=================================================================== PLOVER - Preliminary List Of Vulnerability Examples for Researchers =================================================================== [*] Author: Steve Christey (coley@mitre.org) [*] Date: Match 15, 2006 [*] Document Version: 0.24 Disclaimer: This is a DRAFT document that does not represent an official position of The MITRE Corporation. This document has been created to spur progress in vulnerability classification and vulnerability research. =================================================================== SECTION.1. [INTRO] Introduction =================================================================== Recently, there has been renewed interest in the classification and categorization of vulnerabilities, attacks, faults, and other concepts. Past efforts have largely focused on high-level theories, taxonomies, or schemes that do not sufficiently cover the wide variety of security issues that are found in today's products. PLOVER - the Preliminary List Of Vulnerability Examples for Researchers - is a working document that lists over 1400 diverse, real-world examples of vulnerabilities, identified by their CVE number. The vulnerabilities are organized within a novel, detailed conceptual framework. The framework does not solve the entire classification problem, but it provides useful discussion points and an effective vocabulary for describing vulnerabilities at a low level of detail. PLOVER defines a set of terms and concepts that could help in communicating about vulnerabilities at an abstract level. PLOVER is intended for use by parties who are interested in vulnerability research and classification, including academic researchers, code auditing tool developers, secure programming researchers, and others. It is a resource for knowledgeable and skilled vulnerability analysts and may be of less use to the general public. PLOVER includes: [*] Vulnerability Theory: a conceptual framework for describing and discussing several aspects of vulnerabilities at a low level [*] High-level and low-level vulnerability types and definitions, the relevant attributes, and the inter-relationships between those types, for a current total of 290 types [*] Over 1400 real-world examples of vulnerabilities, identified by their CVE name [*] Discussion of current terminology and its limitations [*] Research gaps PLOVER is an extension and improvement of the "Vulnerability Auditing Checklist," which was posted to various security mailing lists between 2002 and 2004. That checklist has been retired, although PLOVER can still be used as a checklist. After review and revision, it is hoped that PLOVER can become a concise, well-defined, commonly used set of terms and concepts that will improve communications regarding vulnerabilities, support the development and evaluation of code analysis tools, and provide a rich environment for academic research. PLOVER will be used by the CVE project to (1) define terms used in CVE descriptions, (2) provide clarity in distinguishing between different bug types when applying CVE content decisions, and (3) perform more precise vulnerability trend analysis. It must be emphasized that PLOVER, while extensive, is a working document that may contain errors or omissions. It has not been closely validated or compared with past efforts. However, due to the increased interest in vulnerability classification, the author believes that PLOVER can be a useful resource for advancing vulnerability theory within the security community. =================================================================== SECTION.2. [TOC] Table of Contents =================================================================== SECTION.1. [INTRO] Introduction SECTION.2. [TOC] Table of Contents SECTION.3. [DEFS] Terms and Definitions SECTION.4. [VC] Additional Vulnerability Concepts SECTION.5. [TERMPROB] Problems with Existing Terminology SECTION.6. [DIAG] Diagnostic Errors and Challenges SECTION.7. [HOT] Hypotheses, Observations, and Theories SECTION.8. [GENESIS] Genesis of Vulnerabilities SECTION.9. [WIFF] WIFFs: Weaknesses, Idiosyncrasies, Faults, Flaws SECTION.9.1. [BUFF] Buffer overflows, format strings, etc. SECTION.9.2. [SVM] Structure and Validity Problems SECTION.9.3. [SPEC] Special Elements (Characters or Reserved Words) SECTION.9.4. [SPECM] Common Special Element Manipulations SECTION.9.5. [SPECTS] Technology-Specific Special Elements SECTION.9.6. [PATH] Path Traversal and Equivalence Errors SECTION.9.7. [CP] Channel and Path Errors SECTION.9.8. [CCC] Cleansing, Canonicalization, and Comparison Errors SECTION.9.9. [INFO] Information Management Errors SECTION.9.10. [RACE] Race Conditions SECTION.9.11. [PPA] Permissions, Privileges, and ACLs SECTION.9.12. [HAND] Handler Errors SECTION.9.13. [UI] User Interface Errors SECTION.9.14. [INT] Interaction Errors SECTION.9.15. [INIT] Initialization and Cleanup Errors SECTION.9.16. [RES] Resource Management Errors SECTION.9.17. [NUM] Numeric Errors SECTION.9.18. [AUTHENT] Authentication Error SECTION.9.19. [CRYPTO] Cryptographic errors SECTION.9.20. [RAND] Randomness and Predictability SECTION.9.21. [CODE] Code Evaluation and Injection SECTION.9.22. [ERS] Error Conditions, Return Values, Status Codes SECTION.9.23. [VER] Insufficient Verification of Data SECTION.9.24. [MAID] Modification of Assumed-Immutable Data SECTION.9.25. [MAL] Product-Embedded Malicious Code SECTION.9.26. [ATTMIT] Common Attack Mitigation Failures SECTION.9.27. [CONT] Containment errors (container errors) SECTION.9.28. [MISC] Miscellaneous WIFFs SECTION.10. Additional Examples SECTION.10.1. [ALT] Alternate Elements Examples SECTION.10.2. [MAN] Manipulations Examples SECTION.10.3. [ACON] Atomic Consequences - Examples SECTION.10.4. [CAT] Additional Categorized Examples SECTION.10.5. [UNCAT] Additional Uncategorized Examples SECTION.11. References SECTION.12. Contributors / Acknowledgements SECTION.13. Change Log ================================================================ SECTION.3. [DEFS] Terms and Definitions ================================================================ This section identifies the most critical terms, definitions, and concepts that are used in PLOVER. They are used throughout the rest of the document. Use of these terms and concepts may improve communication about vulnerability theory. =============================================================== DEFS.CDEFS. Core Definitions PROPERTY: A characteristic of data or an action (step) that is relevant to the security of a product. Examples: Is the data well-formed? Is the step allowed given the previous step? ATTACKER: A person or independently executing program that intends to compromise the confidentiality, integrity, or availability of a product. MANIPULATION: A modification by an ATTACKER of a data element, group of elements, action, or group of actions based on one or more PROPERTIES. Examples: modify the input by removing a required argument; perform steps out of order. WIFF: Weakness, Idiosyncrasy, Flaw, or Fault. An algorithm, sequence of code, or a configuration in the product, whether it arises from implementation, design, or other processes, that can cross data or object boundaries that could not be crossed during normal operation of the product. CONSEQUENCE: An action performed by the product after a data or object boundary has been crossed, which could not have occurred otherwise. CHANNEL: A communications channel, or an interface, between two entities. ATTACK VECTOR: The minimal set of MANIPULATIONS, CHANNELs, and operational constraints, by the attacker or the product, that are required to cause the product to reach a WIFF through one or more CHANNELs. ATTACK CHANNEL: A CHANNEL in an ATTACK VECTOR that must be controlled or influenced by an ATTACKER for the attack to succeed. VULNERABILITY: A WIFF in a specific product, or a design intended for a class of products that provide the same functionality, that has at least one ATTACK VECTOR. ATTACK: The set of actions by which an ATTACKER follows an ATTACK VECTOR to exploit a VULNERABILITY to achieve a desired CONSEQUENCE. =============================================================== DEFS.ODEFS. Other definitions RESULTANT: Only existing as a result of another WIFF, VULNERABILITY, or CONSEQUENCE. PRIMARY: Existing independently of another WIFF, VULNERABILITY, or CONSEQUENCE. MULTI-FACTOR VULNERABILITY (MFV): A vulnerability that contains two or more WIFFs, two or more manipulations, or two or more attack channels. MULTI-CHANNEL VULNERABILITY: A vulnerability whose attack vector contains two or more attack channels that must be controlled by the attacker. MULTI-CHANNEL ATTACK: An ATTACK on a multi-channel vulnerability. MULTI-MANIPULATION ATTACK: An ATTACK that requires two or more "trigger" manipulations. ATOMIC CONSEQUENCE: The first low-level product action that crosses data or object boundaries. Examples: read or write data past boundary, perform operation on wrong object. FUNCTIONAL CONSEQUENCE: A higher-level action whose security implications can only be described at the functional level of the product. Examples: source code disclosure, authentication bypass, code execution. DIAGNOSIS: The process by which a person analyzes the product in order to identify the underlying WIFFs, CONSEQUENCES, MANIPULATIONS, or ATTACK VECTORS of a VULNERABILITY. =============================================================== DEFS.DPROP. Data Properties There are several properties of data that are relevant to vulnerabilities. STRUCTURE: The Data is either well-formed or malformed. VALIDITY: Data is either valid or invalid. Invalid data includes (1) data of the wrong type (e.g. an alphabetic string when a number is expected), (2) an out-of-range numeric value, or (3) an undefined value (e.g. "Maybe" when the expected answers are either "Yes" or "No"). CONSISTENCY: The relationships between data elements or steps are either consistent or inconsistent. Examples: when the boundary string specified in a multipart MIME header is used in the body (consistent), or when the length field for an input does not match the actual length of the input. EQUIVALENCE: Equivalence determines whether multiple identifiers or references can exist for the same entity within a particular context. The data can be "equivalent" or "exclusive." An example of exclusive data is a primary record key in a database. In a data entry application, "F" and "Female" and "f" might all be treated as equivalent when entered by the user. ENCODING: There may be multiple encodings or representations that are supported for the data. For example, a web application might accept straight ASCII text, URL encoding sequences such as "%20", or Unicode. MUTABILITY: This specifies whether the data is expected to vary as the product executes. For example, a search query, a subject line in a forum post, or the name of a new user in a registration form might all be mutable; an internal buffer for storing the administrator's e-mail address might be immutable. Note that vulnerabilities can arise from violations of expected properties of data. Frequently, data can be manipulated in ways that violate the developer's assumptions. Each of the above properties might be an assumed property by the programmer, which could lead to WIFFs. It might be useful to discuss certain data elements in these terms, e.g. "assumed-immutable", "assumed-consistent", or "assumed-exclusive." For example, a PHP file include vulnerability might allow direct requests to support scripts (assumed-valid access) that can facilitate modification of global variables (assumed-immutable). =============================================================== DEFS.ALT. Alternate Elements Alternate elements are elements that have more than one identifier, reference, object, or method of access. They are important factors in many vulnerabilities, so they are briefly described here. More specific examples are provided in other sections. ALTERNATE CHANNEL: A specific action or data in a product is accessible through one channel, but another channel exists. Example: a web server opens up another listening port on TCP/8080. ALTERNATE NAME: ("alias"). An entity has a name or identifier that is typically used, but there are other names/identifiers that identify the same entity. Example: "abc/def.txt" and "ghi/../abc/def.txt" are alternate names for the same file. ALTERNATE PATH: Within a single channel, the product has one typical "path" of steps that the user must follow to reach a certain functionality, but there are other paths that reach the same functionality. Example: "admin.php" is assumed to be reachable only from links within "index.php", but the attacker can directly access "admin.php". **** DEFS.ALT.NAMES. Examples of Alternate Names Some alternate names include: [*] symbolic links [*] hard links [*] ".." in a path [*] absolute path [*] relative path [*] "C:" drive letter (Windows) [*] 8.3 filenames [*] CLSID NOTE: the current list includes mostly filenames, but there are other examples. =============================================================== DEFS.MANIPS. Manipulations There are two main classes of manipulations: Data manipulation: data is modified Step manipulation: steps are modified **** Data manipulations may include, but are not limited to: [*] providing more or less input than expected [*] inject special character [*] use invalid syntax [*] using an alternate encoding [*] omitting a required value [*] providing data of the wrong type [*] modifying one item so that it is inconsistent with another item Step manipulations may include, but are not limited to: [*] skip first step [*] skip a required step [*] perform steps out of order [*] perform repeated steps [*] do not finish step [*] interrupt step NOTE: more specific examples are provided in another section. **** Some examples of manipulations at the product level include: [*] well-formed data with an invalid value, e.g. a web command: GETTT / HTTP/1.0 http://www.example.com/ [*] malformed data with valid value - GET / - "GET" is a valid command and "/" is a valid URI, but there's no version specifier, so the input is malformed [*] well-formed data with valid value: - ABCDEF~1.DAT (equivalent filename for "ABCDEFGHIJ.DAT") [*] well-formed data with inconsistent value - $String = "Hello World!"; $StringLength = 2; [*] log into FTP server and send PASS command before USER [*] connect to telnet server but don't send any data [*] exit connection while server is still sending data [*] press "Escape" key instead of entering screensaver password **** The same manipulation may have different data properties depending on the context. For example, the string "O'Neill" is valid in a text file but not a SQL query. "i < 3" is a well-formed expression in Javascript, but it is syntactically incorrect in HTML. This is a strong argument for performing canonicalization ONLY at data borders - as soon as it comes in, and just before it goes out. **** Manipulations serve different roles to an attacker. TRIGGER: specifically intended to exploit a WIFF. ESSENTIAL: must be performed to properly interact with the product. Examples: a parameter in a CGI script must be base64-encoded; or, the attacker must log in and navigate to a specific menu. These manipulations are, by definition, valid and well-formed. FACILITATOR: must be performed to work within the constraints of product execution. Examples: shellcode for a buffer overflow exploit must be less than 100 bytes and cannot contain any null characters; an XSS issue requires a ">" before the malicious string in order to terminate an open HTML tag being generated by the product. Note that PLOVER only covers Trigger manipulations. =============================================================== DEFS.CON. Consequences As defined above, there are two types of consequences, atomic and functional. Note that functional consequences can be primary or resultant. For example, a SQL injection issue might have a primary functional consequence of modifying a database; if the database is used for authentication, then the resultant consequence is modification or theft of authentication credentials. A primary disk consumption could result in resultant CPU consumption as the processor does more bookkeeping work than normally needed. Note that a "bypass" Consequence can occur as a result of manipulations of different alternate entity properties such as alternate name and alternate channel. **** DEFS.CON.ATOM. Atomic Consequences These are informal categories that may partially overlap. They are intended to demonstrate the concept rather than precisely define it. [*] out-of-bounds write (buffer overflow or underflow) [*] out-of-bounds read [*] execute code [*] operation on wrong entity [*] wrong operation on entity [*] numeric overflow [*] undefined mathematical operation (e.g. divide-by-zero) [*] invalid pointer dereference, including null dereference [*] infinite loop [*] long loop [*] infinite recursion [*] deep recursion [*] deadlock [*] access of stale identifier [*] access of previously freed memory, including double-free [*] access of uninitialized memory **** DEFS.CON.FUNC. Functional Consequences These are informal categories that may partially overlap. They are intended to demonstrate the concept rather than precisely define it. [*] path traversal [*] code execution [*] command execution [*] path disclosure [*] information leak [*] username enumeration [*] source code disclosure [*] authentication bypass [*] filter bypass [*] detection evasion / information hiding [*] wrong operation on object [*] operation on wrong object [*] hang or freeze [*] corrupt memory [*] refuse new connections [*] drop existing connections [*] memory consumption or exhaustion [*] CPU consumption or exhaustion [*] disk consumption or exhaustion [*] resource consumption or exhaustion [*] inability to restart [*] lockout [*] network amplification (e.g. storm) [*] data amplification [*] authentication credentials disclosure [*] obtain meta-data [*] decrypt data [*] determine filename existence [*] hide activities [*] hide attack source [*] disabled or weakened security feature [*] gain additional privileges, rights, roles, etc. [*] modify permissions or ACLs Each of these operations can be controlled (attacker has full control over the operation on the object), partially controlled, or uncontrolled. For an uncontrolled consequence, the attacker has no role except to take advantage of the consequence whenever it occurs. =============================================================== DEFS.CHANNELS. Channels Here are some examples of channels. Note that any channel can be an attack channel for some vulnerability. DEFS.CHANNELS.REM. Remote Channels Remote channels include: [*] user-to-server [*] server-to-consultant - e.g. RADIUS, DNS server lookups [*] user-to-intermediary DEFS.CHANNELS.LOCAL. Local Channels Local Channels include: [*] command line [*] process invocation [*] data file or object [*] file or directory name [*] file descriptor [*] profile (e.g. user name, GECOS field) [*] environment variable [*] signal or semaphore [*] registry [*] configuration file [*] keyboard device [*] mouse device [*] GUI API [*] alternate data stream [*] shared memory [*] mapped memory [*] Windows named pipe DEFS.CHANNELS.PHYS. Physical Channels Physical channels include: [*] serial port [*] keyboard [*] mouse [*] floppy disk [*] CD drive [*] USB device =============================================================== DEFS.ENDPOINTS. Endpoints Channels exist between two entities, which are called endpoints. The attacker must perform actions as one (or more) of these endpoints in order to exploit a vulnerability. USER: user of the product, possibly an administrator SERVICE: (or server). A networked or local service. OUTSIDER: an entity that may perform actions outside of the context of the product. For example, an attacker who sends a malicious URL via e-mail to exploit a web application vulnerability, is acting as an outsider. An attack that requires social engineering may involve an outsider. CONSULTANT: a separate entity that is used by a product to provide information that affects how the product operates. For example, a product uses a DNS server as a consultant in order to look up the IP address of a given hostname; a product that performs authentication might use a RADIUS or LDAP server as a consultant to verify that the provided credentials are correct. INTERMEDIARY: an entity that controls the channels between endpoints, possibly limiting the kinds of interactions that are allowed within accepted channels. Examples include a firewall, anti-virus product, proxy. Effectively, an intermediary splits a single channel between A and B into two channels - A to the intermediary, and the intermediary to B. MONITOR: a monitor observes the data or actions that are used within the channel, but it is a passive observer. Examples include a sniffer, log file monitor, or intrusion detection system. =============================================================== SECTION.4. [VC] Additional Vulnerability Concepts =============================================================== **** VC.DIRLOC. Direction and Location of Channels Note: this is a new concept that is still being refined. Vulnerabilities that require complicated attack chains, especially those that involve more than two endpoints, can be further described in terms of the "direction" and "location" of the channels that are involved. LOCATION: The LOCATION of a channel is relative to a particular endpoint and to the nature of the interaction when a vulnerable condition is being entered. The channel's location can be: EXTERNAL: out of the control of the endpoint, but involving data or steps that are relevant to the endpoint INTERNAL: involving the endpoint DIRECTION: The DIRECTION of a channel is also relative to a particular endpoint and to the nature of the interaction when a vulnerable condition is being entered. The channel's direction can be: INCOMING: at the particular time, the interaction is being driven by the other end of the channel OUTGOING: at the particular time, the interaction is being driven by the endpoint itself. TRANSIENT: the endpoint is a MONITOR and the interaction is occurring between two other endpoints. Note that the direction and locality changes with respect to the endpoint. Consider an attack that involves a client exploiting a WIFF on a server. For the attacker, the channel would be OUTGOING and INTERNAL. For the server, the channel would be INCOMING and INTERNAL. Consider an attack in which an FTP server exploits a buffer overflow by sending a long response to a request. The channel would be OUTGOING/INTERNAL for the server and INCOMING/INTERNAL for the victim. Consider another case in which the attacker manipulates network traffic in a way that exploits a vulnerability in a sniffer. For the sniffer, the channel would be EXTERNAL/TRANSIENT as Monitor; for the attacker, the channel would be INTERNAL/OUTGOING as Outsider. =============================================================== VC.MULTCHAN. Multi-Channel Attacks Vulnerabilities can be viewed in terms of the channels and endpoints that are involved. Most vulnerabilities involve one channel - user-to-server over a network connection in remote cases, or user-to-user via a program execution in local cases. Other vulnerabilities, or their associated attacks, are multi-channel. Consider a buffer overflow involving reverse DNS. The attacker connects to a target web server from a particular IP address, then has a DNS server send a long response when the target performs reverse resolution to get the domain name for the IP address. Step 1, Channel 1: attacker-as-user to service; INTERNAL/OUTGOING. Step 2, Channel 2: service to attacker-as-consultant: INTERNAL/OUTGOING. Step 3, Channel 2: attacker-as-consultant to service: INTERNAL/OUTGOING. Consider a more complicated example involving cross-site scripting. XSS can involve two or three channels, with 3 endpoints. Suppose there is a WIFF in which the attacker uses a web service to inject HTML onto a page that is then viewed by all users of that application. The channels are: [*] (1) attacker-as-user to service [*] (2) service-to-user When analyzing the service, the channels and attack steps are: [*] 1. Channel 1: attacker-as-user to service: INCOMING/INTERNAL [*] 2. Channel 2: service to user: OUTGOING/INTERNAL When analyzing the user, the channels and attack steps are: [*] 1. Channel 2: service to user: INCOMING/INTERNAL When analyzing the attacker, the channels and attack steps are: [*] 1. Channel 1: attacker-as-user to service: OUTGOING/INTERNAL No other channels or steps are needed for the attacker before the WIFF is exploited. Using the direction/locale model, one can see one reason why XSS is common: it is easy for the attacker to exploit, being single step and single channel, but on the server side, there are two separate channels involved. Now, consider the classic cross-site scripting issue in which the attacker must force a user to click on a link while the user is interacting with the product. There are still two channels: (1) attacker-as-outsider to user (2) user-to-server However, the number of steps, and the directionality or location, differ. When analyzing the service, the channels and attack steps are: [*] 1. Channel 2: user to service: INCOMING/INTERNAL [*] 2. Channel 2: service to user: OUTGOING/INTERNAL When analyzing the user, the channels and attack steps are: [*] 1. Channel 1: attacker-as-outsider to user: INCOMING/EXTERNAL [*] 2. Channel 2: user to service: OUTGOING/INTERNAL [*] 3. Channel 2: service to user: INCOMING/INTERNAL When analyzing the attacker, the channels and attack steps are: [*] 1. Channel 1: attacker-as-outsider to user: OUTGOING/EXTERNAL There are some interesting observations here. First, the XSS attack is effectively launched by the user, in a trusted channel between the user and the server. This makes it more understandable why a programmer might allow an XSS problem in this context - users are not expected to attack themselves. Secondly, the attacker requires less interaction than any endpoint, and the attacker doesn't even need to use an internal channel with the product. This is another explanation for why XSS appears so frequently. Note that the above scenario also describes Cross-Site Request Forgery (CSRF) attacks. The user is performing an action which, from the server perspective, is coming directly from the user. Further study is needed to determine whether this concept is useful in identifying more complex vulnerabilities and attack scenarios. =============================================================== VC.MFV. Multi-Factor Vulnerabilities (MFV) Vulnerabilities are often thought of as atomic entities. It is believed that there is a single fault in one place in the code (or its design), which opens one or more vectors for attack. However, many vulnerabilities are really combinations of multiple factors or problems, which include WIFFs, attack channels, and manipulations. Removal of just one of the factors usually results in the elimination of the vulnerability, or at least a reduction in the attack surface. Multi-Factor Vulnerabilities (MFVs) can be more complicated to prevent, find, and exploit than atomic vulnerabilities. They are also difficult to classify effectively, since currently available schemes treat vulnerabilities as if they are atomic. Understanding the role of multi-factor vulnerabilities is important in making improvements to existing terminology and classification. The classic MFV is symbolic link following. Factors may include: [*] permissions (the attacker must have access to a directory that the victim operates in; the product doesn't check the ownership of a file being written to) [*] filename predictability (the attacker knows, or can predict, the name of the file that will be accessed) [*] race condition [*] design factor: lack of built-in support for safe temporary file creation in most programming languages, lack of atomic operations for effectively creating symlinks Another common MFV is encoded path traversal. Consider the application that protects itself against "../" strings, but not against "%2e%2e%2fTARGET" strings. PHP remote file include vulnerabilities are also multi-factor. The product allows global variables to be modified, and the attacker must interact over two separate channels (one to interact directly with the product, and another to provide a malicious file). Unlike XSS, however, the attacker must control a Consultant endpoint in order to provide the malicious PHP file. ================================================================ SECTION.5. [TERMPROB] Problems with Existing Terminology ================================================================ Here are some of the problems with existing terminology in the vulnerability world. Note: The heavily-discussed subjects like "attack" versus "threat" versus "vulnerability" are avoided here. 1) The same term can be used to describe a WIFF, a manipulation, a vulnerability, or a consequence. "Buffer overflow" is the most obvious example. There are many WIFFs that can result in buffer overflows, such as format strings, off-by-one errors, integer signedness errors, and array index problems, not to mention the "classic" variants; however, they are all referred to as buffer overflows. At the same time, an attacker manipulation by crafting an extremely large input is not necessarily exploiting an overflow, although it may be called such. From an operational defensive standpoint, this distinction is usually immaterial; but for understanding vulnerabilities in terms of WIFFs, it is essential. Other multiple-use terms include "directory traversal," "authentication bypass," "path disclosure," and many others. 2) Due to their nature, multi-factor vulnerabilities and multi-channel attacks do not usually have a single term. In addition, often a single term will be used for a multi-factor vulnerability, which obscures the true nature of the issue. 3) The same manipulation could be useful in attacks on multiple WIFFs, which can cause people to label it as if it is one WIFF, when it could be another. The "buffer overflow" by long input is one example; a product may treat a long input as if it is an invalid value, but poor error handling could trigger a crash. Another example occurs when an attacker provides a "-1" argument that causes a crash, it could be due to an integer overflow, a signedness error, or other factors. It is likely that overflows and signedness errors are frequently reported as the wrong bug type. 4) The same consequence can result from a broad range of WIFFs, but some problems are only described in terms of their consequence. The most egregious example, by far, is "denial of service," which can be triggered by a wide variety of WIFFs, but the term also covers a variety of consequences, some of which may be unimportant or irrelevant to a system administrator. For a less obvious example, consider a null dereference, which could be the result of a parsing error (due to a missing argument), a failed memory allocation (due to an integer signedness error), or a state machine violation (due to an inability to detect out-of-order steps). 5) Some terms refer to manipulations, but terms do not exist for the associated vulnerabilities, WIFFs, or consequences. 6) Some terms are used in different ways for multiple WIFFs. For example, the "leak" term can refer to the disclosure of information, an error in reclaiming used resources ("memory leak"), or inadvertently providing a trusted resource to an untrusted entity ("file descriptor leak"). ============================================================= SECTION.6. [DIAG] Diagnostic Errors and Challenges ============================================================= Some diagnostic errors and challenges are covered in specific WIFF entries. Additional comments are below. ** DRAFT DRAFT DRAFT ** NOTE: this section has not been refined yet. **** DIAG.PRES. Lack of distinction between primary and resultant errors A large percentage of researcher reports focus only on the resultant errors form particular manipulations. The researcher does not perform sufficient diagnosis to identify the primary factors or to ensure that all elements of a multi-factor vulnerability are known and understood. **** DIAG.MANIP. Role of manipulations in diagnostic errors The same manipulation could be used in multiple WIFFs. The same WIFF could have multiple manipulations. Diagnostic errors are likely to occur with manipulations that can trigger different faults. For example, a "long input" could trigger a buffer overflow, a null dereference due to an invalid value, an unhandled error condition, or other factors. There is a diagnostic difficulty in distinguishing between integer errors. e.g. a "-1" input could lead to a signedness error or an integer overflow, but you can't label it as a signedness error just because a -1 was provided as the input. To make matters worse, sometimes a signedness error enables an integer overflow. Also, manipulations in one data context (e.g. a "<" special character for XSS) could produce unrelated, resultant errors in another context which, if not diagnosed, do not detect a more serious underlying WIFF. For example: a SQL syntax error that's generated on a "<" XSS character injection could be an indicator of SQL injection. **** DIAG.DOS. Insufficient diagnosis in "DoS" vulnerabilities Most DoS vulnerabilities are not diagnosed to determine the associated WIFFs. The manipulations are often less structured, too. Thus, there is not much understanding of the underlying causes of DoS (i.e. the WIFFs). Many vulnerabilities are described as crashes, which could be the result of infinite loops that cause memory allocation that eventually lead to an unhandled error condition, a null dereference, etc. Some vulnerabilities that involve flooding attacks of large numbers of connections are reported to cause a crash, but the crash could be resultant from overflows in arrays that are used to manage the connections. The use of fuzzers and fault injection, while powerful technologies, make it easier for researchers cause product instability without needing to know what manipulations caused it, or why. **** DIAG.OBSC. Surface-level diagnosis obscuring the real problem Product-external error message infoleaks can be the source of many diagnostic errors. They are often simply reported as infoleaks when the underlying WIFF is more serious. In addition, a particular data manipulation could be directly inserted into the resulting message, which leads to a resultant WIFF. For example, an XSS manipulation might trigger a SQL syntax error that is product-external, and reflected directly back to the user with the XSS intact. This might be reported as XSS when it's really an indicator of some SQL problem. Or, an XSS manipulation might cause a fault because the product cannot handle *any* invalid values, not just XSS, but the invalid value is used in an external uncontrolled error message and thus appears to be primary XSS. It is highly suspected by the author that many "XSS" vulnerabilities in SQL-friendly PHP applications are actually resultant XSS, from resultant infoleaks of SQL injection vulns, where the SQL syntax error message is reflected back to the user. XSS/SQL is being used as an example here, but there are other similar problems. These diagnostic errors happen quite frequently, but sometimes researchers do not publish enough relevant details to know whether the vulnerability is resultant or not. ============================================================= SECTION.7. [HOT] Hypotheses, Observations, and Theories ============================================================= This is a very free-form collection of hypotheses, observations, and theories about vulnerabilities. ** DRAFT DRAFT DRAFT ** **** HOT.DOS. On "denial of service" A key note: the phrase "denial of service" is often treated like it is a vulnerability. However, it is a CONSEQUENCE of the exploitation (or attempted exploitation) of a vulnerability. A variety of WIFFs can lead to a denial of service, and there are also many types of "denial of service". There is little research that tries to identify the underlying causes for "DoS." **** HOT.IMM. On immutable vs. mutable Some immutable data isn't critical (text color); but some critical data IS mutable (username upon login). In some cases, an attacker can make something immutable and have an impact; e.g. changing perm's on a shared file so that others can't read it. This is not (currently) well-covered in PLOVER. **** HOT.CLASS. On classification and taxonomies Multi-factor vulnerabilities, by their nature, could fit into two or more separate categories, especially when they are multi-WIFF. Thus MFVs are good stress testers for any classification scheme. **** HOT.COMPLEX. On vulnerability complexity Theory: can vulnerability complexity - and/or attack complexity - be measured in terms of PLOVER concepts? [*] number of attack channels the attacker has to control [*] number of manipulations needing to be performed [*] "popularity" of those manipulations (i.e. often are those manipulations publicly reported?) [*] number of WIFFs necessary for exploit [*] minimum number of inputs required as part of the attack [*] environmental / operational constraints NOTE that the attack complexity is different from the vuln complexity. For example, symlinks and PHP file inclusion are both multi-factor vulns, but the attacks are usually very simple. Buffer overflow involving an off-by-one error requiring a long hostname returned by DNS resolution is multi-channel but single WIFF. Might want to cover multi-input attacks somehow, e.g. a path traversal where you have to provide 2 parameters, one for directory and one for filename. This is at least a little more complex. **** HOT.VULNS. Thoughts on vulnerabilities Notice how argument injection in process invocation is similar to attribute injection in XSS variants. Both involve whitespace and "arguments." Some SQL injection exploits require whitespace as well. The exploitation of a vulnerability can involve the introduction of invalid, malformed, or inconsistent input in one context, which is valid, well-formed, and consistent in another context. One example is SQL injection. A counter-example is a vuln that results in DoS. Maybe this ONLY APPLIES to discuss vulns that cross security boundaries? DoS does NOT cross boundaries... or more precisely, it crosses different boundaries than code execution, privilege escalation, etc. More vulnerabilities are multi-factor than you'd expect. A product that's vulnerable to a single-factor issue is likely to be simultaneously vulnerable to multi-factor variants of that issue. For example, an application that blindly accepts "../" is likely to accept "%2e2e%2f" and so on. However, once a product begins to perform cleansing, the new manipulations could be invalid for an older version of the product. Web browser vulnerabilities are often multi-factor, multi-input, multi-step, and/or multi-channel. Some kinds of "intentional" infoleaks don't require any manipulations by the attacker, and the attacker only needs to have a Monitor role in a particular channel. This is not yet well-covered in PLOVER. The same vector or line of code can have multiple manipulations and WIFFs for completely different fault types. Consider the statement "open($filename)" in Perl. Note how vulnerability "variants" often require different manipulations to exploit. The role of context switching should be examined more closely; it is present in many vulns. **** HOT.DESIMP. Design vs. Implementation Thoughts on whether particular vulns are design vs. implementation - sometimes you can't tell without knowing developer's intentions! Also, some non-traditional implementation bugs are the result of failure to implement security mechanisms as required by the designs, e.g. "basic constraints" certificates. Design decisions play a role in many vulnerabilities, if not all. This is especially the case with MFVs. Programming language design plays a major role in WIFFs. Theory: every implementation bug is multi-factor - at least one fault, which is effectively enabled by at least one design flaw or weakness. (hmmm need to rephrase this, but I know what I mean) **** HOT.POP. Popularity of Some Vulnerabilities Why are buffer overflows still so common today? New faults are discovered... multi-step and multi-input attacks are being found... new manipulations are being discovered. Why is XSS so common? See the Alternate Channels sections for more specific details, but... It's multi-channel, so developers don't think of the attack. The exploit is single-path, so it's easy for researchers to find. There are many different manipulations that can bypass the more obvious protections, and at least some of the XSS that's reported is really resultant XSS instead of primary XSS, e.g. when an XSS manipulation triggers an SQL error due to invalid syntax. **** HOT.RESPRI. Resultant and Primary Vulns A resultant vuln in one context could be primary in another. For example, suppose a researcher finds 2 issues. Issue 1 allows the attacker to gain extra privileges, but not administrator privileges. Then, in Issue 2, the attacker can then use those extra privileges to gain administrator privileges. Issue 2 is resultant from Issue 1, but it is also independent of it; if Issue 1 did not exist, then Issue 2 would still be a problem. It would be very useful to identify the relationships between primary and resultant WIFFs. E.g. buffer overflow can be a resultant vuln of format string, signedness error, etc.; XSS is a resultant vuln of SQL injection if the manipulation contains XSS and the SQL engine generates an error. **** HOT.STD. Standards vs. Non-standards The lack of standards compliance is a MAJOR FACTOR in interaction errors, especially multiple interpretation errors. This makes the job of monitors and intermediaries extremely difficult to do correctly. **** HOT.EVOL. Evolution of Security of a Product This is based on observations. Initial vulnerability reports for a product involve the most obvious entry points and the most obvious WIFFs, e.g. buffer overflows in username/pass, subject lines, etc., or basic "../" path traversal. As the product matures, more complex manipulations, multiple manipulations, or alternate channels may be required. Less obvious entry points are found. e.g. all the commands of a product have been tested; what about file format manipulations of the files that it processes? The most mature, well-tested product is only subject to rare kinds of WIFFs, or new entirely classes of WIFFs. **** HOT.CODE. On the State of Code Analysis Code analysis technologies have different focuses. - fuzzers seem focused on data manipulations, but not step manipulations - code auditing tools are fault focused **** HOT.BUFF. Buffer Overfows, Today and Yesterday Most of yesterday's "classic" buffer overflows are single-input and single-channel. The attacker fills a single field with long input of any set of characters, the program blindly accepts the input, and it crashes or executes code. Many of today's "classic" buffer overflows are multi-factor. While classic "blind unbounded copy" buffer overflows still exist today, there are many multi-factor vulnerabilities today that are also referred to as "buffer overflows." One common MFV overflow involves the input field and a length field, in which the attacker modifies the length field and provides an input field whose actual length is inconsistent with the length field. Integer overflows can be one factor of an MFV in this scenario. Another MFV overflow example is an off-by-one error that overwrites the terminating null character of a string, which effectively causes the string to be larger than expected, even when the programmer has otherwise kept very close track of string lengths. The factors involved here include the off-by-one error itself (possibly made easier by the design factor of 0-based vs. 1-based array indexing), the design factor in C of using terminator characters for strings, and the fault during execution, i.e. that a large input is copied into a small buffer. Other MFV overflows can include "expansion-based" buffer overflows, in which the attacker provides special inputs that are translated into larger strings (think "&" to "&" in web applications), or overflows that involve long sequences of special characters that cause the parser to lose track of where it is in the buffer that it is writing to. Note that none of these MFV overflows are easily detectable using brute force black-box techniques. Each requires inputs that are more well-crafted than a long string of "A" characters followed by shellcode. This demonstrates how MFVs can have more complicated exploit scenarios, and it might explain why most MFV overflows are only found by the top researchers. =================================================================== SECTION.8. [GENESIS] Genesis of Vulnerabilities =================================================================== This section identifies specific phases of the software life cycle. Contrary to popular opinion, most vulnerabilities can be introduced during any of several phases. However, some vulnerabilities do tend to appear in one phase or another. The phases include: [*] design [*] implementation [*] bundling [*] distribution [*] installation [*] configuration [*] documentation [*] patch [*] removal **** GENESIS.DESIGN. Design Note: this seems under-studied, especially with respect to classification of design flaws. Most "design limitations" or "design errors" are probably covered by other vulnerability categories. It is the author's belief that many implementation bugs are enabled by design flaws. Common problems in this phase include: [*] introduction of many WIFFs [*] failure to introduce design elements or patterns that minimize the likelihood and risk of classes of implementation errors (e.g. "use lookup table for valid values" to avoid special character, MAID, and overflow errors) [*] Incomplete specification, leading to interpretation errors, [*] Vague specification, leading to multiple interpretation errors [*] Lack of support for security-relevant options [*] Required adherence to an insecure standard. For example, the DOCSIS standard has certain design flaws, as does IP/TCP/UDP/ICMP. **** GENESIS.IMPLEMENTATION. Implementation WIFFs in this phase are well-covered by PLOVER. **** GENESIS.TESTING. Testing Common problems in this phase include practices that make testing more efficient: [*] introducing back doors to facilitate testing. CVE-2002-1272 - back door intended for development accidentally left enabled in production [*] leaving in debugging code. CVE-2001-0528 - debugging version of DLL logs plaintext password. CVE-1999-0095 - debug command in product left enabled [*] using insecure configuration. CAN-2003-0983 - default settings should have been disabled by the vendor, include a user account and open TCP port **** GENESIS.BUNDLING. Bundling Phase. A product may have dependencies on third-party products or libraries that need to be bundled or made available on the end system for proper functioning. Common problems in this phase include: [*] The bundled product itself may have vulnerabilities. Exploitation might require proxied channels through the main product, or direct channels with the bundled product. [*] There may be interaction errors between the main product and the bundled product, such as behavioral changes. Examples: CAN-2005-2385 - AV product uses a third-party library that contains directory traversal and buffer overflow issues **** GENESIS.DISTRIB. Distribution Common problems in this phase include: [*] not undoing modifications from the testing phase (debugging code, back doors, insecure configuration) [*] not providing a mechanism for integrity checking of the software. This is especially problematic for automatic download or update. - CVE-2002-0671, CVE-2002-0676, CAN-2001-1125, CAN-2003-0237 - product downloads executables from a web site but does not verify integrity of the executables, allowing malicious injection using DNS spoofing [*] introduction of embedded malicious code at the distribution point - CAN-2002-1840 - backdoor in the configuration file of an IRC client downloaded from compromised site - CAN-2002-2049 - configure compilation script modified at distribution point **** GENESIS.INSTALL. Installation Phase Common problems in this phase include: [*] insecure permissions [*] undeleted temporary files containing cleartext sensitive information [*] WIFFs in the installation scripts themselves, e.g. symlink following in shell scripts **** GENESIS.PATCH. Patch Error Common problems in this phase include: [*] regression error: an old vulnerability is introduced into new code - CAN-2005-2158, CAN-2005-1937 - CAN-2002-1233 - regression error enables symlink - CAN-2005-1649 - regression error of "Land" vulnerability (spoofed packet, self-referencing manipulation, infinite loop) [*] overwrite of security patch with older patch - CAN-2002-1670 - upgrade overwrites previous security-relevant patches [*] interaction errors with other patches [*] overwrite of configuration to less secure options [*] WIFFs that arise from the patching process itself [*] incomplete vulnerability fix. Typically this involves fixing a specific WIFF but not considering other manipulations, alternate channels, etc. - CAN-2005-0206 - incomplete patch misses 64-bit architecture [*] other errors - CVE-1999-1047 - patches applied in a particular sequence allows firewall bypass and does not log events **** GENESIS.DOC. Documentation Error Common problems in this phase include: [*] Omission of security-critical information [*] Error/typo causes user to introduce a vulnerability or risk [*] Specific recommendation of insecure practices **** GENESIS.PORT. Porting A product may be ported to a different environment (e.g. OS, language, or hardware platform). The product must consider differences with the original environment, otherwise vulnerabilities may be introduced that are specific to the new environment. For example, a product that was originally developed and secured on Unix could be ported to a Windows platform and become subject to very basic Windows-specific bugs, e.g. directory traversal using "\" instead of "/". The reverse is also true, of course, although examples are not immediately available. Common type of ports are: [*] port to different OS [*] port to different hardware / architecture (e.g. chip) [*] port to different programming language [*] port from single-user to multi-user [*] port from non-networked to networked **** GENESIS.CONFIG. Configuration Note: configuration errors are vastly under-studied, especially in terms of classification. They can be more complex than vulnerabilities, which are often discrete and easily separable. In addition, configuration overlaps with the general area of "policy," which can have elements that are not always considered to be relevant to security. Common configuration problems include: [*] Default password [*] Default, non-essential service or component [*] Default less-secure operating mode [*] Administrator capability accessible to arbitrary hosts **** GENESIS.OPENV. Genesis - Operating Environment The product might be deployed into an operating environment or context that violates its most basic assumptions, introducing entire classes of WIFFs that were not previously relevant. For example, a program designed for local users might be called from a CGI wrapper, thus rendering all inputs under possible control by an untrusted party. Common operating environment changes are: [*] make program setuid [*] port local program to networked [*] single-user to multi-user environment ============================================================ SECTION.9. [WIFF] WIFFs: Weaknesses, Idiosyncrasies, Faults, Flaws ============================================================ The bulk of this document covers a large variety of WIFFs, with a large number of real-world vulnerability examples. The order of presentation, and the categorization implied by the different sections, is not intended to be authoritative. Each WIFF attempts to include a definition, notes on terminology, research gaps, common overlap with other WIFFs, and other information. Two or three examples are provided for each WIFF. For many WIFFs, an appendix lists additional examples that further illustrate the subtlety and variety of vulnerabilities. Multi-factor examples may be included. The examples use CVE identifiers (CVE-yyyy-nnnn or CAN-yyyy-nnnn) for specific vulnerabilities that demonstrate the given category. The identifiers can be accessed from the search form at http://cve.mitre.org/cve Following is a summary of the main categories. [BUFF] Buffer overflows, format strings, etc. Buffer Boundary Violations ("buffer overflow"), Unbounded Transfer ("classic overflow"), Boundary beginning violation ("buffer underflow" ?), Out-of-bounds Read, Buffer over-read, Buffer under-read, Array index overflow, Length Parameter Inconsistency, Other length calculation error, Format string vulnerability [SVM] Structure and Validity Problems Missing Value Error, Missing Parameter Error, Missing Element Error, Extra Value Error, Extra Parameter Error, Undefined Parameter Error, Undefined Value Error, Wrong Data Type, Incomplete Element, Inconsistent Elements [SPEC] Special Elements (Characters or Reserved Words) General Special Element Problems, Parameter Delimiter, Value Delimiter, Record Delimiter, Line Delimiter, Section Delimiter, Input Terminator, Input Leader, Quoting Element, Escape, Meta, or Control Character / Sequence, Comment Element, Macro Symbol, Substitution Character, Variable Name Delimiter, Wildcard or Matching Element, Whitespace, Grouping Element / Paired Delimiter, Delimiter between Expressions or Commands, Null Character / Null Byte [SPECM] Common Special Element Manipulations Special Element Injection, Equivalent Special Element Injection, Leading Special Element, Multiple Leading Special Elements, Trailing Special Element, Multiple Trailing Special Elements, Internal Special Element, Multiple Internal Special Element, Missing Special Element, Extra Special Element, Inconsistent Special Elements [SPECTS] Technology-Specific Special Elements Cross-site scripting (XSS), Basic XSS, XSS in error pages, Script in IMG tags, XSS using Script in Attributes, XSS using Script Via Encoded URI Schemes, Doubled character XSS manipulations, e.g. "<