CWE-789: Memory Allocation with Excessive Size Value
Weakness ID: 789
Vulnerability Mapping:
ALLOWEDThis CWE ID may be used to map to real-world vulnerabilities Abstraction: VariantVariant - a weakness that is linked to a certain type of product, typically involving a specific language or technology. More specific than a Base weakness. Variant level weaknesses typically describe issues in terms of 3 to 5 of the following dimensions: behavior, property, technology, language, and resource.
View customized information:
For users who are interested in more notional aspects of a weakness. Example: educators, technical writers, and project/program managers.For users who are concerned with the practical application and details about the nature of a weakness and how to prevent it from happening. Example: tool developers, security researchers, pen-testers, incident response analysts.For users who are mapping an issue to CWE/CAPEC IDs, i.e., finding the most appropriate CWE for a specific issue (e.g., a CVE record). Example: tool developers, security researchers.For users who wish to see all available information for the CWE/CAPEC entry.For users who want to customize what details are displayed.
×
Edit Custom Filter
Description
The product allocates memory based on an untrusted, large size value, but it does not ensure that the size is within expected limits, allowing arbitrary amounts of memory to be allocated.
Alternate Terms
Stack Exhaustion:
When a weakness allocates excessive memory on the stack, it is often described as "stack exhaustion," which is a technical impact of the weakness. This technical impact is often encountered as a consequence of CWE-789 and/or CWE-1325.
Common Consequences
This table specifies different individual consequences associated with the weakness. The Scope identifies the application security area that is violated, while the Impact describes the negative technical impact that arises if an adversary succeeds in exploiting this weakness. The Likelihood provides information about how likely the specific consequence is expected to be seen relative to the other consequences in the list. For example, there may be high likelihood that a weakness will be exploited to achieve a certain impact, but a low likelihood that it will be exploited to achieve a different impact.
Not controlling memory allocation can result in a request for too much system memory, possibly leading to a crash of the application due to out-of-memory conditions, or the consumption of a large amount of memory on the system.
Potential Mitigations
Phases: Implementation; Architecture and Design
Perform adequate input validation against any value that influences the amount of memory that is allocated. Define an appropriate strategy for handling requests that exceed the limit, and consider supporting a configuration option so that the administrator can extend the amount of memory to be used if necessary.
Phase: Operation
Run your program using system-provided resource limits for memory. This might still cause the program to crash or exit, but the impact to the rest of the system will be minimized.
Relationships
This table shows the weaknesses and high level categories that are related to this weakness. These relationships are defined as ChildOf, ParentOf, MemberOf and give insight to similar items that may exist at higher and lower levels of abstraction. In addition, relationships such as PeerOf and CanAlsoBe are defined to show similar weaknesses that the user may want to explore.
Relevant to the view "Research Concepts" (CWE-1000)
Nature
Type
ID
Name
ChildOf
Base - a weakness
that is still mostly independent of a resource or technology, but with sufficient details to provide specific methods for detection and prevention. Base level weaknesses typically describe issues in terms of 2 or 3 of the following dimensions: behavior, property, technology, language, and resource.
Base - a weakness
that is still mostly independent of a resource or technology, but with sufficient details to provide specific methods for detection and prevention. Base level weaknesses typically describe issues in terms of 2 or 3 of the following dimensions: behavior, property, technology, language, and resource.
Variant - a weakness
that is linked to a certain type of product, typically involving a specific language or technology. More specific than a Base weakness. Variant level weaknesses typically describe issues in terms of 3 to 5 of the following dimensions: behavior, property, technology, language, and resource.
Base - a weakness
that is still mostly independent of a resource or technology, but with sufficient details to provide specific methods for detection and prevention. Base level weaknesses typically describe issues in terms of 2 or 3 of the following dimensions: behavior, property, technology, language, and resource.
Base - a weakness
that is still mostly independent of a resource or technology, but with sufficient details to provide specific methods for detection and prevention. Base level weaknesses typically describe issues in terms of 2 or 3 of the following dimensions: behavior, property, technology, language, and resource.
The different Modes of Introduction provide information about how and when this weakness may be introduced. The Phase identifies a point in the life cycle at which introduction may occur, while the Note provides a typical scenario related to introduction during the given phase.
Phase
Note
Implementation
Applicable Platforms
This listing shows possible areas for which the given weakness could appear. These may be for specific named Languages, Operating Systems, Architectures, Paradigms, Technologies, or a class of such platforms. The platform is listed along with how frequently the given weakness appears for that instance.
Languages
C (Undetermined Prevalence)
C++ (Undetermined Prevalence)
Class: Not Language-Specific (Undetermined Prevalence)
Demonstrative Examples
Example 1
Consider the following code, which accepts an untrusted size value and allocates a buffer to contain a string of the given size.
(bad code)
Example Language: C
unsigned int size = GetUntrustedInt(); /* ignore integer overflow (CWE-190) for this example */
This will cause 305,419,896 bytes (over 291 megabytes) to be allocated for the string.
Example 2
Consider the following code, which accepts an untrusted size value and uses the size as an initial capacity for a HashMap.
(bad code)
Example Language: Java
unsigned int size = GetUntrustedInt(); HashMap list = new HashMap(size);
The HashMap constructor will verify that the initial capacity is not negative, however there is no check in place to verify that sufficient memory is present. If the attacker provides a large enough value, the application will run into an OutOfMemoryError.
Example 3
This code performs a stack allocation based on a length calculation.
(bad code)
Example Language: C
int a = 5, b = 6;
size_t len = a - b;
char buf[len]; // Just blows up the stack
}
Since a and b are declared as signed ints, the "a - b" subtraction gives a negative result (-1). However, since len is declared to be unsigned, len is cast to an extremely large positive number (on 32-bit systems - 4294967295). As a result, the buffer buf[len] declaration uses an extremely large size to allocate on the stack, very likely more than the entire computer's memory space.
Miscalculations usually will not be so obvious. The calculation will either be complicated or the result of an attacker's input to attain the negative value.
Example 4
This example shows a typical attempt to parse a string with an error resulting from a difference in assumptions between the caller to a function and the function's action.
(bad code)
Example Language: C
int proc_msg(char *s, int msg_len)
{
// Note space at the end of the string - assume all strings have preamble with space
int pre_len = sizeof("preamble: ");
char buf[pre_len - msg_len]; ... Do processing here if we get this far
}
char *s = "preamble: message\n";
char *sl = strchr(s, ':'); // Number of characters up to ':' (not including space)
int jnklen = sl == NULL ? 0 : sl - s; // If undefined pointer, use zero length
int ret_val = proc_msg ("s", jnklen); // Violate assumption of preamble length, end up with negative value, blow out stack
The buffer length ends up being -1, resulting in a blown out stack. The space character after the colon is included in the function calculation, but not in the caller's calculation. This, unfortunately, is not usually so obvious but exists in an obtuse series of calculations.
Example 5
The following code obtains an untrusted number that is used as an index into an array of messages.
(bad code)
Example Language: Perl
my $num = GetUntrustedNumber(); my @messages = ();
$messages[$num] = "Hello World";
The index is not validated at all (CWE-129), so it might be possible for an attacker to modify an element in @messages that was not intended. If an index is used that is larger than the current size of the array, the Perl interpreter automatically expands the array so that the large index works.
If $num is a large value such as 2147483648 (1<<31), then the assignment to $messages[$num] would attempt to create a very large array, then eventually produce an error message such as:
Out of memory during array extend
This memory exhaustion will cause the Perl program to exit, possibly a denial of service. In addition, the lack of memory could also prevent many other programs from successfully running on the system.
Example 6
This example shows a typical attempt to parse a string with an error resulting from a difference in assumptions between the caller to a function and the function's action. The buffer length ends up being -1 resulting in a blown out stack. The space character after the colon is included in the function calculation, but not in the caller's calculation. This, unfortunately, is not usually so obvious but exists in an obtuse series of calculations.
(bad code)
Example Language: C
int proc_msg(char *s, int msg_len)
{
int pre_len = sizeof("preamble: "); // Note space at the end of the string - assume all strings have preamble with space
char buf[pre_len - msg_len];
... Do processing here and set status
return status;
}
char *s = "preamble: message\n";
char *sl = strchr(s, ':'); // Number of characters up to ':' (not including space)
int jnklen = sl == NULL ? 0 : sl - s; // If undefined pointer, use zero length
int ret_val = proc_msg ("s", jnklen); // Violate assumption of preamble length, end up with negative value, blow out stack
(good code)
Example Language: C
int proc_msg(char *s, int msg_len)
{
int pre_len = sizeof("preamble: "); // Note space at the end of the string - assume all strings have preamble with space
}
char *s = "preamble: message\n";
char *sl = strchr(s, ':'); // Number of characters up to ':' (not including space)
int jnklen = sl == NULL ? 0 : sl - s; // If undefined pointer, use zero length
int ret_val = proc_msg ("s", jnklen); // Violate assumption of preamble length, end up with negative value, blow out stack
Chain: Python library does not limit the resources used to process images that specify a very large number of bands (CWE-1284), leading to excessive memory consumption (CWE-789) or an integer overflow (CWE-190).
large Content-Length HTTP header value triggers application crash in instant messaging application due to failure in memory allocation
Weakness Ordinalities
Ordinality
Description
Primary
(where the weakness exists independent of other weaknesses)
Resultant
(where the weakness is typically related to the presence of some other weaknesses)
Detection Methods
Fuzzing
Fuzz testing (fuzzing) is a powerful technique for generating large numbers of diverse inputs - either randomly or algorithmically - and dynamically invoking the code with those inputs. Even with random inputs, it is often capable of generating unexpected results such as crashes, memory corruption, or resource consumption. Fuzzing effectively produces repeatable test cases that clearly indicate bugs, which helps developers to diagnose the issues.
Effectiveness: High
Automated Static Analysis
Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then searching for potentially-vulnerable patterns that connect "sources" (origins of input) with "sinks" (destinations where the data interacts with external components, a lower layer such as the OS, etc.)
Effectiveness: High
Memberships
This MemberOf Relationships table shows additional CWE Categories and Views that reference this weakness as a member. This information is often useful in understanding where a weakness fits within the context of external information sources.
Nature
Type
ID
Name
MemberOf
Category - a CWE entry that contains a set of other entries that share a common characteristic.
(this CWE ID could be used to map to real-world vulnerabilities)
Reason: Acceptable-Use
Rationale:
This CWE entry is at the Variant level of abstraction, which is a preferred level of abstraction for mapping to the root causes of vulnerabilities.
Comments:
Carefully read both the name and description to ensure that this mapping is an appropriate fit. Do not try to 'force' a mapping to a lower-level Base/Variant simply to comply with this preferred level of abstraction.
Notes
Relationship
This weakness can be closely associated with integer overflows (CWE-190). Integer overflow attacks would concentrate on providing an extremely large number that triggers an overflow that causes less memory to be allocated than expected. By providing a large value that does not trigger an integer overflow, the attacker could still cause excessive amounts of memory to be allocated.
Applicable Platform
Uncontrolled memory allocation is possible in many languages, such as dynamic array allocation in perl or initial size parameters in Collections in Java. However, languages like C and C++ where programmers have the power to more directly control memory management will be more susceptible.
Taxonomy Mappings
Mapped Taxonomy Name
Node ID
Fit
Mapped Node Name
WASC
35
SOAP Array Abuse
CERT C Secure Coding
MEM35-C
Imprecise
Allocate sufficient memory for an object
SEI CERT Perl Coding Standard
IDS32-PL
Imprecise
Validate any integer that is used as an array index
[REF-62] Mark Dowd, John McDonald
and Justin Schuh. "The Art of Software Security Assessment". Chapter 10, "Resource Limits", Page 574. 1st Edition. Addison Wesley. 2006.