Weakness ID
| Status: Draft 134 (Weakness Base) |
| Description | Summary The software uses externally-controlled format strings in printf-style functions, which can lead to buffer overflows or data representation problems. |
| Functional Area | logging, errors, general output |
| Likelihood of Exploit | Very High |
| Weakness Ordinality | Primary (Weakness exists independent of other weaknesses) |
| Causal Nature | Implicit (This is an implicit weakness) |
| Affected Resource | Memory |
| Common Consequences | Confidentiality: Format string problems allow for information disclosure which
can severely simplify exploitation of the program. Access Control: Format string problems can result in the execution of
arbitrary code. |
| Potential Mitigations | Requirements specification: Choose a language which is not subject to this flaw. Implementation: Ensure that all format string functions are passed a static string
which cannot be controlled by the user and that the proper number of arguments are always sent
to that function as well. If at all possible, do not use the %n operator in format strings. Build: Heed the warnings of compilers and linkers, since they may alert you to
improper usage. |
Demonstrative Examples | The following example is exploitable, due to the printf() call in the printWrapper()
function. Note: The stack buffer was added to make exploitation more simple. C Example: #include <stdio.h> void printWrapper(char *string) { printf(string); } int main(int argc, char **argv) { char buf[5012]; memcpy(buf, argv[1], 5012); printWrapper(argv[1]); return (0); }
The following code copies a command line argument into a buffer using snprintf(). int main(int argc, char **argv){ char buf[128]; ... snprintf(buf,128,argv[1]); } This code allows an attacker to view the contents of the stack and write to the
stack using a command line argument containing a sequence of formatting directives. The
attacker can read from the stack by providing more formatting directives, such as %x, than
the function takes as arguments to be formatted. (In this example, the function takes no
arguments to be formatted.) By using the %n formatting directive, the attacker can write
to the stack, causing snprintf() to write the number of bytes output thus far to the
specified argument (rather than reading a value from the argument, which is the intended
behavior). A sophisticated version of this attack will use four staggered writes to
completely control the value of a pointer on the stack.
Certain implementations make more advanced attacks even easier by providing format
directives that control the location in memory to read from or write to. An example of
these directives is shown in the following code, written for glibc: printf("%d %d %1$d %1$d\n", 5, 9); This code produces the following output: 5 9 5 5 It is also possible to use
half-writes (%hn) to accurately control arbitrary DWORDS in memory, which greatly reduces
the complexity needed to execute an attack that would otherwise require four staggered
writes, such as the one mentioned in the first example. |
| Observed Examples | | Reference | Description |
|---|
| CVE-2002-1825 | format string in Perl program | | CVE-2001-0717 | format string in bad call to syslog function | | CVE-2002-0573 | format string in bad call to syslog function | | CVE-2002-1788 | format strings in NNTP server responses | | CVE-2007-2027 | Chain: untrusted search path enabling resultant
format string by loading malicious internationalization messages |
|
| Context Notes | While Format String vulnerabilities typically fall under the Buffer Overflow category,
technically they are not overflowed buffers. The Format String vulnerability is fairly new (circa
1999) and stems from the fact that there is no realistic way for a function that takes a variable
number of arguments to determine just how many arguments were passed in. The most common functions
that take a variable number of arguments, including C-runtime functions, are the printf() family
of calls. The Format String problem appears in a number of ways. A *printf() call without a format
specifier is dangerous and can be exploited. For example, printf(input); is exploitable, while
printf(y, input); is not exploitable in that context. The result of the first call, used
incorrectly, allows for an attacker to be able to peek at stack memory since the input string will
be used as the format specifier. The attacker can stuff the input string with format specifiers
and begin reading stack values, since the remaining parameters will be pulled from the stack.
Worst case, this improper use may give away enough control to allow an arbitrary value (or values
in the case of an exploit program) to be written into the memory of the running program Frequently targeted entities are file names, process names, identifiers Format string problems are a classic C/C++ issue that are now rare due to the ease of
discovery. The reason format string vulnerabilities can be exploited is due to the %n operator.
The %n operator will write the number of characters, which have been printed by the format string
therefore far, to the memory pointed to by its argument. Through skilled creation of a format
string, a malicious user may use values on the stack to create a write-what-where condition. Once
this is achieved, he can execute arbitrary code. |
| Research Gaps | Format string issues are under-studied for languages other than C. Memory or disk
consumption, control flow or variable alteration, and data corruption may result from format
string exploitation in applications written in other languages such as Perl, PHP, Python, etc. Since format strings often occur in rarely-occurring erroneous conditions, it is highly
that many latent issues exist in executables that do not have associated source code (or
equivalent source). |
| References | |
| Relationships | |
| Source Taxonomies | PLOVER - Format string vulnerability 7 Pernicious Kingdoms - Format String CLASP - Format string problem |
| Applicable Platforms | All |
| Time of Introduction | Implementation |
| Related Attack Patterns | | CAPEC-ID | Attack Pattern Name |
|---|
| 67 | String Format Overflow in syslog() |
|
| White Box Definition | A weakness where the code path has: 1. start statement that accepts input 2. end statement that passes a format to format output function where a. the input data is part of the format and b. the format is undesirable Where “undesirable” is defined through the following scenarios: 1. not validated 2. incorrectly validated
|