CWE-116: Improper Encoding or Escaping of Output
View customized information:
For users who are interested in more notional aspects of a weakness. Example: educators, technical writers, and project/program managers.
For users who are concerned with the practical application and details about the nature of a weakness and how to prevent it from happening. Example: tool developers, security researchers, pen-testers, incident response analysts.
For users who are mapping an issue to CWE/CAPEC IDs, i.e., finding the most appropriate CWE for a specific issue (e.g., a CVE record). Example: tool developers, security researchers.
For users who wish to see all available information for the CWE/CAPEC entry.
For users who want to customize what details are displayed.
×
Edit Custom FilterThe product prepares a structured message for communication with another component, but encoding or escaping of the data is either missing or done incorrectly. As a result, the intended structure of the message is not preserved.
Improper encoding or escaping can allow attackers to change the commands that are sent to another component, inserting malicious commands instead. Most products follow a certain protocol that uses structured messages for communication between components, such as queries or commands. These structured messages can contain raw data interspersed with metadata or control information. For example, "GET /index.html HTTP/1.1" is a structured message containing a command ("GET") with a single argument ("/index.html") and metadata about which protocol version is being used ("HTTP/1.1"). If an application uses attacker-supplied inputs to construct a structured message without properly encoding or escaping, then the attacker could insert special characters that will cause the data to be interpreted as control information or metadata. Consequently, the component that receives the output will perform the wrong operations, or otherwise interpret the data incorrectly. This table specifies different individual consequences
associated with the weakness. The Scope identifies the application security area that is
violated, while the Impact describes the negative technical impact that arises if an
adversary succeeds in exploiting this weakness. The Likelihood provides information about
how likely the specific consequence is expected to be seen relative to the other
consequences in the list. For example, there may be high likelihood that a weakness will be
exploited to achieve a certain impact, but a low likelihood that it will be exploited to
achieve a different impact.
This table shows the weaknesses and high level categories that are related to this
weakness. These relationships are defined as ChildOf, ParentOf, MemberOf and give insight to
similar items that may exist at higher and lower levels of abstraction. In addition,
relationships such as PeerOf and CanAlsoBe are defined to show similar weaknesses that the user
may want to explore.
Relevant to the view "Research Concepts" (CWE-1000)
Relevant to the view "Weaknesses for Simplified Mapping of Published Vulnerabilities" (CWE-1003)
The different Modes of Introduction provide information
about how and when this
weakness may be introduced. The Phase identifies a point in the life cycle at which
introduction
may occur, while the Note provides a typical scenario related to introduction during the
given
phase.
This listing shows possible areas for which the given
weakness could appear. These
may be for specific named Languages, Operating Systems, Architectures, Paradigms,
Technologies,
or a class of such platforms. The platform is listed along with how frequently the given
weakness appears for that instance.
Languages Class: Not Language-Specific (Often Prevalent) Technologies AI/ML (Undetermined Prevalence) Database Server (Often Prevalent) Web Server (Often Prevalent) Example 1 This code displays an email address that was submitted as part of a form. (bad code)
Example Language: JSP
<% String email = request.getParameter("email"); %>
... Email Address: <%= email %> The value read from the form parameter is reflected back to the client browser without having been encoded prior to output, allowing various XSS attacks (CWE-79). Example 2 Consider a chat application in which a front-end web application communicates with a back-end server. The back-end is legacy code that does not perform authentication or authorization, so the front-end must implement it. The chat protocol supports two commands, SAY and BAN, although only administrators can use the BAN command. Each argument must be separated by a single space. The raw inputs are URL-encoded. The messaging protocol allows multiple commands to be specified on the same line if they are separated by a "|" character. First let's look at the back end command processor code (bad code)
Example Language: Perl
$inputString = readLineFromFileHandle($serverFH);
# generate an array of strings separated by the "|" character. @commands = split(/\|/, $inputString); foreach $cmd (@commands) { # separate the operator from its arguments based on a single whitespace ($operator, $args) = split(/ /, $cmd, 2); $args = UrlDecode($args); if ($operator eq "BAN") { ExecuteBan($args); }elsif ($operator eq "SAY") { ExecuteSay($args); }The front end web application receives a command, encodes it for sending to the server, performs the authorization check, and sends the command to the server. (bad code)
Example Language: Perl
$inputString = GetUntrustedArgument("command");
($cmd, $argstr) = split(/\s+/, $inputString, 2); # removes extra whitespace and also changes CRLF's to spaces $argstr =~ s/\s+/ /gs; $argstr = UrlEncode($argstr); if (($cmd eq "BAN") && (! IsAdministrator($username))) { die "Error: you are not the admin.\n"; }# communicate with file server using a file handle $fh = GetServerFileHandle("myserver"); print $fh "$cmd $argstr\n"; It is clear that, while the protocol and back-end allow multiple commands to be sent in a single request, the front end only intends to send a single command. However, the UrlEncode function could leave the "|" character intact. If an attacker provides: (attack code)
SAY hello world|BAN user12
then the front end will see this is a "SAY" command, and the $argstr will look like "hello world | BAN user12". Since the command is "SAY", the check for the "BAN" command will fail, and the front end will send the URL-encoded command to the back end: (result)
SAY hello%20world|BAN%20user12
The back end, however, will treat these as two separate commands: (result)
SAY hello world
BAN user12 Notice, however, that if the front end properly encodes the "|" with "%7C", then the back end will only process a single command. Example 3 This example takes user input, passes it through an encoding scheme and then creates a directory specified by the user. (bad code)
Example Language: Perl
sub GetUntrustedInput {
return($ARGV[0]); }sub encode { my($str) = @_; }$str =~ s/\&/\&/gs; $str =~ s/\"/\"/gs; $str =~ s/\'/\'/gs; $str =~ s/\</\</gs; $str =~ s/\>/\>/gs; return($str); sub doit { my $uname = encode(GetUntrustedInput("username")); }print "<b>Welcome, $uname!</b><p>\n"; system("cd /home/$uname; /bin/ls -l"); The programmer attempts to encode dangerous characters, however the denylist for encoding is incomplete (CWE-184) and an attacker can still pass a semicolon, resulting in a chain with command injection (CWE-77). Additionally, the encoding routine is used inappropriately with command execution. An attacker doesn't even need to insert their own semicolon. The attacker can instead leverage the encoding routine to provide the semicolon to separate the commands. If an attacker supplies a string of the form: (attack code)
' pwd
then the program will encode the apostrophe and insert the semicolon, which functions as a command separator when passed to the system function. This allows the attacker to complete the command injection.
This MemberOf Relationships table shows additional CWE Categories and Views that
reference this weakness as a member. This information is often useful in understanding where a
weakness fits within the context of external information sources.
Relationship
This weakness is primary to all weaknesses related to injection (CWE-74) since the inherent nature of injection involves the violation of structured messages.
Relationship CWE-116 and CWE-20 have a close association because, depending on the nature of the structured message, proper input validation can indirectly prevent special characters from changing the meaning of a structured message. For example, by validating that a numeric ID field should only contain the 0-9 characters, the programmer effectively prevents injection attacks. However, input validation is not always sufficient, especially when less stringent data types must be supported, such as free-form text. Consider a SQL injection scenario in which a last name is inserted into a query. The name "O'Reilly" would likely pass the validation step since it is a common last name in the English language. However, it cannot be directly inserted into the database because it contains the "'" apostrophe character, which would need to be escaped or otherwise neutralized. In this case, stripping the apostrophe might reduce the risk of SQL injection, but it would produce incorrect behavior because the wrong name would be recorded. Terminology
The usage of the "encoding" and "escaping" terms varies widely. For example, in some programming languages, the terms are used interchangeably, while other languages provide APIs that use both terms for different tasks. This overlapping usage extends to the Web, such as the "escape" JavaScript function whose purpose is stated to be encoding. The concepts of encoding and escaping predate the Web by decades. Given such a context, it is difficult for CWE to adopt a consistent vocabulary that will not be misinterpreted by some constituency.
Theoretical
This is a data/directive boundary error in which data boundaries are not sufficiently enforced before it is sent to a different control sphere.
Research Gap
While many published vulnerabilities are related to insufficient output encoding, there is such an emphasis on input validation as a protection mechanism that the underlying causes are rarely described. Within CVE, the focus is primarily on well-understood issues like cross-site scripting and SQL injection. It is likely that this weakness frequently occurs in custom protocols that support multiple encodings, which are not necessarily detectable with automated techniques.
More information is available — Please edit the custom filter or select a different filter. |
Use of the Common Weakness Enumeration (CWE™) and the associated references from this website are subject to the Terms of Use. CWE is sponsored by the U.S. Department of Homeland Security (DHS) Cybersecurity and Infrastructure Security Agency (CISA) and managed by the Homeland Security Systems Engineering and Development Institute (HSSEDI) which is operated by The MITRE Corporation (MITRE). Copyright © 2006–2025, The MITRE Corporation. CWE, CWSS, CWRAF, and the CWE logo are trademarks of The MITRE Corporation. |