CWE - CWE-116: Improper Encoding or Escaping of Output (4.17)

Weakness ID: 116

Vulnerability Mapping: ALLOWED This CWE ID could be used to map to real-world vulnerabilities in limited situations requiring careful review (with careful review of mapping notes)
Abstraction: Class Class - a weakness that is described in a very abstract fashion, typically independent of any specific language or technology. More specific than a Pillar Weakness, but more general than a Base Weakness. Class level weaknesses typically describe issues in terms of 1 or 2 of the following dimensions: behavior, property, and resource.

View customized information:

For users who are interested in more notional aspects of a weakness. Example: educators, technical writers, and project/program managers. For users who are concerned with the practical application and details about the nature of a weakness and how to prevent it from happening. Example: tool developers, security researchers, pen-testers, incident response analysts. For users who are mapping an issue to CWE/CAPEC IDs, i.e., finding the most appropriate CWE for a specific issue (e.g., a CVE record). Example: tool developers, security researchers. For users who wish to see all available information for the CWE/CAPEC entry. For users who want to customize what details are displayed.

Description

The product prepares a structured message for communication with another component, but encoding or escaping of the data is either missing or done incorrectly. As a result, the intended structure of the message is not preserved.

Extended Description

Improper encoding or escaping can allow attackers to change the commands that are sent to another component, inserting malicious commands instead.

Most products follow a certain protocol that uses structured messages for communication between components, such as queries or commands. These structured messages can contain raw data interspersed with metadata or control information. For example, "GET /index.html HTTP/1.1" is a structured message containing a command ("GET") with a single argument ("/index.html") and metadata about which protocol version is being used ("HTTP/1.1").

If an application uses attacker-supplied inputs to construct a structured message without properly encoding or escaping, then the attacker could insert special characters that will cause the data to be interpreted as control information or metadata. Consequently, the component that receives the output will perform the wrong operations, or otherwise interpret the data incorrectly.

Alternate Terms

Output Sanitization
Output Validation
Output Encoding

Common Consequences

This table specifies different individual consequences associated with the weakness. The Scope identifies the application security area that is violated, while the Impact describes the negative technical impact that arises if an adversary succeeds in exploiting this weakness. The Likelihood provides information about how likely the specific consequence is expected to be seen relative to the other consequences in the list. For example, there may be high likelihood that a weakness will be exploited to achieve a certain impact, but a low likelihood that it will be exploited to achieve a different impact.

Impact	Details
Modify Application Data	Scope: Integrity The communications between components can be modified in unexpected ways. Unexpected commands can be executed, bypassing other security mechanisms. Incoming data can be misinterpreted.
Execute Unauthorized Code or Commands	Scope: Integrity, Confidentiality, Availability, Access Control The communications between components can be modified in unexpected ways. Unexpected commands can be executed, bypassing other security mechanisms. Incoming data can be misinterpreted.
Bypass Protection Mechanism	Scope: Confidentiality The communications between components can be modified in unexpected ways. Unexpected commands can be executed, bypassing other security mechanisms. Incoming data can be misinterpreted.

Potential Mitigations

Phase(s)	Mitigation
Architecture and Design	Strategy: Libraries or Frameworks Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. For example, consider using the ESAPI Encoding control [REF-45] or a similar tool, library, or framework. These will help the programmer encode outputs in a manner less prone to error. Alternately, use built-in functions, but consider using wrappers in case those functions are discovered to have a vulnerability.
Architecture and Design	Strategy: Parameterization If available, use structured mechanisms that automatically enforce the separation between data and code. These mechanisms may be able to provide the relevant quoting, encoding, and validation automatically, instead of relying on the developer to provide this capability at every point where output is generated. For example, stored procedures can enforce database query structure and reduce the likelihood of SQL injection.
Architecture and Design; Implementation	Understand the context in which your data will be used and the encoding that will be expected. This is especially important when transmitting data between different components, or when generating outputs that can contain multiple encodings at the same time, such as web pages or multi-part mail messages. Study all expected communication protocols and data representations to determine the required encoding strategies.
Architecture and Design	In some cases, input validation may be an important strategy when output encoding is not a complete solution. For example, you may be providing the same output that will be processed by multiple consumers that use different encodings or representations. In other cases, you may be required to allow user-supplied input to contain control information, such as limited HTML tags that support formatting in a wiki or bulletin board. When this type of requirement must be met, use an extremely strict allowlist to limit which control sequences can be used. Verify that the resulting syntactic structure is what you expect. Use your normal encoding methods for the remainder of the input.
Architecture and Design	Use input validation as a defense-in-depth measure to reduce the likelihood of output encoding errors (see CWE-20).
Requirements	Fully specify which encodings are required by components that will be communicating with each other.
Implementation	When exchanging data between components, ensure that both components are using the same character encoding. Ensure that the proper encoding is applied at each interface. Explicitly set the encoding you are using whenever the protocol allows you to do so.

Relationships

This table shows the weaknesses and high level categories that are related to this weakness. These relationships are defined as ChildOf, ParentOf, MemberOf and give insight to similar items that may exist at higher and lower levels of abstraction. In addition, relationships such as PeerOf and CanAlsoBe are defined to show similar weaknesses that the user may want to explore.

Relevant to the view "Research Concepts" (View-1000)

Nature	Type	ID	Name
ChildOf	Pillar - a weakness that is the most abstract type of weakness and represents a theme for all class/base/variant weaknesses related to it. A Pillar is different from a Category as a Pillar is still technically a type of weakness that describes a mistake, while a Category represents a common characteristic used to group related things.	707	Improper Neutralization
ParentOf	Base - a weakness that is still mostly independent of a resource or technology, but with sufficient details to provide specific methods for detection and prevention. Base level weaknesses typically describe issues in terms of 2 or 3 of the following dimensions: behavior, property, technology, language, and resource.	117	Improper Output Neutralization for Logs
ParentOf	Variant - a weakness that is linked to a certain type of product, typically involving a specific language or technology. More specific than a Base weakness. Variant level weaknesses typically describe issues in terms of 3 to 5 of the following dimensions: behavior, property, technology, language, and resource.	644	Improper Neutralization of HTTP Headers for Scripting Syntax
ParentOf	Base - a weakness that is still mostly independent of a resource or technology, but with sufficient details to provide specific methods for detection and prevention. Base level weaknesses typically describe issues in terms of 2 or 3 of the following dimensions: behavior, property, technology, language, and resource.	838	Inappropriate Encoding for Output Context
CanPrecede	Class - a weakness that is described in a very abstract fashion, typically independent of any specific language or technology. More specific than a Pillar Weakness, but more general than a Base Weakness. Class level weaknesses typically describe issues in terms of 1 or 2 of the following dimensions: behavior, property, and resource.	74	Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')

Relevant to the view "Weaknesses for Simplified Mapping of Published Vulnerabilities" (View-1003)

Nature	Type	ID	Name
MemberOf	View - a subset of CWE entries that provides a way of examining CWE content. The two main view structures are Slices (flat lists) and Graphs (containing relationships between entries).	1003	Weaknesses for Simplified Mapping of Published Vulnerabilities
ParentOf	Base - a weakness that is still mostly independent of a resource or technology, but with sufficient details to provide specific methods for detection and prevention. Base level weaknesses typically describe issues in terms of 2 or 3 of the following dimensions: behavior, property, technology, language, and resource.	838	Inappropriate Encoding for Output Context

Modes Of Introduction

The different Modes of Introduction provide information about how and when this weakness may be introduced. The Phase identifies a point in the life cycle at which introduction may occur, while the Note provides a typical scenario related to introduction during the given phase.

Phase	Note
Implementation
Operation

Applicable Platforms

This listing shows possible areas for which the given weakness could appear. These may be for specific named Languages, Operating Systems, Architectures, Paradigms, Technologies, or a class of such platforms. The platform is listed along with how frequently the given weakness appears for that instance.

Languages	Class: Not Language-Specific (Often Prevalent)
Technologies	AI/ML (Undetermined Prevalence) Database Server (Often Prevalent) Web Server (Often Prevalent)

Likelihood Of Exploit

High

Demonstrative Examples

Example 1

This code displays an email address that was submitted as part of a form.

(bad code)

Example Language: JSP

<% String email = request.getParameter("email"); %>
...
Email Address: <%= email %>

The value read from the form parameter is reflected back to the client browser without having been encoded prior to output, allowing various XSS attacks (CWE-79).

Example 2

Consider a chat application in which a front-end web application communicates with a back-end server. The back-end is legacy code that does not perform authentication or authorization, so the front-end must implement it. The chat protocol supports two commands, SAY and BAN, although only administrators can use the BAN command. Each argument must be separated by a single space. The raw inputs are URL-encoded. The messaging protocol allows multiple commands to be specified on the same line if they are separated by a "|" character.

First let's look at the back end command processor code

(bad code)

Example Language: Perl

$inputString = readLineFromFileHandle($serverFH);

# generate an array of strings separated by the "|" character.
@commands = split(/\|/, $inputString);

foreach $cmd (@commands) {

# separate the operator from its arguments based on a single whitespace
($operator, $args) = split(/ /, $cmd, 2);

$args = UrlDecode($args);
if ($operator eq "BAN") {

ExecuteBan($args);

}
elsif ($operator eq "SAY") {

ExecuteSay($args);

}

The front end web application receives a command, encodes it for sending to the server, performs the authorization check, and sends the command to the server.

(bad code)

Example Language: Perl

$inputString = GetUntrustedArgument("command");
($cmd, $argstr) = split(/\s+/, $inputString, 2);

# removes extra whitespace and also changes CRLF's to spaces
$argstr =~ s/\s+/ /gs;

$argstr = UrlEncode($argstr);
if (($cmd eq "BAN") && (! IsAdministrator($username))) {

die "Error: you are not the admin.\n";

}

# communicate with file server using a file handle
$fh = GetServerFileHandle("myserver");

print $fh "$cmd $argstr\n";

It is clear that, while the protocol and back-end allow multiple commands to be sent in a single request, the front end only intends to send a single command. However, the UrlEncode function could leave the "|" character intact. If an attacker provides:

(attack code)

SAY hello world|BAN user12

then the front end will see this is a "SAY" command, and the $argstr will look like "hello world | BAN user12". Since the command is "SAY", the check for the "BAN" command will fail, and the front end will send the URL-encoded command to the back end:

(result)

SAY hello%20world|BAN%20user12

The back end, however, will treat these as two separate commands:

(result)

SAY hello world
BAN user12

Notice, however, that if the front end properly encodes the "|" with "%7C", then the back end will only process a single command.

Example 3

This example takes user input, passes it through an encoding scheme and then creates a directory specified by the user.

(bad code)

Example Language: Perl

sub GetUntrustedInput {

return($ARGV[0]);

}

sub encode {

my($str) = @_;
$str =~ s/\&/\&/gs;
$str =~ s/\"/\"/gs;
$str =~ s/\'/\'/gs;
$str =~ s/\</\</gs;
$str =~ s/\>/\>/gs;
return($str);

}

sub doit {

my $uname = encode(GetUntrustedInput("username"));
print "<b>Welcome, $uname!</b><p>\n";
system("cd /home/$uname; /bin/ls -l");

}

The programmer attempts to encode dangerous characters, however the denylist for encoding is incomplete (CWE-184) and an attacker can still pass a semicolon, resulting in a chain with command injection (CWE-77).

Additionally, the encoding routine is used inappropriately with command execution. An attacker doesn't even need to insert their own semicolon. The attacker can instead leverage the encoding routine to provide the semicolon to separate the commands. If an attacker supplies a string of the form:

(attack code)

' pwd

then the program will encode the apostrophe and insert the semicolon, which functions as a command separator when passed to the system function. This allows the attacker to complete the command injection.

Selected Observed Examples

Note: this is a curated list of examples for users to understand the variety of ways in which this weakness can be introduced. It is not a complete list of all CVEs that are related to this CWE entry.

Reference	Description
CVE-2021-41232	Chain: authentication routine in Go-based agile development product does not escape user name (CWE-116), allowing LDAP injection (CWE-90)
CVE-2008-4636	OS command injection in backup software using shell metacharacters in a filename; correct behavior would require that this filename could not be changed.
CVE-2008-0769	Web application does not set the charset when sending a page to a browser, allowing for XSS exploitation when a browser chooses an unexpected encoding.
CVE-2008-0005	Program does not set the charset when sending a page to a browser, allowing for XSS exploitation when a browser chooses an unexpected encoding.
CVE-2008-5573	SQL injection via password parameter; a strong password might contain "&"
CVE-2008-3773	Cross-site scripting in chat application via a message subject, which normally might contain "&" and other XSS-related characters.
CVE-2008-0757	Cross-site scripting in chat application via a message, which normally might be allowed to contain arbitrary content.

Detection Methods

Method	Details
Automated Static Analysis	This weakness can often be detected using automated static analysis tools. Many modern tools use data flow analysis or constraint-based techniques to minimize the number of false positives. Effectiveness: Moderate Note:This is not a perfect solution, since 100% accuracy and coverage are not feasible.
Automated Dynamic Analysis	This weakness can be detected using dynamic tools and techniques that interact with the software using large test suites with many diverse inputs, such as fuzz testing (fuzzing), robustness testing, and fault injection. The software's operation may slow down, but it should not become unstable, crash, or generate incorrect results.

Memberships

This MemberOf Relationships table shows additional CWE Categories and Views that reference this weakness as a member. This information is often useful in understanding where a weakness fits within the context of external information sources.

Nature	Type	ID	Name
MemberOf	Category - a CWE entry that contains a set of other entries that share a common characteristic.	751	2009 Top 25 - Insecure Interaction Between Components
MemberOf	Category - a CWE entry that contains a set of other entries that share a common characteristic.	845	The CERT Oracle Secure Coding Standard for Java (2011) Chapter 2 - Input Validation and Data Sanitization (IDS)
MemberOf	Category - a CWE entry that contains a set of other entries that share a common characteristic.	883	CERT C++ Secure Coding Section 49 - Miscellaneous (MSC)
MemberOf	Category - a CWE entry that contains a set of other entries that share a common characteristic.	992	SFP Secondary Cluster: Faulty Input Transformation
MemberOf	Category - a CWE entry that contains a set of other entries that share a common characteristic.	1134	SEI CERT Oracle Secure Coding Standard for Java - Guidelines 00. Input Validation and Data Sanitization (IDS)
MemberOf	Category - a CWE entry that contains a set of other entries that share a common characteristic.	1179	SEI CERT Perl Coding Standard - Guidelines 01. Input Validation and Data Sanitization (IDS)
MemberOf	Category - a CWE entry that contains a set of other entries that share a common characteristic.	1347	OWASP Top Ten 2021 Category A03:2021 - Injection
MemberOf	Category - a CWE entry that contains a set of other entries that share a common characteristic.	1407	Comprehensive Categorization: Improper Neutralization

Vulnerability Mapping Notes

Usage	ALLOWED-WITH-REVIEW (this CWE ID could be used to map to real-world vulnerabilities in limited situations requiring careful review)
Reason	Abstraction
Rationale	This CWE entry is a Class and might have Base-level children that would be more appropriate
Comments	Examine children of this entry to see if there is a better fit

Notes

Relationship

This weakness is primary to all weaknesses related to injection (CWE-74) since the inherent nature of injection involves the violation of structured messages.

Relationship

CWE-116 and CWE-20 have a close association because, depending on the nature of the structured message, proper input validation can indirectly prevent special characters from changing the meaning of a structured message. For example, by validating that a numeric ID field should only contain the 0-9 characters, the programmer effectively prevents injection attacks.

However, input validation is not always sufficient, especially when less stringent data types must be supported, such as free-form text. Consider a SQL injection scenario in which a last name is inserted into a query. The name "O'Reilly" would likely pass the validation step since it is a common last name in the English language. However, it cannot be directly inserted into the database because it contains the "'" apostrophe character, which would need to be escaped or otherwise neutralized. In this case, stripping the apostrophe might reduce the risk of SQL injection, but it would produce incorrect behavior because the wrong name would be recorded.

Terminology

The usage of the "encoding" and "escaping" terms varies widely. For example, in some programming languages, the terms are used interchangeably, while other languages provide APIs that use both terms for different tasks. This overlapping usage extends to the Web, such as the "escape" JavaScript function whose purpose is stated to be encoding. The concepts of encoding and escaping predate the Web by decades. Given such a context, it is difficult for CWE to adopt a consistent vocabulary that will not be misinterpreted by some constituency.

Theoretical

This is a data/directive boundary error in which data boundaries are not sufficiently enforced before it is sent to a different control sphere.

Research Gap

While many published vulnerabilities are related to insufficient output encoding, there is such an emphasis on input validation as a protection mechanism that the underlying causes are rarely described. Within CVE, the focus is primarily on well-understood issues like cross-site scripting and SQL injection. It is likely that this weakness frequently occurs in custom protocols that support multiple encodings, which are not necessarily detectable with automated techniques.

Taxonomy Mappings

Mapped Taxonomy Name	Node ID	Fit	Mapped Node Name
WASC	22		Improper Output Handling
The CERT Oracle Secure Coding Standard for Java (2011)	IDS00-J	Exact	Sanitize untrusted data passed across a trust boundary
The CERT Oracle Secure Coding Standard for Java (2011)	IDS05-J		Use a subset of ASCII for file and path names
SEI CERT Oracle Coding Standard for Java	IDS00-J	Imprecise	Prevent SQL injection
SEI CERT Perl Coding Standard	IDS33-PL	Exact	Sanitize untrusted data passed across a trust boundary

Related Attack Patterns

CAPEC-ID	Attack Pattern Name
CAPEC-104	Cross Zone Scripting
CAPEC-73	User-Controlled Filename
CAPEC-81	Web Server Logs Tampering
CAPEC-85	AJAX Footprinting

References

[REF-45]	OWASP. "OWASP Enterprise Security API (ESAPI) Project". <http://www.owasp.org/index.php/ESAPI>.
[REF-46]	Joshbw. "Output Sanitization". 2008-09-18. <https://web.archive.org/web/20081208054333/http://analyticalengine.net/archives/58>. (URL validated: 2023-04-07)
[REF-47]	Niyaz PK. "Sanitizing user data: How and where to do it". 2008-09-11. <https://web.archive.org/web/20090105222005/http://www.diovo.com/2008/09/sanitizing-user-data-how-and-where-to-do-it/>. (URL validated: 2023-04-07)
[REF-48]	Jeremiah Grossman. "Input validation or output filtering, which is better?". 2007-01-30. <https://blog.jeremiahgrossman.com/2007/01/input-validation-or-output-filtering.html>. (URL validated: 2023-04-07)
[REF-49]	Jim Manico. "Input Validation - Not That Important". 2008-08-10. <https://manicode.blogspot.com/2008/08/input-validation-not-that-important.html>. (URL validated: 2023-04-07)
[REF-50]	Michael Eddington. "Preventing XSS with Correct Output Encoding". <http://phed.org/2008/05/19/preventing-xss-with-correct-output-encoding/>.
[REF-7]	Michael Howard and David LeBlanc. "Writing Secure Code". Chapter 11, "Canonical Representation Issues" Page 363. 2nd Edition. Microsoft Press. 2002-12-04. <https://www.microsoftpressstore.com/store/writing-secure-code-9780735617223>.

Content History

Submissions
Submission Date	Submitter	Organization
2006-07-19 (CWE Draft 3, 2006-07-19)	CWE Community
2006-07-19 (CWE Draft 3, 2006-07-19)	Submitted by members of the CWE community to extend early CWE versions
Modifications
Modification Date	Modifier	Organization
2024-07-16 (CWE 4.15, 2024-07-16)	CWE Content Team	MITRE
2024-07-16 (CWE 4.15, 2024-07-16)	updated Applicable_Platforms
2023-06-29	CWE Content Team	MITRE
2023-06-29	updated Mapping_Notes
2023-04-27	CWE Content Team	MITRE
2023-04-27	updated References, Relationships, Time_of_Introduction
2023-01-31	CWE Content Team	MITRE
2023-01-31	updated Description
2022-10-13	CWE Content Team	MITRE
2022-10-13	updated Observed_Examples
2021-10-28	CWE Content Team	MITRE
2021-10-28	updated Relationships
2021-03-15	CWE Content Team	MITRE
2021-03-15	updated Relationships, Terminology_Notes
2020-06-25	CWE Content Team	MITRE
2020-06-25	updated Applicable_Platforms, Demonstrative_Examples, Potential_Mitigations
2020-02-24	CWE Content Team	MITRE
2020-02-24	updated Relationships
2019-06-20	CWE Content Team	MITRE
2019-06-20	updated Relationships
2019-01-03	CWE Content Team	MITRE
2019-01-03	updated Relationships, Taxonomy_Mappings
2018-03-27	CWE Content Team	MITRE
2018-03-27	updated References
2017-11-08	CWE Content Team	MITRE
2017-11-08	updated Applicable_Platforms, Common_Consequences, Demonstrative_Examples, Likelihood_of_Exploit, References, Taxonomy_Mappings
2017-05-03	CWE Content Team	MITRE
2017-05-03	updated Related_Attack_Patterns
2017-01-19	CWE Content Team	MITRE
2017-01-19	updated Relationships
2015-12-07	CWE Content Team	MITRE
2015-12-07	updated Relationships
2014-07-30	CWE Content Team	MITRE
2014-07-30	updated Demonstrative_Examples, Relationships
2014-06-23	CWE Content Team	MITRE
2014-06-23	updated References
2012-10-30	CWE Content Team	MITRE
2012-10-30	updated Potential_Mitigations
2012-05-11	CWE Content Team	MITRE
2012-05-11	updated References, Relationships, Taxonomy_Mappings
2011-09-13	CWE Content Team	MITRE
2011-09-13	updated Relationships, Taxonomy_Mappings
2011-06-01	CWE Content Team	MITRE
2011-06-01	updated Common_Consequences, Relationships, Taxonomy_Mappings
2011-03-29	CWE Content Team	MITRE
2011-03-29	updated Relationship_Notes, Relationships
2010-06-21	CWE Content Team	MITRE
2010-06-21	updated Potential_Mitigations
2010-04-05	CWE Content Team	MITRE
2010-04-05	updated Potential_Mitigations
2010-02-16	CWE Content Team	MITRE
2010-02-16	updated Detection_Factors, Potential_Mitigations, References, Taxonomy_Mappings
2009-12-28	CWE Content Team	MITRE
2009-12-28	updated Demonstrative_Examples, Potential_Mitigations
2009-10-29	CWE Content Team	MITRE
2009-10-29	updated Relationships
2009-07-27	CWE Content Team	MITRE
2009-07-27	updated Demonstrative_Examples
2009-05-27	CWE Content Team	MITRE
2009-05-27	updated Related_Attack_Patterns
2009-03-10	CWE Content Team	MITRE
2009-03-10	updated Description, Potential_Mitigations
2009-01-12	CWE Content Team	MITRE
2009-01-12	updated Alternate_Terms, Applicable_Platforms, Common_Consequences, Demonstrative_Examples, Description, Likelihood_of_Exploit, Name, Observed_Examples, Potential_Mitigations, References, Relationship_Notes, Relationships, Research_Gaps, Terminology_Notes, Theoretical_Notes
2008-09-08	CWE Content Team	MITRE
2008-09-08	updated Name, Relationships
2008-07-01	Eric Dalci	Cigital
2008-07-01	updated Time_of_Introduction
2008-07-01	Sean Eidemiller	Cigital
2008-07-01	added/updated demonstrative examples
Previous Entry Names
Change Date	Previous Entry Name
2008-04-11	Output Validation
2008-09-09	Incorrect Output Sanitization
2009-01-12	Insufficient Output Sanitization


	Site Map \| Terms of Use \| Manage Cookies \| Cookie Notice \| Privacy Policy \| Contact Us \| Use of the Common Weakness Enumeration (CWE™) and the associated references from this website are subject to the Terms of Use. CWE is sponsored by the U.S. Department of Homeland Security (DHS) Cybersecurity and Infrastructure Security Agency (CISA) and managed by the Homeland Security Systems Engineering and Development Institute (HSSEDI) which is operated by The MITRE Corporation (MITRE). Copyright © 2006–2025, The MITRE Corporation. CWE, CWSS, CWRAF, and the CWE logo are trademarks of The MITRE Corporation.

Common Weakness Enumeration

CWE-116: Improper Encoding or Escaping of Output

Edit Custom Filter