CWE-138: Improper Neutralization of Special Elements
Weakness ID: 138
The software receives input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could be interpreted as control elements or syntactic markers when they are sent to a downstream component.
Most languages and protocols have their own special elements such as characters and reserved words. These special elements can carry control implications. If software does not prevent external control or influence over the inclusion of such special elements, the control flow of the program may be altered from what was intended. For example, both Unix and Windows interpret the symbol < ("less than") as meaning "read input from a file".
Time of Introduction
Technical Impact: Execute unauthorized code or
commands; Alter execution
logic; DoS: crash / exit /
Multi-channel issue. Terminal escape sequences not
filtered from log files.
Developers should anticipate that special elements (e.g. delimiters,
symbols) will be injected into input vectors of their software system.
One defense is to create a white list (e.g. a regular expression) that
defines valid input according to the requirements specifications.
Strictly filter any input that does not match against the white list.
Properly encode your output, and quote any elements that have special
meaning to the component with which you are communicating.
Strategy: Input Validation
Assume all input is malicious. Use an "accept known good" input
validation strategy, i.e., use a whitelist of acceptable inputs that
strictly conform to specifications. Reject any input that does not
strictly conform to specifications, or transform it into something that
When performing input validation, consider all potentially relevant
properties, including length, type of input, the full range of
acceptable values, missing or extra inputs, syntax, consistency across
related fields, and conformance to business rules. As an example of
business rule logic, "boat" may be syntactically valid because it only
contains alphanumeric characters, but it is not valid if the input is
only expected to contain colors such as "red" or "blue."
Do not rely exclusively on looking for malicious or malformed inputs
(i.e., do not rely on a blacklist). A blacklist is likely to miss at
least one undesirable input, especially if the code's environment
changes. This can give attackers enough room to bypass the intended
validation. However, blacklists can be useful for detecting potential
attacks or determining which inputs are so malformed that they should be
Use and specify an appropriate output encoding to ensure that the
special elements are well-defined. A normal byte sequence in one
encoding could be a special element in another.
Strategy: Input Validation
Inputs should be decoded and canonicalized to the application's current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass whitelist validation schemes by introducing dangerous inputs after they have been checked.
the weakness exists independent of other weaknesses)
This weakness can be related to interpretation conflicts or interaction
errors in intermediaries (such as proxies or application firewalls) when the
intermediary's model of an endpoint does not account for protocol-specific
See this entry's children for different types of special elements that
have been observed at one point or another. However, it can be difficult to
find suitable CVE examples. In an attempt to be complete, CWE includes some
types that do not have any associated observed example.
This weakness is probably under-studied for proprietary or custom formats.
It is likely that these issues are fairly common in applications that use
their own custom format for configuration files, logs, meta-data, messaging,
etc. They would only be found by accident or with a focused effort based on
an understanding of the format.