CWE-180: Incorrect Behavior Order: Validate Before Canonicalize

Weakness ID: 180
The software validates input before it is canonicalized, which prevents the software from detecting data that becomes invalid after the canonicalization step.

This can be used by an attacker to bypass the validation and launch attacks that expose weaknesses that would otherwise be prevented, such as injection.

  • Implementation
Technical Impact: Bypass protection mechanism

The following code attempts to validate a given input path by checking it against a whitelist and then return the canonical path. In this specific case, the path is considered valid if it starts with the string "/safe_dir/".

String path = getInputPath();
if (path.startsWith("/safe_dir/"))
File f = new File(path);
return f.getCanonicalPath();

The problem with the above code is that the validation step occurs before canonicalization occurs. An attacker could provide an input path of "/safe_dir/../" that would pass the validation step. However, the canonicalization process sees the double dot as a traversal to the parent directory and hence when canonicized the path would become just "/".

To avoid this problem, validation should occur after canonicalization takes place. In this case canonicalization occurs during the initialization of the File object. The code below fixes the issue.

String path = getInputPath();
File f = new File(path);
if (f.getCanonicalPath().startsWith("/safe_dir/"))
return f.getCanonicalPath();

Product allows remote attackers to view restricted files via an HTTP request containing a "*" (wildcard or asterisk) character.
Product modifies the first two letters of a filename extension after performing a security check, which allows remote attackers to bypass authentication via a filename with a .ats extension instead of a .hts extension.
Database consumes an extra character when processing a character that cannot be converted, which could remove an escape character from the query and make the application subject to SQL injection attacks.
Overlaps "fakechild/../realchild"
Product checks URI for "<" and other literal characters, but does it before hex decoding the URI, so "%3E" and other sequences are allowed.
Inputs should be decoded and canonicalized to the application's current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass whitelist validation schemes by introducing dangerous inputs after they have been checked.

ChildOfCategoryCategory171Cleansing, Canonicalization, and Comparison Errors
Development Concepts (primary)699
ChildOfWeakness BaseWeakness Base179Incorrect Behavior Order: Early Validation
Research Concepts (primary)1000
ChildOfCategoryCategory722OWASP Top Ten 2004 Category A1 - Unvalidated Input
Weaknesses in OWASP Top Ten (2004) (primary)711
ChildOfCategoryCategory845CERT Java Secure Coding Section 00 - Input Validation and Data Sanitization (IDS)
Weaknesses Addressed by the CERT Java Secure Coding Standard (primary)844
ChildOfCategoryCategory896SFP Cluster: Tainted Input
Software Fault Pattern (SFP) Clusters (primary)888
This overlaps other categories.

  • Non-specific
OWASP Top Ten 2004A1Unvalidated Input
CERT Java Secure CodingIDS01-JNormalize strings before validating them
