CWE-89: Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')
Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')
Weakness ID: 89 (Weakness Base)
Status: Draft
Description
Description Summary
The software constructs all or part of an SQL command using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the intended SQL command when it is sent to a downstream component.
Extended Description
Without sufficient removal or quoting of SQL syntax in user-controllable inputs, the generated SQL query can cause those inputs to be interpreted as SQL instead of ordinary user data. This can be used to alter query logic to bypass security checks, or to insert additional statements that modify the back-end database, possibly including execution of system commands.
SQL injection has become a common issue with database-driven web sites. The flaw is easily detected, and easily exploited, and as such, any site or software package with even a minimal user base is likely to be subject to an attempted attack of this kind. This flaw depends on the fact that SQL makes no real distinction between the control and data planes.
Time of Introduction
Architecture and Design
Implementation
Operation
Applicable Platforms
Languages
All
Technology Classes
Database-Server
Modes of Introduction
This weakness typically appears in data-rich applications that save user
inputs in a database.
Common Consequences
Scope
Effect
Confidentiality
Technical Impact: Read application
data
Since SQL databases generally hold sensitive data, loss of
confidentiality is a frequent problem with SQL injection
vulnerabilities.
Access Control
Technical Impact: Bypass protection
mechanism
If poor SQL commands are used to check user names and passwords, it
may be possible to connect to a system as another user with no previous
knowledge of the password.
Access Control
Technical Impact: Bypass protection
mechanism
If authorization information is held in a SQL database, it may be
possible to change this information through the successful exploitation
of a SQL injection vulnerability.
Integrity
Technical Impact: Modify application
data
Just as it may be possible to read sensitive information, it is also
possible to make changes or even delete this information with a SQL
injection attack.
Likelihood of Exploit
Very High
Enabling Factors for Exploitation
The application dynamically generates queries that contain user
input.
Detection Methods
Automated Static Analysis
This weakness can often be detected using automated static analysis
tools. Many modern tools use data flow analysis or constraint-based
techniques to minimize the number of false positives.
Automated static analysis might not be able to recognize when proper
input validation is being performed, leading to false positives - i.e.,
warnings that do not have any security consequences or do not require
any code changes.
Automated static analysis might not be able to detect the usage of
custom API functions or third-party libraries that indirectly invoke SQL
commands, leading to false negatives - especially if the API/library
code is not available for analysis.
This is not a perfect solution, since 100% accuracy and coverage are
not feasible.
Automated Dynamic Analysis
This weakness can be detected using dynamic tools and techniques that
interact with the software using large test suites with many diverse
inputs, such as fuzz testing (fuzzing), robustness testing, and fault
injection. The software's operation may slow down, but it should not
become unstable, crash, or generate incorrect results.
Effectiveness: Moderate
Manual Analysis
Manual analysis can be useful for finding this weakness, but it might
not achieve desired code coverage within limited time constraints. This
becomes difficult for weaknesses that must be considered for all inputs,
since the attack surface can be too large.
Demonstrative Examples
Example 1
In 2008, a large number of web servers were compromised using the
same SQL injection attack string. This single string worked against many
different programs. The SQL injection was then used to modify the web sites
to serve malicious code. [1]
Example 2
The following code dynamically constructs and executes a SQL query
that searches for items matching a specified name. The query restricts the
items displayed to those where owner matches the user name of the
currently-authenticated user.
(Bad Code)
Example
Language: C#
...
string userName = ctx.getAuthenticatedUserName();
string query = "SELECT * FROM items WHERE owner = '" + userName +
"' AND itemname = '" + ItemName.Text + "'";
sda = new SqlDataAdapter(query, conn);
DataTable dt = new DataTable();
sda.Fill(dt);
...
The query that this code intends to execute follows:
SELECT * FROM items WHERE owner = <userName> AND
itemname = <itemName>;
However, because the query is constructed dynamically by concatenating
a constant base query string and a user input string, the query only
behaves correctly if itemName does not contain a single-quote character.
If an attacker with the user name wiley enters the string:
(Attack)
name' OR 'a'='a
for itemName, then the query becomes the following:
(Attack)
SELECT * FROM items WHERE owner = 'wiley' AND itemname = 'name' OR
'a'='a';
The addition of the:
(Attack)
OR 'a'='a'
condition causes the WHERE clause to always evaluate to true, so the
query becomes logically equivalent to the much simpler query:
(Attack)
SELECT * FROM items;
This simplification of the query allows the attacker to bypass the
requirement that the query only return items owned by the authenticated
user; the query now returns all entries stored in the items table,
regardless of their specified owner.
Example 3
This example examines the effects of a different malicious value
passed to the query constructed and executed in the previous
example.
If an attacker with the user name wiley enters the string:
(Attack)
name'; DELETE FROM items; --
for itemName, then the query becomes the following two queries:
(Attack)
Example
Language: SQL
SELECT * FROM items WHERE owner = 'wiley' AND itemname =
'name';
DELETE FROM items;
--'
Many database servers, including Microsoft(R) SQL Server 2000, allow
multiple SQL statements separated by semicolons to be executed at once.
While this attack string results in an error on Oracle and other
database servers that do not allow the batch-execution of statements
separated by semicolons, on databases that do allow batch execution,
this type of attack allows the attacker to execute arbitrary commands
against the database.
Notice the trailing pair of hyphens (--), which specifies to most
database servers that the remainder of the statement is to be treated as
a comment and not executed. In this case the comment character serves to
remove the trailing single-quote left over from the modified query. On a
database where comments are not allowed to be used in this way, the
general attack could still be made effective using a trick similar to
the one shown in the previous example.
If an attacker enters the string
(Attack)
name'; DELETE FROM items; SELECT * FROM items WHERE 'a'='a
Then the following three valid statements will be created:
(Attack)
SELECT * FROM items WHERE owner = 'wiley' AND itemname =
'name';
DELETE FROM items;
SELECT * FROM items WHERE 'a'='a';
One traditional approach to preventing SQL injection attacks is to
handle them as an input validation problem and either accept only
characters from a whitelist of safe values or identify and escape a
blacklist of potentially malicious values. Whitelisting can be a very
effective means of enforcing strict input validation rules, but
parameterized SQL statements require less maintenance and can offer more
guarantees with respect to security. As is almost always the case,
blacklisting is riddled with loopholes that make it ineffective at
preventing SQL injection attacks. For example, attackers can:
Target fields that are not quoted
Find ways to bypass the need for certain escaped
meta-characters
Use stored procedures to hide the injected meta-characters.
Manually escaping characters in input to SQL queries can help, but it
will not make your application secure from SQL injection attacks.
Another solution commonly proposed for dealing with SQL injection
attacks is to use stored procedures. Although stored procedures prevent
some types of SQL injection attacks, they do not protect against many
others. For example, the following PL/SQL procedure is vulnerable to the
same SQL injection attack shown in the first example.
(Bad Code)
procedure get_item ( itm_cv IN OUT ItmCurTyp, usr in varchar2, itm
in varchar2)
is open itm_cv for
' SELECT * FROM items WHERE ' || 'owner = '|| usr || ' AND
itemname = ' || itm || ';
end get_item;
Stored procedures typically help prevent SQL injection attacks by
limiting the types of statements that can be passed to their parameters.
However, there are many ways around the limitations and many interesting
statements that can still be passed to stored procedures. Again, stored
procedures can prevent some exploits, but they will not make your
application secure against SQL injection attacks.
Example 4
MS SQL has a built in function that enables shell command execution.
An SQL injection in such a context could be disastrous. For example, a query
of the form:
(Bad Code)
SELECT ITEM,PRICE FROM PRODUCT WHERE ITEM_CATEGORY='$user_input'
ORDER BY PRICE
Where $user_input is taken from an untrusted source.
If the user provides the string:
(Attack)
'; exec master..xp_cmdshell 'dir' --
The query will take the following form:
(Attack)
SELECT ITEM,PRICE FROM PRODUCT WHERE ITEM_CATEGORY=''; exec
master..xp_cmdshell 'dir' --' ORDER BY PRICE
Now, this query can be broken down into:
a first SQL query: SELECT ITEM,PRICE FROM PRODUCT WHERE
ITEM_CATEGORY='';
a second SQL query, which executes the dir command in the shell:
exec master..xp_cmdshell 'dir'
an MS SQL comment: --' ORDER BY PRICE
As can be seen, the malicious input changes the semantics of the query
into a query, a shell command execution and a comment.
Example 5
This code intends to print a message summary given the message
ID.
(Bad Code)
Example
Language: PHP
$id = $_COOKIE["mid"];
mysql_query("SELECT MessageID, Subject FROM messages WHERE
MessageID = '$id'");
The programmer may have skipped any input validation on $id under the
assumption that attackers cannot modify the cookie. However, this is
easy to do with custom client code or even in the web browser.
While $id is wrapped in single quotes in the call to mysql_query(), an
attacker could simply change the incoming mid cookie to:
(Attack)
1432' or '1' = '1
This would produce the resulting query:
(Result)
SELECT MessageID, Subject FROM messages WHERE MessageID = '1432'
or '1' = '1'
Not only will this retrieve message number 1432, it will retrieve all
other messages.
In this case, the programmer could apply a simple modification to the
code to eliminate the SQL injection:
(Good Code)
Example
Language: PHP
$id = intval($_COOKIE["mid"]);
mysql_query("SELECT MessageID, Subject FROM messages WHERE
MessageID = '$id'");
However, if this code is intended to support multiple users with different message boxes, the code might also need an access control check (CWE-285) to ensure that the application user has the permission to see that message.
Example 6
This example attempts to take a last name provided by a user and
enter it into a database.
(Bad Code)
Example
Language: Perl
$userKey = getUserID();
$name = getUserInput();
# ensure only letters, hyphens and apostrophe are
allowed
$name = whiteList($name, "^a-zA-z'-$");
$query = "INSERT INTO last_names VALUES('$userKey',
'$name')";
While the programmer applies a whitelist to the user input, it has
shortcomings. First of all, the user is still allowed to provide hyphens
which are used as comment structures in SQL. If a user specifies -- then
the remainder of the statement will be treated as a comment, which may
bypass security logic. Furthermore, the whitelist permits the apostrophe
which is also a data / command separator in SQL. If a user supplies a
name with an apostrophe, they may be able to alter the structure of the
whole statement and even change control flow of the program, possibly
accessing or modifying confidential information. In this situation, both
the hyphen and apostrophe are legitimate characters for a last name and
permitting them is required. Instead, a programmer may want to use a
prepared statement or apply an encoding routine to the input to prevent
any data / directive misinterpretations.
Use a vetted library or framework that does not allow this weakness to
occur or provides constructs that make this weakness easier to
avoid.
For example, consider using persistence layers such as Hibernate or
Enterprise Java Beans, which can provide significant protection against
SQL injection if used properly.
Phase: Architecture and Design
Strategy: Parameterization
If available, use structured mechanisms that automatically enforce the
separation between data and code. These mechanisms may be able to
provide the relevant quoting, encoding, and validation automatically,
instead of relying on the developer to provide this capability at every
point where output is generated.
Process SQL queries using prepared statements, parameterized queries, or stored procedures. These features should accept parameters or variables and support strong typing. Do not dynamically construct and execute query strings within these features using "exec" or similar functionality, since this may re-introduce the possibility of SQL injection. [R.89.3]
Phases: Architecture and Design; Operation
Strategy: Environment Hardening
Run your code using the lowest privileges that are required to accomplish the necessary tasks [R.89.12]. If possible, create isolated accounts with limited privileges that are only used for a single task. That way, a successful attack will not immediately give the attacker access to the rest of the software or its environment. For example, database applications rarely need to run as the database administrator, especially in day-to-day operations.
Specifically, follow the principle of least privilege when creating
user accounts to a SQL database. The database users should only have the
minimum privileges necessary to use their account. If the requirements
of the system indicate that a user can read and modify their own data,
then limit their privileges so they cannot read/write others' data. Use
the strictest permissions possible on all database objects, such as
execute-only for stored procedures.
Phase: Architecture and Design
For any security checks that are performed on the client side, ensure that these checks are duplicated on the server side, in order to avoid CWE-602. Attackers can bypass the client-side checks by modifying values after the checks have been performed, or by changing the client to remove the client-side checks entirely. Then, these modified values would be submitted to the server.
Phase: Implementation
Strategy: Output Encoding
While it is risky to use dynamically-generated query strings, code, or commands that mix control and data together, sometimes it may be unavoidable. Properly quote arguments and escape any special characters within those arguments. The most conservative approach is to escape or filter all characters that do not pass an extremely strict whitelist (such as everything that is not alphanumeric or white space). If some special characters are still needed, such as white space, wrap each argument in quotes after the escaping/filtering step. Be careful of argument injection (CWE-88).
Instead of building a new implementation, such features may be
available in the database or programming language. For example, the
Oracle DBMS_ASSERT package can check or enforce that parameters have
certain properties that make them less vulnerable to SQL injection. For
MySQL, the mysql_real_escape_string() API function is available in both
C and PHP.
Phase: Implementation
Strategy: Input Validation
Assume all input is malicious. Use an "accept known good" input
validation strategy, i.e., use a whitelist of acceptable inputs that
strictly conform to specifications. Reject any input that does not
strictly conform to specifications, or transform it into something that
does.
When performing input validation, consider all potentially relevant
properties, including length, type of input, the full range of
acceptable values, missing or extra inputs, syntax, consistency across
related fields, and conformance to business rules. As an example of
business rule logic, "boat" may be syntactically valid because it only
contains alphanumeric characters, but it is not valid if the input is
only expected to contain colors such as "red" or "blue."
Do not rely exclusively on looking for malicious or malformed inputs
(i.e., do not rely on a blacklist). A blacklist is likely to miss at
least one undesirable input, especially if the code's environment
changes. This can give attackers enough room to bypass the intended
validation. However, blacklists can be useful for detecting potential
attacks or determining which inputs are so malformed that they should be
rejected outright.
When constructing SQL query strings, use stringent whitelists that
limit the character set based on the expected value of the parameter in
the request. This will indirectly limit the scope of an attack, but this
technique is less important than proper output encoding and
escaping.
Note that proper output encoding, escaping, and quoting is the most
effective solution for preventing SQL injection, although input
validation may provide some defense-in-depth. This is because it
effectively limits what will appear in output. Input validation will not
always prevent SQL injection, especially if you are required to support
free-form text fields that could contain arbitrary characters. For
example, the name "O'Reilly" would likely pass the validation step,
since it is a common last name in the English language. However, it
cannot be directly inserted into the database because it contains the
"'" apostrophe character, which would need to be escaped or otherwise
handled. In this case, stripping the apostrophe might reduce the risk of
SQL injection, but it would produce incorrect behavior because the wrong
name would be recorded.
When feasible, it may be safest to disallow meta-characters entirely,
instead of escaping them. This will provide some defense in depth. After
the data is entered into the database, later processes may neglect to
escape meta-characters before use, and you may not have control over
those processes.
Phase: Architecture and Design
Strategy: Enforcement by Conversion
When the set of acceptable objects, such as filenames or URLs, is
limited or known, create a mapping from a set of fixed input values
(such as numeric IDs) to the actual filenames or URLs, and reject all
other inputs.
Phase: Implementation
Ensure that error messages only contain minimal details that are
useful to the intended audience, and nobody else. The messages need to
strike the balance between being too cryptic and not being cryptic
enough. They should not necessarily reveal the methods that were used to
determine the error. Such detailed information can be used to refine the
original attack to increase the chances of success.
If errors must be tracked in some detail, capture them in log messages
- but consider what could occur if the log messages can be viewed by
attackers. Avoid recording highly sensitive information such as
passwords in any form. Avoid inconsistent messaging that might
accidentally tip off an attacker about internal state, such as whether a
username is valid or not.
In the context of SQL Injection, error messages revealing the
structure of a SQL query can help attackers tailor successful attack
strings.
Phase: Operation
Strategy: Firewall
Use an application firewall that can detect attacks against this
weakness. It can be beneficial in cases in which the code cannot be
fixed (because it is controlled by a third party), as an emergency
prevention measure while more comprehensive software assurance measures
are applied, or to provide defense in depth.
Effectiveness: Moderate
An application firewall might not cover all possible input vectors. In
addition, attack techniques might be available to bypass the protection
mechanism, such as using malformed inputs that can still be processed by
the component that receives those inputs. Depending on functionality, an
application firewall might inadvertently reject or modify legitimate
requests. Finally, some manual effort may be required for
customization.
Phases: Operation; Implementation
Strategy: Environment Hardening
When using PHP, configure the application so that it does not use register_globals. During implementation, develop the application so that it does not rely on this feature, but be wary of implementing a register_globals emulation that is subject to weaknesses such as CWE-95, CWE-621, and similar issues.
SQL injection can be resultant from special character mismanagement, MAID,
or blacklist/whitelist problems. It can be primary to authentication
errors.
2. end statement that performs an SQL command where
a. the input is part of the SQL command and
b. input contains SQL syntax (esp. query separator)
References
[R.89.1] [REF-17] Michael Howard, David LeBlanc
and John Viega. "24 Deadly Sins of Software Security". "Sin 1: SQL Injection." Page 3. McGraw-Hill. 2010.
[R.89.2] [REF-11] M. Howard and
D. LeBlanc. "Writing Secure Code". Chapter 12, "Database Input Issues" Page
397. 2nd Edition. Microsoft. 2002.
[R.89.6] David Litchfield, Chris Anley, John Heasman
and Bill Grindlay. "The Database Hacker's Handbook: Defending Database
Servers". Wiley. 2005-07-14.
[R.89.7] David Litchfield. "The Oracle Hacker's Handbook: Hacking and Defending
Oracle". Wiley. 2007-01-30.
[R.89.13] [REF-7] Mark Dowd, John McDonald
and Justin Schuh. "The Art of Software Security Assessment". Chapter 8, "SQL Queries", Page 431.. 1st Edition. Addison Wesley. 2006.
[REF-7] Mark Dowd, John McDonald
and Justin Schuh. "The Art of Software Security Assessment". Chapter 17, "SQL Injection", Page 1061.. 1st Edition. Addison Wesley. 2006.