CWE-362: Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition')
Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition')
Weakness ID: 362 (Weakness Class)
Status: Draft
Description
Description Summary
The program contains a code sequence that can run concurrently with other code, and the code sequence requires temporary, exclusive access to a shared resource, but a timing window exists in which the shared resource can be modified by another code sequence that is operating concurrently.
Extended Description
This can have security implications when the expected synchronization is in security-critical code, such as recording whether a user is authenticated or modifying important state information that should not be influenced by an outsider.
A race condition occurs within concurrent environments, and is effectively a property of a code sequence. Depending on the context, a code sequence may be in the form of a function call, a small number of instructions, a series of program invocations, etc.
A race condition violates these properties, which are closely related:
Exclusivity - the code sequence is given exclusive access to the shared resource, i.e., no other code sequence can modify properties of the shared resource before the original sequence has completed execution.
Atomicity - the code sequence is behaviorally atomic, i.e., no other thread or process can concurrently execute the same sequence of instructions (or a subset) against the same resource.
A race condition exists when an "interfering code sequence" can still access the shared resource, violating exclusivity. Programmers may assume that certain code sequences execute too quickly to be affected by an interfering code sequence; when they are not, this violates atomicity. For example, the single "x++" statement may appear atomic at the code layer, but it is actually non-atomic at the instruction layer, since it involves a read (the original value of x), followed by a computation (x+1), followed by a write (save the result to x).
The interfering code sequence could be "trusted" or "untrusted." A trusted interfering code sequence occurs within the program; it cannot be modified by the attacker, and it can only be invoked indirectly. An untrusted interfering code sequence can be authored directly by the attacker, and typically it is external to the vulnerable program.
Time of Introduction
Architecture and Design
Implementation
Applicable Platforms
Languages
C: (Sometimes)
C++: (Sometimes)
Java: (Sometimes)
Language-independent
Architectural Paradigms
Concurrent Systems Operating on Shared Resources: (Often)
When a race condition makes it possible to bypass a resource cleanup routine or trigger multiple initialization routines, it may lead to resource exhaustion (CWE-400).
When a race condition allows multiple control flows to access a
resource simultaneously, it might lead the program(s) into unexpected
states, possibly resulting in a crash.
Confidentiality
Integrity
Technical Impact: Read files or
directories; Read application
data
When a race condition is combined with predictable resource names and loose permissions, it may be possible for an attacker to overwrite or access confidential data (CWE-59).
Likelihood of Exploit
Medium
Detection Methods
Black Box
Black box methods may be able to identify evidence of race conditions
via methods such as multiple simultaneous connections, which may cause
the software to become instable or crash. However, race conditions with
very narrow timing windows would not be detectable.
White Box
Common idioms are detectable in white box analysis, such as time-of-check-time-of-use (TOCTOU) file operations (CWE-367), or double-checked locking (CWE-609).
Automated Dynamic Analysis
This weakness can be detected using dynamic tools and techniques that
interact with the software using large test suites with many diverse
inputs, such as fuzz testing (fuzzing), robustness testing, and fault
injection. The software's operation may slow down, but it should not
become unstable, crash, or generate incorrect results.
Race conditions may be detected with a stress-test by calling the
software simultaneously from a large number of threads or processes, and
look for evidence of any unexpected behavior.
Insert breakpoints or delays in between relevant code statements to
artificially expand the race window so that it will be easier to
detect.
Effectiveness: Moderate
Demonstrative Examples
Example 1
This code could be used in an e-commerce application that supports
transfers between accounts. It takes the total amount of the transfer, sends
it to the new account, and deducts the amount from the original
account.
(Bad Code)
Example
Language: Perl
$transfer_amount = GetTransferAmount();
$balance = GetBalanceFromDatabase();
if ($transfer_amount < 0) {
FatalError("Bad Transfer Amount");
}
$newbalance = $balance - $transfer_amount;
if (($balance - $transfer_amount) < 0) {
FatalError("Insufficient Funds");
}
SendNewBalanceToDatabase($newbalance);
NotifyUser("Transfer of $transfer_amount succeeded.");
NotifyUser("New balance: $newbalance");
A race condition could occur between the calls to
GetBalanceFromDatabase() and SendNewBalanceToDatabase().
Suppose the balance is initially 100.00. An attack could be
constructed as follows:
(Attack)
Example
Language: PseudoCode
The attacker makes two simultaneous calls of the program, CALLER-1
and CALLER-2. Both callers are for the same user account.
CALLER-1 (the attacker) is associated with PROGRAM-1 (the instance
that handles CALLER-1). CALLER-2 is associated with
PROGRAM-2.
CALLER-1 makes a transfer request of 80.00.
PROGRAM-1 calls GetBalanceFromDatabase and sets $balance to
100.00
PROGRAM-1 calculates $newbalance as 20.00, then calls
SendNewBalanceToDatabase().
Due to high server load, the PROGRAM-1 call to
SendNewBalanceToDatabase() encounters a delay.
CALLER-2 makes a transfer request of 1.00.
PROGRAM-2 calls GetBalanceFromDatabase() and sets $balance to
100.00. This happens because the previous PROGRAM-1 request was not
processed yet.
PROGRAM-2 determines the new balance as 99.00.
After the initial delay, PROGRAM-1 commits its balance to the
database, setting it to 20.00.
PROGRAM-2 sends a request to update the database, setting the
balance to 99.00
At this stage, the attacker should have a balance of 19.00 (due to
81.00 worth of transfers), but the balance is 99.00, as recorded in the
database.
To prevent this weakness, the programmer has several options,
including using a lock to prevent multiple simultaneous requests to the
web application, or using a synchronization mechanism that includes all
the code between GetBalanceFromDatabase() and
SendNewBalanceToDatabase().
Example 2
The following function attempts to acquire a lock in order to
perform operations on a shared resource.
(Bad Code)
Example
Language: C
void f(pthread_mutex_t *mutex) {
pthread_mutex_lock(mutex);
/* access shared resource */
pthread_mutex_unlock(mutex);
}
However, the code does not check the value returned by
pthread_mutex_lock() for errors. If pthread_mutex_lock() cannot acquire
the mutex for any reason, the function may introduce a race condition
into the program and result in undefined behavior.
In order to avoid data races, correctly written programs must check
the result of thread synchronization functions and appropriately handle
all errors, either by attempting to recover from them or reporting it to
higher levels.
chain: time-of-check time-of-use (TOCTOU) race
condition in program allows bypass of protection mechanism that was designed
to prevent symlink attacks.
chain: time-of-check time-of-use (TOCTOU) race
condition in program allows bypass of protection mechanism that was designed
to prevent symlink attacks.
chain: race condition might allow resource to be
released before operating on it, leading to NULL dereference
Potential Mitigations
Phase: Architecture and Design
In languages that support it, use synchronization primitives. Only
wrap these around critical code to minimize the impact on
performance.
Phase: Architecture and Design
Use thread-safe capabilities such as the data access abstraction in
Spring.
Phase: Architecture and Design
Minimize the usage of shared resources in order to remove as much
complexity as possible from the control flow and to reduce the
likelihood of unexpected conditions occurring.
Additionally, this will minimize the amount of synchronization necessary and may even help to reduce the likelihood of a denial of service where an attacker may be able to repeatedly trigger a critical section (CWE-400).
Phase: Implementation
When using multithreading and operating on shared variables, only use
thread-safe functions.
Phase: Implementation
Use atomic operations on shared variables. Be wary of innocent-looking
constructs such as "x++". This may appear atomic at the code layer, but
it is actually non-atomic at the instruction layer, since it involves a
read, followed by a computation, followed by a write.
Phase: Implementation
Use a mutex if available, but be sure to avoid related weaknesses such as CWE-412.
Phase: Implementation
Avoid double-checked locking (CWE-609) and other implementation errors that arise when trying to avoid the overhead of synchronization.
Phase: Implementation
Disable interrupts or signals over critical parts of the code, but
also make sure that the code does not go into a large or infinite
loop.
Phase: Implementation
Use the volatile type modifier for critical variables to avoid
unexpected compiler optimization or reordering. This does not
necessarily solve the synchronization problem, but it can help.
Phases: Architecture and Design; Operation
Strategy: Environment Hardening
Run your code using the lowest privileges that are required to accomplish the necessary tasks [R.362.11]. If possible, create isolated accounts with limited privileges that are only used for a single task. That way, a successful attack will not immediately give the attacker access to the rest of the software or its environment. For example, database applications rarely need to run as the database administrator, especially in day-to-day operations.
Race conditions in web applications are under-studied and probably
under-reported. However, in 2008 there has been growing interest in this
area.
Much of the focus of race condition research has been in Time-of-check Time-of-use (TOCTOU) variants (CWE-367), but many race conditions are related to synchronization problems that do not necessarily require a time-of-check.
Taxonomy Mappings
Mapped Taxonomy Name
Node ID
Fit
Mapped Node Name
PLOVER
Race Conditions
CERT C Secure Coding
FIO31-C
Do not simultaneously open the same file multiple
times
CERT Java Secure Coding
VNA03-J
Do not assume that a group of calls to independently atomic
methods is atomic
CERT C++ Secure Coding
FIO31-CPP
Do not simultaneously open the same file multiple
times
Leveraging Time-of-Check and Time-of-Use (TOCTOU) Race Conditions
References
[R.362.1] [REF-17] Michael Howard, David LeBlanc
and John Viega. "24 Deadly Sins of Software Security". "Sin 13: Race Conditions." Page 205. McGraw-Hill. 2010.
[R.362.2] Andrei Alexandrescu. "volatile - Multithreaded Programmer's Best
Friend". Dr. Dobb's. 2008-02-01. <http://www.ddj.com/cpp/184403766>.
The relationship between race conditions and synchronization problems (CWE-662) needs to be further developed. They are not necessarily two perspectives of the same core concept, since synchronization is only one technique for avoiding race conditions, and synchronization can be used for other purposes besides race condition prevention.