CWE

Common Weakness Enumeration

A Community-Developed List of Software & Hardware Weakness Types

CWE Top 25 Most Dangerous Weaknesses
Home > CWE List > CWE- Individual Dictionary Definition (4.4)  
ID

CWE-1333: Inefficient Regular Expression Complexity

Weakness ID: 1333
Abstraction: Base
Structure: Simple
Status: Draft
Presentation Filter:
+ Description
The product uses a regular expression with an inefficient, possibly exponential worst-case computational complexity that consumes excessive CPU cycles.
+ Extended Description
Some regular expression engines have a feature called "backtracking". If the token cannot match, the engine "backtracks" to a position that may result in a different token that can match. Backtracking becomes a weakness if all of these conditions are met:
  • The number of possible backtracking attempts are exponential relative to the length of the input.
  • The input can fail to match the regular expression.
  • The input can be long enough.

Attackers can create crafted inputs that intentionally cause the regular epression to use excessive backtracking in a way that causes the CPU consumption to spike.

+ Alternate Terms
ReDoS:
ReDoS is an abbreviation of "Regular expression Denial of Service".
Regular Expression Denial of Service:
While this term is attack-focused, this is commonly used to describe the weakness.
Catastrophic backtracking:
This term is used to describe the behavior of the regular expression as a negative technical impact.
+ Relationships

The table(s) below shows the weaknesses and high level categories that are related to this weakness. These relationships are defined as ChildOf, ParentOf, MemberOf and give insight to similar items that may exist at higher and lower levels of abstraction. In addition, relationships such as PeerOf and CanAlsoBe are defined to show similar weaknesses that the user may want to explore.

+ Relevant to the view "Research Concepts" (CWE-1000)
NatureTypeIDName
ChildOfClassClass - a weakness that is described in a very abstract fashion, typically independent of any specific language or technology. More specific than a Pillar Weakness, but more general than a Base Weakness. Class level weaknesses typically describe issues in terms of 1 or 2 of the following dimensions: behavior, property, and resource.185Incorrect Regular Expression
ChildOfClassClass - a weakness that is described in a very abstract fashion, typically independent of any specific language or technology. More specific than a Pillar Weakness, but more general than a Base Weakness. Class level weaknesses typically describe issues in terms of 1 or 2 of the following dimensions: behavior, property, and resource.407Inefficient Algorithmic Complexity
+ Modes Of Introduction

The different Modes of Introduction provide information about how and when this weakness may be introduced. The Phase identifies a point in the life cycle at which introduction may occur, while the Note provides a typical scenario related to introduction during the given phase.

PhaseNote
ImplementationA RegEx can be easy to create and read using unbounded matching characters, but the programmer might not consider the risk of excessive backtracking.
+ Applicable Platforms
The listings below show possible areas for which the given weakness could appear. These may be for specific named Languages, Operating Systems, Architectures, Paradigms, Technologies, or a class of such platforms. The platform is listed along with how frequently the given weakness appears for that instance.

Languages

Class: Language-Independent (Undetermined Prevalence)

+ Common Consequences

The table below specifies different individual consequences associated with the weakness. The Scope identifies the application security area that is violated, while the Impact describes the negative technical impact that arises if an adversary succeeds in exploiting this weakness. The Likelihood provides information about how likely the specific consequence is expected to be seen relative to the other consequences in the list. For example, there may be high likelihood that a weakness will be exploited to achieve a certain impact, but a low likelihood that it will be exploited to achieve a different impact.

ScopeImpactLikelihood
Availability

Technical Impact: DoS: Resource Consumption (CPU)

High
+ Likelihood Of Exploit
High
+ Demonstrative Examples

Example 1

This example attempts to check if an input string is a "sentence" [REF-1164].

(bad code)
Example Language: JavaScript 
var test_string = "Bad characters: $@#";
var bad_pattern = /^(\w+\s?)*$/i;
var result = test_string.search(bad_pattern);

The regular expression has a vulnerable backtracking clause inside (\w+\s?)*$ which can be triggered to cause a Denial of Service by processing particular phrases.

To fix the backtracking problem, backtracking is removed with the ?= portion of the expression which changes it to a lookahead and the \2 which prevents the backtracking. The modified example is:

(good code)
Example Language: JavaScript 
var test_string = "Bad characters: $@#";
var good_pattern = /^((?=(\w+))\2\s?)*$/i;
var result = test_string.search(good_pattern);

Note that [REF-1164] has a more thorough (and lengthy) explanation of everything going on within the RegEx.

Example 2

This example attempts to check if an input string is a "sentence" and is modified for Perl [REF-1164].

(bad code)
Example Language: Perl 
my $test_string = "Bad characters: \$\@\#";
my $bdrslt = $test_string;
$bdrslt =~ /^(\w+\s?)*$/i;

The regular expression has a vulnerable backtracking clause inside (\w+\s?)*$ which can be triggered to cause a Denial of Service by processing particular phrases.

To fix the backtracking problem, backtracking is removed with the ?= portion of the expression which changes it to a lookahead and the \2 which prevents the backtracking. The modified example is:

(good code)
Example Language: Perl 
my $test_string = "Bad characters: \$\@\#";
my $gdrslt = $test_string;
$gdrslt =~ /^((?=(\w+))\2\s?)*$/i;

Note that [REF-1164] has a more thorough (and lengthy) explanation of everything going on within the RegEx.

+ Observed Examples
ReferenceDescription
Markdown parser uses inefficient regex when processing a message, allowing users to cause CPU consumption and delay preventing processing of other messages.
Long string in a version control product allows DoS due to an inefficient regex.
Javascript code allows ReDoS via a long string due to excessive backtracking.
ReDoS when parsing time.
ReDoS when parsing documents.
ReDoS when validating URL.
+ Potential Mitigations

Phase: Architecture and Design

Use regular expressions that do not support backtracking, e.g. by removing nested quantifiers.

Effectiveness: High

Note: This is one of the few effective solutions when using user-provided regular expressions.

Phase: System Configuration

Configure backtracking limits in the configuration of the regular expression implementation, such as PHP's pcre.backtrack_limit. Also consider limits on execution time for the process.

Effectiveness: Moderate

Phase: Implementation

Do not use regular expressions with untrusted input. If regular expressions must be used, avoid using backtracking in the expression.

Effectiveness: High

Phase: Implementation

Limit the length of the input that the regular expression will process.

Effectiveness: Moderate

+ References
[REF-1162] Jan Goyvaerts. "Runaway Regular Expressions: Catastrophic Backtracking". . 2019-12-22. <https://www.regular-expressions.info/catastrophic.html>.
[REF-1163] Adar Weidman. "Regular expression Denial of Service - ReDoS". . <https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS>.
[REF-1164] Ilya Kantor. "Catastrophic backtracking". . 2020-12-13. <https://javascript.info/regexp-catastrophic-backtracking>.
[REF-1165] Cristian-Alexandru Staicu and Michael Pradel. "Freezing the Web: A Study of ReDoS Vulnerabilities in JavaScript-based Web Servers". . USENIX Security Symposium. 2018-07-11. <https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-staicu.pdf>.
[REF-1166] James C. Davis, Christy A. Coghlan, Francisco Servant and Dongyoon Lee. "The Impact of Regular Expression Denial of Service (ReDoS) in Practice: An Empirical Study at the Ecosystem Scale". . 2018-08-01. <https://people.cs.vt.edu/fservant/papers/Davis_Coghlan_Servant_Lee_ESECFSE18.pdf>.
[REF-1167] James Davis. "The Regular Expression Denial of Service (ReDoS) cheat-sheet". . 2020-05-23. <https://levelup.gitconnected.com/the-regular-expression-denial-of-service-redos-cheat-sheet-a78d0ed7d865>.
More information is available — Please select a different filter.
Page Last Updated: March 15, 2021