Regular Expression Denial of Service (ReDoS) attacks exploit poorly constructed regular expressions in applications. These attacks can cause an application to consume excessive CPU time and eventually crash or slow down, resulting in a denial of service.
How ReDoS Attacks Work
ReDoS attacks target regular expressions that exhibit exponential time complexity when processing certain inputs. These expressions often contain ambiguous quantifiers or nested quantifiers (e.g., (a+)+
or .*a.*a.*
). Attackers craft input strings designed to exploit this inefficiency, forcing the applicationโs regex engine to backtrack excessively.
Key Concepts:
- Backtracking: Regex engines often use a backtracking mechanism to test different possibilities for matching patterns. Backtracking is an efficient way to match complex patterns in most cases, but poorly constructed regex patterns can lead to excessive backtracking and slow performance.
-
Ambiguous Patterns: Patterns with overlapping possibilities (e.g.,
.*
anda+
) can lead to inefficiency because the regex engine must explore multiple matching paths. - Crafted Input: Attackers create inputs that maximize backtracking steps. These inputs are often longer strings with repetitive or non-matching segments designed to trap the regex engine in a loop of failed attempts.
Detailed Workflow:
- The regex engine attempts to match the input against the pattern, starting from the first character.
- If a match is not found, the engine backtracks to try different possibilities.
- For certain patterns, the number of backtracking steps grows exponentially with the length of the input.
For example, given the regex pattern (a+)+
and the input aaaaab
, the engine first matches all five a
s, then fails at b
. It then backtracks to try splitting the a
s into smaller groups (e.g., [aaaa][ab]
, [aaa][aab]
), repeating this process for every possible grouping.
Examples of Vulnerable Regular Expressions
Example 1: Nested Quantifiers
Pattern: ^(a+)+$
Input: aaaaab
This pattern is highly vulnerable to backtracking because the nested quantifier (a+)+
forces the engine to explore all possible groupings of a
s before concluding that b
cannot be matched.
Example 2: Overlapping Wildcards
Pattern: ^(.*a.*a.*)$
Input: aaaaaaaaaaaaaaaaab
The engine tries to match the first .*
with all a
s, then backtracks to test shorter segments to satisfy the rest of the pattern. This process is computationally expensive for long strings.
Example 3: Alternation with Wildcards
Pattern: ^(a|aa)+$
Input: aaaaaaaaaaaaaaaaab
The alternation (a|aa)
leads to multiple overlapping possibilities for how a
s are grouped. The engine explores all these possibilities, causing excessive backtracking.
Real-World Impact
ReDoS attacks can have serious consequences, especially in web applications that validate user inputs with regex. Attackers can exploit these vulnerabilities to crash servers, delay responses, or consume resources, leading to denial-of-service conditions.
Metrics of an Attack
Consider the pattern ^(a+)+$
:
-
Input length: 5
Execution time: ~1ms
-
Input length: 50
Execution time: ~10 seconds
-
Input length: 500
Execution time: effectively infinite (application crashes).
These metrics demonstrate how small increases in input size result in exponential increases in processing time.
Preventing ReDoS Attacks
-
Avoid Ambiguous Quantifiers: Simplify regex patterns to avoid overlapping or nested quantifiers. For instance, instead of
(a+)+
, usea{2,}
. - Use ReDoS-Safe Libraries: Many modern libraries (e.g., RE2) avoid backtracking by design. These libraries implement regex matching in linear time, making them immune to ReDoS attacks.
- Limit Input Length: Restrict the size of inputs processed by regular expressions. Input validation can prevent attackers from supplying overly long strings.
- Timeouts: Impose a timeout on regex operations to prevent excessive resource consumption.
- Test for Vulnerabilities: Use tools like fuzz testing to identify regex patterns that exhibit exponential complexity.
Additional Examples of Secure Alternatives
Securing Example 1:
Instead of ^(a+)+$
, use:
javascriptconst regex = /^a+$/;
This eliminates the nested quantifiers and ensures linear performance.
Securing Example 2:
Instead of ^(.*a.*a.*)$
, use:
javascriptconst regex = /^[^a]*a[^a]*a[^a]*$/;
This removes ambiguity by explicitly specifying the expected positions of a
characters.
Securing Example 3:
Instead of ^(a|aa)+$
, use:
javascriptconst regex = /^a+$/;
By clarifying the expected input, the risk of excessive backtracking is mitigated.
Conclusion
ReDoS attacks highlight the importance of designing efficient and secure regex patterns. Developers must understand the potential pitfalls of regex backtracking and test patterns against adversarial inputs. By following best practices and leveraging modern tools, applications can remain resilient against this subtle but impactful threat.