Understanding ReDoS (Regular Expression Denial of Service) Attacks

Regular Expression Denial of Service (ReDoS) attacks exploit poorly constructed regular expressions in applications. These attacks can cause an application to consume excessive CPU time and eventually crash or slow down, resulting in a denial of service.

How ReDoS Attacks Work

ReDoS attacks target regular expressions that exhibit exponential time complexity when processing certain inputs. These expressions often contain ambiguous quantifiers or nested quantifiers (e.g., (a+)+ or .*a.*a.*). Attackers craft input strings designed to exploit this inefficiency, forcing the application’s regex engine to backtrack excessively.

Key Concepts:

Backtracking: Regex engines often use a backtracking mechanism to test different possibilities for matching patterns. Backtracking is an efficient way to match complex patterns in most cases, but poorly constructed regex patterns can lead to excessive backtracking and slow performance.
Ambiguous Patterns: Patterns with overlapping possibilities (e.g., .* and a+) can lead to inefficiency because the regex engine must explore multiple matching paths.
Crafted Input: Attackers create inputs that maximize backtracking steps. These inputs are often longer strings with repetitive or non-matching segments designed to trap the regex engine in a loop of failed attempts.

Detailed Workflow:

The regex engine attempts to match the input against the pattern, starting from the first character.
If a match is not found, the engine backtracks to try different possibilities.
For certain patterns, the number of backtracking steps grows exponentially with the length of the input.

For example, given the regex pattern (a+)+ and the input aaaaab, the engine first matches all five as, then fails at b. It then backtracks to try splitting the as into smaller groups (e.g., [aaaa][ab], [aaa][aab]), repeating this process for every possible grouping.

Examples of Vulnerable Regular Expressions

Example 1: Nested Quantifiers

Pattern: ^(a+)+$

Input: aaaaab

This pattern is highly vulnerable to backtracking because the nested quantifier (a+)+ forces the engine to explore all possible groupings of as before concluding that b cannot be matched.

Example 2: Overlapping Wildcards

Pattern: ^(.*a.*a.*)$

Input: aaaaaaaaaaaaaaaaab

The engine tries to match the first .* with all as, then backtracks to test shorter segments to satisfy the rest of the pattern. This process is computationally expensive for long strings.

Example 3: Alternation with Wildcards

Pattern: ^(a|aa)+$

Input: aaaaaaaaaaaaaaaaab

The alternation (a|aa) leads to multiple overlapping possibilities for how as are grouped. The engine explores all these possibilities, causing excessive backtracking.

Real-World Impact

ReDoS attacks can have serious consequences, especially in web applications that validate user inputs with regex. Attackers can exploit these vulnerabilities to crash servers, delay responses, or consume resources, leading to denial-of-service conditions.

Metrics of an Attack

Consider the pattern ^(a+)+$:

Input length: 5

Execution time: ~1ms
Input length: 50

Execution time: ~10 seconds
Input length: 500

Execution time: effectively infinite (application crashes).

These metrics demonstrate how small increases in input size result in exponential increases in processing time.

Preventing ReDoS Attacks

Avoid Ambiguous Quantifiers: Simplify regex patterns to avoid overlapping or nested quantifiers. For instance, instead of (a+)+, use a{2,}.
Use ReDoS-Safe Libraries: Many modern libraries (e.g., RE2) avoid backtracking by design. These libraries implement regex matching in linear time, making them immune to ReDoS attacks.
Limit Input Length: Restrict the size of inputs processed by regular expressions. Input validation can prevent attackers from supplying overly long strings.
Timeouts: Impose a timeout on regex operations to prevent excessive resource consumption.
Test for Vulnerabilities: Use tools like fuzz testing to identify regex patterns that exhibit exponential complexity.

Additional Examples of Secure Alternatives

Securing Example 1:

Instead of ^(a+)+$, use:

javascript
const regex = /^a+$/;

This eliminates the nested quantifiers and ensures linear performance.

Securing Example 2:

Instead of ^(.*a.*a.*)$, use:

javascript
const regex = /^[^a]*a[^a]*a[^a]*$/;

This removes ambiguity by explicitly specifying the expected positions of a characters.

Securing Example 3:

Instead of ^(a|aa)+$, use:

javascript
const regex = /^a+$/;

By clarifying the expected input, the risk of excessive backtracking is mitigated.

Conclusion

ReDoS attacks highlight the importance of designing efficient and secure regex patterns. Developers must understand the potential pitfalls of regex backtracking and test patterns against adversarial inputs. By following best practices and leveraging modern tools, applications can remain resilient against this subtle but impactful threat.