How regex backtracking works
Most regex engines use a backtracking NFA (Non-deterministic Finite Automaton) approach. When a pattern fails to match at a given position, the engine backtracks — it undoes its last decision and tries an alternative. This is what makes regex flexible and powerful. It is also what makes certain patterns catastrophically slow.
Consider the simple case: the regex a*applied to the string "aaa". The engine can match 3 a's, 2 a's, 1 a, or zero. If it tries 3 and fails a subsequent condition, it backtracks to 2, tries, fails, backtracks to 1, and so on. For a simple pattern, this is a handful of attempts. For certain pathological patterns with nested quantifiers, the number of possible backtrack paths grows exponentially with input length.
The classic catastrophic pattern
// This regex is catastrophically slow on certain inputs: /(a+)+$/ // Test input: "aaaaaaaaaaaaaaaaaaaX" // The X at the end causes a match failure. // The engine must explore every possible grouping of 'a' characters. // For n=20 'a' characters before X: ~2^20 = 1,048,576 attempts. // For n=30: ~2^30 = 1,073,741,824 attempts. Seconds become hours.
The pattern (a+)+ allows the outer + to try every possible number of repetitions of the inner group, and the inner + to try every possible length of each repetition. The combinations multiply exponentially. This is catastrophic backtracking — the match fails, but failing takes O(2ⁿ) time instead of O(n).
The Cloudflare incident
The July 2019 Cloudflare outage was caused by a new WAF (Web Application Firewall) rule that contained a regex with catastrophic backtracking. When deployed, the rule was applied to every HTTP request passing through Cloudflare's infrastructure. Under real traffic, inputs arrived that triggered the backtracking condition. The regex engine consumed all available CPU time. All other processing on those machines stopped. The outage affected Cloudflare's entire network.
The rule was introduced by a human writing a WAF signature — a routine security operations task. The catastrophic backtracking was not obvious from code review. The testing environment did not generate the specific input pattern that triggers worst-case behavior. The deployed regex pattern looked plausible. The consequences were a global internet incident.
Patterns vulnerable to ReDoS
// Nested quantifiers — DANGEROUS: (a+)+ // nested + inside + (a*)* b // nested * inside * (a|aa)+ // alternation with overlap inside quantifier // Alternation where options share prefixes — DANGEROUS: (a|ab)+c // 'a' and 'ab' share prefix, forces backtracking at 'c' (x+x+)+y // classic catastrophic pattern // Safe equivalents: a+ // not nested, linear time (a|b)+ // no overlap between alternatives, linear time
The common warning signs: nested quantifiers ((a+)*, (a*)+), alternation inside a repeated group where the alternatives overlap ((a|ab)+), and repeated groups that can match the empty string.
How to protect against ReDoS
- Static analysis tools. Tools like
safe-regex(npm),recheck, andvuln-regex-detectoranalyze regex patterns for catastrophic backtracking without executing them. Integrate them into CI. - Input length limits. Catastrophic backtracking scales exponentially with input length. Enforcing a maximum length on user inputs before regex evaluation bounds the worst-case time, even for vulnerable patterns.
- Use linear-time regex engines.RE2 (used by Go, RE2 library for other languages) and Rust's regex crate use a DFA-based approach that guarantees linear-time matching, trading certain Perl-style features (backreferences, lookaheads) for safety. For most validation regexes, these features are not needed.
- Possessive quantifiers and atomic groups.Some engines (Java, PCRE) support possessive quantifiers (
a++) and atomic groups ((?>...)) that disable backtracking for that group — eliminating the catastrophic path at the pattern level. - Execution timeouts. As a last line of defense, apply a timeout to regex execution. Node.js added
timeoutsupport toRegExp.prototype.execin recent versions. Set it. Any regex that takes more than a few milliseconds on user input is a bug.
The takeaway for production code
Every regex that runs on user-supplied input in a server context is a potential ReDoS vector. This is not an excuse to avoid regex — it is an argument for treating regex like any other code that handles untrusted input: with scrutiny, tooling, and defensive limits. Runsafe-regex on your validation patterns in CI. Enforce input length limits before regex evaluation. Consider RE2 for validation-heavy code paths. The Cloudflare incident was not the first ReDoS-caused outage and will not be the last. The patterns that cause it are common. The mitigations are available. There is no good reason not to use them.
Test your regex patterns safely
Regex Tester — test patterns with live highlighting →