Regex Flavors Family Index

type: language-family-index family: regex-flavors languages_catalogued: 22 tags: [language-reference, family-index, regex-flavors, regular-expressions, pattern-matching, pcre, re2, redos, unicode]

Regex Flavors — Family Index

Family overview

Regex has two histories that don’t quite fit together. The theoretical history starts with Stephen Kleene’s 1956 paper on “regular sets” — a closure-algebra description of the languages recognised by finite automata, equivalent in expressive power to NFAs and DFAs and provably decidable in O(n) time. The practical history starts with Ken Thompson’s 1968 paper “Regular Expression Search Algorithm,” which became the regex engine in ed, then grep, then the IEEE POSIX.2 standard of 1986 (POSIX BRE and ERE). Up through this point, “regex” still meant “regular language” in the formal sense — every supported feature could be expressed as a finite automaton and matched in linear time.

Then came Perl. Larry Wall shipped Perl 1.0 in 1987 with a pragmatic regex engine, and across Perl 2/3/4/5 (1988–1994) it accreted features that broke true regularity: backreferences (\1, which require remembering arbitrary captured substrings, pushing the language up to NP-complete in the worst case), lookaround assertions ((?=...), (?<=...), (?!...), (?<!...)), atomic groups ((?>...)), possessive quantifiers (*+, ++, ?+), conditionals ((?(1)yes|no)), and recursion ((?R), (?1)). Philip Hazel extracted this dialect into the standalone PCRE library in 1997, and PCRE became the de facto Ur-flavor that everyone else cloned, extended, or deliberately departed from. PCRE2 (the 10.x series, current 10.47 released October 2025) is the modern continuation; the original PCRE 8.x is end-of-life.

The Perl/PCRE family of engines is implemented as backtracking NFAs — when a quantifier is ambiguous, the engine tries one path and rewinds if it fails. This is fast on most inputs but has a notorious worst case: catastrophic backtracking, where a pattern like ^(a+)+$ matched against "aaaaaaaaaaaaaaa!" takes exponential time. Russ Cox documented this in his 2007 series “Regular Expression Matching Can Be Simple and Fast”, reviving Thompson’s NFA-simulation algorithm and showing that for the truly regular subset (no backreferences, no lookarounds), linear-time matching is straightforward. This work became RE2 at Google (2010, C++), and inspired a whole lineage of “RE2-style” engines — Go’s regexp (standard library), Rust’s regex crate, and parts of .NET’s RegexOptions.NonBacktracking mode. The modern split is: backtracking engines (PCRE2, .NET default, Java java.util.regex, JavaScript V8/JSC, Python re and regex, Ruby Onigmo) accept the full Perl-extended dialect including backreferences and lookarounds and risk ReDoS; automata engines (RE2, Go, Rust, Hyperscan, .NET non-backtracking) refuse the irregular features and guarantee linear time. ReDoS as a security category became a CVE-able class around 2012–2017 and is now a routine finding in static analysis tools.

The fourth axis is Unicode. The Unicode Consortium’s UTS #18 defines three conformance levels for regex Unicode support — Level 1 (basic code-point handling, simple loose matches, basic property classes like \p{L}), Level 2 (extended grapheme clusters, full case folding, named characters, default-ignorable handling), and Level 3 (locale-tailored). Most engines hit Level 1; only a few (ICU, Python regex, .NET, increasingly JS with /v) push into Level 2. JavaScript’s /v flag (ES2024, Stage 4 in 2023, shipping in V8 11.2 / Chrome 112 / Safari 17 / Node.js 20) is the most consequential recent addition: it enables Unicode “set notation” — nested character classes, set difference ([A--B]), set intersection ([A&&B]), multi-character string properties via \p{...} and \q{...}, and proper case-insensitive matching for negated property sets. It supersedes the older ES2015 /u flag for any new Unicode-aware regex work.

In our deep library

Languages with first-class regex stories that have their own deep notes:

perl — the Ur-flavor; PCRE descends from Perl 5.x, and many Perl extensions were retrofitted into PCRE2.
python — built-in re (limited, ASCII-default \w) and the third-party regex module (variable-length lookbehinds, possessive quantifiers, atomic groups, full Unicode property support, concurrent=True GIL release).
javascript — V8/JSC regex engines; /u (ES2015) and /v (ES2024) flags, lookbehinds (ES2018), named groups (ES2018), Unicode property escapes (ES2018).
java — java.util.regex.Pattern, backtracking NFA, UNICODE_CHARACTER_CLASS flag for UTS #18 Level 1 conformance, named groups since Java 7.
csharp — System.Text.RegularExpressions, source-generated regexes via [GeneratedRegex] (.NET 7+), RegexOptions.NonBacktracking (.NET 7+, derivative-based linear-time engine from MSR).
ruby — Onigmo (since Ruby 2.0), forked from Oniguruma which was archived in April 2025; backtracking NFA with broad encoding support.
go — regexp package, RE2 lineage, deliberately rejects backreferences and lookarounds for O(n) guarantees.
rust — regex crate by Andrew Gallant, RE2-style with Rust-specific optimisations, paired with the lower-level regex-automata and the third-party fancy-regex for Perl-like features.
cpp — std::regex (C++11, ECMAScript dialect by default + POSIX modes), Boost.Xpressive (compile-time + run-time regex via expression templates), Boost.Regex.
php — preg_* family wraps PCRE2 directly.
bash — =~ operator uses POSIX ERE; grep, sed, awk flavors covered below.
lua — uses Lua patterns, deliberately not a full regex flavor (no alternation, no backtracking — see “Notable threads” below).

Adjacent Tier 3 notes:

query — SQL LIKE, SIMILAR TO (POSIX-flavor in PostgreSQL/Snowflake), and dialect-specific REGEXP_* functions; Splunk SPL’s rex; Lucene query parser regex.
config-and-dsl — .gitignore, Apache mod_rewrite, and assorted config DSLs that embed regex or glob fragments.
notation-spec — ABNF, EBNF, PEG; the family of grammar formalisms that subsumes regex.

Tier 3 family table

Flavor	First appeared	Origin	Engine type	Notable features	Status (2026)	URL
POSIX BRE	1986 (POSIX.2)	IEEE / Open Group	NFA-simulation, regular language	Basic Regular Expressions: backslash-escaped metacharacters (`$`, `$`, `\{`, `\}`); the dialect of `grep` (no flag) and traditional `sed`; foundational and intentionally minimal	Stable, mostly legacy outside Unix tooling	https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html
POSIX ERE	1986 (POSIX.2)	IEEE / Open Group	NFA-simulation, regular language	Extended Regular Expressions: bare metacharacters (`(`, `)`, `{`, `}`, `+`, `?`, `	`); the dialect of`egrep`/`grep -E`,`awk`, modern`sed -E`	Stable; the lingua franca of POSIX text tooling
Perl 5 regex	1987 (Perl 1) → mature in Perl 5 (1994)	Larry Wall	Backtracking NFA	The Ur-flavor: lookarounds, named captures (`(?<name>...)`), atomic groups (`(?>...)`), possessive quantifiers (`*+`), conditionals, recursion (`(?R)`, `(?&name)`), `\K` keep-out, embedded code `(?{...})`	Active; tracks Perl release cadence	https://perldoc.perl.org/perlre
PCRE2	1997 (PCRE 1.0) → 2015 (PCRE2 10.0)	Philip Hazel, University of Cambridge	Backtracking NFA + JIT	Standalone C library cloning Perl 5 regex; UTF-8/16/32 modes; `pcre2grep` CLI; widely embedded (PHP, nginx, Apache, R, many languages); current 10.47 (Oct 2025)	Very active; PCRE 8.x is EOL, PCRE2 is the supported line	https://www.pcre.org/
RE2	2010	Russ Cox / Google	NFA-simulation, regular language	Linear-time guarantee for arbitrary input; no backreferences, no general lookarounds; bounded memory; safe for adversarial patterns; C++ library	Very active (used in Google production, Cloud Logging, code search)	https://github.com/google/re2
Go `regexp`	2012 (Go 1.0)	Russ Cox / Go team	RE2 port in Go	RE2 syntax exactly; same restrictions (no backreferences, no lookarounds); deliberate language-level commitment to ReDoS safety; `regexp/syntax` exposes the AST	Stable; standard library	https://pkg.go.dev/regexp
Rust `regex` crate	2014 (crate v0.1)	Andrew Gallant (“BurntSushi”)	RE2-style NFA/DFA hybrid	RE2 lineage in syntax and guarantees; rewritten internally as `regex-automata` (multiple matching strategies); paired with `fancy-regex` for backref/lookaround if needed	Very active; de facto Rust ecosystem standard	https://docs.rs/regex/
Java `java.util.regex`	2002 (Java 1.4)	Sun Microsystems	Backtracking NFA	Pattern/Matcher API; named groups since Java 7; UTS #18 Level 1 with `UNICODE_CHARACTER_CLASS` flag (or `(?U)` inline); `\p{InGreek}` Unicode block syntax; lookarounds and backreferences supported	Active (tracks JDK releases; Java 21/22/23 stable)	https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/regex/Pattern.html
.NET `System.Text.RegularExpressions`	2002 (.NET 1.0)	Microsoft	Backtracking NFA (default) + DFA mode + non-backtracking mode	Default backtracking with rich Perl-like features (balancing groups for matched-paren parsing — unique to .NET); `RegexOptions.Compiled` (IL emit, since .NET 1.0); `[GeneratedRegex]` source-generated regex (.NET 7, 2022); `RegexOptions.NonBacktracking` linear-time mode (.NET 7, 2022, derivative-based MSR engine)	Very active	https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expressions
JavaScript regex (`/u`, `/v`)	1997 (ES1) → `/u` (ES2015) → `/v` (ES2024)	Brendan Eich → ECMA TC39	Backtracking NFA (V8 Irregexp, JSC YarrJIT, SpiderMonkey)	Built into the language as a literal syntax (`/pattern/flags`); `/v` flag (ES2024, TC39 Stage 4 in 2023) added set notation `[A--B]` / `[A&&B]`, nested classes, `\q{...}` string properties; lookbehinds + named groups since ES2018; sticky `/y` since ES2015	Very active; `/v` shipping in V8 11.2 / Chrome 112 / Safari 17 / Node.js 20 (all 2023 era)	https://tc39.es/ecma262/#sec-regexp-regular-expression-objects
Python `re`	1997 (Python 1.5)	Guido van Rossum / core team	Backtracking NFA	Standard library; ASCII-default `\w` (Unicode requires `re.UNICODE` / `(?a)` flags); lookaheads + fixed-length lookbehinds only; named groups; no possessive quantifiers, no atomic groups (until Python 3.11), no recursion	Active; Python 3.14 docs current	https://docs.python.org/3/library/re.html
Python `regex` (PyPI)	2009	Matthew Barnett	Backtracking NFA	Drop-in `re` superset: variable-length lookbehinds, possessive quantifiers, atomic groups, recursive patterns, `\p{Script=Greek}` properties, grapheme clusters `\X`, `concurrent=True` GIL release, fuzzy matching	Very active (latest release April 2026)	https://pypi.org/project/regex/
Ruby Onigmo	2002 (Oniguruma) → 2011 (Onigmo fork) → Ruby 2.0 (2013)	K. Kosako (Oniguruma) → K. Takata (Onigmo)	Backtracking NFA	Multi-encoding (UTF-8, EUC-JP, Shift_JIS, etc.) baked in from the start; backports Perl 5.10+ features like named captures and `\K`; Oniguruma upstream archived April 2025, Onigmo continues for Ruby	Active (Onigmo); Oniguruma archived	https://github.com/k-takata/Onigmo
Vim regex	1991 (Vi → Vim)	Bram Moolenaar (RIP 2023)	Backtracking NFA + NFA-simulation since 7.4	Four “magicness” levels: `\v` very-magic (egrep-like), `\m` magic (default), `\M` nomagic, `\V` very-nomagic; idiosyncratic atom syntax (`$`, `$` for groups in default magic); since Vim 7.4 (2013) supports a Thompson NFA engine alongside the old backtracker	Active (Vim 9.x and Neovim 0.10+)	https://vimdoc.sourceforge.net/htmldoc/pattern.html
Emacs regex	1985	Richard Stallman / GNU Emacs	Backtracking NFA	Lisp-string regexes — every backslash doubles in source code (`"\$"` for `\(`); group via `\(...$`, alternation via `\|`; `re-search-forward`, `looking-at`, `replace-regexp`; the `rx` macro (since Emacs 27) provides s-expression syntax that compiles to the underlying flavor	Active (Emacs 30)	https://www.gnu.org/software/emacs/manual/html_node/elisp/Regular-Expressions.html
Tcl ARE	1999 (Tcl 8.1)	Henry Spencer	Hybrid (NFA + DFA)	“Advanced Regular Expressions”: POSIX ERE superset with Perl-style extensions (lookarounds, non-greedy, named); also used by PostgreSQL’s `~`/`SIMILAR TO` since 7.4; Spencer’s library underpinned MySQL pre-8.0.4 too	Stable	https://www.tcl-lang.org/man/tcl/TclCmd/re_syntax.htm
GNU grep / sed extensions	1988 (GNU grep) / 1992 (GNU sed)	Mike Haertel (grep), Jay Fenlason (sed), GNU project	NFA / DFA hybrid	Extends POSIX BRE/ERE with `\<`, `\>` word boundaries, `\b`, `\B`, `\w`, `\W`, `\s`, `\S`; `grep -P` (Perl mode) shells out to PCRE2; `grep -E`/`-G`/`-F` switch flavors	Very active (coreutils)	https://www.gnu.org/software/grep/manual/grep.html
AWK ERE	1977 (AWK)	Aho, Weinberger, Kernighan / Bell Labs	NFA-simulation	POSIX ERE in `gawk`/`mawk`/`nawk`; pattern matching as a first-class language construct (`/regex/ { action }` rule head); GNU `gawk` adds `\<`, `\>`, `\B`, `\y` word boundaries	Active (gawk 5.x)	https://www.gnu.org/software/gawk/manual/html_node/Regexp.html
POSIX glob	1986 (POSIX.2)	IEEE / Open Group	Token-based, not regex	Not a regex flavor strictly but constantly conflated with one: ``, `?`, `[abc]`, `[!abc]`, `[a-z]` for filename matching; `*` (recursive) is an extension (bash `globstar`, zsh, fish); `fnmatch(3)` and `glob(3)` are the C APIs	Stable, ubiquitous	https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13
Boost.Xpressive	2007 (Boost 1.34)	Eric Niebler	Backtracking NFA	C++ template library: regexes as expression templates — write the pattern as C++ code at compile time (`sregex re = (s1 = +_w) >> '@' >> (s2 = +_w)`) or as a runtime string; same engine handles both; semantic actions in C++	Stable (header-only, in current Boost)	https://www.boost.org/doc/libs/release/doc/html/xpressive.html
Hyperscan / Vectorscan	2008 (Sensory Networks) → 2015 OSS at Intel → 2020 Vectorscan fork	Sensory Networks → Intel → VectorCamp	NFA + literal matchers, SIMD-accelerated	Multi-pattern matching: compile thousands of regexes simultaneously and stream-scan input; SSE/AVX/AVX-512 acceleration; powers Snort, Suricata, ClamAV; Hyperscan 5.4 last OSS release (Intel went proprietary at 5.5); Vectorscan is the community ARM-NEON / Power-VSX / SIMDe portable fork	Active (Vectorscan); Intel branch closed-source post-5.4	https://github.com/intel/hyperscan
Lua patterns	1993	Roberto Ierusalimschy / PUC-Rio	Greedy left-to-right, no backtracking	Deliberately not regex — no alternation, no `?`+`*`+`+` on groups, only on character classes; ~500 LoC implementation; the language docs are explicit that it’s a simpler alternative trading expressive power for tininess	Active (tracks Lua releases)	https://www.lua.org/manual/5.4/manual.html#6.4.1

Notable threads

ReDoS and the Cox/RE2 response (2007 → today). Russ Cox’s 2007 article series is the single most influential piece of writing in modern regex history. He showed that the algorithm Perl, Python, Ruby, Java, .NET, and JavaScript all used (backtracking NFA) had a worst case that was exponential in the input size for patterns like (a?a?a?a?a?aaaaa) — and that Ken Thompson’s 1968 NFA-simulation algorithm matched the same patterns in O(nm) time, modulo backreferences and lookarounds which break true regularity. RE2 (2010) was Cox’s implementation at Google. Go’s standard regexp (2012) is RE2 in Go; Rust’s regex (2014) is “RE2-but-in-Rust”; .NET’s RegexOptions.NonBacktracking (.NET 7, 2022) is a derivative-based variant from Microsoft Research that achieves the same linear-time guarantee. The 2010s saw ReDoS become a documented attack class — Stack Overflow’s 2016 outage from a single ReDoS-vulnerable regex in their Markdown post-processor is the canonical case study, and modern static analyzers (CodeQL, Semgrep, npm audit) flag it as a routine finding.
The JavaScript /v flag (ES2024) — set notation, finally. ES2015 added /u for code-point-correct matching and \p{...} Unicode property escapes (ES2018). The remaining gap was that you couldn’t compose property classes — you could match \p{Script=Greek} and \p{Letter} separately, but not “Greek letters” as a single class. The /v flag (TC39 proposal-regexp-v-flag, Stage 4 in 2023, ECMA-262 in ES2024, V8 11.2 / Chrome 112 / Safari 17 / Node 20) added: nested character classes ([[A-Z]&&[^AEIOU]]), set difference ([\p{Decimal_Number}--[0-9]] for non-ASCII digits), set intersection ([\p{Letter}&&\p{Script=Greek}]), and string properties via \q{...} ([\q{ng|gh|sh}]). It’s effectively UTS #18 Level 1 done properly and brings JS regex meaningfully closer to ICU and the Python regex module. The /v flag implies /u and forbids the legacy “annex B” loosenesses, so it’s also a quiet cleanup of the language’s regex surface.
POSIX ERE vs PCRE: the \1 line. POSIX ERE without backreferences is a true regular language — you can compile any pattern to a DFA and match in O(n) time and O(1) memory. The moment Perl 5 added \1 for backreferences (matching whatever the first capture group captured), the language jumped expressive class to something that can match a^n b^n and is no longer regular at all. PCRE inherited this. Russ Cox showed in the second article of his series that PCRE-style backreferences make matching NP-complete in the worst case. Most production engines that accept backreferences therefore can’t promise linear time; engines that do (RE2, Go, Rust, Hyperscan, Lua patterns) reject backreferences as a category. This is the deepest, oldest fault line in the regex world — every flavor lives on one side of it.
.NET’s three-engine story is unique. Microsoft’s System.Text.RegularExpressions is the only mainstream engine that ships three matching strategies in one library: (1) the original interpreted backtracking engine (default), (2) a JIT-compiled IL backtracker (RegexOptions.Compiled, since .NET 1.0), now eclipsed by (3) source-generated regex via [GeneratedRegex] attribute (.NET 7, 2022) which emits real C# code at build time, and (4) the non-backtracking derivative-based engine (RegexOptions.NonBacktracking, .NET 7, 2022) that gives RE2-like O(n) guarantees while preserving backtracking semantics for the supported subset (no lookarounds, no backreferences). .NET also has the only mainstream engine with balancing groups ((?<-name>...) pops a stack) — which lets you match arbitrarily nested parentheses, the canonical example of “regex shouldn’t be able to do this.”
Hyperscan: SIMD multi-pattern at line rate. When Snort or Suricata inspects a 100 Gbit/s network link looking for thousands of intrusion-detection signatures simultaneously, you can’t run 5000 regexes one at a time. Hyperscan (Sensory Networks, then Intel, then partly forked as Vectorscan after Intel went proprietary at version 5.5) compiles a set of regexes into a single combined matcher and uses SSE/AVX/AVX-512 instructions to advance the automaton across multiple input bytes in parallel. The trade-offs: only matches in left-to-right scan order (no end-of-match position guarantees by default), no backreferences/lookarounds, and the compile step is slow because it does heavy literal extraction and graph optimisation up front. The Vectorscan fork (VectorCamp, 2020+) extends portability to ARM NEON and Power VSX and remains ABI-compatible with Hyperscan 5.4, the last OSS Intel release.
Go and Rust as language-level commitments to safety. Both languages chose RE2 lineage on purpose. Go’s regexp (Cox, 2012) is RE2 in Go. Rust’s regex (BurntSushi, 2014) explicitly cites RE2 as its blueprint. The deliberate decision in both ecosystems is that the standard regex library cannot ReDoS. If you want backreferences in Go, you reach for regexp2 (third-party, .NET-derived); in Rust, you reach for fancy-regex (third-party, RE2-fallback hybrid). This is a quietly important language-design statement: in safety-conscious languages, regex is a place where you trade expressive power for predictable performance, and the type/standard library reflects that. Compare the unreflective Perl/PCRE/Python lineage where the maximally-permissive flavor is the default and ReDoS is left as an exercise for the user.
Lua patterns as the contrarian case. Lua patterns are intentionally not a regex flavor at all. There is no alternation operator. The ?/*/+/- quantifiers only apply to single-character classes, never to groups. The implementation is around 500 lines of C. The Lua manual is explicit that this is a deliberate cost/value trade — a full regex would dwarf the entire rest of the standard library. This makes Lua patterns the smallest practically-useful pattern language in mainstream use, and a clean argument for “regex is bigger than it needs to be.”
The Onigmo / Oniguruma split. Ruby uses Onigmo, a fork of Oniguruma (the original by K. Kosako, used in PHP mb_ereg, TextMate’s grammar engine, Atom, and Ruby 1.9). The fork (Onigmo, by K. Takata, since ~2011) backports Perl 5.10+ features Oniguruma upstream didn’t take. As of April 2025, Oniguruma was archived; Onigmo carries on as the canonical Ruby regex engine. This is a quiet but meaningful event — it means GitHub’s Linguist, TextMate-grammar tooling, and PHP mb_ereg may need to migrate to Onigmo or absorb Oniguruma’s archive state.

Citations

POSIX.2 regular expressions (Open Group): https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html
PCRE2 home and manual: https://www.pcre.org/ ; https://www.pcre.org/current/doc/html/pcre2.html
PCRE2 release notes (10.47, October 2025): https://github.com/PCRE2Project/pcre2/blob/master/NEWS
Russ Cox, “Regular Expression Matching Can Be Simple And Fast” (2007): https://swtch.com/~rsc/regexp/regexp1.html
Russ Cox, regex articles index (#1, #2, #3, #4): https://swtch.com/~rsc/regexp/
RE2 (Google): https://github.com/google/re2 and https://github.com/google/re2/wiki/Syntax
Go regexp package: https://pkg.go.dev/regexp ; syntax: https://pkg.go.dev/regexp/syntax
Rust regex crate: https://docs.rs/regex/latest/regex/
Andrew Gallant, “Regex engine internals as a library”: https://burntsushi.net/regex-internals/
Java java.util.regex.Pattern (JDK 21): https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/regex/Pattern.html
.NET regex options: https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-options
“.NET 7 Regex Improvements” (Stephen Toub, 2022): https://devblogs.microsoft.com/dotnet/regular-expression-improvements-in-dotnet-7/
“Derivative-based Nonbacktracking Real-World Regex Matching” (MSR): https://www.microsoft.com/en-us/research/publication/derivative-based-nonbacktracking-real-world-regex-matching-with-backtracking-semantics/
TC39 proposal-regexp-v-flag: https://github.com/tc39/proposal-regexp-v-flag
ECMA-262 (latest): https://tc39.es/ecma262/#sec-regexp-regular-expression-objects
MDN, “RegExp.prototype.unicodeSets”: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets
“Regexes Got Good” (Smashing Magazine, 2024) — JS regex history overview: https://www.smashingmagazine.com/2024/08/history-future-regular-expressions-javascript/
Python re (3.x): https://docs.python.org/3/library/re.html
Python regex (PyPI): https://pypi.org/project/regex/
Onigmo (Ruby’s regex engine): https://github.com/k-takata/Onigmo
Oniguruma (archived April 2025): https://github.com/kkos/oniguruma
Vim :help pattern: https://vimdoc.sourceforge.net/htmldoc/pattern.html
Emacs Lisp regular expressions: https://www.gnu.org/software/emacs/manual/html_node/elisp/Regular-Expressions.html
Tcl ARE syntax: https://www.tcl-lang.org/man/tcl/TclCmd/re_syntax.htm
Boost.Xpressive: https://www.boost.org/doc/libs/release/doc/html/xpressive.html
Hyperscan (Intel): https://github.com/intel/hyperscan
Vectorscan (community fork): https://github.com/VectorCamp/vectorscan
“Hyperscan: A Fast Multi-pattern Regex Matcher” (NSDI ‘19): https://www.usenix.org/conference/nsdi19/presentation/wang-xiang
GNU grep manual: https://www.gnu.org/software/grep/manual/grep.html
GNU gawk regexp manual: https://www.gnu.org/software/gawk/manual/html_node/Regexp.html
Lua 5.4 patterns: https://www.lua.org/manual/5.4/manual.html#6.4.1
Unicode UTS #18 (Unicode Regular Expressions): https://unicode.org/reports/tr18/

Caveats

Hyperscan / Vectorscan version cadence post-2024. Intel’s continued internal Hyperscan development (5.5+) is under a proprietary licence; the open-source surface is fixed at 5.4 and Vectorscan tracks that as a portability fork. The community split is real but the precise Intel-internal version cadence isn’t publicly documented; treat any “current Intel Hyperscan version” claim above 5.4 as unverified.
Vim regex engine internals. Vim 7.4 (2013) introduced an NFA-simulation engine that runs alongside the original backtracker; the engine selection is heuristic and the exact rules are documented only loosely in :help two-engines. Specific performance claims should be benchmarked rather than asserted.
Tcl ARE / PostgreSQL regex. PostgreSQL’s ~ operator and SIMILAR TO use Henry Spencer’s library, but the exact feature subset has drifted across PostgreSQL versions; cite the PG docs for version-specific syntax claims rather than relying on the generic Tcl ARE description.

Compendium

Explorer

Regex Flavors Family Index

Regex Flavors Family Index

type: language-family-index family: regex-flavors languages_catalogued: 22 tags: [language-reference, family-index, regex-flavors, regular-expressions, pattern-matching, pcre, re2, redos, unicode]

Regex Flavors — Family Index

Family overview

In our deep library

Tier 3 family table

Notable threads

Citations

Caveats

Graph View

Table of Contents