A fast DFA based Regex engine that supports submatches
This is currently a PoC for a DFA that supports submatches extraction. The match time complexity is linear in length of the text to match. Read the article if you are interested in the implementation.
[!WARNING] Pls use nim-regex for anything serious, instead of this package.
nimble install nregex
Nim +1.0.4
import pkg/nregex
var m: RegexMatch
doAssert match("abc", re"abc", m)
doAssert match("ab", re"a(b|c)", m)
doAssert match("aabcd", re"(aa)bcd", m)
doAssert m.group(0) == @[0 .. 1]
doAssert match("aab", re"((a)*b)", m)
doAssert m.group(0) == @[0 .. 2]
doAssert m.group(1) == @[0 .. 0, 1 .. 1]
doAssert "abcd".find(re"bc", m)
doAssert "2222".find(re"(22)*", m)
doAssert m.group(0) == @[0 .. 1, 2 .. 3]
doAssert re"bc" in "abcd"
doAssert re"(23)+" in "112323211"
The following benchmarks show nregex is up to 22 times faster than PCRE. However, when the RE contains capture groups, PCRE is about 4 times faster than nregex.
relative | time/iter | iters/s | regex | text | |
---|---|---|---|---|---|
CPU | 294.85ps | 3.39G | |||
PCRE | 1.10ms | 912.11 | ^\w*sol\w*$ | (a*100000)sol(b*100000) | |
nregex | 739.52% | 148.25us | 6.75K | ||
PCRE | 174.87ns | 5.72M | ^[0-9]+-[0-9]+-[0-9]+$ | 650-253-0001 | |
nregex | 2280.84% | 7.67ns | 130.43M | ||
PCRE | 179.23ns | 5.58M | ^[0-9]+..+$ | 650-253-0001 | |
nregex | 1447.15% | 12.38ns | 80.74M |
nimble test
MIT