A pure Swift NFA implementation of a regular expression engine
A pure Swift implementation of a Regular Expression Engine
Trying again with V2 using DFAs instead of NFAs to get grep-like performance
To avoid compiling overhead it is possible to create a Regex
instance
// Compile the expression
let regex = try! Regex(pattern: "[a-zA-Z]+")
let string = "RegEx is tough, but useful."
// Search for matches
let words = regex.match(string)
/*
words = [
RegexMatch(match: "RegEx", groups: []),
RegexMatch(match: "is", groups: []),
RegexMatch(match: "tough", groups: []),
RegexMatch(match: "but", groups: []),
RegexMatch(match: "useful", groups: []),
]
*/
If compiling overhead is not an issue it is possible to use the =~
operator to match a string
let fourLetterWords = "drink beer, it's very nice!" =~ "\\b\\w{4}\\b" ?? []
/*
fourLetterWords = [
RegexMatch(match: "beer", groups: []),
RegexMatch(match: "very", groups: []),
RegexMatch(match: "nice", groups: []),
]
*/
By default the Global
flag is active. To change which flag are active, add a /
at the start of the pattern, and add /<flags>
at the end. The available flags are:
g
Global
- Allows multiple matchesi
Case Insensitive
- Case insensitive matchingm
Multiline
- ^
and $
also match the begining and end of a line// Global and Case Insensitive search
let regex = try! Regex(pattern: "/\\w+/ig")
Pattern | Description | Supported |
---|---|---|
. |
[^\n\r] |
|
[^] |
[\s\S] |
|
\w |
[A-Za-z0-9_] |
|
\W |
[^A-Za-z0-9_] |
|
\d |
[0-9] |
|
\D |
[^0-9] |
|
\s |
[\ \r\n\t\v\f] |
|
\S |
[^\ \r\n\t\v\f] |
|
[ABC] |
Any in the set |
|
[^ABC] |
Any not in the set |
|
[A-Z] |
Any in the range inclusively |
|
Pattern | Description | Supported |
---|---|---|
^ |
Beginning of string |
|
$ |
End of string |
|
\b |
Word boundary |
|
\B |
Not word boundary |
|
Pattern | Description | Supported |
---|---|---|
\0 |
Octal escaped character |
|
\00 |
Octal escaped character |
|
\000 |
Octal escaped character |
|
\xFF |
Hex escaped character |
|
\uFFFF |
Unicode escaped character |
|
\cA |
Control character |
|
\t |
Tab |
|
\n |
Newline |
|
\v |
Vertical tab |
|
\f |
Form feed |
|
\r |
Carriage return |
|
\0 |
Null |
|
\. |
. |
|
\\ |
\ |
|
\+ |
+ |
|
\* |
* |
|
\? |
? |
|
\^ |
^ |
|
\$ |
$ |
|
\{ |
{ |
|
\} |
} |
|
\[ |
[ |
|
\] |
] |
|
\( |
( |
|
\) |
) |
|
\/ |
/ |
|
| |
` | ` |
Pattern | Description | Supported |
---|---|---|
(ABC) |
Capture group |
|
(<name>ABC) |
Named capture group |
|
\1 |
Back reference |
|
\'name' |
Named back reference |
|
(?:ABC) |
Non-capturing group |
|
(?=ABC) |
Positive lookahead |
|
(?!ABC) |
Negative lookahead |
|
(?<=ABC) |
Positive lookbehind |
|
(?<!ABC) |
Negative lookbehing |
|
Pattern | Description | Supported |
---|---|---|
+ |
One or more |
|
* |
Zero or more |
|
? |
Optional |
|
{n} |
n |
|
{,} |
Same as * |
|
{,n} |
n or less |
|
{n,} |
n or more |
|
{n,m} |
n to m |
|
Pattern | Description | Supported |
---|---|---|
+? |
One or more |
|
*? |
Zero or more |
|
?? |
Optional |
|
{n}? |
n |
|
{,n}? |
n or less |
|
{n,}? |
n or more |
|
{n,m}? |
n to m |
|
Pattern | Description | Supported |
---|---|---|
| |
Everything before or everything after |
|
Pattern | Description | Supported |
---|---|---|
i |
Case insensitive |
|
g |
Global |
|
m |
Multiline |
|
(Similar to before)
char(a), char(b)
-> string(ab)
) for better performance)Swift treats \r\n
as a single Character
. Use \n\r
to have both.