An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
This release marks the 1.0 release of regex.
While this release includes some breaking changes, most users of older versions of the regex library should be able to migrate to 1.0 by simply bumping the version number. The important changes are as follows:
RegexBuilder
.(?-u:\B)
is no longer allowed in Unicode regexes since it can match at
invalid UTF-8 code unit boundaries. (?-u:\b)
is still allowed in Unicode
regexes.From<regex_syntax::Error>
impl has been removed. This formally removes
the public dependency on regex-syntax
.use_std
, has been added and enabled by default. Disabling
the feature will result in a compilation error. In the future, this may
permit us to support no_std
environments (w/ alloc
) in a backwards
compatible way.For more information and discussion, please see 1.0 release tracking issue.
This release includes a ground-up rewrite of the regex-syntax crate, which has been in development for over a year.
New features:
&&
, --
and ~~
binary
operators within classes.\p{..}
character classes.
Things like \p{scx:Hira}
, \p{age:3.2}
or \p{Changes_When_Casefolded}
now work. All property name and value aliases are supported, and properties
are selected via loose matching. e.g., \p{Greek}
is the same as
\p{G r E e K}
.UNICODE.md
document has been added to this repository that
exhaustively documents support for UTS#18.()+
is
now a valid regex.Ast
type in regex-syntax
now contains span information.\u
, \u{...}
, \U
and \U{...}
syntax for specifying code points
in a regular expression.Replace::by_ref
adapter for use of a replacer without consuming it.Bug fixes:
New features:
[\p{Greek}&&\pL]
matches greek letters and
[[0-9]&&[^4]]
matches every decimal digit except 4
.
(Much thanks to @robinst, who contributed this awesome feature.)Bug fixes:
(?x)
flag.Captures::get
to API documentation.(?x)
is used.rure_captures_len
in the C binding.