Fuzzy matching and more functionality for spaCy.
python<=3.11,>=3.7
, along with rapidfuzz>=1.0.0
.
"spaczz_"
preprended optional SpaczzRuler
init arguments. Also, sorry to do this without a deprecation cycle.
Matcher.pipe
methods, which were deprecated, are now removed.
spaczz_span
custom attribute, which was deprecated, is now removed.
TokenMatcher
. Spaczz expects token matches returned in order of ascending match start, then descending match length. However, spaCy's Matcher
does not return matches in this order by default. Added a sort in the TokenMatcher
to ensure this.SpaczzRuler
optional arguments no longer need to be prepended with "spaczz_"
. This will still work in most cases offering some backwards compatibility. However, optional arguments prepended with "spaczz_"
will not work with spaCy v3's new spacy.load
and nlp.add_pipe
config driven APIs. It is therefore recommended that users move away from using the prepended versions if using spaCy v3. It should be noted however that the prepended arguments are still necessary if using spaczz with spaCy v2.Matcher.pipe
methods are now deprecated in accordance with spaCy v3.spaczz_span
custom attribute is deprecated in favor of spaczz_ent
. They both have the same functionality but the -spaczz_ent
name makes more sense.Adds the TokenMatcher
to spaczz and integrates it with the SpaczzRuler
. Also overhauls spaczz's custom attributes and includes some quality of life improvements and bug fixes.