Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Important bugfix release:
(All users are urged to upgrade!)
setup.py
more robust for manual installation mode (without compiling C++ lib) (v2.4.7 was skipped)-t
(threshold) behaviour was wrong (was interpreted as +1)--simplereport
(-r
) option that generates a report without coverage information (more limited but a lot faster)v2.4.2 was prematurely released, one minor test was corrupt. Fixed now in this release.
Bugfix release, fixes issue #25
Minor fix release prior to paper publication:
Various fixes:
Pattern.instanceof()
should be faster and is now available from Python tooNew features:
ignorenewlines
option in class encoding. Useful if you have source text split by for instance sentences (one per line), but want a model that crosses sentence boundaries.