Ukrainian lemmatizer plugin for ElasticSearch
The release for the latest ES in v5.6.x branch.
This marvelous release made the lemmatizer able to be used within the custom analysis settings on ad-hoc basis, using the approach described in the official documentation. Aforesaid (and some additional improvements) also implies the fact that it is now feasible to feed an additional stopwords list to the lemmatizer. Curious reader may find the working example in test.sh.
The release consists mostly of subtle changes, yet they make our life better, don't they?
Next release for the latest ES in v2.4.x branch.
Excellent news! The plugin has gotten smaller (only 3.8 MB !) and builds became faster.
So this has finally happened: the plugin won't use that ugly CSV burden as it used to. Current implementation uses the Morfologik project as an stemming engine. Moreover the most recent dictionary produced by those magnificent guys in Brown-Uk was included. Whenever you run into any of them (and especially @arysin), show your gratitude. 😉
Made it available for ES v2.2.1. Also gradle-wrapper has been added just for convenience purposes.
This particular release addresses an issue with non-Unicode based OSes (like Windows) and a problem with testing in gradle.