Tools for shrinking fastText models (in gensim format)
Closing https://github.com/avidale/compress-fasttext/issues/19#event-10655768121 by removing pqkmeans
dependency that has un-installable dependencies of its own.
gensim
versionFastTextTransformer
, a scikit-learn-like wrapper for feature extractionRussian models based on geowac_tokens_none_fasttextskipgram_300_5_2020
from RusVectores, 1.9GB:
Model | RAM size, mb | similarity to the original model |
---|---|---|
geowac_tokens_sg_300_5_2020-100K-20K-100.bin | 26 | 0.9619 |
geowac_tokens_sg_300_5_2020-400K-100K-300.bin | 202 | 0.9990 |
English models based on cc.en.300.bin
from the Facebook website, 7.2GB:
Model | RAM size, mb | similarity to the original model |
---|---|---|
ft_cc.en.300_freqprune_50K_5K_pq_100.bin | 12 | 0.3570 |
ft_cc.en.300_freqprune_100K_20K_pq_100.bin | 25 | 0.6081 |
ft_cc.en.300_freqprune_100K_20K_pq_300.bin | 48 | 0.6268 |
ft_cc.en.300_freqprune_400K_100K_pq_300.bin | 199 | 0.8782 |
Much more small models for various languages can be found at https://zenodo.org/record/4905385.
sklearn
and pqkmeans
only in the [full]
setup modeNow attempts of arithmetic operations on compressed matrices do not raise errors. However, they lead to conversion of these matrices to numpy.array
, which uses time and memory.
Now prune_ft_freq
method takes into account not only n-gram frequency, but also the norm of its embedding.
This improves model compression accuracy for the same model size.
We publish the code for compressing Gensim FastText models and using their small versions.
We also publish 4 compressed versions of the ruscorpora_none_fasttextskipgram_300_2_2019 model from RusVectores.
Model | RAM, mb | Similarity to the original | Intrinsic evaluation (relative to the original) |
---|---|---|---|
ft_freqprune_50K_5K_pq_100.bin | 13 | 92.7% | 89.9% |
ft_freqprune_100K_20K_pq_100.bin | 28 | 96.1% | 96.6% |
ft_freqprune_100K_20K_pq_300.bin | 51 | 98.2% | 97.9% |
ft_freqprune_400K_100K_pq_300.bin | 180 | 99.7% | 99.9% |