Library for fast text representation and classification.
We are happy to announce the release of version 0.9.2.
We are excited to release fastText bindings for WebAssembly. Classification tasks are widely used in web applications and we believe giving access to the complete fastText API from the browser will notably help our community to build nice tools. See our documentation to learn more.
Finding the best hyperparameters is crucial for building efficient models. However, searching the best hyperparameters manually is difficult. This release includes the autotune feature that allows you to find automatically the best hyperparameters for your dataset. You can find more information on how to use it here.
fastText loves Python. In this release, we have:
The autotune feature is fully integrated with our Python API. This allows us to have a more stable autotune optimization loop from Python and to synchronize the best hyper-parameters with the _FastText
model object.
We release two helper scripts:
They can also be used directly from our Python API.
When you test a trained model, you can now have more detailed results for the precision/recall metrics of a specific label or all labels.
This release contains the source code of the unsupervised multilingual alignment paper.
We want to thank our community for giving us feedback on Facebook and on GitHub.
We are happy to announce the release of version 0.9.1.
The main goal of this release is to merge two existing python modules: the official fastText
module which was available on our github repository and the unofficial fasttext
module which was available on pypi.org.
You can find an overview of the new API here, and more insight in our blog post.
This version includes a massive rewrite of internal classes. The training and test are now split into three different classes : Model
that takes care of the computational aspect, Loss
that handles loss and applies gradients to the output matrix, and State
that is responsible of holding the model's state inside each thread.
That makes the code more straighforward to read but also gives a smaller memory footprint, because the data needed for loss computation is now hold only once unlike before where there was one for each thread.
on_unicode_error
argument that helps to handle unicode issues one can face with some datasetspy::str
class between python2 and python3aws
to fbaipublicfiles
As always, we want to thank you for your help and your precious feedback which helps making this project better.
We are happy to announce the change of the license from BSD+patents to MIT and the release of fastText 0.2.0.
The main purpose of this release is to set a beta C++ API of the FastText
class. The class now behaves as a computational library: we moved the display and some usage error handlings outside of it (mainly to main.cc
and fasttext_pybind.cc
). It is still compatible with older versions of the class, but some methods are now marked as deprecated and will probably be removed in the next release.
In this respect, we also introduce the official support for python. The python binding of fastText is a client of the FastText
class.
Here is a short summary of the 104 commits since 0.1.0 :
-loss ova
or -loss one-vs-all
command line option ( 8850c51b972ed68642a15c17fbcd4dd58766291d ).FastText
class ( 256032b87522cdebc4850c99b204b81b3255cb2a ).setup.py
OS X compiler flags, pybind11 include.README.md
Makefile
and setup.py
in order to build for measuring the coverage.We want to thank you all for being a part of this community and sharing your passion with us. Some of these improvements would not have been possible without your help.