Vnlp Versions Save

State-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.

v0.2.3

1 year ago
  • cyhunspell is replaced by spylls. Consequently, VNLP now supports Python 3.10. However, Python3.6 support is dropped now.
  • Newer versions of Tensorflow does not rely on Keras-Preprocessing anymore. This had caused issues since our tokenizers were saved via pickle. Instead, they are stored as json now, and are loaded in a tf version agnostic way.
  • Tensorflow warnings are suppressed.
  • Readthedocs build and files are updated due to tensorboard, protobuf and grpcio dependency issues.

v0.2

1 year ago
  • SentencePiece Unigram Context (SPUContext) models are added for Named Entity Recognition, Dependency Parsing, Part of Speech Tagging and Sentiment Analysis. These are the default models now.
  • SPUContext models are even more compact, up to 4x faster and perform significantly better. See metrics table on the main page for comparison.
  • SPUContext models use SentencePiece Unigram tokenization.
  • Wheel file is 80% smaller now, and each model downloads its weights when it is initialized for the first time.
  • In order to evaluate a DL based model, use "evaluate = True" flag while initializing, e.g., NamedEntityRecognizer(model = 'CharNER', evaluate = True). This will load the weights that are NOT trained with test sets.
  • Former Python API has become a generic user API, creating an abstraction for the implemented methods. Desired model can be initialized using the "model" argument, e.g., NamedEntityRecognizer(model = 'CharNER').