NLP Cube Versions Save

Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing

2 years ago

This release include support for tagging, parsing, tokenizing, sentence splitting and lemmatizing of raw text.

It was evaluated during the CONLL Shared Task on Universal Dependencies Parsing and has pretrained languages models for the entire UD Corpus.

Features

Model store with pretrained (selected) languages
Training pipeline for building custom models
Supports multiple language models: transformer, fasttext, languasito, dummy (no embeddings)
Updated models with large improvements in the F-Score
Flavours: build a joint model using multiple treebanks at the same time and language code conditioning (increses performance in most cases)

5 years ago

This release include support for tagging, parsing, tokenizing, sentence splitting and lemmatizing of raw text.

It was evaluated during the CONLL Shared Task on Universal Dependencies Parsing and has pretrained languages models for the entire UD Corpus.

Features: