NLP made easy
BERT pre-training on BooksCorpus and English Wikipedia with mixed precision and gradient accumulation on GPUs. We achieved the following fine-tuning results based on the produced checkpoint on validation sets(#482, #505, #489). Thank you @haven-jeon
Dataset | MRPC | SQuAD 1.1 | SST-2 | MNLI-mm |
---|---|---|---|---|
Score | 87.99% | 80.99/88.60 | 93% | 83.6% |
BERT fine-tuning on various sentence classification datasets with checkpoints converted from the official repository(#600, #571, #481). Thank you @kenjewu @haven-jeon
Dataset | MRPC | RTE | SST-2 | MNLI-m/mm |
---|---|---|---|---|
Score | 88.7% | 70.8% | 93% | 84.55%, 84.66% |
BERT fine-tuning on question answering datasets with checkpoints converted from the official repository(#493). Thank you @fiercex
Dataset | SQuAD 1.1 | SQuAD 1.1 | SQuAD 2.0 |
---|---|---|---|
Model | bert_12_768_12 | bert_24_1024_16 | bert_24_1024_16 |
F1/EM | 88.53/80.98 | 90.97/84.05 | 77.96/81.02 |
BERT model convertion scripts for checkpoints from the original tensorflow repository, and more converted models(#456, #461, #449). Thank you @fiercex:
Scripts and command line interface for BERT embedding of raw sentences(#587, #618). Thank you @imgarylai
Scripts for exporting BERT model for deployment (#624)
BERT
ELMo
Word Embedding
Natural Language Inference
Dependency Parsing
Text Classification
GluonNLP v0.3 contains many exciting new features. (depends on MXNet 1.3.0b20180725)
TokenEmbedding
class (#185)GluonNLP provides its users with easy access to
Gluon NLP Toolkit supplies model definitions for common NLP tasks. These can be adapted for the users requirements or taken as blueprint for new developments. All of these are implemented using Gluon Blocks allowing easy reuse as plug-and-play neural network building blocks.
Gluon NLP Toolkit provides tools for building efficient data pipelines for NLP tasks by defining a Dataset class interface and utilities for transforming them. Several datasets are included by default and will be automatically downloaded when used.
Gluon NLP further ships with common datasets data transformation functions, dataset samplers to determine how to iterate through datasets as well as functions to generate data batches.