BERT Extension in TensorFlow
BERT (Bidirectional Encoder Representations from Transformers) is a generalized autoencoding pretraining method proposed by Google AI Language team, which obtains new state-of-the-art results on 11 NLP tasks ranging from question answering, natural, language inference and sentiment analysis. BERT is designed to pretrain deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which allows it to be easily finetuned for downstream tasks without substantial task-specific architecture modifications. This project is aiming to provide extensions built on top of current BERT and bring power of BERT to other NLP tasks like NER and NLU.
Figure 1: Illustrations of fine-tuning BERT on different tasks
python prepro/prepro_conll.py \
--data_format json \
--input_file data/ner/conll2003/raw/eng.xxx \
--output_file data/ner/conll2003/xxx-conll2003/xxx-conll2003.json
CUDA_VISIBLE_DEVICES=0 python run_ner.py \
--task_name=conll2003 \
--do_train=true \
--do_eval=true \
--do_predict=true \
--do_export=true \
--data_dir=data/ner/conll2003 \
--vocab_file=model/cased_L-12_H-768_A-12/vocab.txt \
--bert_config_file=model/cased_L-12_H-768_A-12/bert_config.json \
--init_checkpoint=model/cased_L-12_H-768_A-12/bert_model.ckpt \
--max_seq_length=128 \
--train_batch_size=32 \
--eval_batch_size=8 \
--predict_batch_size=8 \
--learning_rate=2e-5 \
--num_train_epochs=5.0 \
--output_dir=output/ner/conll2003/debug
--export_dir=output/ner/conll2003/export
tensorboard --logdir=output/ner/conll2003
docker run -p 8500:8500 \
-v output/ner/conll2003/export/xxxxx:models/ner \
-e MODEL_NAME=ner \
-t tensorflow/serving
Figure 2: Illustrations of fine-tuning BERT on NER task
CoNLL2003 - NER | Avg. (5-run) | Best |
---|---|---|
Precision | 91.37 ± 0.33 | 91.87 |
Recall | 92.37 ± 0.25 | 92.68 |
F1 Score | 91.87 ± 0.28 | 92.27 |
Table 1: The test set performance of BERT-large finetuned model on CoNLL2003-NER task with setting: batch size = 16, max length = 128, learning rate = 2e-5, num epoch = 5.0
Figure 3: Illustrations of fine-tuning BERT on NLU task
ATIS - NLU | Avg. (5-run) | Best |
---|---|---|
Accuracy - Intent | 97.38 ± 0.19 | 97.65 |
F1 Score - Slot | 95.61 ± 0.09 | 95.53 |
Table 2: The test set performance of BERT-large finetuned model on ATIS-NLU task with setting: batch size = 16, max length = 128, learning rate = 2e-5, num epoch = 5.0