State of the art faster Transformer with Tensorflow 2.0 ( NLP, Computer Vision, Audio ).
Website: https://legacyai.github.io/tf-transformers
tf-transformers: faster and easier state-of-the-art Transformer in TensorFlow 2.0
Imagine auto-regressive generation to be 90x faster. tf-transformers (Tensorflow Transformers) is designed to harness the full power of Tensorflow 2, designed specifically for Transformer based architecture.
These models can be applied on:
GPT2 text generation with max_length=64
, num_beams=3
.
tf_transformers : 31 minutes
huggingface_tf : 83 minutes
huggingface_pt : 36 minutes
huggingface_jax : 35 minutes
From 83 minutes
to 31 minutes
is a significant speedup. 92 %
speedup.
On an average, tf-transformers is 80-90 % speedup than HuggingFace Tensorflow implementation and in most cases it is comparable or faster than PyTorch.
More benchmarks can be found in benchmark
This repository is tested on Python 3.7+ and TensorFlow 2.7.
pip install sentencepiece
pip install tensorflow-text >= 2.7.3
pip install tqdm
Install tensorflow >= 2.7.0 [CPU or GPU]
as per your machine.
You should install tf-transformers in a virtual environment. If you're unfamiliar with Python virtual environments, check out the user guide.
First, create a virtual environment with the version of Python you're going to use and activate it.
Then, you will need to install at least one of TensorFlow. Please refer to TensorFlow installation page, installation pages regarding the specific install command for your platform. We highly recommend to install [tensorflow-text] (https://www.tensorflow.org/text).
When one of those backends has been installed, tf-transformers can be installed using pip as follows:
pip install tf-transformers
git clone https://github.com/legacyai/tf-transformers.git
pip install poetry
cd tf-transformers
poetry install
tf-transformers API is very simple and minimalistic.
>>> from tf_transformers.models import GPT2Model
>>> model = GPT2Model.from_pretrained('gpt2')
>>> model.save_checkpoint("/tmp/gpt2_model/") # Save Model
For text-generation, it is very important to add :obj:use_auto_regressive=True
. This is required for all the models.
>>> from tf_transformers.models import GPT2Model
>>> model = GPT2Model.from_pretrained('gpt2', use_auto_regressive=True)
To serialize save and load model
>>> from tf_transformers.models import GPT2Model
>>> model = GPT2Model.from_pretrained('gpt2')
>>> model.save_transformers_serialized("/tmp/gpt2_serialized/")
# To load a serialized models for inference in prodcution:
>>> import tensorflow as tf
>>> loaded = tf.saved_model.load("/tmp/gpt2_serialized/")
>>> model = loaded.signatures['serving_default']
In tf-transformers we mostly followed Functional API
from keras.
All models in tf-transformers
are connected and always have following functionality.
If tf.keras.Model
or tf_transformers.core.LegacyModel
, use:
print(model.input)
.
If tf.keras.Layer
or tf_transformers.core.LegacyLayer
, use:
print(model.model_inputs)
.
If tf.keras.Model
or tf_transformers.core.LegacyModel
, use:
print(model.output)
.
If tf.keras.Layer
or tf_transformers.core.LegacyLayer
, use:
print(model.model_outputs)
.
We have covered tutorials covering pre-training, finetuning, classfication, QA, NER so much more.
Use state-of-the-art models in Production, with less than 10 lines of code.
Make industry based experience to avaliable to students and community with clear tutorials
Train any model on GPU, multi-GPU, TPU with amazing tf.keras.Model.fit
Customize any models or pipelines with minimal or no code change.
The Research section has codes for pre-training different models ranging from **MLM, T5, CLIP etc **. All these scripts are designed to harness full power of tensorflow-io pipeline and tested on TPU V2 and TPU V3. Bugs are expected in those, but it serves as a purpose for practioners to start or modifying what we have already done.
We have conducted few experiments to squeeze the power of Albert base models ( concept is applicable to any models and in tf-transformers, it is out of the box.)
The idea is minimize the loss for specified task in each layer of your model and check predictions at each layer. as per our experiments, we are able to get the best smaller model (thanks to Albert), and from layer 4 onwards we beat all the smaller model in GLUE benchmark. By layer 6, we got a GLUE score of 81.0, which is 4 points ahead of Distillbert with GLUE score of 77 and MobileBert GLUE score of 78.
The Albert model has 14 million parameters, and by using layer 6, we were able to speed up the compuation by 50% .
The concept is applicable to all the models and tasks.
By splitting input sequence into block attention and merge using FFN layer we have shown that, smaller machines will be able to perform sequence processing up to 4096 tokens in a single V100 GPU machine.
The model has outperforms Pegasus Base (128 million)
in PubMed
summarisation despite being 60 million
parameter.
tf-transformers currently provides the following architectures .
We now have a page you can cite for the tf-transformers library.