A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
TFMOT 0.8.0 forces users to use the keras v2 version.
TFMOT 0.7.5 fixes compatibility issues with new keras.
TFMOT 0.7.4 add from_config method to QuantizeConfig class since the new keras serialization enforces it.
TFMOT 0.7.3 add remove_input_range method that removes input range after apply quantize.
TFMOT 0.7.2 removes support for the PeepholeLSTMCell layer, that was removed in Keras.
TFMOT 0.7.1 fixes a bug in tensor_encoding in 9e4c106267a4a7f61e0d90b0848db15fd063b80e.
TFMOT 0.7.0 adds updates for Quantization Aware Training (QAT) and Pruning API. Adds support for structured (MxN) pruning. QAT now also has support for layers with swish activations and ability to disable per-axis quantization in the default 8bit scheme. Adds support for combining pruning, QAT and weight clustering.
Keras Quantization API: Tested against TensorFlow 2.6.0, 2.5.1 and nightly with Python 3.
Keras pruning API: Tested against TensorFlow 2.6.0, 2.5.1 and nightly with Python 3.
Keras clustering API:
Actual commit for release: d6556c2a591c928fc8b9b723b4909639193ecf14
TFMOT 0.6.0 adds some additional features for Quantization Aware Training. Adds support for overriding and subclassing default quantization schemes. Adds input quantizer for annotated quantized layers without annotated input layers. Also adds pruning policy for pruning registries for different hardware supports. Also adds Conv2DTranspose support and tanh activations.
Keras quantization API: Tested against TensorFlow 2.4.2, 2.5.0 and nightly with Python 3.
Keras pruning API: Tested against TensorFlow 2.4.2, 2.5.0 and nightly with Python 3.
Keras clustering API:
Actual commit for release: https://github.com/tensorflow/model-optimization/commit/525accb4d3ed3bc6d345143fb0fa1d8faa0ce23d.
TFMOT 0.5.0 adds some additional features for Quantization Aware Training. QAT now supports Keras layers SeparableConv2D
and SeparableConv1D
. It also provides a new Quantizer AllValuesQuantizer
which allows for more flexibility with range selection.
Keras clustering API: Tested against TensorFlow 1.14.0 and 2.3.0 with Python 3.
Keras quantization API: Tested against TensorFlow 2.3.0 with Python 3.
Keras pruning API: Tested against TensorFlow 1.14.0 and 2.3.0 with Python 3.
TFMOT 0.4.1 fixes a bug which makes 0.4.0 quantization code fail when run against tf-nightly
since July 31, 2020. The code now works against different versions on TF, and is not broken by changes to smart_cond
in core TF.
Keras clustering API:
Keras quantization API:
Keras pruning API: