Aimet Versions Save

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

1.31.0

1 month ago

1.30.0

3 months ago

What's New

ONNX

Upgraded AIMET to support Onnx version 1.14 and ONNXRUNTIME version 1.15.
Added support for AutoQuant.

Documentation

Release main page: https://github.com/quic/aimet/releases/tag/1.30.0
Installation guide: https://quic.github.io/aimet-pages/releases/1.30.0/install/index.html
User guide: https://quic.github.io/aimet-pages/releases/1.30.0/user_guide/index.html
API documentation: https://quic.github.io/aimet-pages/releases/1.30.0/api_docs/index.html
Documentation main page: https://quic.github.io/aimet-pages/index.html

1.29.0

5 months ago

What's New

Keras

Fixes issues with TF Op Lambda Layers in Qc Quantize Wrappers call.

PyTorch

[experimental] Support for embedding AIMET encodings within the graph using ONNX quantize/dequantize operators. Currently this option is only supported when using 8bit per-tensor quantization.

ONNX

Added support for Adaround.

TensorFlow

No significant updates

Documentation

Release main page: https://github.com/quic/aimet/releases/tag/1.29.0
Installation guide: https://quic.github.io/aimet-pages/releases/1.29.0/install/index.html
User guide: https://quic.github.io/aimet-pages/releases/1.29.0/user_guide/index.html
API documentation: https://quic.github.io/aimet-pages/releases/1.29.0/api_docs/index.html
Documentation main page: https://quic.github.io/aimet-pages/index.html

1.28.1

6 months ago

1.28.0

8 months ago

What's New

Keras

Added Support for Spatial SVD Compression feature.
[experimental] Debugging APIs have been added for dumping intermediate tensor outputs. This data can be used with current QNN/SNPE tools for debugging accuracy problems.

PyTorch

Upgraded AIMET Pytorch default version to 1.13. AIMET remains compatible with Pytorch version 1.9.

ONNX

[experimental] Debugging APIs have been added for dumping intermediate tensor outputs. This data can be used with current QNN/SNPE tools for debugging accuracy problems.

TensorFlow

No significant updates

Documentation

Release main page: https://github.com/quic/aimet/releases/tag/1.28.0
Installation guide: https://quic.github.io/aimet-pages/releases/1.28.0/install/index.html
User guide: https://quic.github.io/aimet-pages/releases/1.28.0/user_guide/index.html
API documentation: https://quic.github.io/aimet-pages/releases/1.28.0/api_docs/index.html
Documentation main page: https://quic.github.io/aimet-pages/index.html

1.27.0

9 months ago

What's New

Keras

Update support for TFOpLambda layers in Batch Norm Folding with extra call args/kwargs.

PyTorch

Added AIMET to support PyTorch version 1.13.0. Only ONNX opset 14 is supported for export.
[experimental] Debugging APIs have been added for dumping intermediate tensor data. This data can be used with current QNN/SNPE tools for debugging accuracy problems. Layer Output Generation API gives incorrect tensor data for the layer just before Relu when used for original FP32 model.
[experimental] Support for embedding AIMET encodings within the graph using ONNX quantize/dequantize operators. Currently this is option is only supported when using 8bit per-tensor quantization.
Fixed a bug in AIMET QuantSim for PyTorch models to handle non-contiguous tensors.

ONNX

AIMET support for ONNX 1.11.0 has been added. However there is currently limited op support in QNN/SNPE. If the model fails to load please continue to use opset 11 for export.

TensorFlow

[experimental] Debugging APIs have been added for dumping intermediate tensor outputs. This data can be used with current QNN/SNPE tools for debugging accuracy problems.

Documentation

Release main page: https://github.com/quic/aimet/releases/tag/1.27.0
Installation guide: https://quic.github.io/aimet-pages/releases/1.27.0/install/index.html
User guide: https://quic.github.io/aimet-pages/releases/1.27.0/user_guide/index.html
API documentation: https://quic.github.io/aimet-pages/releases/1.27.0/api_docs/index.html
Documentation main page: https://quic.github.io/aimet-pages/index.html

1.26.1

10 months ago

What's New

TensorFlow

Upgraded AIMET to support TensorFlow version 2.10.1 (AIMET remains compatible with TensorFlow 2.4).
Several bug fixes

Common

Upgraded to Ubuntu 20 base image for all variants.

Documentation

Release main page: https://github.com/quic/aimet/releases/tag/1.26.1
Installation guide: https://quic.github.io/aimet-pages/releases/1.26.1/install/index.html
User guide: https://quic.github.io/aimet-pages/releases/1.26.1/user_guide/index.html
API documentation: https://quic.github.io/aimet-pages/releases/1.26.1/api_docs/index.html
Documentation main page: https://quic.github.io/aimet-pages/index.html

1.26.0

1 year ago

What's New

Keras

Added a feature called BN Re-estimation that can improve model accuracy after QAT for INT4 quantization.
Updated the AutoQuant feature to automatically choose the optimal calibration scheme, create an HTML report on which optimizations were applied.
Update to Model Preparer to replace separable conventional with depth wise and point wise conv layers.
Fixes BN fold implementation to account for a subsequent multi-input layer
Fixed a bug where min/max encoding values were not aligned with scale/offset during QAT.

PyTorch

Several bug fixes

TensorFlow

Added a feature called BN Re-estimation that can improve model accuracy after QAT for INT4 quantization
Updated the AutoQuant feature to automatically choose the optimal calibration scheme, create an HTML report on which optimizations were applied.
Fixed a bug where min/max encoding values were not aligned with scale/offset during QAT.

Common

Documentation updates for taking AIMET models to target.
Standalone Batchnorm layers parameter’s conversion such that it will behave as linear/dense layer.

Experimental

Added new Architecture Checker feature to identify and report model architecture constructs that are not ideal for quantized runtimes. Users can utilize this information to change their model architectures accordingly.

Documentation

Release main page: https://github.com/quic/aimet/releases/tag/1.26.0
Installation guide: https://quic.github.io/aimet-pages/releases/1.26.0/install/index.html
User guide: https://quic.github.io/aimet-pages/releases/1.26.0/user_guide/index.html
API documentation: https://quic.github.io/aimet-pages/releases/1.26.0/api_docs/index.html
Documentation main page: https://quic.github.io/aimet-pages/index.html

1.25.0

1 year ago

What's New

Keras

Added QuantAnalyzer feature
Adds Batch Normalization folding for Functional Keras Models. This allows the default config files to work for super grouping.
Resolved an issue with quantizer placement in Sequential blocks in subclassed models

PyTorch

Added AutoQuant V2 which includes advanced features such as out-of-the-box inference, model preparer, quant scheme search, improved summary report, etc.
Fixes to resolve minor accuracy diffs in the learnedGrid quantizer for per-channel quantization
Fixes to improve EfficientNetB4 accuracy w/respect to target
Fixed rare case where quantizer may calculate incorrect offset when generating QAT 2.0 learned encodings

TensorFlow

Added QuantAnalyzer feature
Fixed an accuracy issue due to rare cases where the incorrect BN epsilon was being used
Fixed an accuracy issue due to Quantsim export incorrectly recomputing QAT2.0 encodings

Common

Updated AIMET python package version format to support latest pip
Fixed an issue where not all inputs might be quantized properly

Documentation

Release main page: https://github.com/quic/aimet/releases/tag/1.25.0
Installation guide: https://quic.github.io/aimet-pages/releases/1.25.0/install/index.html
User guide: https://quic.github.io/aimet-pages/releases/1.25.0/user_guide/index.html
API documentation: https://quic.github.io/aimet-pages/releases/1.25.0/api_docs/index.html
Documentation main page: https://quic.github.io/aimet-pages/index.html

1.24.0

1 year ago

What's New

Export quantsim configuration for configuring downstream target quantization

PyTorch

Fixes to resolve minor accuracy diffs in the learnedGrid quantizer for per-channel quantization
Added support for AMP 2.0 which enables faster automatic mixed precision
Added support for QAT for INT4 quantized models – includes a feature for performing BN Re-estimation after QAT

Keras

Added support for AMP 2.0 which enables faster automatic mixed precision
Support for basic transformer networks
Added support for subclassed models. The current subclassing feature includes support for only a single level of subclassing and does not support lambdas.
Added QAT per-channel gradient support
Minor updates to the quantization configuration
Fixed QuantSim bug where layers using dtypes other than float were incorrectly quantized

TensorFlow

Added an additional prelu mapping pattern to ensure proper folding and quantsim node placement
Fixed per-channel encoding representation to align with Pytorch and Keras

Documentation

Release main page: https://github.com/quic/aimet/releases/tag/1.24.0
Installation guide: https://quic.github.io/aimet-pages/releases/1.24.0/install/index.html
User guide: https://quic.github.io/aimet-pages/releases/1.24.0/user_guide/index.html
API documentation: https://quic.github.io/aimet-pages/releases/1.24.0/api_docs/index.html
Documentation main page: https://quic.github.io/aimet-pages/index.html