Aimet Versions Save

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

1.23.0

1 year ago

What's New

  • TF-enhanced calibration scheme has been accelerated using a custom CUDA kernel. Runs significantly faster now.
  • Installation instructions are now combined with rest of the documentation (User-Guide and API docs)

PyTorch

  • Fixed backward pass of the fake-quantize (QcQuantizeWrapper) nodes to handle symmetric mode correctly
  • Per-channel quantization is now enabled on a per-op-type basis
  • Support for recursively excluding module from a root module in QuantSim
  • Support for excluding layers when running model validator and model preparer
  • Reduced memory usage in AdaRound
  • Fixed bugs in AdaRound for per-channel quantization
  • Made ConnectedGraph more robust when identifying custom layers
  • Added jupyter notebook-based examples for the following features
  • AutoQuant: Added support for sparse conv layers in QuantSim (experimental)

Keras

  • Added support for Keras per-channel quantization
  • Changed interface to CLE to accept a pre-compiled model
  • Added jupyter notebook-based examples for the following features: Transformer quantization

TensorFlow

  • Fix to avoid unnecessary indexing in AdaRound

Documentation

1.22.2

1 year ago

What's new

Tensorflow

  • Added support for supergroups : MatMul + Add
  • Added support for TF-Slim BN name with backslash
  • Added support for Depthwise + Conv in CLS

Documentation

1.22.1

1 year ago

What's Changed

  • Added support for QuantizableMultiHeadAttention for PyTorch nn.transformer layers by @quic-kyuykim
  • Support functional conv2d in model preparer by @quic-kyuykim
  • Enable qat with multi gpu by @quic-mangal
  • Optimize forward pass logic of PyTorch QAT 2.0 by @quic-geunlee
  • Fix functional depthwise conv support on model preparer by @quic-kyuykim
  • Fix bug in model validator to correctly identify functional ops in leaf module by @quic-klhsieh
  • Support dynamic functional conv2d in model preparer by @quic-kyuykim
  • Added updated default runtime config, also a per-channel one. Fixed n… by @quic-akhobare
  • Include residing module info in model validator by @quic-klhsieh
  • Support for Keras MultiHeadAttention Layer by @quic-ashvkuma

Documentation

1.22.0

1 year ago

This release has the following changes

  • Support for simulation and QAT for PyTorch transformer models (including support for torch.nn mha and encoder layers)

Documentation

1.21.0

1 year ago

This release has the following changes

  • New feature: PyTorch QuantAnalyzer - Visualize per-layer sensitivity and per-quantizer PDF histograms
  • New feature: TensorFlow AutoQuant - Automatically apply various AIMET post-training quantization techniques
  • PyTorch QAT with Range Learning: Added support for Per Channel Quantization
  • PyTorch: Enabled exporting of encodings for multi-output leaf module
  • TensorFlow Adaround
    • Added ability to use configuration file in API to adapt to a specific runtime target
    • Added Per-Channel Quantization support
  • TensorFlow QuantSim: Added support for FP16 inference and QAT
  • TensorFlow Per Channel Quantization
    • Fixed speed and accuracy issues
    • Fixed zero accuracy for 16-bits per channel quantization
    • Added support for DepthWise Conv2d Op
  • Multiple other bug fixes

User guide: https://quic.github.io/aimet-pages/releases/1.21.0/user_guide/index.html API documentation: https://quic.github.io/aimet-pages/releases/1.21.0/api_docs/index.html Documentation main page: https://quic.github.io/aimet-pages/index.html

1.20.0

2 years ago

1.19.1.py37

2 years ago

Release of the AI Model Efficiency toolkit package

  • PyTorch: Added CLE support for Conv1d, ConvTranspose1d and Depthwise Separable Conv1d layers
  • PyTorch: Added High-Bias Fold support for Conv1D layer
  • PyTorch: Modified Elementwise Concat Op to support any number of tensors
  • Minor dependency fixes

User guide: https://quic.github.io/aimet-pages/releases/1.19.1/user_guide/index.html API documentation: https://quic.github.io/aimet-pages/releases/1.19.1/api_docs/index.html Documentation main page: https://quic.github.io/aimet-pages/index.html

NOTE: This release is functionally equivalent to the 1.19.1 release (https://github.com/quic/aimet/releases/tag/1.19.1). But it has NOT undergone rigorous testing. It has been created only for compatibility with Google Colab.

1.19.1

2 years ago

Release of the AI Model Efficiency toolkit package

  • PyTorch: Added CLE support for Conv1d, ConvTranspose1d and Depthwise Separable Conv1d layers
  • PyTorch: Added High-Bias Fold support for Conv1D layer
  • PyTorch: Modified Elementwise Concat Op to support any number of tensors
  • Minor dependency fixes

User guide: https://quic.github.io/aimet-pages/releases/1.19.1/user_guide/index.html API documentation: https://quic.github.io/aimet-pages/releases/1.19.1/api_docs/index.html Documentation main page: https://quic.github.io/aimet-pages/index.html

1.18.0.py37

2 years ago

Release of the AI Model Efficiency toolkit package

Release Notes

  • Multiple bug fixes
  • Additional feature examples for PyTorch and TensorFlow

User guide: https://quic.github.io/aimet-pages/releases/1.18.0/user_guide/index.html API documentation: https://quic.github.io/aimet-pages/releases/1.18.0/api_docs/index.html Documentation main page: https://quic.github.io/aimet-pages/index.html

NOTE: This release is functionally equivalent to the 1.18.0 release (https://github.com/quic/aimet/releases/tag/1.18.0). But it has NOT undergone rigorous testing. It has been created only for compatibility with Google Colab.

1.18.0

2 years ago

Release of the AI Model Efficiency toolkit package

Release Notes

  • Multiple bug fixes
  • Additional feature examples for PyTorch and TensorFlow

User guide: https://quic.github.io/aimet-pages/releases/1.18.0/user_guide/index.html API documentation: https://quic.github.io/aimet-pages/releases/1.18.0/api_docs/index.html Documentation main page: https://quic.github.io/aimet-pages/index.html