Neuralmagic Sparseml Versions Save

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

v1.4.2

1 year ago

This is a patch release for 1.4.0 that contains the following changes:

Auto batch reduction for YOLOv5 on quantization aware training was failing with an OOM error. This has now been fixed and logic simplified to always scale the input batch size down by 4 and increase the gradient accumulation steps by 4 (https://github.com/neuralmagic/yolov5/commit/2c7058d75ab6798f85287d3f2e6cebf5dc5d25ec)
Older YOLOv5 sparse models, examples, and commands were failing to start in the 1.4 release due to the removal of hyps/hyp.finetune.yaml, data/hyps/hyp.finetune.yaml, models_v5.0/yolov5s.yaml files. These have now been added back in to ensure those training commands will still work and complete successfully. (https://github.com/neuralmagic/yolov5/commit/5308d00dfdcb37742eb22c741c125eecddede435, https://github.com/neuralmagic/yolov5/commit/a1ebe6a9a2460a95ff62b88e91c56802264014a1, https://github.com/neuralmagic/yolov5/commit/88aea22397f95f7889f098a515af4ff7a9cb2ed5)
YOLOv5 models were crashing when trying to resume training from a checkpoint using the --resume arg. This is now fixed and --resume is supported again. (https://github.com/neuralmagic/yolov5/commit/bd2bc0b193dca24ed6da403c68f3120b17cd039e)
Hugging Face token classification training CLI no longer fails when teacher and student labels are in different order. It now pulls the token configuration from the teacher model for the student. (#1400)

v1.4.0

1 year ago

New Features:

OpenPifPaf training prototype support (#1171)
Layerwise distillation support for the PyTorch DistillationModifier (#1272)
Recipe template API added in PyTorch for simple creation of default recipes (#1147)
Ability to create sample inputs and outputs on export for transformers, YOLOv5, and image classification pathways (#1180)
Loggers and one-shot support for torchvision training script (#1299, #1300)

Changes:

Refactored the ONNX Export pipeline to standardize implementations, adding functionality for more complicated models, and adding better debugging support. (#1192)
Refactored the PyTorch QuantizationModifier to expand supported models and operators and simplify the interface. (#1183)
YOLOv5 integration upgraded to the latest upstream. (#1322)

Resolved Issues:

recipe_template CLI no longer has improper code documentation, impairing operability. (#1170)
ONNX export now enforces that all quantized graphs will have unit8 values. fixing issues for some quantized models that were crashing in DeepSparse. (#1181)
Changed over to vector_norm for PyTorch pruning modifiers that were leading to crashes in older PyTorch versions. (#1167)
Model loading for torchvision script fixed where models were failing on load unless a recipe was supplied. (#1281)

Known Issues:

None

v1.3.1

1 year ago

This is a patch release for 1.3.0 that contains the following changes:

NumPy version pinned to <=1.21.6 to avoid deprecation warning/index errors in pipelines.

v1.3.0

1 year ago

New Features:

NLP multi-label training and eval support added.
SQuAD v2.0 support provided.
Recipe template APIs introduced, enabling easier creation of recipes for custom models with standard sparsification pathways.
EfficientNetV2 model architectures implemented.
Sample inputs and outputs exportable for YOLOv5, transformers, and image classification integrations.

Changes:

PyTorch 1.12 and Python 3.10 now supported.
YOLOv5 pipelines upgraded to the latest version from Ultralytics.
Transformers pipelines upgraded to latest version from Hugging Face.
PyTorch image classification pathway upgraded using torchvision standards.
Recipe arguments now support list types.

Resolved Issues:

Improper URLs fixed for ONNX export documentation to proper documentation links.
Latest transformers version hosted by Neural Magic automatically installs; previously it would pin on older versions and not receive updates

Known Issues:

None

v1.2.0

1 year ago

New Features:

Document classification training and export pipelines added for transformers integration.

Changes:

Refactor of transformers training and export integration code now enables more code reuse across use cases.
List of supported quantized nodes expanded to enable more complex quantization patterns for ResNet-50 and MobileBERT enabling better performance for similar models.
Transformers integration expanded to enable saving and reloading of optimizer state from trained checkpoints.
Deployment folder added for image classification integration which will export to deployment.
Gradient accumulation support added for image classification integration.
Minimum Python version changed to 3.7 as 3.6 as reached EOL.

Resolved Issues:

Quantized checkpoints for image classification models now instantiates correctly, no longer leading to random initialization of weights rather than restore.
TraininableParamsModifier for PyTorch now enables and disables params properly so weights are frozen while training.
Quantized embeddings no longer causes crashes while training with distributed data parallel.
Improper EfficientNet definitions fixed that would lead to accuracy issues due to convolutional strides being duplicated.
Protobuf version for ONNX 1.12 compatibility pinned to prevent install failures on some systems.

Known Issues:

None

v1.1.1

1 year ago

This is a patch release for 1.1.0 that contains the following changes:

Some structurally modified image classification models in PyTorch would crash on reload; they now reload properly.

v1.1.0

1 year ago

New Features:

YOLACT Segmentation native training integration made for SparseML.
OBSPruning modifier added (https://arxiv.org/abs/2203.07259).
QAT now supported for MobileBERT.
Custom module support provided for QAT to enable quantization of layers such as GELU.

Changes:

Updates made across the repository for new SparseZoo Python APIs.
Non-string keys are now supported in recipes for layer and module names.
Native support added for DDP training with pruning in PyTorch pathways.
YOLOV5p6 models default to their native activations instead of overwriting to Hardswish.
Transformers eval pathways changed to turn off Amp (fFP16) to give more stable results.
TensorBoard logger added to transformers integration.
Python setuptools set as required at 59.5 to avoid installation issues with other packages.
DDP now works for quantized training of embedding layers where tensors were being placed on incorrect devices and causing training crashes.

Resolved Issues:

ConstantPruningModifier propagated None in place of the start_epoch value when start_epoch > 0. It now propagates the proper value.
Quantization of BERT models were dropping accuracy improperly by quantizing the identify branches.
SparseZoo stubs were not loading model weights for image classification pathways when using DDP training.

Known Issues:

None

v1.0.1

1 year ago

This is a patch release for 1.0.0 that contains the following changes:

Quantized ONNX graph folding resolution that prevents and extra quant/dequant pair being added into the residuals for BERT style models. This was causing an accuracy drop after exporting to ONNX of up to 1% and is now fixed.

v1.0.0

1 year ago

New Features:

One-shot and recipe arguments support added for transformers, yolov5, and torchvision.
Dockerfiles and new build processes created for Docker.
CLI formats and inclusion standardized on install of SparseML for transformers, yolov5, and torchvision.
N:M pruning mask creator deployed for use in PyTorch pruning modifiers.
Masked_language_modeling training CLI added for transformers.
Documentation additions made across all standard integrations and pathways.
GitHub action tests running for end-to-end testing of integrations.

Changes:

Click as a root dependency added as the new preferred route for CLI invocation and arg management.
Provider parameter added for ONNXRuntime InferenceSessions.
Moved onnxruntime to optional install extra. onnxruntime no longer a root dependency and will only be imported when using specific pathways.
QAT export pipelines improved with better support for QATMatMul and custom operators.

Resolved Issues:

Incorrect commands and models updated for older docs for transformers, yolov5, and torchvision.
YOLOv5 issues addressed with data files, configs, and datasets not being easily accessible with the new install pathway. They are now included in the sparseml src folder for yolov5.
An extra batch no longer runs for the PyTorch ModuleRunner.
None sparsity parameter was being improperly propagated for sparsity in the PyTorch ConstantPruningModifier.
PyPI dependency conflicts no longer occur with the latest ONNX and Protobuf upgrades.
When GPUs were not available, yolov5 pathways were not working.
Transformers export was not working properly when neither --do_train or --do_eval arguments were passed in.
Non-string keys now allowed within recipes.
Numerous fixes applied for pruning modifiers including improper masks casting, improper initialization, and improper arguments passed through for MFAC.
YOLOv5 export formatting error addressed.
Missing or incorrect data corrected for logging and recording statements.
PyTorch DistillationModifier for transformers was ignoring both "self" distillation and "disable" distillation values; instead, normal distillation would be used.
FP16 not deactivating on QAT start for torchvision.

Known Issues:

PyTorch > 1.9 quantized ONNX export is broken; waiting on PyTorch resolution and testing.

v0.12.2

1 year ago

This is a patch release for 0.12.0 that contains the following changes:

Protobuf is restricted to version < 4.0 as the newer version breaks ONNX.