DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.12.4...v0.12.5
deepspeed-kernels
only on Linux by @aphedges in https://github.com/microsoft/DeepSpeed/pull/4739
.gitignore
file to be parsed properly by @aphedges in https://github.com/microsoft/DeepSpeed/pull/4740
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.12.3...v0.12.4
module_state_dict
by @LZHgrla in https://github.com/microsoft/DeepSpeed/pull/4587
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.12.2...v0.12.3
master
to ensure mismatched cuda environments are shown to the user https://github.com/microsoft/DeepSpeed/commit/4f7dd7214b1d81dbbdff826015a67accc10390d2
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.12.1...v0.12.2
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.12.0...v0.12.1
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.11.2...v0.12.0
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.11.1...v0.11.2
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.11.0...v0.11.1
set_to_none=true
in zero_grad
methods by @Jackmin801 in https://github.com/microsoft/DeepSpeed/pull/4438
ignore_unused_parameters
by @UniverseFly in https://github.com/microsoft/DeepSpeed/pull/4418
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.10.3...v0.11.0
non_reentrant_checkpoint
fix requires_grad of input must be true for activation checkpoint layer in pipeline train. by @inkcherry in https://github.com/microsoft/DeepSpeed/pull/4224
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.10.2...v0.10.3