Stable Baselines3 Contrib Versions Save

Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code

v1.4.0

2 years ago

Breaking Changes:

Dropped python 3.6 support
Upgraded to Stable-Baselines3 >= 1.4.0
MaskablePPO was updated to match latest SB3 PPO version (timeout handling and new method for the policy object)

New Features:

Added TRPO (@cyprienc)
Added experimental support to train off-policy algorithms with multiple envs (note: HerReplayBuffer currently not supported)
Added Augmented Random Search (ARS) (@sgillen)

Others:

Improve test coverage for MaskablePPO

v1.3.0

2 years ago

WARNING: This version will be the last one supporting Python 3.6 (end of life in Dec 2021). We highly recommended you to upgrade to Python >= 3.7.

Breaking Changes:

Removed sde_net_arch
Upgraded to Stable-Baselines3 >= 1.3.0

New Features:

Added MaskablePPO algorithm (@kronion)
MaskablePPO Dictionary Observation support (@glmcdona)

v1.2.0

2 years ago

Breaking Changes:

Upgraded to Stable-Baselines3 >= 1.2.0

Bug Fixes:

QR-DQN and TQC updated so that their policies are switched between train and eval mode at the correct time (@ayeright)

Others:

Fixed type annotation
Added python 3.9 to CI

v1.1.0

2 years ago

Breaking Changes

Added support for Dictionary observation spaces (cf. SB3 doc)
Upgraded to Stable-Baselines3 >= 1.1.0
Added proper handling of timeouts for off-policy algorithms (cf. SB3 doc)
Updated usage of logger (cf. SB3 doc)

Bug Fixes

Removed unused code in TQC

Others

SB3 docs and tests dependencies are no longer required for installing SB3 contrib

Documentation

updated QR-DQN docs checkmark typo (@minhlong94)

v1.0

3 years ago

Blog post: https://araffin.github.io/post/sb3/

Breaking Changes

Upgraded to Stable-Baselines3 v1.0

Bug Fixes

Fixed a bug with QR-DQN predict method when using deterministic=False with image space

v1.0rc1

3 years ago

v0.11.1

3 years ago

Breaking Changes:

Upgraded to Stable-Baselines3 >= 0.11.1

New Features:

Added TimeFeatureWrapper to the wrappers
Added QR-DQN algorithm (@ku2482_)

Bug Fixes:

Fixed bug in TQC when saving/loading the policy only with non-default number of quantiles
Fixed bug in QR-DQN when calculating the target quantiles (@ku2482, @guyk1971)

Others:

Updated TQC to match new SB3 version
Moved quantile_huber_loss to common/utils.py (@ku2482)