Stable Baselines3 Contrib Versions Save

Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code

v1.4.0

2 years ago

Breaking Changes:

  • Dropped python 3.6 support
  • Upgraded to Stable-Baselines3 >= 1.4.0
  • MaskablePPO was updated to match latest SB3 PPO version (timeout handling and new method for the policy object)

New Features:

  • Added TRPO (@cyprienc)
  • Added experimental support to train off-policy algorithms with multiple envs (note: HerReplayBuffer currently not supported)
  • Added Augmented Random Search (ARS) (@sgillen)

Others:

  • Improve test coverage for MaskablePPO

v1.3.0

2 years ago

WARNING: This version will be the last one supporting Python 3.6 (end of life in Dec 2021). We highly recommended you to upgrade to Python >= 3.7.

Breaking Changes:

  • Removed sde_net_arch
  • Upgraded to Stable-Baselines3 >= 1.3.0

New Features:

  • Added MaskablePPO algorithm (@kronion)
  • MaskablePPO Dictionary Observation support (@glmcdona)

v1.2.0

2 years ago

Breaking Changes:

  • Upgraded to Stable-Baselines3 >= 1.2.0

Bug Fixes:

  • QR-DQN and TQC updated so that their policies are switched between train and eval mode at the correct time (@ayeright)

Others:

  • Fixed type annotation
  • Added python 3.9 to CI

v1.1.0

2 years ago

Breaking Changes

  • Added support for Dictionary observation spaces (cf. SB3 doc)
  • Upgraded to Stable-Baselines3 >= 1.1.0
  • Added proper handling of timeouts for off-policy algorithms (cf. SB3 doc)
  • Updated usage of logger (cf. SB3 doc)

Bug Fixes

  • Removed unused code in TQC

Others

  • SB3 docs and tests dependencies are no longer required for installing SB3 contrib

Documentation

  • updated QR-DQN docs checkmark typo (@minhlong94)

v1.0

3 years ago

Blog post: https://araffin.github.io/post/sb3/

Breaking Changes

  • Upgraded to Stable-Baselines3 v1.0

Bug Fixes

  • Fixed a bug with QR-DQN predict method when using deterministic=False with image space

v1.0rc1

3 years ago

v0.11.1

3 years ago

Breaking Changes:

  • Upgraded to Stable-Baselines3 >= 0.11.1

New Features:

  • Added TimeFeatureWrapper to the wrappers
  • Added QR-DQN algorithm (@ku2482_)

Bug Fixes:

  • Fixed bug in TQC when saving/loading the policy only with non-default number of quantiles
  • Fixed bug in QR-DQN when calculating the target quantiles (@ku2482, @guyk1971)

Others:

  • Updated TQC to match new SB3 version
  • Moved quantile_huber_loss to common/utils.py (@ku2482)