Rlberry Versions Save

An easy-to-use reinforcement learning library for research and education.

v0.7.3

3 weeks ago

Version 0.7.3

PR #454

  • remove unused librairies

PR #451

  • Moving UCBVI to rlberry_scool

PR #438

  • move long tests to rlberry research

PR #436 #444 #445 #447 #448 #455 #456

  • Update user guide
  • add tests on the userguide examples
  • removing rlberry_research references as much as possible (doc and code)

v0.7.2

2 months ago

Relax dependencies

v0.7.1

2 months ago

Version 0.7.1

PR #411

  • Moving "rendering" to rlberry

PR #405 #406 #408

  • fix plots

PR #404

  • add AdaStop

v0.7.0

4 months ago

Release of version 0.7.0 of rlberry.

This is the first rlberry release since we did a major restructuration of rlberry in three repositories (PR #379) : rlberry (this repo): everything for rl that is not an agent or an environment, e.g. experiment management, parallelization, statistical tools, plotting... rlberry-scool: repository for teaching materials, e.g. simplified algorithms for teaching, notebooks for tutorials for learning RL... rlberry-research: repository of agents and environments used inside Inria Scool team

Changes since last version.

PR #397

  • Automatic save after fit() in ExperienceManager

PR #396

  • Improve coverage and fix version workflow

PR #385 to #390

  • Switch from RTD to github page

PR #382

  • switch to poetry

PR #376

  • New plot_writer_data function that does not depend on seaborn and that can plot smoothed function and confidence band if scikit-fda is installed.

v0.6.0

6 months ago

Release of version 0.6.0 of rlberry.

This is the last rlberry release before we do a major restructuration of rlberry in three repositories:

  • rlberry: everything for rl that is not an agent or an environment, e.g. experiment management, parallelization, statistical tools, plotting...
  • rlberry-scool: repository for teaching materials, e.g. simplified algorithms for teaching, notebooks for tutorials for learning RL...
  • rlberry-research: repository of agents and environments used inside Inria Scool team

Changes since last version.

PR #276

  • Non adaptive multiple tests for agent comparison.

PR #365

  • Fix Sphinx version to <7.

PR #350

  • Rename AgentManager to ExperimentManager.

PR #326

  • Moved SAC from experimental to torch agents. Tested and benchmarked.

PR #335

  • Upgrade from Python3.9 -> python3.10

v0.5.0

10 months ago

Release of version 0.5.0 of rlberry.

With this release, rlberry switches to gymnasium!

New in version 0.5.0:

PR #281, #323

  • Merge gymnasium branch into main, make gymnasium the default library for environments in rlberry.

Remark: for now stablebaselines 3 has no stable release with gymnasium. To use stablebaslines with gymnasium, use the main branch from github:

pip install git+https://github.com/DLR-RM/stable-baselines3

v0.4.1

10 months ago

Release of version 0.4.1 of rlberry.

:warning: WARNING :warning: :
Before the rlberry installation, please install the fork of gym 0.21 : "gym[accept-rom-license] @ git+https://github.com/rlberry-py/gym_fix_021"

New in 0.4.1

PR #307

  • Create fork gym0.21 for setuptools non-retrocompatible changes.

PR #306

  • Add Q-learning agent in :class:rlberry.agents.QLAgent and SARSA agent in :class:rlberry.agents.SARSAAgent.

PR #298

  • Move old scripts (jax agents, attention networks, old examples...) that we won't maintain from the main branch to an archive branch.

PR #277

  • Add and update code to use "Atari games" env

v0.4.0

1 year ago

Release of version 0.4.0 of rlberry.

New in 0.4.0

PR #273

  • Change the default behavior of plot_writer_data so that if seaborn has version >= 0.12.0 then a 90% percentile interval is used instead of sd.

PR #269

PR #262

  • PPO can now handle continuous actions.

PR #261, #264

  • Implementation of Munchausen DQN in rlberry.agents.torch.MDQNAgent.

  • Comparison of MDQN with DQN agent in the long tests.

PR #244, #250, #253

  • Compress the pickles used to save the trained agents.

PR #235

PR #226, #227

  • Improve logging, the logging level can now be changed with rlberry.utils.logging.set_level().

  • Introduce smoothing in curves done with plot_writer_data when only one seed is used.

PR #223

  • Moved PPO from experimental to torch agents. Tested and benchmarked.

v0.3.0

1 year ago

Release of version 0.3.0 of rlberry.

New in 0.3.0

PR #206

  • Creation of a Deep RL tutorial, in the user guide.

PR #132

  • New tracker class rlberry.agents.bandit.tools.BanditTracker to track statistics to be used in Bandit algorithms.

PR #191

  • Possibility to generate a profile with rlberry.agents.manager.AgentManager.

PR #148, #161, #180

  • Misc improvements on A2C.
  • New StableBaselines3 wrapper rlberry.agents.stable_baselines.StableBaselinesAgent to import StableBaselines3 Agents.

PR #119

  • Improving documentation for agents.torch.utils
  • New replay buffer rlberry.agents.utils.replay.ReplayBuffer, aiming to replace code in utils/memories.py
  • New DQN implementation, aiming to fix reproducibility and compatibility issues.
  • Implements Q(lambda) in DQN Agent.

Feb 22, 2022 (PR #126)

  • Setup rlberry.__version__ (currently 0.3.0dev0)
  • Record rlberry version in a AgentManager attribute equality of AgentManagers
  • Override __eq__ method of the AgentManager class.

Feb 14-15, 2022 (PR #97, #118)

  • (feat) Add Bandits basic environments and agents. See ~rlberry.agents.bandits.IndexAgent and ~rlberry.envs.bandits.Bandit.
  • Thompson Sampling bandit algorithm with gaussian or beta prior.
  • Base class for bandits algorithms with custom save & load functions (called ~rlberry.agents.bandits.BanditWithSimplePolicy)

Feb 11, 2022 (#83, #95)

  • (fix) Fixed bug in FiniteMDP.sample(): terminal state was being checked with self.state instead of given state
  • (feat) Option to use 'fork' or 'spawn' in ~rlberry.manager.AgentManager
  • (feat) AgentManager output_dir now has a timestamp and a short ID by default.
  • (feat) Gridworld can be constructed from string layout
  • (feat) max_workers argument for ~rlberry.manager.AgentManager to control the maximum number of processes/threads created by the fit method.

Feb 04, 2022

  • Add ~rlberry.manager.read_writer_data to load agent's writer data from pickle files and make it simpler to customize in ~rlberry.manager.plot_writer_data
  • Fix bug, dqn should take a tuple as environment
  • Add a quickstart tutorial in the docs quick_start
  • Add the RLSVI algorithm (tabular) ~rlberry.agents.RLSVIAgent
  • Add the Posterior Sampling for Reinforcement Learning PSRL agent for tabular MDP ~rlberry.agents.PSRLAgent
  • Add a page to help contributors in the doc contributing

v0.2.1

2 years ago

New in v0.2

Improving interface and tools for parallel execution (#50)

  • AgentStats renamed to AgentManager.
  • AgentManager can handle agents that cannot be pickled.
  • Agent interface requires eval() method instead of policy() to handle more general agents (e.g. reward-free, POMDPs etc).
  • Multi-processing and multi-threading are now done with ProcessPoolExecutor and ThreadPoolExecutor (allowing nested processes for example). Processes are created with spawn (jax does not work with fork, see #51).

New experimental features (see #51, #62)

  • JAX implementation of DQN and replay buffer using reverb.
  • rlberry.network: server and client interfaces to exchange messages via sockets.
  • RemoteAgentManager to train agents in a remote server and gather the results locally (using rlberry.network).

Logging and rendering:

  • Data logging with a new DefaultWriter and improved evaluation and plot methods in rlberry.manager.evaluation.
  • Fix rendering bug with OpenGL (bf606b44aaba1b918daf3dcc02be96a8ef5436b4).

Bug fixes.

New in v0.2.1 (#65)

Features:

  • Agent and AgentManager both have a unique_id attribute (useful for creating unique output files/directories).
  • DefaultWriter is now initialized in base class Agent and (optionally) wraps a tensorboard SummaryWriter.
  • AgentManager has an option enable_tensorboard that activates tensorboard logging in each of its Agents (with their writer attribute). The log_dirs of tensorboard are automatically assigned by AgentManager.
  • RemoteAgentManager receives tensorboard data created in the server, when the method get_writer_data() is called. This is done by a zip file transfer with rlberry.network.
  • BaseWrapper and gym_make now have an option wrap_spaces. If set to True, this option converts gym.spaces to rlberry.spaces, which provides classes with better seeding (using numpy's default_rng instead of RandomState)
  • AgentManager: new method get_agent_instances() that returns trained instances
  • plot_writer_data: possibility to set xtag (tag used for x-axis)

Bug fixes:

  • Fixed agent initialization bug in AgentHandler (eval_env missing in kwargs for agent_class).