Rlberry Versions Save

An easy-to-use reinforcement learning library for research and education.

v0.7.3

3 weeks ago

Version 0.7.3

PR #454

remove unused librairies

PR #451

Moving UCBVI to rlberry_scool

PR #438

move long tests to rlberry research

PR #436 #444 #445 #447 #448 #455 #456

Update user guide
add tests on the userguide examples
removing rlberry_research references as much as possible (doc and code)

v0.7.2

2 months ago

Relax dependencies

v0.7.1

2 months ago

Version 0.7.1

PR #411

Moving "rendering" to rlberry

PR #405 #406 #408

fix plots

PR #404

add AdaStop

v0.7.0

4 months ago

Release of version 0.7.0 of rlberry.

This is the first rlberry release since we did a major restructuration of rlberry in three repositories (PR #379) : rlberry (this repo): everything for rl that is not an agent or an environment, e.g. experiment management, parallelization, statistical tools, plotting... rlberry-scool: repository for teaching materials, e.g. simplified algorithms for teaching, notebooks for tutorials for learning RL... rlberry-research: repository of agents and environments used inside Inria Scool team

Changes since last version.

PR #397

Automatic save after fit() in ExperienceManager

PR #396

Improve coverage and fix version workflow

PR #385 to #390

Switch from RTD to github page

PR #382

switch to poetry

PR #376

New plot_writer_data function that does not depend on seaborn and that can plot smoothed function and confidence band if scikit-fda is installed.

v0.6.0

6 months ago

Release of version 0.6.0 of rlberry.

This is the last rlberry release before we do a major restructuration of rlberry in three repositories:

rlberry: everything for rl that is not an agent or an environment, e.g. experiment management, parallelization, statistical tools, plotting...
rlberry-scool: repository for teaching materials, e.g. simplified algorithms for teaching, notebooks for tutorials for learning RL...
rlberry-research: repository of agents and environments used inside Inria Scool team

Changes since last version.

PR #276

Non adaptive multiple tests for agent comparison.

PR #365

Fix Sphinx version to <7.

PR #350

Rename AgentManager to ExperimentManager.

PR #326

Moved SAC from experimental to torch agents. Tested and benchmarked.

PR #335

Upgrade from Python3.9 -> python3.10

v0.5.0

10 months ago

Release of version 0.5.0 of rlberry.

With this release, rlberry switches to gymnasium!

New in version 0.5.0:

PR #281, #323

Merge gymnasium branch into main, make gymnasium the default library for environments in rlberry.

Remark: for now stablebaselines 3 has no stable release with gymnasium. To use stablebaslines with gymnasium, use the main branch from github:

pip install git+https://github.com/DLR-RM/stable-baselines3

v0.4.1

10 months ago

Release of version 0.4.1 of rlberry.

:warning: WARNING :warning: :
Before the rlberry installation, please install the fork of gym 0.21 : "gym[accept-rom-license] @ git+https://github.com/rlberry-py/gym_fix_021"

New in 0.4.1

PR #307

Create fork gym0.21 for setuptools non-retrocompatible changes.

PR #306

Add Q-learning agent in :class:rlberry.agents.QLAgent and SARSA agent in :class:rlberry.agents.SARSAAgent.

PR #298

Move old scripts (jax agents, attention networks, old examples...) that we won't maintain from the main branch to an archive branch.

PR #277

Add and update code to use "Atari games" env

v0.4.0

1 year ago

Release of version 0.4.0 of rlberry.

New in 0.4.0

PR #273

Change the default behavior of plot_writer_data so that if seaborn has version >= 0.12.0 then a 90% percentile interval is used instead of sd.

PR #269

Add rlberry.envs.PipelineEnv a way to define pipeline of wrappers in a simple way.

PR #262

PPO can now handle continuous actions.

PR #261, #264

Implementation of Munchausen DQN in rlberry.agents.torch.MDQNAgent.
Comparison of MDQN with DQN agent in the long tests.

PR #244, #250, #253

Compress the pickles used to save the trained agents.

PR #235

Implementation of rlberry.envs.SpringCartPole environment, an RL environment featuring two cartpoles linked by a spring.

PR #226, #227

Improve logging, the logging level can now be changed with rlberry.utils.logging.set_level().
Introduce smoothing in curves done with plot_writer_data when only one seed is used.

PR #223

Moved PPO from experimental to torch agents. Tested and benchmarked.

v0.3.0

1 year ago

Release of version 0.3.0 of rlberry.

New in 0.3.0

PR #206

Creation of a Deep RL tutorial, in the user guide.

PR #132

New tracker class rlberry.agents.bandit.tools.BanditTracker to track statistics to be used in Bandit algorithms.

PR #191

Possibility to generate a profile with rlberry.agents.manager.AgentManager.

PR #148, #161, #180

Misc improvements on A2C.
New StableBaselines3 wrapper rlberry.agents.stable_baselines.StableBaselinesAgent to import StableBaselines3 Agents.

PR #119

Improving documentation for agents.torch.utils
New replay buffer rlberry.agents.utils.replay.ReplayBuffer, aiming to replace code in utils/memories.py
New DQN implementation, aiming to fix reproducibility and compatibility issues.
Implements Q(lambda) in DQN Agent.

Feb 22, 2022 (PR #126)

Setup rlberry.__version__ (currently 0.3.0dev0)
Record rlberry version in a AgentManager attribute equality of AgentManagers
Override __eq__ method of the AgentManager class.

Feb 14-15, 2022 (PR #97, #118)

(feat) Add Bandits basic environments and agents. See ~rlberry.agents.bandits.IndexAgent and ~rlberry.envs.bandits.Bandit.
Thompson Sampling bandit algorithm with gaussian or beta prior.
Base class for bandits algorithms with custom save & load functions (called ~rlberry.agents.bandits.BanditWithSimplePolicy)

Feb 11, 2022 (#83, #95)

(fix) Fixed bug in FiniteMDP.sample(): terminal state was being checked with self.state instead of given state
(feat) Option to use 'fork' or 'spawn' in ~rlberry.manager.AgentManager
(feat) AgentManager output_dir now has a timestamp and a short ID by default.
(feat) Gridworld can be constructed from string layout
(feat) max_workers argument for ~rlberry.manager.AgentManager to control the maximum number of processes/threads created by the fit method.

Feb 04, 2022

Add ~rlberry.manager.read_writer_data to load agent's writer data from pickle files and make it simpler to customize in ~rlberry.manager.plot_writer_data
Fix bug, dqn should take a tuple as environment
Add a quickstart tutorial in the docs quick_start
Add the RLSVI algorithm (tabular) ~rlberry.agents.RLSVIAgent
Add the Posterior Sampling for Reinforcement Learning PSRL agent for tabular MDP ~rlberry.agents.PSRLAgent
Add a page to help contributors in the doc contributing

v0.2.1

2 years ago

New in v0.2

Improving interface and tools for parallel execution (#50)

AgentStats renamed to AgentManager.
AgentManager can handle agents that cannot be pickled.
Agent interface requires eval() method instead of policy() to handle more general agents (e.g. reward-free, POMDPs etc).
Multi-processing and multi-threading are now done with ProcessPoolExecutor and ThreadPoolExecutor (allowing nested processes for example). Processes are created with spawn (jax does not work with fork, see #51).

New experimental features (see #51, #62)

JAX implementation of DQN and replay buffer using reverb.
rlberry.network: server and client interfaces to exchange messages via sockets.
RemoteAgentManager to train agents in a remote server and gather the results locally (using rlberry.network).

Logging and rendering:

Data logging with a new DefaultWriter and improved evaluation and plot methods in rlberry.manager.evaluation.
Fix rendering bug with OpenGL (bf606b44aaba1b918daf3dcc02be96a8ef5436b4).

Bug fixes.

New in v0.2.1 (#65)

Features:

Agent and AgentManager both have a unique_id attribute (useful for creating unique output files/directories).
DefaultWriter is now initialized in base class Agent and (optionally) wraps a tensorboard SummaryWriter.
AgentManager has an option enable_tensorboard that activates tensorboard logging in each of its Agents (with their writer attribute). The log_dirs of tensorboard are automatically assigned by AgentManager.
RemoteAgentManager receives tensorboard data created in the server, when the method get_writer_data() is called. This is done by a zip file transfer with rlberry.network.
BaseWrapper and gym_make now have an option wrap_spaces. If set to True, this option converts gym.spaces to rlberry.spaces, which provides classes with better seeding (using numpy's default_rng instead of RandomState)
AgentManager: new method get_agent_instances() that returns trained instances
plot_writer_data: possibility to set xtag (tag used for x-axis)

Bug fixes:

Fixed agent initialization bug in AgentHandler (eval_env missing in kwargs for agent_class).