An easy-to-use reinforcement learning library for research and education.
Relax dependencies
Release of version 0.7.0 of rlberry.
This is the first rlberry release since we did a major restructuration of rlberry in three repositories (PR #379) : rlberry (this repo): everything for rl that is not an agent or an environment, e.g. experiment management, parallelization, statistical tools, plotting... rlberry-scool: repository for teaching materials, e.g. simplified algorithms for teaching, notebooks for tutorials for learning RL... rlberry-research: repository of agents and environments used inside Inria Scool team
Changes since last version.
PR #397
PR #396
PR #385 to #390
PR #382
PR #376
Release of version 0.6.0 of rlberry.
This is the last rlberry release before we do a major restructuration of rlberry in three repositories:
Changes since last version.
PR #276
PR #365
PR #350
PR #326
PR #335
Release of version 0.5.0 of rlberry.
With this release, rlberry switches to gymnasium!
New in version 0.5.0:
PR #281, #323
Remark: for now stablebaselines 3 has no stable release with gymnasium. To use stablebaslines with gymnasium, use the main branch from github:
pip install git+https://github.com/DLR-RM/stable-baselines3
Release of version 0.4.1 of rlberry.
:warning: WARNING :warning: :
Before the rlberry installation, please install the fork of gym 0.21 : "gym[accept-rom-license] @ git+https://github.com/rlberry-py/gym_fix_021"
New in 0.4.1
PR #307
PR #306
rlberry.agents.QLAgent
and SARSA agent in :class:rlberry.agents.SARSAAgent
.PR #298
PR #277
Release of version 0.4.0 of rlberry.
New in 0.4.0
PR #273
PR #269
PR #262
PR #261, #264
Implementation of Munchausen DQN in rlberry.agents.torch.MDQNAgent.
Comparison of MDQN with DQN agent in the long tests.
PR #244, #250, #253
PR #235
PR #226, #227
Improve logging, the logging level can now be changed with rlberry.utils.logging.set_level().
Introduce smoothing in curves done with plot_writer_data when only one seed is used.
PR #223
Release of version 0.3.0 of rlberry.
New in 0.3.0
PR #206
PR #132
rlberry.agents.bandit.tools.BanditTracker
to track statistics to be used in Bandit algorithms.PR #191
rlberry.agents.manager.AgentManager
.PR #148, #161, #180
rlberry.agents.stable_baselines.StableBaselinesAgent
to import StableBaselines3 Agents.PR #119
rlberry.agents.utils.replay.ReplayBuffer
, aiming to replace code in utils/memories.pyFeb 22, 2022 (PR #126)
rlberry.__version__
(currently 0.3.0dev0)__eq__
method of the AgentManager class.Feb 14-15, 2022 (PR #97, #118)
~rlberry.agents.bandits.IndexAgent
and ~rlberry.envs.bandits.Bandit
.~rlberry.agents.bandits.BanditWithSimplePolicy
)Feb 11, 2022 (#83, #95)
FiniteMDP.sample()
: terminal state was being checked with self.state
instead of given state
~rlberry.manager.AgentManager
max_workers
argument for ~rlberry.manager.AgentManager
to control the maximum number of processes/threads created by the fit
method.Feb 04, 2022
~rlberry.manager.read_writer_data
to load agent's writer data from pickle files and make it simpler to customize in ~rlberry.manager.plot_writer_data
quick_start
~rlberry.agents.RLSVIAgent
~rlberry.agents.PSRLAgent
contributing
Improving interface and tools for parallel execution (#50)
AgentStats
renamed to AgentManager
.AgentManager
can handle agents that cannot be pickled.Agent
interface requires eval()
method instead of policy()
to handle more general agents (e.g. reward-free, POMDPs etc).ProcessPoolExecutor
and ThreadPoolExecutor
(allowing nested processes for example). Processes are created with spawn
(jax does not work with fork
, see #51).New experimental features (see #51, #62)
rlberry.network
: server and client interfaces to exchange messages via sockets.RemoteAgentManager
to train agents in a remote server and gather the results locally (using rlberry.network
).Logging and rendering:
DefaultWriter
and improved evaluation and plot methods in rlberry.manager.evaluation
.Bug fixes.
Features:
Agent
and AgentManager
both have a unique_id
attribute (useful for creating unique output files/directories).DefaultWriter
is now initialized in base class Agent
and (optionally) wraps a tensorboard SummaryWriter.AgentManager
has an option enable_tensorboard
that activates tensorboard logging in each of its Agent
s (with their writer
attribute). The log_dir
s of tensorboard are automatically assigned by AgentManager
.RemoteAgentManager
receives tensorboard data created in the server, when the method get_writer_data()
is called. This is done by a zip file transfer with rlberry.network
.BaseWrapper
and gym_make
now have an option wrap_spaces
. If set to True
, this option converts gym.spaces
to rlberry.spaces
, which provides classes with better seeding (using numpy's default_rng
instead of RandomState
)AgentManager
: new method get_agent_instances() that returns trained instancesplot_writer_data
: possibility to set xtag (tag used for x-axis)Bug fixes:
AgentHandler
(eval_env
missing in kwargs for agent_class).