Yarll Save

Combining deep learning and reinforcement learning.

Project README

Yet Another Reinforcement Learning Library (YARLL)

Update 14/05/2021: Added PyTorch implementation of REINFORCE.
Update 11/05/2021: Added PyTorch implementation of SAC.
Update 13/04/2021: Converted DDPG to Tensorflow 2.

Status

Different algorithms have currently been implemented (in no particular order):

Advantage Actor Critic
Asynchronous Advantage Actor Critic (A3C)
Deep Deterministic Policy Gradient (DDPG)
Proximal Policy Optimization (PPO)
Soft Actor-Critic (SAC) (TF2, PyTorch)
Trust Region Policy Optimization (TRPO)
REINFORCE (TF2, PyTorch) (convolutional neural network part has not been tested yet)
Cross-Entropy Method
Q-Learning
Deep Q-Learning
Fitted Q Iteration
Sarsa with with function approximation and eligibility traces
(Sequential) knowledge transfer
Asynchronous knowledge transfer

Asynchronous Advantage Actor Critic (A3C)

The code for this algorithm can be found here. Example run after training using 16 threads for a total of 5 million timesteps on the PongDeterministic-v4 environment:

Pong example run

How to run

First, install the library using pip (you can first remove OpenCV from the setup.py file if it is already installed):


pip install yarll

To use the library on a specific branch or to use it while changing the code, you can add the path to the library to your $PYTHONPATH (e.g., in your .bashrc or .zshrc file):


export PYTHONPATH=/path/to/yarll:$PYTHONPATH

Alternatively, you can add a symlink from your site-packages to the yarll directory.

Algorithms/experiments

You can run algorithms by passing the path to an experiment specification (which is a file in json format) to main.py:


python yarll/main.py <path_to_experiment_specification>

You can see all the possible arguments by running python yarll/main.py -h.

Examples of experiment specifications can be found in the experiment_specs folder.

Statistics

Statistics can be plot using:


python -m yarll.misc.plot_statistics <path_to_stats>

<path_to_stats> can be one of 2 things:

A json file generated using gym.wrappers.Monitor, in case it plots the episode lengths and total reward per episode.
A directory containing TensorFlow scalar summaries for different tasks, in which case all of the found scalars are plot.

Help about other arguments (e.g. for using smoothing) can be found by executing python -m yarll.misc.plot_statistics -h.

Alternatively, it is also possible to use Tensorboard to show statistics in the browser by passing the directory with the scalar summaries as --logdir argument.

Open Source Agenda is not affiliated with "Yarll" Project. README Source: arnomoonens/yarll

Stars

Open Issues

Last Commit

2 years ago

Repository

arnomoonens/yarll

License

MIT

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/yarll"><img src="https://www.opensourceagenda.com/projects/yarll/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022