Combining deep learning and reinforcement learning.
Update 14/05/2021: Added PyTorch implementation of REINFORCE.
Update 11/05/2021: Added PyTorch implementation of SAC.
Update 13/04/2021: Converted DDPG to Tensorflow 2.
Different algorithms have currently been implemented (in no particular order):
The code for this algorithm can be found here. Example run after training using 16 threads for a total of 5 million timesteps on the PongDeterministic-v4 environment:
First, install the library using pip (you can first remove OpenCV from the setup.py
file if it is already installed):
pip install yarll
To use the library on a specific branch or to use it while changing the code, you can add the path to the library to your $PYTHONPATH
(e.g., in your .bashrc
or .zshrc
file):
export PYTHONPATH=/path/to/yarll:$PYTHONPATH
Alternatively, you can add a symlink from your site-packages
to the yarll directory.
You can run algorithms by passing the path to an experiment specification (which is a file in json format) to main.py
:
python yarll/main.py <path_to_experiment_specification>
You can see all the possible arguments by running python yarll/main.py -h
.
Examples of experiment specifications can be found in the experiment_specs folder.
Statistics can be plot using:
python -m yarll.misc.plot_statistics <path_to_stats>
<path_to_stats>
can be one of 2 things:
gym.wrappers.Monitor
, in case it plots the episode lengths and total reward per episode.Help about other arguments (e.g. for using smoothing) can be found by executing python -m yarll.misc.plot_statistics -h
.
Alternatively, it is also possible to use Tensorboard to show statistics in the browser by passing the directory with the scalar summaries as --logdir
argument.