ChuaCheowHuan Reinforcement Learning Save

My reproduction of various reinforcement learning algorithms (DQN variants, A3C, DPPO, RND with PPO) in Tensorflow.

Project README

What's in this repository?

This repository contains codes that I have reproduced (while learning RL) for various reinforcement learning algorithms. The codes were tested on Colab.

If Github is not loading the Jupyter notebooks, a known Github issue, click here to view the notebooks on Jupyter's nbviewer.


Implemented Algorithms

Algorithms Discrete Continuous Multithreaded Multiprocessing Tested on
DQN :heavy_check_mark: CartPole-v0
Double DQN (DDQN) :heavy_check_mark: CartPole-v0
Dueling DDQN :heavy_check_mark: CartPole-v0
Dueling DDQN + PER :heavy_check_mark: CartPole-v0
A3C (1) :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:(3) CartPole-v0, Pendulum-v0
DPPO (2) :heavy_check_mark: :heavy_check_mark:(3) Pendulum-v0
RND + PPO :heavy_check_mark: MountainCarContinuous-v0 (4), Pendulum-v0 (5)

(1): N-step returns used for critic's target.
(2): GAE used for computation of TD lambda return (for critic's target) & policy's advantage.
(3): Distributed Tensorflow & Python's multiprocessing package used.
(4): State featurization (approximates feature map of an RBF kernel) is used.
(5): Fast-slow LSTM with an overly simplified VAE like "variational unit" (VU) is used.


misc folder

The misc folder contains related example codes that I have put together while learning RL. See the README.md in the misc folder for more details.


Blog

Check out my blog for more information on my repositories.

Open Source Agenda is not affiliated with "ChuaCheowHuan Reinforcement Learning" Project. README Source: ChuaCheowHuan/reinforcement_learning

Open Source Agenda Badge

Open Source Agenda Rating