Modularized Implementation of Deep RL Algorithms in PyTorch
The main update is to use TensorBoard to fully replace the plotting system from Open AI baselines. The latter has turned out to be a bad choice.
Previously, there is an internal version for me to run experiments on servers, including many helping scripts. I find recently it is intractable to maintain two versions at the same time. So from now on they are merged together.
I really don't like shell. So my philosophy is to use python as much as possible. I don't like the common style that we pass a loooooong arg list to a script to specify hyper-parameters.
Currently I cannot upgrade to Pytorch v1.0.1 as many of my ongoing projects are still based on v0.4.0. I will do this as soon as possible.
After this release, there is no official support for Python 2, although I expect most of the code will still work well in Python 2.
After this release, all the codes are incompatible with PyTorch v0.3.x
I found the current Atari wrapper I used is not fully compatible with the one in OpenAI baselines, resulting a dropped performance for most games (except for Pong). So I plan to do a major update to fix this issue. (To be more specific, OpenAI baselines track the return of the original episode which usually has more than one lives, however I track the return of the episode that only has one life)
Moreover, asynchronous methods are getting deprecated nowadays, so I will remove them and switch to A2C style algorithms in next version.
I made this tag in case someone may still want some old stuff.
To be more specific, following are implemented algorithms in this release:
Most of them are compatible with both Python2 and Python3, however almost all the async methods can only work in Python2.