Deep Rl Save

PyTorch implementation of deep reinforcement learning algorithms

Project README

Deep Reinforcement Learning (DRL) Algorithms with PyTorch

This repository contains PyTorch implementations of deep reinforcement learning algorithms. The repository will soon be updated including the PyBullet environments!

Algorithms Implemented

Deep Q-Network (DQN) _{^{(V. Mnih et al. 2015)}}
Double DQN (DDQN) _{^{(H. Van Hasselt et al. 2015)}}
Advantage Actor Critic (A2C)
Vanilla Policy Gradient (VPG)
Natural Policy Gradient (NPG) _{^{(S. Kakade et al. 2002)}}
Trust Region Policy Optimization (TRPO) _{^{(J. Schulman et al. 2015)}}
Proximal Policy Optimization (PPO) _{^{(J. Schulman et al. 2017)}}
Deep Deterministic Policy Gradient (DDPG) _{^{(T. Lillicrap et al. 2015)}}
Twin Delayed DDPG (TD3) _{^{(S. Fujimoto et al. 2018)}}
Soft Actor-Critic (SAC) _{^{(T. Haarnoja et al. 2018)}}
SAC with automatic entropy adjustment (SAC-AEA) _{^{(T. Haarnoja et al. 2018)}}

Environments Implemented

Classic control environments (CartPole-v1, Pendulum-v0, etc.) _{^{(as described in here)}}
MuJoCo environments (Hopper-v2, HalfCheetah-v2, Ant-v2, Humanoid-v2, etc.) _{^{(as described in here)}}
PyBullet environments (HopperBulletEnv-v0, HalfCheetahBulletEnv-v0, AntBulletEnv-v0, HumanoidDeepMimicWalkBulletEnv-v1 etc.) _{^{(as described in here)}}

Results (MuJoCo, PyBullet)

MuJoCo environments

Hopper-v2

Observation space: 8
Action space: 3

HalfCheetah-v2

Observation space: 17
Action space: 6

Ant-v2

Observation space: 111
Action space: 8

Humanoid-v2

Observation space: 376
Action space: 17

PyBullet environments

HopperBulletEnv-v0

Observation space: 15
Action space: 3

HalfCheetahBulletEnv-v0

Observation space: 26
Action space: 6

AntBulletEnv-v0

Observation space: 28
Action space: 8

HumanoidDeepMimicWalkBulletEnv-v1

Observation space: 197
Action space: 36

Requirements

Usage

The repository's high-level structure is:

├── agents                    
    └── common 
├── results  
    ├── data 
    └── graphs        
└── save_model

1) To train the agents on the environments

To train all the different agents on PyBullet environments, follow these steps:

git clone https://github.com/dongminlee94/deep_rl.git
cd deep_rl
python run_bullet.py

For other environments, change the last line to run_cartpole.py, run_pendulum.py, run_mujoco.py.

If you want to change configurations of the agents, follow this step:

python run_bullet.py \
    --env=HumanoidDeepMimicWalkBulletEnv-v1 \
    --algo=sac-aea \
    --phase=train \
    --render=False \
    --load=None \
    --seed=0 \
    --iterations=200 \
    --steps_per_iter=5000 \
    --max_step=1000 \
    --tensorboard=True \
    --gpu_index=0

2) To watch the learned agents on the above environments

To watch all the learned agents on PyBullet environments, follow these steps:

python run_bullet.py \
    --env=HumanoidDeepMimicWalkBulletEnv-v1 \
    --algo=sac-aea \
    --phase=test \
    --render=True \
    --load=envname_algoname_... \
    --seed=0 \
    --iterations=200 \
    --steps_per_iter=5000 \
    --max_step=1000 \
    --tensorboard=False \
    --gpu_index=0

You should copy the saved model name in save_model/envname_algoname_... and paste the copied name in envname_algoname_.... So the saved model will be load.

Open Source Agenda is not affiliated with "Deep Rl" Project. README Source: dongminlee94/deep_rl

Stars

485

Open Issues

Last Commit

2 years ago

Repository

dongminlee94/deep_rl

License

MIT

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/dongminlee94-deep-rl"><img src="https://www.opensourceagenda.com/projects/dongminlee94-deep-rl/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022