Stefanbo92 A3C Continuous Save

Tensorflow implementation of the asynchronous advantage actor-critic (a3c) reinforcement learning algorithm for continuous action space

Project README

A3C Continuous Reinforcement Learning

Tensorflow implementation of the asynchronous advantage actor-critic (A3C) reinforcement learning algorithm (paper) for continuous action space. Code is mostly based on Morvan Zhou (github).

Components

ACNet: This class contains the actor-critic neural network that estimates an action given a certain state and a value for each state. For continuous action states the action is given as an expected value mu and variance sigma.
Worker: The A3C algorithm employs multiple workers which have their own environment and ACNet and train on these asynchronous. Every few steps they update their weights to the global ACNet.
Main: The main function creates the global ACNet and multiple workers. They start training until a defined number of training episodes is reached. Reward will be plotted over all steps.

Results

Pendulum environment before training:

before

After 1500 episodes:

after

Open Source Agenda is not affiliated with "Stefanbo92 A3C Continuous" Project. README Source: stefanbo92/A3C-Continuous

Stars

Open Issues

Last Commit

6 years ago

Repository

stefanbo92/A3C-Continuous

License

MIT

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/stefanbo92-a3c-continuous"><img src="https://www.opensourceagenda.com/projects/stefanbo92-a3c-continuous/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022