PyTorch implementation of some reinforcement learning algorithms: A2C, P...
Curiosity-driven Exploration by Self-supervised Prediction
Proximal Policy Optimization(PPO) with Intrinsic Curiosity Module(ICM)
Clean baseline implementation of PPO using an episodic TransformerXL memory
Baseline implementation of recurrent PPO using truncated BPTT
Code for the paper "Reinforced Curriculum Learning for Autonomous Drivin...
Implementing reinforcement-learning algorithms for pysc2 -environment
Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation ...
Implementation of Generatve Adversarial Imitation Learning (GAIL) for cl...
This is an pytorch implementation of Distributed Proximal Policy Optimiz...
PyTorch implementation of Proximal Policy Optimization
Deep Reinforcement Learning by using Proximal Policy Optimization and Ra...
Jax implementation of Proximal Policy Optimization (PPO) specifically tu...
Policy Optimization with Penalized Point Probability Distance: an Altern...