Clean baseline implementation of PPO using an episodic TransformerXL memory
Code for paper "Episodic Memory Deep Q-Networks" (https://arxiv.org/abs/...