Emdp Versions Save

Easy MDPs and grid worlds with accessible transition dynamics to do exact calculations

4 years ago

Add the Cakeworld MDP from the action gap paper: Bellemare et al. 2015 #16
Add the two circle MDP for Off policy from Zhang et al 2019 #16
Change plotting functions so that the MDP plot in the same orientation as in the character matrix file #17
Fix Transition matrix builder where if the agent was manually placed in a wall state there was a non-zero probability of it escaping. This was fixed in #17 along with associated tests. Walls now behave as absorbing states. As a side effect you should also see that value functions for wall states are zero.
Fix a really nasty bug in utils.convert_one_hot_to_int that caused the downcasting of an action. This function now returns a normal int as opposed to a np.int8: 59c20641eecfc5d23076f8bee6805f6101ba0a2d
All tests are now passing again.

5 years ago

Add new counter example from the ACE paper. #13
Add gym support so that emdp works out of the box with algorithms that are compatible OpenAI gym. #12
Add torch-compatible analytic functions to make them differentiable. #14

5 years ago