Emdp Versions Save

Easy MDPs and grid worlds with accessible transition dynamics to do exact calculations

0.0.5

4 years ago
  1. Add the Cakeworld MDP from the action gap paper: Bellemare et al. 2015 #16
  2. Add the two circle MDP for Off policy from Zhang et al 2019 #16
  3. Change plotting functions so that the MDP plot in the same orientation as in the character matrix file #17
  4. Fix Transition matrix builder where if the agent was manually placed in a wall state there was a non-zero probability of it escaping. This was fixed in #17 along with associated tests. Walls now behave as absorbing states. As a side effect you should also see that value functions for wall states are zero.
  5. Fix a really nasty bug in utils.convert_one_hot_to_int that caused the downcasting of an action. This function now returns a normal int as opposed to a np.int8: 59c20641eecfc5d23076f8bee6805f6101ba0a2d
  6. All tests are now passing again.

0.0.4

5 years ago
  • Add new counter example from the ACE paper. #13
  • Add gym support so that emdp works out of the box with algorithms that are compatible OpenAI gym. #12
  • Add torch-compatible analytic functions to make them differentiable. #14

v0.0.3-beta

5 years ago