Deep active inference agents using Monte-Carlo methods
This source code release accompanies the manuscript:
Z. Fountas, N. Sajid, P. A.M. Mediano and K. Friston "Deep active inference agents using Monte-Carlo methods", Advances in Neural Information Processing Systems 33 (NeurIPS 2020).
If you use this model or the dynamic dSprites environment in your work, please cite our paper.
For a quick overview see this video. In this work, we propose the deep neural architecture illustrated below, which can be used to train scaled-up active inference agents for continuous complex environments based on amortized inference, M-C tree search, M-C dropouts and top-down transition precision, that encourages disentangled latent representations.
We test this architecture on two tasks from the Animal-AI Olympics and a new simple object-sorting task based on DeepMind's dSprites dataset.
Agent trained in the Dynamic dSprites environment | Agent trained in the Animal-AI environment |
pip install -r requirements.txt
wget https://github.com/deepmind/dsprites-dataset/raw/master/dsprites_ndarray_co1sh3sc6or40x32y32_64x64.npz
or by manually visiting the above URL.
python train.py
This script will automatically generate checkpoints with the optimized parameters of the agent and store this checkpoints to a different sub-folder every 25 training iterations. The default folder that will contain all sub-folders is figs_final_model_0.01_30_1.0_50_10_5
. The script will also generate a number of performance figures, also stored in the same folder. You can stop the process at any point by pressing Ctr+c
.
python test_demo.py -n figs_final_model_0.01_30_1.0_50_10_5/checkpoints/ -m
This command will open a graphical interface which can be controlled by a number of keyboard shortcuts. In particular, press:
q
or esc
to exit the simulation at any point.1
to enable the MCTS-based full-scale active inference agent (enable by default).2
to enable the active inference agent that minimizes expected free energy calculated only for a single time-step into the future.3
to make the agent being controlled entirely by the habitual network (see manuscript for explanation)4
to activate manual mode where the agents are disabled and the environment can be manipulated by the user. Use the keys w
, s
, a
or d
to move the current object up, down, left or right respectively.5
to enable an agent that minimizes the terms a
and b
of equation 8 in the manuscript.6
to enable an agent that minimizes only the term a
of the same equation (reward-seeking agent).m
to toggle the use of sampling in calculating future transitions.@inproceedings{fountas2020daimc,
author = {Fountas, Zafeirios and Sajid, Noor and Mediano, Pedro and Friston, Karl},
booktitle = {Advances in Neural Information Processing Systems},
editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
pages = {11662--11675},
publisher = {Curran Associates, Inc.},
title = {Deep active inference agents using Monte-Carlo methods},
url = {https://proceedings.neurips.cc/paper/2020/file/865dfbde8a344b44095495f3591f7407-Paper.pdf},
volume = {33},
year = {2020}
}