Implementation of the paper [Using Fast Weights to Attend to the Recent Past](https://arxiv.org/abs/1610.06258)
Reproducing the associative model experiment on the paper
Using Fast Weights to Attend to the Recent Past by Jimmy Ba et al. (Incomplete)
Tensorflow (version >= 0.8)
Generate a dataset
$ python generator.py
This script generates a file called associative-retrieval.pkl
, which can be used for training.
Run the model
$ python fw.py
The following is the accuracy and loss graph for R=20. The experiments are barely tuned.
Layer Normalization is extremely crucial for the success of training.
Further improvements:
Using Fast Weights to Attend to the Recent Past. Jimmy Ba, Geoffrey Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin Ionescu.
Layer Normalization. Jimmy Ba, Ryan Kiros, Geoffery Hinton.