Skip to content

asmith26/Dynamic-memory-networks-in-Theano

 
 

Repository files navigation

Dynamic memory networks in Theano

The aim of this repository is to implement Dynamic memory networks as described in the paper by Kumar et al. and to experiment with its various extensions.

We will cover the process in a series of blog posts.

Repository contents

file description
main.py the main entry point to train and test available network architectures on bAbI-like tasks
dmn_basic.py our baseline implementation. It is as close to the original as we could understand the paper, except the number of steps in the main memory GRU is fixed. Attention module uses T.abs_ function as a distance between two vectors which causes gradients to become NaN randomly. The results reported in this blog post are based on this network
dmn_smooth.py uses the square of the Euclidean distance instead of abs in the attention module. Training is very stable. Performance on bAbI is slightly better
dmn_batch.py dmn_smooth with minibatch training support. The batch size cannot be set to 1 because of the Theano bug
dmn_qa_draft.py draft version of a DMN designed for answering multiple choice questions
utils.py tools for working with bAbI tasks and GloVe vectors
nn_utils.py helper functions on top of Theano and Lasagne
fetch_babi_data.sh shell script to fetch bAbI tasks (adapted from MemN2N)
fetch_glove_data.sh shell script to fetch GloVe vectors (by 5-vision)

Usage

This implementation is based on Theano and Lasagne. One way to install them is:

pip install -r https://raw.githubusercontent.com/Lasagne/Lasagne/master/requirements.txt
pip install https://github.com/Lasagne/Lasagne/archive/master.zip

The following bash scripts will download bAbI tasks and GloVe vectors.

./fetch_babi_data.sh
./fetch_glove_data.sh

Use main.py to train a network:

python main.py --network dmn_basic --babi_id 1

The states of the network will be saved in states/ folder. There is one pretrained state on the 1st bAbI task. It should give 100% accuracy on the test set:

python main.py --network dmn_basic --mode test --babi_id 1 --load_state states/dmn_basic.mh5.n40.babi1.epoch4.test0.00033.state

Roadmap

  • Mini-batch training (done, 08/02/2016)
  • Web interface (work in progress)
  • Visualization of episodic memory module (work in progress)
  • Regularization (work in progress, L2 doesn't help at all!)
  • Support for multiple-choice questions (work in progress: dmn_qa_draft.py)
  • Evaluation on more complex datasets
  • Import some ideas from Neural Reasoner

About

Implementation of Dynamic memory networks by Kumar et al. http://arxiv.org/abs/1506.07285

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.6%
  • Shell 0.4%