FAST AND ACCURATE READING COMPREHENSION WITHOUT RECURRENT NETWORKS

A Tensorflow implementation of Google's Fast Reading Comprehension from ICLR2018. Without RNNs the model computes relatively quickly compared to R-net(about 5 times faster in naive implementation). After 12 epochs of training our model reaches dev EM/F1 = 57 / 72.

Dataset

The dataset used for this task is Stanford Question Answering Dataset. Pretrained GloVe embeddings obtained from common crawl with 840B tokens are used for words.

Requirements

Python2.7
NumPy
tqdm
TensorFlow (1.2 or higher)
spacy

Downloads and Setup

Preprocessing step is identical to R-net. Once you clone this repo, run the following lines from bash just once to process the dataset (SQuAD).

$ pip install -r requirements.txt
$ bash setup.sh
$ python process.py --process True --reduce_glove True

Training / Testing / Debugging / Demo

You can change the hyperparameters from params.py file to fit the model in your GPU. To train the model, run the following line. To test or debug your model after training, change mode = "train" from params.py file and run the model.

$ python model.py

A working realtime demo is available at demo.py. To use web interface for live demo change use mode = "demo" and set batch_size to 1. (The code is taken from R-net)

TODO's

Training and testing the model
Add trilinear function to Context-to-Query attention
Convergence testing
Apply dropouts + stochastic depth dropout
Realtime Demo
Query-to-context attention
Data augmentation by paraphrasing

Tensorboard

Run tensorboard for visualisation.

$ tensorboard --logdir=./

Note

2/02/18 The model quickly reaches EM/F1 = 55/69 on devset, but never gets beyond that even with strong regularization. Also the training speed (1.8 batch per second in GTX1080) is slower than the paper suggests (3.2 batch per second in P100).

28/01/18 The model reaches devset performance of EM/F1=44/58 1 hour into training without dropout. Next goal is to train with dropout every 2 layers.

04/11/17 Currently the model is not optimized and there is a memory leak so I strongly suggest only training if your memory is 16GB >. Also I haven't done convergence testing yet. The training time is 5 ~ 6x faster on naive implementation compared to R-net.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
screenshots		screenshots
LICENSE		LICENSE
README.md		README.md
data_load.py		data_load.py
demo.html		demo.html
demo.py		demo.py
evaluate.py		evaluate.py
glove.840B.300d.char.txt		glove.840B.300d.char.txt
layers.py		layers.py
model.py		model.py
params.py		params.py
process.py		process.py
requirements.txt		requirements.txt
setup.sh		setup.sh

License

3DMM-ICME2023/Reading-Comprehension-without-RNNs

Folders and files

Latest commit

History

Repository files navigation

FAST AND ACCURATE READING COMPREHENSION WITHOUT RECURRENT NETWORKS

Dataset

Requirements

Downloads and Setup

Training / Testing / Debugging / Demo

TODO's

Tensorboard

Note

About

Resources

License

Stars

Watchers

Forks

Languages