Reading Comprehension on SQuAD

Note:

Almost re-implement the paper: Machine Comprehension Using Match-LSTM and Answer Pointer
It's a simple baseline for the task, I get only f1: 0.643 and em: 0.4975 on the dev.
Since I only have access to an NV1060 card, so I did only several training and evaluation on dev set.
I changed a little bit of the structure of the files.

Train

Please check the cs224n assignment 4 for installing dependencies etc.

Set the parameters in Config.py. then run:

python train.py

Test

Note the "--" before the keyword.

If trained multiple models, one can run:

python eval_ensemble.py

to do ensemble test.

Meanwhile, one can run:

python eval_interactive.py --ckpt='path/to/ckpt' --vocab='path/to/vocab.bat' --embed='path/to/embedding'

to do interactive test, where you can input context and question, then take a look what the model get for you.

Takeaways

random shuffle the training set is important.
initialization the weights of rnns is crucial, and i tried to use identity initialization but did not do experiment to see how it differed from xavier or others.
regularization is deadly needed.
My best result is: F1: 0.643 and EM: 0.4975 for 4000 samples in dev set.

TO DO:

Programming Assignment 4 (by stanford cs224n)

Welcome to CS224N Project Assignment 4 Reading Comprehension. The project has several dependencies that have to be satisfied before running the code. You can install them using your preferred method -- we list here the names of the packages using pip.

Requirements

The starter code provided pressuposes a working installation of Python 2.7, as well as a TensorFlow 0.12.1.

It should also install all needed dependnecies through pip install -r requirements.txt.

Running your assignment

Be aware that the file structure is a little bit different for this repo.

You can get started by downloading the datasets and doing dome basic preprocessing:

$ code/get_started.sh

Note that you will always want to run your code from this assignment directory, not the code directory, like so:

$ python code/train.py

This ensures that any files created in the process don't pollute the code directoy.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
docker		docker
files		files
preprocessing		preprocessing
tests		tests
utils		utils
.gitignore		.gitignore
Config.py		Config.py
README.md		README.md
__init__.py		__init__.py
config.py		config.py
dwr.py		dwr.py
eval_ensemble.py		eval_ensemble.py
eval_interactive.py		eval_interactive.py
evaluate.py		evaluate.py
get_started.sh		get_started.sh
qa_answer.py		qa_answer.py
qa_data.py		qa_data.py
qa_model.py		qa_model.py
requirements.txt		requirements.txt
squad_preprocess.py		squad_preprocess.py
train.py		train.py
train.sh		train.sh
valohai.yaml		valohai.yaml

InnerPeace-Wu/reading_comprehension-cs224n

Folders and files

Latest commit

History

Repository files navigation

Reading Comprehension on SQuAD

Note:

Train

Test

Takeaways

TO DO:

Programming Assignment 4 (by stanford cs224n)

Requirements

Running your assignment

About

Resources

Stars

Watchers

Forks

Languages