Dialogue Model using NLP

This project is about the dialogue QA. There would be some utterances and what we need to do is select the best answer from the 100 candidates. There are three models implemented, which are RNN w/o attention, RNN w/ attention, and the model with the best performance. In the report, I list some comparison between differenct RNN models (e.g. LSTM, GRU), and the implementation details.

Requirements

    conda env create -f nlp.yml

Training

Prepare data

put the train.json, valid.json, test.json in the data folder
download the english word vectors crawl-300d-2M.vec from FastText (https://fasttext.cc/docs/en/english-vectors.html) and also put it into data folder

So there are these files in the data folder as follow:

    ./data/config.json # config setting
    ./data/train.json # training data
    ./data/valid.json # validation data
    ./data/test.json # testing data
    ./data/crawl-300d-2M.vec # english word vectors

Train the model

prepare the models folder
create experiment folder, e.g. lstm
add the config.json which contains the experiment settings into the experiment folder

    ./models/lstm/config.json

run the training process

    cd src
    bash preprocess.sh # preprocess the json to pickle 
    bash train.sh model_path cuda_device

Pre-trained model

Use gdrive package (https://github.com/gdrive-org/gdrive) to download the pre-trained model

    bash download.sh

Testing

bash rnn.sh/attention.sh/best.sh ${1} ${2}

${1} path_to_the_test_json
${2} path_to_the_predictions

Attention Score Plot

there should be a best folder in the models
need to preprocess the data to the pkl format
need to prepare embedding.pkl which contains the englist word embedding info

    cd src
    python visual.py data_path, embed_path
    
    # example
    python visual.py ../data/valid.pkl ./embedding.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

models

models

src

src

.gitignore

.gitignore

README.md

README.md

attention.sh

attention.sh

best.sh

best.sh

download.sh

download.sh

nlp.yml

nlp.yml

report.pdf

report.pdf

rnn.sh

rnn.sh

Repository files navigation

Dialogue Model using NLP

Requirements

Training

Prepare data

Train the model

Pre-trained model

Testing

Attention Score Plot

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
models		models
src		src
.gitignore		.gitignore
README.md		README.md
attention.sh		attention.sh
best.sh		best.sh
download.sh		download.sh
nlp.yml		nlp.yml
report.pdf		report.pdf
rnn.sh		rnn.sh

vic85821/dialogue_model_nlp

Folders and files

Latest commit

History

Repository files navigation

Dialogue Model using NLP

Requirements

Training

Prepare data

Train the model

Pre-trained model

Testing

Attention Score Plot

About

Resources

Stars

Watchers

Forks

Languages