Skip to content

dsp6414/seq2seq

 
 

Repository files navigation

Seq2seq code in PyTorch

Building from Ruotian Luo's code for captioning AND Sandeep Subramanian's seq2seq code

Data preprocessing:

I use these steps from Alexandre Bérard's code

> config/WMT14/download.sh    # download WMT14 data into raw_data/WMT14
> config/WMT14/prepare.sh     # preprocess the data, and copy the files to data/WMT14

Then run the following to save in h5 files:

> python scripts/prepro_text.py 

Training:

Training requires some directories for saving the model's snapshots, the tensorboard events

> mkdir -p save events

To train a model under the parameters defined in config.yaml

> python nmt.py -c config.yaml 

Check options/opts.py for more about the options.

To evaluate a model:

> python eval.py -c config

To submit jobs via OAR use either train.sh or select_train.sh

About

Seq2Seq code in PyTorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.3%
  • TeX 1.6%
  • Shell 1.1%