Requirements and Installation

Code is copied from https://github.com/sIncerass/powernorm

Requirements and Installation

A PyTorch installation
For training new models, you'll also need an NVIDIA GPU and NCCL
Python version 3.7

The fairseq library we use requires PyTorch version >= 1.2.0. Please follow the instructions here.

After PyTorch is installed, you can install fairseq with:

conda env create --file env.yml
python setup.py build develop

Reproduction

Prepare the traning data:

The scripts for training and testing is located at trans-scripts folder. Please refer to this page to preprocess and get binarized data or use the data we provided in the next section. To reproduce the results for Table.1 by yourself:

# IWSLT14 De-En
## To train the model
./trans-scripts/train/train-iwslt14.sh encoder_norm_self_attn encoder_norm_ffn decoder_norm_self_attn decoder_norm_ffn
example:
$ CUDA_VISIBLE_DEVICES=0 ./trans-scripts/train/train-iwslt14.sh power power layer layer
$ CUDA_VISIBLE_DEVICES=0 ./trans-scripts/train/train-iwslt14.sh batch batch layer layer
$ CUDA_VISIBLE_DEVICES=0 ./trans-scripts/train/train-iwslt14.sh layer layer layer layer

For CBN (I only can train the CBN with only used it in the self-attentaion encoder part and other for layer norm.
Other way leads to the model collapse even I tune the gradient clip.):
$ CUDA_VISIBLE_DEVICES=0 ./trans-scripts/train/train-iwslt14_cbn.sh cbn layer layer layer

For tuning the hyper-parameter in cbn, you can check the script: trans-scripts/train/train-iwslt14_cbn.sh
  lr: 0.00015
  lr-cbn: 0.00015
  weight-decay_cbn: 1.
  cbn-loss-weight: 0.1
  gradient-clip: 0.1

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
__pycache__		__pycache__
data-bin/iwslt14.tokenized.de-en.joined		data-bin/iwslt14.tokenized.de-en.joined
fairseq.egg-info		fairseq.egg-info
fairseq		fairseq
fairseq_cli		fairseq_cli
imgs		imgs
scripts		scripts
trans-scripts		trans-scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
create_node.py		create_node.py
distributed_train.py		distributed_train.py
env.yml		env.yml
eval_lm.py		eval_lm.py
generate.py		generate.py
hubconf.py		hubconf.py
interactive.py		interactive.py
preprocess.py		preprocess.py
score.py		score.py
setup.py		setup.py
train.py		train.py
validate.py		validate.py

License

lianqing01/transformer-cbn

Folders and files

Latest commit

History

Repository files navigation

Requirements and Installation

Reproduction

About

Resources

License

Stars

Watchers

Forks

Languages