CopyNet

This is an implementation of CopyNet which extends the functionality of encoder-decoder models to allow the generation of output sequences that contain "out of vocabulary" tokens that are present in the input sequence.

Dependencies: pytorch numpy tensorboardX (for logging) tqdm (for logging) spacy (for tokenization)

The model is trained on sequence pairs. Create a directory to hold training files. Each file should have 2 lines of text. The first is the input sequence, the second is the target output sequnce. The tokens in each sequence should be seperated by spaces. I used spacy to tokenize the training data so the SequencePairDataset class as well as the evaluation methods assume that spacy will be used. If you want to use a different tokenizer be sure to update those files accordingly.

Train the model using the train.py script. Most hyperparameters can be tuned with command line arguments documented in the training script.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
__pycache__		__pycache__
data		data
model		model
results		results
README.md		README.md
__init__.py		__init__.py
dataset.py		dataset.py
evaluate.py		evaluate.py
kfoldltldataset.py		kfoldltldataset.py
language.py		language.py
ltldataset.py		ltldataset.py
mjcdataset.py		mjcdataset.py
train.py		train.py
train_crossval.py		train_crossval.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycache

pycache

data

data

model

model

results

results

README.md

README.md

init.py

init.py

dataset.py

dataset.py

evaluate.py

evaluate.py

kfoldltldataset.py

kfoldltldataset.py

language.py

language.py

ltldataset.py

ltldataset.py

mjcdataset.py

mjcdataset.py

train.py

train.py

train_crossval.py

train_crossval.py

utils.py

utils.py

Repository files navigation

CopyNet

About

Releases

Packages

Languages

raarielgrace/copynet

Folders and files

Latest commit

History

Repository files navigation

CopyNet

About

Resources

Stars

Watchers

Forks

Languages