Pytorch implementation of NAACL 2018 "A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents" (original TensorFlow code)
Two datasets of long documents are provided by original authors.
ArXiv dataset: Download
PubMed dataset: Download
We also provide small try_out.zip file for the test purpose.
- Python 3.7
- Pytorch 1.0.1
- Pyrouge
python src/main.py -data pubmed -save_dir SAVE_DIR
python src/decode.py -data pubmed -mode decode -train_from models/MODEL_PATH