Skip to content

MaksymDel/span_ae

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Span Based AutoEncoder for sentences

This repository implements Span Based AutoEncoder for text data.

The model is structured as follows:
For each sentence we:

  1. tokenize and embed source words
  2. compute contextualized word embeddings by feeding embedded sentence to bidirectional LSTM
  3. extract all possible spans of width up to N from the encoder outputs (embeddings from step 2)
  4. prune some spans based on FeedForward network scorer that trains end-to-end with the rest of the model
    (top K spans are left after this step)
  5. try to reconstruct the sentence with decoder LSTM which attends to the remaining spans from the step 4

The code is written using allennlp library. Follow the installation procedure from the allennlp website (available via pip)
To run this code simply execute:
python -m allennlp.run train configs/experiment_gpu.json --serialization-dir models/baseline --include-package span_ae

About

Span based AutoEncoder for sentences

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published