Skip to content

This repo is a playground for seq2seq models with PyTorch

License

Notifications You must be signed in to change notification settings

ErikHumphrey/sustain-seq2seq

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sustain-seq2seq

  • Tokenization that covers BPE and GPT2 (from pytorch_transformers) in a single Lookup object. Full tests for this are required as a lot of problems came from mismatched maps and out of range ints in the lookup.

    Encoding for BPE:

    • X and y are bordered with <BOS and <EOS> and <PAD>ed in rest

    Encoding for GPT2:

    • X and y are both <|endoftext|> ints (both bos/eos point to this string) and <PAD>ed in rest (decoder should stop if <|endoftext|> is generated at index>1)

Models that need to work:

  • LSTMEncoder + LSTMDecoder with Attention
  • GPT2Encoder + LSTMDecoder with Attention
  • LSTMEncoder + LSTMDecoder with Attention, Pointer Generator & Coverage
  • GPT2Encoder + LSTMDecoder with Attention, Pointer Generator & Coverage
  • GPT2Encoder + GPT2Decoder with Pointer Generator & Coverage

Other stuff that needs to be done:

  • Look at validation measures again (BLEU, METEOR, ROUGE)
  • Implement all attention types (low priority)
  • Experiment with multihead attention for RNNs
  • Beamsearch and/or topk/topp as in pytorch_transformers
  • Check attention masks are working everywhere
  • Optimizer: Learning rate scheduler, superconvergence, warm restart si cyclical LR. Implement scheduler. Partially done, needs more testing.

About

This repo is a playground for seq2seq models with PyTorch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Roff 79.6%
  • Python 20.4%