This repository contains an example implementation of the transformer, as published in the Attention is all you need paper. This implementation is based on the PyTorch implementation by OpenNMT.
The code for the transformer can be found in the directory transformer
.
Apart from the transformer code, this repository contains an example that
shows how to use the transformer in your code.
The example is a simple copy task that can be learned by the transformer.
If you want to try the example, you'll need to do the following:
- create a new virtual environment, using Python 3 (preferably >3.6) . Python 2 might also work, but this is not guaranteed.
- install all requirements via
pip install -r requirements.txt
- start the training with
python train_copy.py
. If you want to use a GPU for training, you can specify the GPU with-g
. For further options typepython train_copy.py -h
. - wait some minutes and you should get a transformer that can copy digits.