Transformer

This project implements "Attention is All You Need" paper.

We present a detailed guide to comprehend the transformer concepts and required Keras functionality/tricks (Please see Guide.md and Walkthrough.md).

We hope that the software and the documentation will allow the community to create and automatize custom Keras layers/models/solutions in a more robust and faster manner.

We also share a machine translation demo which can be setup over a DSL.

We would appreciate any contribution:)

Would be lovely if any large models can be trained and results are shared.
- Couldn't test for now due to limited computational resources.
Although we run formal and informal tests during development; we would appreciate more tests.
- Also any suggestions and requests.
Please also contribute to the docs if you have other Keras tricks or alternative approaches.

Key Contributions

Model training and decoding can be defined over a DSL; and therefore execution is automated.
Abstracts training basics (checkpointing, artifact management etc.) from custom trainers.
Keeps a definite interface for generators (the so-called inner-generator) feeding the models.
- Users may provide their custom generators (the so-called outer-generator) to integrate any data source.
- Thus, we separate the ingestion logic from ML/DL architectures/models.
Abstraction (for training basics, generators, preprocessors, decoding, callbacks etc.) allows definite interfaces which eases automatization.
Custom layers are re-usable and clear.
Many interesting/hard (Keras) problems are solved:
- Parameter Tying
  - Tying embedders and projector altogether.
  - Handling save/load case!
  - Parametrized call usage to allow a layer to behave conditionally.
- Domain-agnostic training with inner-outer generators.
- Model and callback serialization.
- Multi-head attention on same tensor.
- Positional encoding - Interleaving Trick.
- Padding/Additional masking when computing attention attention.
- Custom loss with custom padding mask.
- Subword embeddings.
- Beam Search Decoding.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
keras_transformer		keras_transformer
.gitignore		.gitignore
Guide.md		Guide.md
LICENSE.txt		LICENSE.txt
README.md		README.md
Walkthrough.md		Walkthrough.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

keras_transformer

keras_transformer

.gitignore

.gitignore

Guide.md

Guide.md

LICENSE.txt

LICENSE.txt

README.md

README.md

Walkthrough.md

Walkthrough.md

requirements.txt

requirements.txt

Repository files navigation

Transformer

Key Contributions

About

Releases

Packages

Languages

License

erelcan/keras-transformer

Folders and files

Latest commit

History

Repository files navigation

Transformer

Key Contributions

About

Resources

License

Stars

Watchers

Forks

Languages