Skip to content

Implementation of various Vision Transformers I found interesting

License

Notifications You must be signed in to change notification settings

kapitsa2811/vision-transformers-pytorch

 
 

Repository files navigation

vision-transformers-pytorch

Implementation of various Vision Transformers (and other vision models) I found interesting

Models

Currently I have implemented:

Implemented.

Implemented. Currently testing.

Tested and got 83.17 top-1 accuracy with NFNet-F0

Pyramid Vision Transformer (https://arxiv.org/abs/2102.12122)

Tested and got 78.94 top-1 accuracy with PVT-Small

Tested and got 82.192 on top-1. Re-experimenting with random erasing.

Implemented.

Tested and got 82.862 on top-1 @ 300px, 83.2 on top-1 @ 380px

Implemented.

Usage

I'm currently using LMDB of ILSVRC 2012 dataset, that made by using

python preprocess.py [IMAGENET_PATH] [train/val]

I think just using torchvision.datasets will be better. I will change to it later.

Then you can do training.

python train.py --conf [CONFIG FILE] --n_gpu [NUMBER OF GPUS] [Config overrides in the form of key=value]

About

Implementation of various Vision Transformers I found interesting

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%