Show, Attend and Tell: Neural Image Caption Generation with Visual Attention in PyTorch

This repository contains PyTorch implementation of Show, Attend and Tell

How to run

To train model form scratch, use following command.

python main.py

To train model following existing checkpoint, use following command.

python main.py --model_path MODEL_PATH

To generate caption of an image, use following command.

python main.py --test --model_path MODEL_PATH --image_path IMAGE_PATH

Lastly, to download required data (Flickr8k and GloVe, for now), use '--download' argument.

Results

Flickr8k dataset

Following examples are generated after training using Google Colaboratory for less than 7 hours. Training captions are lemmatized, and so are generated captions. Thus generated captions are not complete English sentences, but they are still interpretable. (Lemmatization helps training when resource is limited, because it reduces vocabulary size.)

(A lot of examples have dogs, because dogs are cute!)

Correct examples

Not 100% correct, but not totally wrong examples

Wrong examples

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
images/Flickr8k		images/Flickr8k
README.md		README.md
config.py		config.py
dataloader.py		dataloader.py
main.py		main.py
model.py		model.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images/Flickr8k

images/Flickr8k

README.md

README.md

config.py

config.py

dataloader.py

dataloader.py

main.py

main.py

model.py

model.py

train.py

train.py

utils.py

utils.py

Repository files navigation

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention in PyTorch

How to run

Results

Flickr8k dataset

About

Releases

Packages

Languages

kokookok77/show-attend-and-tell-pytorch

Folders and files

Latest commit

History

Repository files navigation

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention in PyTorch

How to run

Results

Flickr8k dataset

About

Resources

Stars

Watchers

Forks

Languages