GitHub - xiaonanChong96/image_captioning: Tensorflow implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

This neural system for image captioning is roughly based on the paper "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention" by Xu et al. (ICML2015). It is implemented using the Tensorflow library, and allows end-to-end training of both CNN and RNN parts. To use it, you will need the Tensorflow version of VGG16 or ResNet 50/101/152, which can be obtained with Caffe-to-Tensorflow.

The code is now compatible with Tensorflow r1.4.

References

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio. ICML 2015.
The original implementation in Theano
An earlier implementation in Tensorflow
Microsoft COCO dataset
Caffe to Tensorflow

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
models		models
test/images		test/images
tfmodels		tfmodels
train		train
utils		utils
val		val
words		words
LICENSE.md		LICENSE.md
README.md		README.md
base_model.py		base_model.py
base_model.pyc		base_model.pyc
dataset.py		dataset.py
dataset.pyc		dataset.pyc
eval.sh		eval.sh
main.py		main.py
model.py		model.py
model.pyc		model.pyc

License

xiaonanChong96/image_captioning

Folders and files

Latest commit

History

Repository files navigation

References

About

Resources

License

Stars

Watchers

Forks

Languages