Skip to content

serhii-havrylov/ShowAndTell

Repository files navigation

"For millions of years mankind lived just like the animals. Then something happened which unleashed the power of our imagination: we learned to talk".

This project reproduces the model from Show and Tell: A Neural Image Caption Generator

Image features are the outputs of the relu7 layer from the VGG network which you can download here. Remove the drop7, fc8, prob layers from .prototxt file, so the last layer must be relu7

You can download prepared training and validation data from my google drive or you can reproduce image/text feature extraction pipeline as following:

  1. Download datasets
  2. Run python scripts for generating files which store the image paths and corresponding captions
    • run data_preparation/flickr/flickr8k/build_image_text_match.py
    • run data_preparation/flickr/flickr30k/build_image_text_match.py
    • run data_preparation/mscoco/build_image_text_match.py
  3. Run python scripts for generating files which store image features
    • run data_preparation/flickr/extract_features.py
    • run data_preparation/mscoco/extract_features.py
  4. Run python scripts for generating training and validation data
    • run data_preparation/merge_all_data.py

To train model run caption_generation_model/train.py or you can download pretrained model from my google drive

If you want to use the pretrained model run minimalistic flask app caption_generation_server/app.py (Note: it requires installed caffe and its python interface pycaffe)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published