Skip to content

🎥Image2Caption🔤: Upload an image and let the model generate a caption for you🤖.

Notifications You must be signed in to change notification settings

sd2001/Image2Caption

Repository files navigation

🤖Auto Caption Generation for Images📸


Image Captioning is the process of generating textual description of an image. It uses both Natural Language Processing and Computer Vision to generate the captions. Deep Learning using CNNs-LSTMs can be used to solve this problem of generating a caption for a given image, hence called Image Captioning.

⌛Output that we get💻

There's a lot of biasing as well, since training data wasn't big enough!

🖐️LETS TAKE A QUICK DIVE INTO THIS BIT OF MAGIC!😇

Dataset:

  • Flickr 8k (containing 8k images),
  • Flickr 30k (containing 30k images),
  • MS COCO (containing 180k images), etc.

Point to Note:

Here I have used the Flickr8k dataset based on the availability of standard computational resources. This dataset is the best for 8GB RAM, and takes about 25mins/epoch training on a CPU. Flickr30k and MS COCO may need about 32GB-64GB RAM based on how it's processed. Consider using AWS EC2 workstation for the best and fastest output. Its paid tho😞!

General Architecture

Model Architecture(VGG16 + LSTMs)

We remove the last 2 layers of VGG16 and pass it to 👇

📊Data that we feed into the Network!📁


References and Bibliography:

Paper

About

🎥Image2Caption🔤: Upload an image and let the model generate a caption for you🤖.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published