RNN Caption Generator

This project is done by Arnab Karmakar, Abhinav Paul and Samvram Sahu, for fulfilment of course requirements of AV494 - Deep Learning for Computational Data Sciences, of seventh semester at Indian Institute of Space Science and Technology. The implementation and theory is discussed below.

Also the code is run on a collab notebook online for faster computation, and saving resources.

Flow

We will use the VGG16 model that has been pre-trained for classifying images. But instead of using the last classification layer, we will redirect the output of the previous layer. This gives us a vector with 4096 elements that summarizes the image-contents - similar to how a "thought-vector" summarized the contents of an input-text in Tutorial #21 on language translation. We will use this vector as the initial state of the Gated Recurrent Units (GRU). However, the internal state-size of the GRU is only 512, so we need an intermediate fully-connected (dense) layer to map the vector with 4096 elements down to a vector with only 512 elements.

The decoder then uses this initial-state together with a start-marker "ssss" to begin producing output words. In the first iteration it will hopefully output the word "big". Then we input this word into the decoder and hopefully we get the word "brown" out, and so on. Finally we have generated the text "big brown bear sitting eeee" where "eeee" marks the end of the text.

Results

Among the various input images for testing the trained model, we show the 6 images from in and around IIST which has been used to test, we paste the following screenshots

Future work

The dataset is specefic for the North American continent and this is not an attention specefic RNN. Hence we can train the final work, on a much larger well labelled dataset. We look forward to extend our horizon. Also we acknowledge the guidance of Dr. Deepak Mishra, without whose guidance we could not have realized this assignment.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
results/results		results/results
README.md		README.md
Samvram_captioning_Paul.ipynb		Samvram_captioning_Paul.ipynb
cache.py		cache.py
coco.py		coco.py
download.py		download.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

results/results

results/results

README.md

README.md

Samvram_captioning_Paul.ipynb

Samvram_captioning_Paul.ipynb

cache.py

cache.py

coco.py

coco.py

download.py

download.py

Repository files navigation

RNN Caption Generator

Flow

Results

Future work

About

Releases

Packages

Languages

samvram/RNN_Caption_Generator

Folders and files

Latest commit

History

Repository files navigation

RNN Caption Generator

Flow

Results

Future work

About

Resources

Stars

Watchers

Forks

Languages