Skip to content

katha-shah/iksha

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Iksha - Surrounding Description Model for Visually Impaired

Describes what it sees

Introduction

An application designed for specially abled people having vision impairment. These people need assistance to look around the things in the world. Many such conventional solutions are available to help these people. But with this project and the help of A.I., we are designing description model which entails the surrounding around us. This description is given in simple plain English sentence.

We use machine learning algorithms to solve this problem to greater accuracy. We divide this solution into two parts (i) Feature Extraction (ii) Language Model. For feature extraction we use neural networks to get the feature vector of images and processing the data. Also we use tranfer learning to use the pre-trained model over our input sets. Language model will make use of NLP concepts to generate the meaningful sentences in plain English language.

Dataset*

FLICKR30k

  • Contains 30,000 images with its caption in English language splitting 1000 images for validation and 1000 images for Testing

Architecture

Encoder

• CNN can be thought of as an Encoder.
• CNN is a widely used image feature extraction technique for object detection and image classification. Transfer learning is used to obtain the features of the images from the dataset (Inception v3)

Decoder

• Decoder is the Bi-directional Deep LSTM
• Language modelling is done at the word level.
• The first time step receives the encoded output from the encoder and also the vector.

Results

Upon running the model for 70 epochs, we’ve attained BLEU score ~ 0.56 which is pretty good given the limited training dataset and computation power.
Describes without error
image

Describes with minor errors
image

Somewhat related to the image
image

Completely unrelated
image

About

Describes what it sees

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages