Iksha - Surrounding Description Model for Visually Impaired

Describes what it sees

Introduction

An application designed for specially abled people having vision impairment. These people need assistance to look around the things in the world. Many such conventional solutions are available to help these people. But with this project and the help of A.I., we are designing description model which entails the surrounding around us. This description is given in simple plain English sentence.

We use machine learning algorithms to solve this problem to greater accuracy. We divide this solution into two parts (i) Feature Extraction (ii) Language Model. For feature extraction we use neural networks to get the feature vector of images and processing the data. Also we use tranfer learning to use the pre-trained model over our input sets. Language model will make use of NLP concepts to generate the meaningful sentences in plain English language.

Dataset*

• FLICKR30k

Contains 30,000 images with its caption in English language splitting 1000 images for validation and 1000 images for Testing

Architecture

Encoder

• CNN can be thought of as an Encoder.
• CNN is a widely used image feature extraction technique for object detection and image classification. Transfer learning is used to obtain the features of the images from the dataset (Inception v3)

Decoder

• Decoder is the Bi-directional Deep LSTM
• Language modelling is done at the word level.
• The first time step receives the encoded output from the encoder and also the vector.

Results

Upon running the model for 70 epochs, we’ve attained BLEU score ~ 0.56 which is pretty good given the limited training dataset and computation power.
• Describes without error

• Describes with minor errors

• Somewhat related to the image

• Completely unrelated

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
model		model
README.md		README.md
config.py		config.py
data_loader.py		data_loader.py
infer.py		infer.py
main.py		main.py
model.py		model.py
vocab.pkl		vocab.pkl
vocab.py		vocab.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model

model

README.md

README.md

config.py

config.py

data_loader.py

data_loader.py

infer.py

infer.py

main.py

main.py

model.py

model.py

vocab.pkl

vocab.pkl

vocab.py

vocab.py

Repository files navigation

Iksha - Surrounding Description Model for Visually Impaired

Introduction

Dataset*

Architecture

Results

About

Releases

Packages

Languages

katha-shah/iksha

Folders and files

Latest commit

History

Repository files navigation

Iksha - Surrounding Description Model for Visually Impaired

Introduction

Dataset*

Architecture

Results

About

Resources

Stars

Watchers

Forks

Languages