Zero-Shot MultiLabel

code for master thesis on Zero-Shot Classification with multilabel data.

Abstract

Visual recognition systems are often limited to the object categories previously trained on and thus suffer in their ability to scale. This is in part due to the difficulty of acquiring sufficient labeled images as the number of object categories grows. To solve this, earlier research have presented models that uses other sources, such as text data, to help classify object categories unseen during training. However, most of these models are limited on images with a single label and most images can contain more than one object category, and therefore more than one label. This master's thesis implements a model capable of classifying unseen categories for both single- and multi-labeled images.

The architecture consist of several modules: A pre-trained neural network that generates image features for each image, a model trained on text that represents words as vectors, and a neural network that projects the image features to the dimension native to the vector representation of words. On this architecture, we compared two approaches to generate word vectors using GloVe and Word2vec, with different vector dimensions and on spaces containing different numbers of word vectors. The model was adapted to multi-label predictions comparing three approaches for image box generation: YOLOv2, Faster R-CNN and randomly generated boxes. Here each box represents a section of the image cut out and this approach was chosen to fit each label to a one of these boxes.

The results showed that increasing the word vector dimension increased the accuracy, with Word2vec outperforming GloVe, and when adding more words to the word vector space the accuracy dropped. In the single-label scenario the model achieves similar results to existing models with similar architecture. While in the multi-label scenario, the model trained on boxes generated by Faster R-CNN and predicted on random generated boxes had highest accuracy, but was not able to outperform comparative alternatives. The architecture gives promising results, but more investigation is needed to answer if the results can be improved further.

Dependencies

Usage:

Python 2.7
Glove
Gensim
ANNOY
h5py
opencv2 - available from pip
numpy - available from pip

Object detection frameworks

Downloadables

Weightfile for yolo9000 can be downloaded at http://pjreddie.com/media/files/yolo9000.weights.
Language model file consisting of word2vec and Glove vectors: Language models
Single- and multi-label weights trained on Glove and Word2vec (50D,150D,300D): Single- and multi-label weights

Before training and testing

Download the pre-trained language model vectors.
Use py-faster-rcnn or YOLO to compute region-of-interest boxes.

Train Zero-Shot model

Single-label data

python tools/train_brute_force.py --imdb dataset --lm language model (e.g. w2v_wiki_300D) --loss squared_hinge --iters 10000

Multi-label data

python tools/train_ml_brute_force.py --imdb dataset --lm language model (e.g. w2v_wiki_300D) --loss squared_hinge --model ZSL_model (pre-trained on single-label data) --boxes (random, frcnn or yolo)--iters 10000

Test Zero-Shot model

Single-label data

python tools/test_brute_force.py --lm glove_wiki_300 --imdb imagenet_zs --ckpt output/train_bts/model_glove_wiki_300.hdf5 --singlelabel_predict

Multi-label data

python tools/test_brute_force.py --lm glove_wiki_300 --imdb imagenet_zs --ckpt output/train_bts/model_glove_wiki_300.hdf5 --boxes faster_rcnn

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
lib		lib
scripts		scripts
tools		tools
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

lib

lib

scripts

scripts

tools

tools

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Zero-Shot MultiLabel

code for master thesis on Zero-Shot Classification with multilabel data.

Abstract

Dependencies

Object detection frameworks

Downloadables

Before training and testing

Train Zero-Shot model

Single-label data

Multi-label data

Test Zero-Shot model

Single-label data

Multi-label data

About

Releases

Packages

Languages

xuepo99/Msc_Multi_label_ZeroShot

Folders and files

Latest commit

History

Repository files navigation

Zero-Shot MultiLabel

code for master thesis on Zero-Shot Classification with multilabel data.

Abstract

Dependencies

Object detection frameworks

Downloadables

Before training and testing

Train Zero-Shot model

Single-label data

Multi-label data

Test Zero-Shot model

Single-label data

Multi-label data

About

Resources

Stars

Watchers

Forks

Languages