Multi-label Image Classification

Contains code for training and testing CNNs for multi-label image classification using various multi-label loss functions: Softmax, Sigmoid, Pairwise Ranking, WARP, LSEP using Tensorflow. Codebase follows Tensorflow(v1.3)'s image classification tutorial using slim, and incorporates custom loss functions for multi-labels.

Requirements

Tensorflow 1.3
Tensorflow Slim
Python 2

Dataset

The 'data' folder contains the train/test splits of nuswide dataset as an example. For the images and other relevant details, please refer to the individual dataset's page.

Extracting CNN features

Extract CNN features of images from various models like vgg, inception, resnet and save them in a matfile. Run the following, having changed any needed arguments.

dataset_dir=/home/ayushi/Git/research/dataset/nuswide/images/Flickr
checkpoint_path=../data/pretrained/vgg_16.ckpt
eval_file_image_list=../data/nuswide/nus1_train_list.txt
eval_file_image_features=../data/nuswide/net-vgg16/nus1_test_vgg16.mat
python extract.py \
    --dataset_dir=${dataset_dir} \
    --model_name=vgg_16 \
    --checkpoint_path=${checkpoint_path} \
    --bottleneck_scope=PreLogitsFlatten \
    --checkpoint_exclude_scopes=vgg_16/fc8 \
    --eval_file_image_list=${eval_file_image_list} \
    --eval_file_image_features=${eval_file_image_features} \
    --num_classes=81 \
    --bottleneck_shape=4096 \
    --batch_size=10

where, dataset_dir refers to the directory which contains all the dataset images, checkpoint_path refers to the checkpoint file that can be downloaded from Tensorflow's checkpoint releases, eval_file_image_list contains the list of image names and eval_file_image_feaures refers to the matfile where the extracted features will be saved.

Train and Test CNN: with extracted CNN features

When training only the classifier, it can be done with only a fc layer on the extracted CNN features. Extract the CNN features as shown above, then run the following python file, having changed any needed arguments.

DATASET_DIR=../data/coco/
TRAIN_DIR=../data/coco/caffe-res1-101/sigmoid_logits/
CHECKPOINT_PATH=../data/coco/caffe-res1-101/sigmoid_logits/
train_file_image_features=../data/coco/caffe-res1-101/coco_train_r101.mat
train_file_image_annotations=../data/coco/coco_train_annot.txt
eval_file_image_features=../data/coco/caffe-res1-101/coco_train_r101.mat
eval_file_image_annotations=../data/coco/coco_train_annot.txt
eval_file_image_scores=../data/coco/caffe-res1-101/sigmoid_logits/coco_train_r101_pred_1.mat
python logits.py \
    --train_dir=${TRAIN_DIR} \
    --dataset_dir=${DATASET_DIR} \
    --dataset_name=coco \
    --dataset_split_name=train \
    --bottleneck_shape=2048 \
    --loss=sigmoid \
    --train_file_image_features=${train_file_image_features} \
    --train_file_image_annotations=${train_file_image_annotations} \
    --eval_file_image_features=${eval_file_image_features} \
    --eval_file_image_annotations=${eval_file_image_annotations} \
    --eval_file_image_scores=${eval_file_image_scores} \
    --run_opt=extract \
    --max_number_of_epochs=20 \
    --learning_rate=0.001 \
    --weight_decay=0.0005 \
    --batch_size=100 \
    --optimizer=rmsprop \
    --topK=3 \

where, run_opt is train or extract for training and testing modes respectively, loss can be any of the multi-label losses (softmax/sigmoid/ranking/warp/lsep); eval_file_image_scores is the matfile where the classifier predictions will be saved.

Evaluation (Performance metrics like MAP and F1 per image/label)

When testing CNN, the performance metrics of the test dataset will be printed. Refer to 'eval' folder for the evaluation code files or helper scripts.

Train CNN(Additional Option): with end-to-end network from images

Following Tensorflow, the dataset with images and corresponding labels, are saved in .tfrecord format. Refer to the convert_nuswide.py script in the datasets folder as an example as to how this has been done for the NUSWIDE dataset, and run.

python datasets/download_and_convert_data.py --dataset_name=nuswide --dataset_dir=./data/nuswide

To train, run the following, having changed any needed arguments.

DATASET_DIR=../data/nuswide/
TRAIN_DIR=../data/nuswide/net-incep-v4/
CHECKPOINT_PATH=../data/pretrained/inception_v4.ckpt
python train.py \
    --train_dir=${TRAIN_DIR} \
    --dataset_dir=${DATASET_DIR} \
    --dataset_name=nuswide \
    --dataset_split_name=train \
    --model_name=inception_v4 \
    --checkpoint_path=${CHECKPOINT_PATH} \
    --checkpoint_exclude_scopes=InceptionV4/Logits,InceptionV4/AuxLogits \
    --trainable_scopes=InceptionV4/Logits,InceptionV4/AuxLogits \
    --batch_size=5 \
    --loss=softmax

where, dataset_dir refers to the directory which contains the tfrecord subdirectory containing all the tfrecord train and test files; train_dir refers to the directory where the trained models will be saved; checkpoint_path refers to the checkpoint file that can be downloaded from Tensorflow's checkpoint releases. The network nodes to be finetuned or not can be controlled with trainable_scopes and checkpoint_exclude_scopes, loss can be any of the multi-label losses (softmax/sigmoid/ranking/warp/lsep).

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
classifier		classifier
data/nuswide		data/nuswide
datasets		datasets
eval		eval
nets		nets
preprocessing		preprocessing
README.md		README.md
extract.py		extract.py
graph_def.py		graph_def.py
inputs.py		inputs.py
logits.py		logits.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

classifier

classifier

data/nuswide

data/nuswide

datasets

datasets

eval

eval

nets

nets

preprocessing

preprocessing

README.md

README.md

extract.py

extract.py

graph_def.py

graph_def.py

inputs.py

inputs.py

logits.py

logits.py

train.py

train.py

Repository files navigation

Multi-label Image Classification

Requirements

Dataset

Extracting CNN features

Train and Test CNN: with extracted CNN features

Evaluation (Performance metrics like MAP and F1 per image/label)

Train CNN(Additional Option): with end-to-end network from images

About

Releases

Packages

Languages

ayushidutta/cnn-image-classification

Folders and files

Latest commit

History

Repository files navigation

Multi-label Image Classification

Requirements

Dataset

Extracting CNN features

Train and Test CNN: with extracted CNN features

Evaluation (Performance metrics like MAP and F1 per image/label)

Train CNN(Additional Option): with end-to-end network from images

About

Resources

Stars

Watchers

Forks

Languages