Skip to content

Zumbalamambo/deepcv

Repository files navigation

deepcv

Can we make computer vision like our eyes?

Prepare data

$ python main.py --config=config/dataset/pascal_voc.cfg
                 --app=dataset
                 --dataset_name=voc

Classification

classify

$ FILE_PATH = ~/dataset/test.jpg
$ python main.py --config=config/vgg/vgg_16.cfg
                 --gpu = True
                 --app=classifier
                 --task=classify
                 --file=${FILE_PATH} or --file_url=${FILE_PATH}

train a classification model from scratch

DATASET_DIR=cache/dataset/imagenet
LOG_DIR=cache/log/vgg_16
python train_classifier.py --model_name=vgg_16
                           --log_dir=${LOG_DIR}
                           --dataset_dir=${DATASET_DIR}
                           --dataset_name=imagenet
                           --dataset_split_name=train

fine-tuning a classification model from an existing checkpoint

$ DATASET_DIR=cache/dataset/imagenet
$ LOG_DIR=cache/log/vgg_16
$ CHECKPOINT_PATH=cache/weight/vgg_16.ckpt
$ python train_classifier.py --model_name=vgg_16
                             --log_dir=${LOG_DIR}
                             --dataset_dir=${DATASET_DIR}
                             --dataset_name=imagenet
                             --dataset_split_name=train
                             --checkpoint_path=${CHECKPOINT_PATH}

Detection

protobuf compilation

From deepcv/

$ protoc model/detection/protos/*.proto --python_out=.

detect

$ FILE_PATH = ~/dataset/test.jpg
$ python main.py  --config=config/ssd/ssd_v1.cfg \
                  --app=detector \
                  --task=detect \
                  --file=$FILE_PATH

train a detection model

$ PIPELINE_CONFIG_PATH = config/train_detection/faster_rcnn_resnet101_voc007.config
$ python main.py  --log_dir=cache/log/faster_rcnn/resnet/voc007
                  --pipeline_config_path = ${PIPELINE_CONFIG_PATH}

Visualize

visualize training procedure

$ LOGDIR = cache/log/ssd/mobilenet_v1/pet
$tensorboard --logdir=${LOGDIR}

Pre-trained Models

Model TF-Slim File Checkpoint Top-1 Accuracy Top-5 Accuracy
Inception V1 Code inception_v1_2016_08_28.tar.gz 69.8 89.6
Inception V2 Code inception_v2_2016_08_28.tar.gz 73.9 91.8
Inception V3 Code inception_v3_2016_08_28.tar.gz 78.0 93.9
Inception V4 Code inception_v4_2016_09_09.tar.gz 80.2 95.2
Inception-ResNet-v2 Code inception_resnet_v2_2016_08_30.tar.gz 80.4 95.3
ResNet 50 Code resnet_v1_50_2016_08_28.tar.gz 75.2 92.2
ResNet 101 Code resnet_v1_101_2016_08_28.tar.gz 76.4 92.9
ResNet 152 Code resnet_v1_152_2016_08_28.tar.gz 76.8 93.2
ResNet V2 200 Code TBA 79.9* 95.2*
VGG 16 Code vgg_16_2016_08_28.tar.gz 71.5 89.8
VGG 19 Code vgg_19_2016_08_28.tar.gz 71.1 89.8

Choose the right MobileNet model to fit your latency and size budget. The size of the network in memory and on disk is proportional to the number of parameters. The latency and power usage of the network scales with the number of Multiply-Accumulates (MACs) which measures the number of fused Multiplication and Addition operations. These MobileNet models have been trained on the ILSVRC-2012-CLS image classification dataset. Accuracies were computed by evaluating using a single image crop.

Model Checkpoint Million MACs Million Parameters Top-1 Accuracy Top-5 Accuracy
MobileNet_v1_1.0_224 569 4.24 70.7 89.5
MobileNet_v1_1.0_192 418 4.24 69.3 88.9
MobileNet_v1_1.0_160 291 4.24 67.2 87.5
MobileNet_v1_1.0_128 186 4.24 64.1 85.3
MobileNet_v1_0.75_224 317 2.59 68.4 88.2
MobileNet_v1_0.75_192 233 2.59 67.4 87.3
MobileNet_v1_0.75_160 162 2.59 65.2 86.1
MobileNet_v1_0.75_128 104 2.59 61.8 83.6
MobileNet_v1_0.50_224 150 1.34 64.0 85.4
MobileNet_v1_0.50_192 110 1.34 62.1 84.0
MobileNet_v1_0.50_160 77 1.34 59.9 82.5
MobileNet_v1_0.50_128 49 1.34 56.2 79.6
MobileNet_v1_0.25_224 41 0.47 50.6 75.0
MobileNet_v1_0.25_192 34 0.47 49.0 73.6
MobileNet_v1_0.25_160 21 0.47 46.0 70.7
MobileNet_v1_0.25_128 14 0.47 41.3 66.2

About

DeepCV expand OpenCV

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published