ICNet Tensorflow

This is fork from (https://github.com/hellochick/ICNet-tensorflow) repository. Some features were added, some removed. I've created that fork for myself, but maybe you will find it usefull too, because that version is more for training from scratch, while original repo more for inference. I tried to create system that looks more like Tensorflow Models Object Detection API.

ICNet original paper: https://arxiv.org/abs/1704.08545. Also check citation in the end.

Features of that repository

Here is only list of changes. How to use them see below or in code directly.

Please, note that for now that repo is quite imperfect, because was created for one specific task. But mostly I mean interface and I tried to adapt it for common segmentation tasks with any number of classes. Main problem now is that a half of parameters are set in hyperparams.py, while the rest you must set by command line arguments. So for now, please, prefer to use hyperparams.py for changing hyperparameters, especially input size. Also try to check Update section to be sure you know last changes

Introduction

This is an implementation of ICNet in TensorFlow for semantic segmentation on my own data, so don't forget to change labels and number of classes. We first convert weight from Original Code by using caffe-tensorflow framework.

Change list

2017/12/10

Moved --measure-time option to inference.py script. More suitable here.
Added export_inference_graph.py that creates frozen model (from checkpoints). Outputs .pb file that can be used for production.
Added example of usage of exporting graph. Check export_model.sh. In that script used all pipeline: first creating frozen graph without weights, then unfrozen with weights and then cleaned and quantized graph, if you wish.

2017/12/01

Added --measure-time for evaluation. Pass it to evaluation to get time measures instead of validation.
Fixed classes weights, now that feature must works much better.
Moved all hyperparams to one place: hyperparams.py. Control them now from this file.
Added PSPNet model, you can use train_psp.py to train it. Not all fatures works for that model.
Added --weighted for inference, with that option output will be a combination of input and output, very very illustrative.
Now inference will work on directory (will inference all images in dir) if you will pass path to directory in --img-path argument.
Random crop was fully rewritten. Random pad and random scale now not working (coming soon), but crop works correctly now. In original it worked only with random scale (incorrectly) and always was with padding. You can control crop rate with minval and maxval in image_reader.py:

h_rate = tf.random_uniform(shape = [1], minval = 0.5, maxval = 1.0, dtype = tf.float32)
w_rate = tf.random_uniform(shape = [1], minval = 0.5, maxval = 1.0, dtype = tf.float32)

Added CROP_MUSTHAVE_CLASS_INDEX in hyperparams. If it is -1 nothing will change. But if you will set it to some class index, crop operation will return only crops where that class is presented (at least one pixel, but usually that's enough. In future I will implement also parameter to control pixels part of needed class).
Removed excess image summaries (only original + network output now, no branches output) for now. Will thinks about it more in future.
Added --evaluate-once to evaluation.
Fixed summaries writing, it was incorrect before.

2017/11/23:

Forked by me. Initial changes was done (described above in differences section). Models now stored in repository.

2017/11/15:

Support training phase, you can train on your own dataset. Please read the guide below.

2017/11/13:

Add bnnomerge model which reparing for training phase. Choose different model using flag --model=train, train_bn, trainval, trainval_bn (Upload model in google drive).
Change tf.nn.batch_normalization to tf.layers.batch_normalization.

2017/11/07:

Support every image size larger than 128x256 by changing the avg pooling ksize and strides in the pyramid module. If input image size cannot divided by 32, it will be padded in to mutiple of 32.

Install

Get restore checkpoint from Google Drive and put into model directory.

Inference

To get result on your own images, use the following command:

python inference.py --img-path=./input/test.png

Inference time: ~0.02s, I have no idea why it's faster than caffe implementation

Evaluation

Perform in single-scaled model on the cityscapes validation datase.

Model	Accuracy	Missing accuracy
train_30k	65.56/67.7	2.14%
trainval_90k	78.44%	None

If you want to use it for training validation, use parameter --repeated-eval. In that case model will run endless and will evaluate one time per --eval-interval period if new checkpoint comes. Also if you want to select best valid model, you need to setup --best-models-dir.

Example of usage:

python evaluate.py --model other --snapshot-dir=test --repeated-eval --data-list some_valid_list.txt --best-models-dir ./best_models

Image Result

Input image	Output image

Training on your own dataset

Note: This implementation is different from the details descibed in ICNet paper, since I did not re-produce model compression part. Instead, we train on the half kernel directly.

Step by Step

1. Change the DATA_LIST_PATH train.py, make sure the list contains the absolute path of your data files, in list.txt:

/ABSOLUTE/PATH/TO/image /ABSOLUTE/PATH/TO/label

2. Set Hyperparameters

In hyperparams.py:

#  Mean taken from Mapilary Vistas dataset
IMG_MEAN = np.array((106.33906592, 116.77648721, 119.91756518), dtype = np.float32)

BATCH_SIZE = 8
DATA_LIST_PATH = '/mnt/Data/Datasets/Segmentation/vistas_no_pp/list.txt'
IGNORE_LABEL = 255
INPUT_SIZE = '800,800'
LEARNING_RATE = 5e-2
MOMENTUM = 0.95
NUM_CLASSES = 3
NUM_STEPS = 100000
POWER = 0.9
RANDOM_SEED = 1234
WEIGHT_DECAY = 0.000001
PRETRAINED_MODEL = './model/icnet_cityscapes_trainval_90k_bnnomerge.npy'
SNAPSHOT_DIR = './snapshots/'
SAVE_NUM_IMAGES = 8
SAVE_PRED_EVERY = 200

USE_CLASS_WEIGHTS = True

If you want to use classes weights, you need to pass argument --use-class-weights and setup class weights value in train.py:

CLASS_WEIGHTS = [1.0, 1.0, 2.0]

3. Run following command and decide whether to update mean/var or train beta/gamma variable.

python train.py --update-mean-var --train-beta-gamma --random-mirror --use-class-weights --not-restore-last

After training the dataset, you can run following command to get the result:

python inference.py --img-path=YOUR_OWN_IMAGE

Example of validation run:

python evaluate.py --snapshot-dir=snapshots --repeated-eval --best-models-dir ./best_models --eval-interval 120 --ignore-zero

Example of training process in Tensorboard summaries:

Also you can run model on video with utils.py script in dataset directory, but it is a little hardcoded for myself, so success is on your own.

Export model for production

Check file export_model.sh for example of exporting model.

In that .sh file please change all paths and names to yours.

Then run it with bash.

bash export_model.sh

First script will convert your model to inference graph - without weights. So called unfrozen graph. After that it will convert it to frozen graph, with weights.

Example of how to run that model after exporting see in inference.py:

def load_from_pb(shape, path):
    segment_graph = tf.Graph()
    with segment_graph.as_default():
        seg_graph_def = tf.GraphDef()
        with tf.gfile.GFile(path, 'rb') as fid:
            serialized_graph = fid.read()
            seg_graph_def.ParseFromString(serialized_graph)
            
            x = tf.placeholder(dtype = tf.float32, shape = shape)
            img_tf = preprocess(x)
            img_tf, n_shape = check_input(img_tf)
            
            tf.import_graph_def(seg_graph_def, {'input:0': img_tf}, name = '')

            raw_output = segment_graph.get_tensor_by_name('conv6/BiasAdd:0')
            output = tf.image.resize_bilinear(raw_output, tf.shape(img_tf)[1:3,])
            output = tf.argmax(output, dimension = 3)
            pred = tf.expand_dims(output, dim = 3)

            config = tf.ConfigProto()
            config.gpu_options.per_process_gpu_memory_fraction = 0.4
            config.allow_soft_placement = True
            config.log_device_placement = False

            sess = tf.Session(graph = segment_graph, config = config)

    return sess, pred, x

You can also run inference.py on that exported .pb model with flas --pb-file.

python3 inference.py --img-path ../1.jpg --pb-file ../frozen_icnet.pb

You can also try to quantize weights and transform graph to smaller size if you want. Check how to do that here.

Citation

@article{zhao2017icnet,
  author = {Hengshuang Zhao and
            Xiaojuan Qi and
            Xiaoyong Shen and
            Jianping Shi and
            Jiaya Jia},
  title = {ICNet for Real-Time Semantic Segmentation on High-Resolution Images},
  journal={arXiv preprint arXiv:1704.08545},
  year = {2017}
}

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.vscode/.ropeproject		.vscode/.ropeproject
datasets		datasets
model		model
.DS_Store		.DS_Store
._.DS_Store		._.DS_Store
.gitignore		.gitignore
Color scheme_last.png		Color scheme_last.png
Color scheme_screenshot_v1.png		Color scheme_screenshot_v1.png
README.md		README.md
augment_zoom_static.py		augment_zoom_static.py
color_segnet.png		color_segnet.png
evaluate.py		evaluate.py
evaluate.pyc		evaluate.pyc
export_inference_graph.py		export_inference_graph.py
export_model.sh		export_model.sh
generate_zooms.py		generate_zooms.py
get_mean.py		get_mean.py
hyperparams.py		hyperparams.py
image_reader.py		image_reader.py
image_reader.pyc		image_reader.pyc
inference.py		inference.py
inference.pyc		inference.pyc
model.py		model.py
network.py		network.py
network.pyc		network.pyc
tensorboard.jpg		tensorboard.jpg
tensorlayer_nets.py		tensorlayer_nets.py
tools.py		tools.py
tools.pyc		tools.pyc
train.py		train.py
train.pyc		train.pyc
train_pspnet.py		train_pspnet.py
train_tensorlayer.py		train_tensorlayer.py
train_unext.sh		train_unext.sh

WildChlamydia/ICNet-tensorflow

Folders and files

Latest commit

History

Repository files navigation