Skip to content

yuanzheng625/deepLab-for-video

Repository files navigation

DeepLab for Video Semantic Segmentation

sample

The benefit of video segmentation over single image segmentation is that temporal information can be ultilized to localize the appearence-challenging but semantic-consistent parts of an object. That is why there are two tires detected in image E while only one for both image C and D.

This code is edited from DeepLab (a python wraper version https://github.com/TheLegendAli/DeepLab-Context) by Zheng Yuan. It enables the single image semantic segmentation algorithm (orginal DeepLab) to be applied for video use.

Basically, the video segmentation can be seen as two-adjacent-image-together segmentation with temporal constrain in CRF.

We know the single image segmentation (in orignal deepLab) consists of two steps, CNN feature extraction (used as uninary) and CRF for localized segmentation.

In the video segmentation, I still use the CNN feature extraction unit as before. Every video frame will be feed into the CNN to get its feature/score extracted.

But in CRF, I instead feed the CRF with two consecutive pictures and do the segmentation by graphic cutting the crf for the two pictures simultaneously. Since the CRF is a fully connected network, the temporal/motion constrain between two temporally neighboring pixels is considered. You can imagzine the number of the nodes in this CRF is twice of the crf in image segmentation.

The whole video segmentation process can be seen as a sequencial processing of two-picture CRF. Suppose we have frame 1, 2, 3, first feed 1 and 2 into a CRF and get 1 and 2 segmentation results simulatenously. Then feed 2's result and 3's feature into another CRF and get new results for the frame 2 and 3. This time the result of frame 2 is final. You could see for each frame we are considering the temporal constrain with both previous frame and next frame. (e.g. frame 2)

See below for the information of original DeepLab by UCLA.

Introduction

DeepLab is a state-of-art deep learning system for semantic image segmentation built on top of Caffe.

It combines densely-computed deep convolutional neural network (CNN) responses with densely connected conditional random fields (CRF).

This distribution provides a publicly available implementation for the key model ingredients first reported in an arXiv paper, accepted in revised form as conference publication to the ICLR-2015 conference. It also contains implementations for methods supporting model learning using only weakly labeled examples, described in a second follow-up arXiv paper. Please consult and consider citing the following papers:

@inproceedings{chen14semantic,
  title={Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs},
  author={Liang-Chieh Chen and George Papandreou and Iasonas Kokkinos and Kevin Murphy and Alan L Yuille},
  booktitle={ICLR},
  url={http://arxiv.org/abs/1412.7062},
  year={2015}
}

@article{papandreou15weak,
  title={Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation},
  author={George Papandreou and Liang-Chieh Chen and Kevin Murphy and Alan L Yuille},
  journal={arxiv:1502.02734},
  year={2015}
}

Note that if you use the densecrf implementation, please consult and cite the following paper:

@inproceedings{KrahenbuhlK11,
  title={Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials},
  author={Philipp Kr{\"{a}}henb{\"{u}}hl and Vladlen Koltun},
  booktitle={NIPS},      
  year={2011}
}

Performance

DeepLab currently achieves 73.9% on the challenging PASCAL VOC 2012 image segmentation task -- see the leaderboard.

Pre-trained models

We have released several trained models and corresponding prototxt files at here. Please check it for more model details.

The best model among the released ones yields 73.6% on PASCAL VOC 2012 test set.

Python wrapper requirements

  1. Install wget library for python
sudo pip install wget
  1. Change DATA_ROOT to point to the PASCAL images

  2. To use the mat_read_layer and mat_write_layer, please download and install matio.

Running the code

python run.py

FAQ

Check FAQ if you have some problems while using the code.

About

video semantic segmentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published