Skip to content

syed-cbot/Image-segmentation-using-RGB-D

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image-segmentation-using-RGB-D

Learning depth-based semantic segmentation of street scenes

Abstract

This work addresses multi-class semantic segmentation of street scenes by exploring depth information with RGB data. Our dataset comprises of street images from Berlin taken from four different camera angles and scanned using a laser scanner and later processed to create the depth images from 3D point clouds by projection. Our work also proposes an architecture model comprising of a Residual Network as an encoder and a UNet decoder for the Berlin set that learns good quality feature representation. We achieve a mean accuracy of 58.35%, mean pixel accuracy of 94.36% and mean IOU (Intersection over Union) of 51.91% on the test set. We further analyze the benefits that the model ex- hibits on certain classes when trained including depth to the RGB data with that of the model based only on RGB information. An alternative approach of feeding the depth information using a separate encoder was carried out to study the performance variation in segmentation and if it can bring any significant hike to it’s quality. And finally we draw a performance contrast of our network to one of the state-of-the-art models on our dataset.

Introduction

image segmentation vs semantic segmentation

          

image segmentation using depth

Motivation

image segmentation vs semantic segmentation

          

Goals

  1. Achieve quality segmentation using RGB-D
  2. Comparison study of RGB-D to RGB segmentation
  3. Explore alternative approach to feed depth
  4. Compare our model to state-of-the-art

Approach (Data acquisition)

image segmentation vs semantic segmentation



List of labels to work with

image segmentation vs semantic segmentation



Approach (Architecture ResNet-34 fused with UNet decoder)

image segmentation vs semantic segmentation



Experiments and Results

ResNet-34 vs ResNet-50 vs ResNet-101

image segmentation vs semantic segmentation


image segmentation vs semantic segmentation

image segmentation vs semantic segmentation


ResNet-34 on testset

image segmentation vs semantic segmentation


image segmentation vs semantic segmentation

image segmentation vs semantic segmentation


RGB-D vs RGB segmentation

image segmentation vs semantic segmentation


image segmentation vs semantic segmentation

image segmentation vs semantic segmentation

image segmentation vs semantic segmentation

image segmentation vs semantic segmentation



Reproducing FuseNet approach on ResNet-34 fused with UNet decoder

image segmentation vs semantic segmentation


Early Fusion vs Late Fusion

image segmentation vs semantic segmentation


image segmentation vs semantic segmentation

image segmentation vs semantic segmentation

image segmentation vs semantic segmentation


Early Fusion (smaller model) vs Late Fusion

image segmentation vs semantic segmentation


image segmentation vs semantic segmentation


Early Fusion vs Late Fusion1 vs Late Fusion2

image segmentation vs semantic segmentation


image segmentation vs semantic segmentation

image segmentation vs semantic segmentation

image segmentation vs semantic segmentation

image segmentation vs semantic segmentation



ResNet-34 vs DDNet

Incorporating dense connections in UNet decoder

image segmentation vs semantic segmentation


Results

image segmentation vs semantic segmentation


image segmentation vs semantic segmentation

image segmentation vs semantic segmentation


image segmentation vs semantic segmentation


image segmentation vs semantic segmentation

image segmentation vs semantic segmentation


Swapping encoders and decoders of both architectures

DPDB-UNet vs Res-DDNet

image segmentation vs semantic segmentation


image segmentation vs semantic segmentation

image segmentation vs semantic segmentation

image segmentation vs semantic segmentation

image segmentation vs semantic segmentation



DPDB-UNet variations

Results

image segmentation vs semantic segmentation


image segmentation vs semantic segmentation

image segmentation vs semantic segmentation

image segmentation vs semantic segmentation

image segmentation vs semantic segmentation



Conclusions

  1. Proposed architecture model learns good quality feature representation
  2. Depth can deliver performance hike
  3. Late fusion is counter-productive
  4. On Berlin set, ResNet-34 performs better than DDNet

About

Learning depth-based semantic segmentation of street scenes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 54.0%
  • Python 45.7%
  • Shell 0.3%