This is my fork of Samarth Brahmbhatt's py-faster-rcnn, which implemented StuffNet for joint object detection and semantic segmentation. The repository has been modified to support training with the Cityscapes dataset for joint object detection and semantic segmentation.
This was also Samarth Brahmbhatt's fork of Ross Girshick's py-faster-rcnn and has code and models for the WACV 2017 paper StuffNet: Using 'Stuff' to Improve Object Detection.
Please use this version of the repository. I've created -seg
versions of the training and solver prototxt files e.g. experiments/scripts/faster_rcnn_end2end-seg.sh
.
The repository is configured by default for StuffNet-30
i.e. 30 segmentation classes. To switch to StuffNet-10
, you will need to:
- Change the
num_output
parameter in thetrain-seg.prototxt
andtest-seg.prototxt
files to 10. - Change the
SEG_CLASSES
parameter inexperiments/cfgs/faster_rcnn_end2end-seg.yml
to 10.
StuffNet
models need segmentation images in addition to RGB images with bounding box annotations for training. You should generate them for your dataset using feature constraining (see paper for details) and put them in DATA_PATH/context_images_SEG_CLASSES/*.ppm
. SEG_CLASSES
is either 10 or 30. DATA_PATH
for VOC 2007 is VOCdevkit/VOC2007
, for VOC 2010 is VOCdevkit/VOC2010
, and so forth. The names of the PPM files should be exactly the same as the corresponding RGB images. For example, if the RGB image is DATA_PATH/JPEGImages/2010_006993.jpg
the segmentation image for training StuffNet-10
should be DATA_PATH/context_images_10/2010_006993.ppm
.