YOLOv3-PyTorch

Overview

The inspiration for this project comes from ultralytics/yolov3 Thanks.

This project is a YOLOv3 object detection system. Development framework by PyTorch.

The goal of this implementation is to be simple, highly extensible, and easy to integrate into your own projects. This implementation is a work in progress -- new features are currently being implemented.

About YOLOv3

We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that's pretty swell. It's a little bigger than last time but more accurate. It's still fast though, don't worry. At 320x320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 mAP@50 in 51 ms on a Titan X, compared to 57.5 mAP@50 in 198 ms by RetinaNet, similar performance but 3.8x faster. As always, all the code is online at this https URL

Installation

Clone and install requirements

$ git clone https://github.com/Lornatang/YOLOv3-PyTorch.git
$ cd YOLOv3-PyTorch/
$ pip3 install -r requirements.txt

Download pre-trained weights

$ cd weights/
$ bash download_weights.sh

Download COCO2014

$ cd data/
$ bash get_coco_dataset.sh

Usage

Train

usage: train.py [-h] [--epochs EPOCHS] [--batch-size BATCH_SIZE] [--accumulate ACCUMULATE]
                [--cfg CFG] [--data DATA] [--multi-scale] [--img-size IMG_SIZE [IMG_SIZE ...]]
                [--rect] [--resume] [--nosave] [--notest] [--evolve] [--cache-images]
                [--weights WEIGHTS] [--arc ARC] [--name NAME] [--device DEVICE] [--adam]
                [--single-cls] [--var VAR]

Example (COCO2014)

To train on COCO2014 using a Darknet-53 backend pretrained on ImageNet run:

$ python3 train.py --cfg cfgs/yolov3.cfg  --data cfgs/coco2014.data --weights weights/darknet53.conv.74 --multi-scale

Example (VOC2007+2012)

To train on VOC07+12:

$ python3 train.py --cfg cfgs/yolov3-voc.cfg  --data cfgs/voc2007.data --weights weights/darknet53.conv.74 --multi-scale

Other training methods

Normal Training: python3 train.py to begin training after downloading COCO data with data/get_coco_dataset.sh. Each epoch trains on 117,263 images from the train and validate COCO sets, and tests on 5000 images from the COCO validate set.

Resume Training: python3 train.py --resume to resume training from weights/checkpoint.pth.

Test

mAP@0.5 run at --iou-threshold 0.5, mAP@0.5...0.95 run at --iou-threshold 0.7
Darknet results: https://arxiv.org/abs/1804.02767

Method	Size	COCO mAP @0.5...0.95	COCO mAP @0.5
YOLOv3-tiny YOLOv3 YOLOv3-SPP	320	14.0 28.7 30.5	29.1 51.8 52.3
YOLOv3-tiny YOLOv3 YOLOv3-SPP	416	16.0 31.2 33.9	33.0 55.4 56.9
YOLOv3-tiny YOLOv3 YOLOv3-SPP	512	16.6 32.7 35.6	34.9 57.7 59.5
YOLOv3-tiny YOLOv3 YOLOv3-SPP	608	16.6 33.1 37.0	35.4 58.2 60.7

$ python3 test.py --cfg cfgs/yolov3-spp.cfg --weights weights/yolov3-spp.pth --augment --save-json --image-size 608

Namespace(augment=True, batch_size=16, cfg='cfgs/yolov3-spp.cfg', confidence_threshold=0.001, data='data/coco2014.data', device='', image_size=608, iou_threshold=0.6, save_json=True, single_cls=False, task='eval', weights='weights/yolov3-spp.pth', workers=4)
Using CUDA 
    + device:0 (name='TITAN RTX', total_memory=24190MB)

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.454
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.644
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.497
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.270
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.504
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.577
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.363
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.599
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.668
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.502
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.724
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.805

Inference

detect.py runs inference on any sources:

$ python3 detect.py --source ...

Image: --source file.jpg
Video: --source file.mp4
Directory: --source dir/
Webcam: --source 0
HTTP stream: --source https://v.qq.com/x/page/x30366izba3.html

To run a specific models:

YOLOv3: python3 detect.py --cfg cfgs/yolov3.cfg --weights weights/yolov3.weights

YOLOv3-tiny: python3 detect.py --cfg cfgs/yolov3-tiny.cfg --weights weights/yolov3-tiny.weights

YOLOv3-SPP: python3 detect.py --cfg cfgs/yolov3-spp.cfg --weights weights/yolov3-spp.weights

Backbone

In addition to some architectures given by the author, we also add some commonly used neural network architectures, which usually have better mAP and less computation than the original architecture.

All training and tests at image size:(416 x 416) for GeForce RTX 2080 Ti.

Note: All commands use the following parameters.

python3 train.py --cfg <cfg-path> --data cfgs/voc2007.data --multi-scale --cache-image --batch-size 8

Backbone	Train	Test	train time (s/iter)	inference time (ms/im)	train mem (GB)	mAP	Cfg	Weights
YOLOv3-tiny	VOC07+12	VOC07	0.047	1.9	2.7	57.7	Link	weights
MobileNet-v1	VOC07+12	VOC07	0.056	2.4	2.9	65.2	Link	weights
MobileNet-v2	VOC07+12	VOC07	0.116	2.5	3.1	65.6	Link	weights
MobileNet-v3-small	VOC07+12	VOC07	0.050	1.8	1.0	57.7	Link	weights
MobileNet-v3-large	VOC07+12	VOC07	0.080	2.6	3.1	60.4	Link	weights
ShuffleNet-v1	VOC07+12	VOC07	-	-	-	-	Link	-
ShuffleNet-v2	VOC07+12	VOC07	-	-	-	-	Link	-
AlexNet	VOC07+12	VOC07	0.065	2.5	1.5	55.2	Link	weights
VGG16	VOC07+12	VOC07	0.194	7.9	7.7	73.7	Link	weights

Train on Custom Dataset

Run the commands below to create a custom model definition, replacing your-dataset-num-classes with the number of classes in your dataset.

# move to configs dir
$ cd cfgs/
# create custom model 'yolov3-custom.cfg'. (In fact, it is OK to modify two lines of parameters, see `create_model.sh`)                              
$ bash create_model.sh your-dataset-num-classes

Classes

Add class names to data/custom/classes.names. This file should have one row per class name.

Image Folder

Move the images of your dataset to data/custom/images/.

Annotation Folder

Move your annotations to data/custom/labels/. The dataloader expects that the annotation file corresponding to the image data/custom/images/train.jpg has the path data/custom/labels/train.txt. Each row in the annotation file should define one bounding box, using the syntax label_idx x_center y_center width height. The coordinates should be scaled [0, 1], and the label_idx should be zero-indexed and correspond to the row number of the class name in data/custom/classes.names.

Define Train and Validation Sets

In data/custom/train.txt and data/custom/valid.txt, add paths to images that will be used as train and validation data respectively.

Training

To train on the custom dataset run:

$ python3 train.py --cfg cfgs/yolov3-custom.cfg --data cfg/custom.data --epochs 100 --multi-scale

Add --weights weights/darknet53.conv.74 to train using a backend pretrained on ImageNet.

Darknet Conversion

$ git clone https://github.com/Lornatang/YOLOv3-PyTorch && cd YOLOv3-PyTorch

# convert darknet cfgs/weights to pytorch model
$ python3  -c "from easydet.utils import convert; convert('cfgs/yolov3-spp.cfgs', 'weights/yolov3-spp.weights')"
Success: converted 'weights/yolov3-spp.weights' to 'converted.pth'

# convert cfgs/pytorch model to darknet weights
$ python3  -c "from easydet.utils import convert; convert('cfgs/yolov3-spp.cfgs', 'weights/yolov3-spp.pth')"
Success: converted 'weights/yolov3-spp.pth' to 'converted.weights'

Credit

YOLOv3: An Incremental Improvement

Joseph Redmon, Ali Farhadi

Abstract
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that’s pretty swell. It’s a little bigger than last time but more accurate. It’s still fast though, don’t worry. At 320 × 320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 AP50 in 51 ms on a Titan X, compared to 57.5 AP50 in 198 ms by RetinaNet, similar performance but 3.8× faster. As always, all the code is online at https://pjreddie.com/yolo/.

[Paper] [Project Webpage] [Authors' Implementation]

@article{yolov3,
  title={YOLOv3: An Incremental Improvement},
  author={Redmon, Joseph and Farhadi, Ali},
  journal = {arXiv},
  year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 408 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
assets		assets
cfgs		cfgs
data		data
dev		dev
easydet		easydet
tools		tools
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
detect.py		detect.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

License

fireae/YOLOv3-PyTorch

Folders and files

Latest commit

History

Repository files navigation