Added Functionality

MultiScale Training #52: train.py --multi-scale will train each batch at a randomly selected image size from 320 to 608 pixels.

Webcam Inference #89: set webcam=True in detect.py.

Video Inference. Pass a video file to detect.py.

YOLOv3-tiny support #51: detect.py --cfg cfg/yolov3-tiny.cfg --weights weights/yolov3-tiny.pt

YOLOv3-spp support #16. detect.py --cfg cfg/yolov3-spp.cfg --weights weights/yolov3-spp.pt

ONNX Export #82: ONNX_EXPORT = True in models.py.

This is an initial release. This repository currently works well for inference, with no known issues, with training still under development. Current COCO training mAP using this repo is 0.522 (at 416 x 416) after 62 epochs (using all default training settings, simply running python3 train.py). We are exploring ways to improve this further.

Loss curves, Precision, Recall and mAP.

Recommend PyTorch >0 v1.0.0 to run this repo, which includes numerous bug fixes: https://github.com/pytorch/pytorch/releases

Inference

Inference appears to be working well, no known issues.

Training

Training is still under development. v1.0 benchmark is currently 0.522 mAP at epoch 62 at 416 image size without multi-scale training.
Balancing of the various loss terms seem to have great effect on the results, the current constants were tuned for training performance. Further study is needed here.
Augmentation may be a bit aggressive, experimentation is needed at reduced levels.
Training seems to suffer jumps in losses (visible in plots above). This may be associated with restarting a stopped training session, the cause is unclear, and some users seem not to experience this.

Validation

mAP calculation is now correct. mAP is calculated per class per image, and then averaged over images. Despite this, current mAP calculation in test.py seems to be slightly different than the official COCO mAP code.
It would be very useful for test.py to additionally output text files in the official COCO mAP format.

Multi-GPU

Multi GPU is currently not supported. PRs welcome!

Performance

Numerous commits addressed performance issues. Largest changes produced by not reporting P, R, TP, FP, FN each batch, and by reducing data movement between GPU and CPU. Current training speed is about 1 epoch per hour (16-image batches at 416 x 416 each take about 0.7 seconds) on a GCP P100.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Breaking Changes

Bug Fixes

Added Functionality

Performance

TODO (help and PR's welcome!)

Inference

Training

Validation

Multi-GPU

Performance

Releases: ultralytics/yolov3

ONNX Export, Webcam Inference, MultiScale Training

Breaking Changes

Bug Fixes

Added Functionality

Performance

TODO (help and PR's welcome!)

Initial Release

Inference

Training

Validation

Multi-GPU

Performance