implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks
To reproduce the results in the paper, please use this branch.
Model | Test Size | APtest | AP50test | AP75test | APStest | APMtest | APLtest | batch1 throughput |
---|---|---|---|---|---|---|---|---|
YOLOR-P6 | 1280 | 52.6% | 70.6% | 57.6% | 34.7% | 56.6% | 64.2% | 49 fps |
YOLOR-W6 | 1280 | 54.1% | 72.0% | 59.2% | 36.3% | 57.9% | 66.1% | 47 fps |
YOLOR-E6 | 1280 | 54.8% | 72.7% | 60.0% | 36.9% | 58.7% | 66.9% | 37 fps |
YOLOR-D6 | 1280 | 55.4% | 73.3% | 60.6% | 38.0% | 59.2% | 67.1% | 30 fps |
YOLOv4-P5 | 896 | 51.8% | 70.3% | 56.6% | 33.4% | 55.7% | 63.4% | 41 fps |
YOLOv4-P6 | 1280 | 54.5% | 72.6% | 59.8% | 36.6% | 58.2% | 65.5% | 30 fps |
YOLOv4-P7 | 1536 | 55.5% | 73.4% | 60.8% | 38.4% | 59.4% | 67.7% | 16 fps |
To reproduce the inference speed, please see darknet.
Model | Test Size | APval | AP50val | AP75val | APSval | APMval | APLval | batch1 throughput |
---|---|---|---|---|---|---|---|---|
YOLOv4-CSP | 640 | 49.1% | 67.7% | 53.8% | 32.1% | 54.4% | 63.2% | 76 fps |
YOLOR-CSP | 640 | 49.2% | 67.6% | 53.7% | 32.9% | 54.4% | 63.0% | weights |
YOLOR-CSP* | 640 | 50.0% | 68.7% | 54.3% | 34.2% | 55.1% | 64.3% | weights |
YOLOv4-CSP-X | 640 | 50.9% | 69.3% | 55.4% | 35.3% | 55.8% | 64.8% | 53 fps |
YOLOR-CSP-X | 640 | 51.1% | 69.6% | 55.7% | 35.7% | 56.0% | 65.2% | weights |
YOLOR-CSP-X* | 640 | 51.5% | 69.9% | 56.1% | 35.8% | 56.8% | 66.1% | weights |
python convert_to_onnx.py --weights yolor_csp_x_star.pt --cfg cfg/yolor_csp_x.cfg --output yolo_csp_x_star.onnx
python object_detector_onnx.py
/usr/src/tensorrt/bin/trtexec --onnx=yolor_csp_x_star.onnx \
--saveEngine=yolor_csp_x_star-fp16.trt \
--explicitBatch \
--minShapes=input:1x3x416x416 \
--optShapes=input:1x3x896x896 \
--maxShapes=input:1x3x896x896 \
--verbose \
--fp16 \
--device=0
python object_detector_trt.py
Note that yolor_p6 have 4 detect layer, change maximum boxes in exec_backends/trt_loader.py
For faster end-to-end processing with GPU, we can intergrade BatchedNMSPlugin to Yolor model. First we must convert model to ONNX, then follow all the steps bellow:
pip install onnx-simplifier
python3 -m onnxsim yolor_csp_x_star.onnx yolor_csp_x_star-sim.onnx --dynamic-input-shape --input-shape 1,3,640,640
python3 add_nms_plugins.py --model yolor_csp_x_star-sim.onnx
If you met IR version checking error, try to use torch==1.8.0 onnx==1.6.0
when convert original to ONNX, and then onnx==1.11.0
for this step
This script does following stages:
- Split current output tensor to bboxes & scores tensors, which are required inputs of
batchedNMSDynamic
plugins - Add post-processing to current model
- Add plugin on top of post-processed model
/usr/src/tensorrt/bin/trtexec --onnx=yolor_csp_x_star-nms.onnx \
--saveEngine=yolor_csp_x_star.trt \
--explicitBatch \
--minShapes=input:1x3x416x416 \
--optShapes=input:1x3x896x896 \
--maxShapes=input:1x3x896x896 \
--verbose \
--device=0
python3 object_detector_trt_nms.py
@article{wang2021you,
title={You Only Learn One Representation: Unified Network for Multiple Tasks},
author={Wang, Chien-Yao and Yeh, I-Hau and Liao, Hong-Yuan Mark},
journal={arXiv preprint arXiv:2105.04206},
year={2021}
}