forked from open-mmlab/mmdetection
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Code Release of ECCV 2020 Spotlight paper for Side-Aware Boundary Loc…
…alization for More Precise Object Detection (open-mmlab#3603) * add sabl two stage * add sabl retina * ret cfg bug fix * test bug fix * minor update * update * add r101 two stage * update * add cfgs * add cfgs * update cfgs * format * format * add readme * fix isort * update * update readme * add doc string for sabl retina head * add doc string and rename some functions * add docstring for bucketing coder * update docstring * bucket_num -> num_buckets * bucket_pw -> bucket_w bucket_ph -> bucket_h * update label2onehot * update bucketing bbox coder doc * update * typo fix * bboxes_ -> rescaled_bboxes * rename some params in sabl head * init with mmcv.cnn * update doc * rename pos->post * update cfgs * update test cfg * update * add unitest for sabl head * add unitest for sabl retina * rename * minor rename * minor update * update docstring * update * use F.one_hot * update docstring * update test heads * update ReadMe * fix
- Loading branch information
1 parent
08d1402
commit 26562a1
Showing
22 changed files
with
2,365 additions
and
19 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# Side-Aware Boundary Localization for More Precise Object Detection | ||
|
||
## Introduction | ||
|
||
We provide config files to reproduce the object detection results in the ECCV 2020 Spotlight paper for [Side-Aware Boundary Localization for More Precise Object Detection](https://arxiv.org/abs/1912.04260). | ||
|
||
``` | ||
@inproceedings{Wang_2020_ECCV, | ||
title = {Side-Aware Boundary Localization for More Precise Object Detection}, | ||
author = {Wang, Jiaqi and Zhang, Wenwei and Cao, Yuhang and Chen, Kai and Pang, Jiangmiao and Gong, Tao and Shi, Jianping, Loy, Chen Change and Lin, Dahua}, | ||
booktitle = {ECCV}, | ||
year = {2020} | ||
} | ||
``` | ||
|
||
## Results and Models | ||
|
||
The results on COCO 2017 val is shown in the below table. (results on test-dev are usually slightly higher than val). | ||
Single-scale testing (1333x800) is adopted in all results. | ||
|
||
|
||
| Method | Backbone | Lr schd | ms-train | box AP | Download | | ||
| :----------------: | :-------: | :-----: | :------: | :----: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | | ||
| SABL Faster R-CNN | R-50-FPN | 1x | N | 39.9 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_faster_rcnn_r50_fpn_1x_coco/sabl_faster_rcnn_r50_fpn_1x_coco-e867595b.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_faster_rcnn_r50_fpn_1x_coco/20200830_130324.log.json) | | ||
| SABL Faster R-CNN | R-101-FPN | 1x | N | 41.7 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_faster_rcnn_r101_fpn_1x_coco/sabl_faster_rcnn_r101_fpn_1x_coco-f804c6c1.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_faster_rcnn_r101_fpn_1x_coco/20200830_183949.log.json) | | ||
| SABL Cascade R-CNN | R-50-FPN | 1x | N | 41.6 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_cascade_rcnn_r50_fpn_1x_coco/sabl_cascade_rcnn_r50_fpn_1x_coco-e1748e5e.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_cascade_rcnn_r50_fpn_1x_coco/20200831_033726.log.json) | | ||
| SABL Cascade R-CNN | R-101-FPN | 1x | N | 43.0 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_cascade_rcnn_r101_fpn_1x_coco/sabl_cascade_rcnn_r101_fpn_1x_coco-2b83e87c.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_cascade_rcnn_r101_fpn_1x_coco/20200831_141745.log.json) | | ||
|
||
| Method | Backbone | GN | Lr schd | ms-train | box AP | Download | | ||
| :------------: | :-------: | :---: | :-----: | :---------: | :----: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | | ||
| SABL RetinaNet | R-50-FPN | N | 1x | N | 37.7 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r50_fpn_1x_coco/sabl_retinanet_r50_fpn_1x_coco-6c54fd4f.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r50_fpn_1x_coco/20200830_053451.log.json) | | ||
| SABL RetinaNet | R-50-FPN | Y | 1x | N | 38.8 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r50_fpn_gn_1x_coco/sabl_retinanet_r50_fpn_gn_1x_coco-e16dfcf1.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r50_fpn_gn_1x_coco/20200831_141955.log.json) | | ||
| SABL RetinaNet | R-101-FPN | N | 1x | N | 39.7 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_1x_coco/sabl_retinanet_r101_fpn_1x_coco-42026904.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_1x_coco/20200831_034256.log.json) | | ||
| SABL RetinaNet | R-101-FPN | Y | 1x | N | 40.5 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_1x_coco/sabl_retinanet_r101_fpn_gn_1x_coco-40a893e8.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_1x_coco/20200830_201422.log.json) | | ||
| SABL RetinaNet | R-101-FPN | Y | 2x | Y (640~800) | 42.9 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_640_800_coco/sabl_retinanet_r101_fpn_gn_2x_ms_640_800_coco-1e63382c.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_640_800_coco/20200830_144807.log.json) | | ||
| SABL RetinaNet | R-101-FPN | Y | 2x | Y (480~960) | 43.6 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_480_960_coco/sabl_retinanet_r101_fpn_gn_2x_ms_480_960_coco-5342f857.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_480_960_coco/20200830_164537.log.json) | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
_base_ = [ | ||
'../_base_/models/cascade_rcnn_r50_fpn.py', | ||
'../_base_/datasets/coco_detection.py', | ||
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py' | ||
] | ||
# model settings | ||
model = dict( | ||
pretrained='torchvision://resnet101', | ||
backbone=dict(depth=101), | ||
roi_head=dict(bbox_head=[ | ||
dict( | ||
type='SABLHead', | ||
num_classes=80, | ||
cls_in_channels=256, | ||
reg_in_channels=256, | ||
roi_feat_size=7, | ||
reg_feat_up_ratio=2, | ||
reg_pre_kernel=3, | ||
reg_post_kernel=3, | ||
reg_pre_num=2, | ||
reg_post_num=1, | ||
cls_out_channels=1024, | ||
reg_offset_out_channels=256, | ||
reg_cls_out_channels=256, | ||
num_cls_fcs=1, | ||
num_reg_fcs=0, | ||
reg_class_agnostic=True, | ||
norm_cfg=None, | ||
bbox_coder=dict( | ||
type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.7), | ||
loss_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), | ||
loss_bbox_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), | ||
loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1, | ||
loss_weight=1.0)), | ||
dict( | ||
type='SABLHead', | ||
num_classes=80, | ||
cls_in_channels=256, | ||
reg_in_channels=256, | ||
roi_feat_size=7, | ||
reg_feat_up_ratio=2, | ||
reg_pre_kernel=3, | ||
reg_post_kernel=3, | ||
reg_pre_num=2, | ||
reg_post_num=1, | ||
cls_out_channels=1024, | ||
reg_offset_out_channels=256, | ||
reg_cls_out_channels=256, | ||
num_cls_fcs=1, | ||
num_reg_fcs=0, | ||
reg_class_agnostic=True, | ||
norm_cfg=None, | ||
bbox_coder=dict( | ||
type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.5), | ||
loss_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), | ||
loss_bbox_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), | ||
loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1, | ||
loss_weight=1.0)), | ||
dict( | ||
type='SABLHead', | ||
num_classes=80, | ||
cls_in_channels=256, | ||
reg_in_channels=256, | ||
roi_feat_size=7, | ||
reg_feat_up_ratio=2, | ||
reg_pre_kernel=3, | ||
reg_post_kernel=3, | ||
reg_pre_num=2, | ||
reg_post_num=1, | ||
cls_out_channels=1024, | ||
reg_offset_out_channels=256, | ||
reg_cls_out_channels=256, | ||
num_cls_fcs=1, | ||
num_reg_fcs=0, | ||
reg_class_agnostic=True, | ||
norm_cfg=None, | ||
bbox_coder=dict( | ||
type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.3), | ||
loss_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), | ||
loss_bbox_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), | ||
loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1, loss_weight=1.0)) | ||
])) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
_base_ = [ | ||
'../_base_/models/cascade_rcnn_r50_fpn.py', | ||
'../_base_/datasets/coco_detection.py', | ||
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py' | ||
] | ||
# model settings | ||
model = dict( | ||
roi_head=dict(bbox_head=[ | ||
dict( | ||
type='SABLHead', | ||
num_classes=80, | ||
cls_in_channels=256, | ||
reg_in_channels=256, | ||
roi_feat_size=7, | ||
reg_feat_up_ratio=2, | ||
reg_pre_kernel=3, | ||
reg_post_kernel=3, | ||
reg_pre_num=2, | ||
reg_post_num=1, | ||
cls_out_channels=1024, | ||
reg_offset_out_channels=256, | ||
reg_cls_out_channels=256, | ||
num_cls_fcs=1, | ||
num_reg_fcs=0, | ||
reg_class_agnostic=True, | ||
norm_cfg=None, | ||
bbox_coder=dict( | ||
type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.7), | ||
loss_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), | ||
loss_bbox_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), | ||
loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1, | ||
loss_weight=1.0)), | ||
dict( | ||
type='SABLHead', | ||
num_classes=80, | ||
cls_in_channels=256, | ||
reg_in_channels=256, | ||
roi_feat_size=7, | ||
reg_feat_up_ratio=2, | ||
reg_pre_kernel=3, | ||
reg_post_kernel=3, | ||
reg_pre_num=2, | ||
reg_post_num=1, | ||
cls_out_channels=1024, | ||
reg_offset_out_channels=256, | ||
reg_cls_out_channels=256, | ||
num_cls_fcs=1, | ||
num_reg_fcs=0, | ||
reg_class_agnostic=True, | ||
norm_cfg=None, | ||
bbox_coder=dict( | ||
type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.5), | ||
loss_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), | ||
loss_bbox_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), | ||
loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1, | ||
loss_weight=1.0)), | ||
dict( | ||
type='SABLHead', | ||
num_classes=80, | ||
cls_in_channels=256, | ||
reg_in_channels=256, | ||
roi_feat_size=7, | ||
reg_feat_up_ratio=2, | ||
reg_pre_kernel=3, | ||
reg_post_kernel=3, | ||
reg_pre_num=2, | ||
reg_post_num=1, | ||
cls_out_channels=1024, | ||
reg_offset_out_channels=256, | ||
reg_cls_out_channels=256, | ||
num_cls_fcs=1, | ||
num_reg_fcs=0, | ||
reg_class_agnostic=True, | ||
norm_cfg=None, | ||
bbox_coder=dict( | ||
type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.3), | ||
loss_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), | ||
loss_bbox_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), | ||
loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1, loss_weight=1.0)) | ||
])) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
_base_ = [ | ||
'../_base_/models/faster_rcnn_r50_fpn.py', | ||
'../_base_/datasets/coco_detection.py', | ||
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py' | ||
] | ||
model = dict( | ||
pretrained='torchvision://resnet101', | ||
backbone=dict(depth=101), | ||
roi_head=dict( | ||
bbox_head=dict( | ||
_delete_=True, | ||
type='SABLHead', | ||
num_classes=80, | ||
cls_in_channels=256, | ||
reg_in_channels=256, | ||
roi_feat_size=7, | ||
reg_feat_up_ratio=2, | ||
reg_pre_kernel=3, | ||
reg_post_kernel=3, | ||
reg_pre_num=2, | ||
reg_post_num=1, | ||
cls_out_channels=1024, | ||
reg_offset_out_channels=256, | ||
reg_cls_out_channels=256, | ||
num_cls_fcs=1, | ||
num_reg_fcs=0, | ||
reg_class_agnostic=True, | ||
norm_cfg=None, | ||
bbox_coder=dict( | ||
type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.7), | ||
loss_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), | ||
loss_bbox_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), | ||
loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1, | ||
loss_weight=1.0)))) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
_base_ = [ | ||
'../_base_/models/faster_rcnn_r50_fpn.py', | ||
'../_base_/datasets/coco_detection.py', | ||
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py' | ||
] | ||
model = dict( | ||
roi_head=dict( | ||
bbox_head=dict( | ||
_delete_=True, | ||
type='SABLHead', | ||
num_classes=80, | ||
cls_in_channels=256, | ||
reg_in_channels=256, | ||
roi_feat_size=7, | ||
reg_feat_up_ratio=2, | ||
reg_pre_kernel=3, | ||
reg_post_kernel=3, | ||
reg_pre_num=2, | ||
reg_post_num=1, | ||
cls_out_channels=1024, | ||
reg_offset_out_channels=256, | ||
reg_cls_out_channels=256, | ||
num_cls_fcs=1, | ||
num_reg_fcs=0, | ||
reg_class_agnostic=True, | ||
norm_cfg=None, | ||
bbox_coder=dict( | ||
type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.7), | ||
loss_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), | ||
loss_bbox_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), | ||
loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1, | ||
loss_weight=1.0)))) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
_base_ = [ | ||
'../_base_/models/retinanet_r50_fpn.py', | ||
'../_base_/datasets/coco_detection.py', | ||
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py' | ||
] | ||
# model settings | ||
model = dict( | ||
pretrained='torchvision://resnet101', | ||
backbone=dict(depth=101), | ||
bbox_head=dict( | ||
_delete_=True, | ||
type='SABLRetinaHead', | ||
num_classes=80, | ||
in_channels=256, | ||
stacked_convs=4, | ||
feat_channels=256, | ||
approx_anchor_generator=dict( | ||
type='AnchorGenerator', | ||
octave_base_scale=4, | ||
scales_per_octave=3, | ||
ratios=[0.5, 1.0, 2.0], | ||
strides=[8, 16, 32, 64, 128]), | ||
square_anchor_generator=dict( | ||
type='AnchorGenerator', | ||
ratios=[1.0], | ||
scales=[4], | ||
strides=[8, 16, 32, 64, 128]), | ||
bbox_coder=dict( | ||
type='BucketingBBoxCoder', num_buckets=14, scale_factor=3.0), | ||
loss_cls=dict( | ||
type='FocalLoss', | ||
use_sigmoid=True, | ||
gamma=2.0, | ||
alpha=0.25, | ||
loss_weight=1.0), | ||
loss_bbox_cls=dict( | ||
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.5), | ||
loss_bbox_reg=dict( | ||
type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.5))) | ||
# training and testing settings | ||
train_cfg = dict( | ||
assigner=dict( | ||
type='ApproxMaxIoUAssigner', | ||
pos_iou_thr=0.5, | ||
neg_iou_thr=0.4, | ||
min_pos_iou=0.0, | ||
ignore_iof_thr=-1), | ||
allowed_border=-1, | ||
pos_weight=-1, | ||
debug=False) | ||
# optimizer | ||
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001) |
Oops, something went wrong.