-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 2373d1b
Showing
61 changed files
with
119,035 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# output dir | ||
output* | ||
instant_test_output | ||
inference_test_output | ||
|
||
|
||
# *.png | ||
# *.json | ||
*.diff | ||
# *.jpg | ||
!/projects/DensePose/doc/images/*.jpg | ||
|
||
# compilation and distribution | ||
__pycache__ | ||
_ext | ||
*.pyc | ||
*.pyd | ||
*.so | ||
*.dll | ||
*.egg-info/ | ||
build/ | ||
dist/ | ||
wheels/ | ||
|
||
# pytorch/python/numpy formats | ||
*.pth | ||
*.pkl | ||
*.npy | ||
*.ts | ||
model_ts*.txt | ||
|
||
# ipython/jupyter notebooks | ||
*.ipynb | ||
**/.ipynb_checkpoints/ | ||
|
||
# Editor temporaries | ||
*.swn | ||
*.swo | ||
*.swp | ||
*~ | ||
|
||
# editor settings | ||
.idea | ||
.vscode | ||
_darcs | ||
|
||
# project dirs | ||
/detectron2/model_zoo/configs | ||
# /datasets/* | ||
!/datasets/*.* | ||
/projects/*/datasets | ||
/models | ||
/snippet | ||
res* | ||
/checkpoints | ||
detectron2 | ||
results | ||
checkpoints | ||
demo_images | ||
exps |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
## OpenDet | ||
|
||
<img src="./docs/opendet2.png" width="78%"/> | ||
|
||
> **Expanding Low-Density Latent Regions for Open-Set Object Detection (CVPR2022)**<br> | ||
> [Jiaming Han](https://csuhan.com), [Yuqiang Ren](https://github.com/Anymake), [Jian Ding](https://dingjiansw101.github.io), [Xingjia Pan](https://scholar.google.com.hk/citations?user=NaSU3eIAAAAJ&hl=zh-CN), Ke Yan, [Gui-Song Xia](http://www.captain-whu.com/xia_En.html).<br> | ||
> [arXiv preprint](https://csuhan.com/attaches/cvpr_3605_final.pdf). | ||
OpenDet2: OpenDet is implemented based on [detectron2](https://github.com/facebookresearch/detectron2). | ||
|
||
### Setup | ||
|
||
The code is based on [detectron2 v0.5](https://github.com/facebookresearch/detectron2/tree/v0.5). | ||
|
||
* **Installation** | ||
|
||
Here is a from-scratch setup script. | ||
|
||
``` | ||
conda create -n opendet2 python=3.8 -y | ||
conda activate opendet2 | ||
conda install pytorch=1.8.1 torchvision cudatoolkit=10.1 -c pytorch -y | ||
pip install detectron2==0.5 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.8/index.html | ||
git clone https://github.com/csuhan/opendet2.git | ||
cd opendet2 | ||
pip install -v -e . | ||
``` | ||
|
||
* **Prepare datasets** | ||
|
||
Please follow [datasets/README.md](datasets/README.md) for dataset preparation. Then we generate VOC-COCO datasets. | ||
|
||
``` | ||
bash datasets/opendet2_utils/prepare_openset_voc_coco.sh | ||
# using data splits provided by us. | ||
cp datasets/voc_coco_ann datasets/voc_coco -rf | ||
``` | ||
|
||
### Model Zoo | ||
|
||
We report the results on VOC and VOC-COCO-20, and provide pretrained models. Please refer to the corresponding log file for full results. | ||
|
||
* **Faster R-CNN** | ||
|
||
| Method | backbone | mAP<sub>K↑</sub>(VOC) | WI<sub>↓</sub> | AOSE<sub>↓</sub> | mAP<sub>K↑</sub> | AP<sub>U↑</sub> | Download | | ||
|---------|:--------:|:--------------------------:|:-------------------:|:---------------------:|:---------------------:|:--------------------:|:------------:| | ||
| FR-CNN | R-50 | 80.06 | 19.50 | 16518 | 58.36 | 0 | [config](configs/faster_rcnn_R_50_FPN_3x_baseline.yaml) [model](https://drive.google.com/drive/folders/10uFOLLCK4N8te08-C-olRyDV-cJ-L6lU?usp=sharing) | | ||
| PROSER | R-50 | 79.42 | 20.44 | 14266 | 56.72 | 16.99 | [config](configs/faster_rcnn_R_50_FPN_3x_proser.yaml) [model](https://drive.google.com/drive/folders/1_L85gisyvDtBXPe2UbI49vrd5FoBIOI_?usp=sharing) | | ||
| ORE | R-50 | 79.80 | 18.18 | 12811 | 58.25 | 2.60 | [config]() [model]() | | ||
| DS | R-50 | 79.70 | 16.76 | 13062 | 58.46 | 8.75 | [config](configs/faster_rcnn_R_50_FPN_3x_ds.yaml) [model](https://drive.google.com/drive/folders/1OWDjL29E2H-_lSApXqM2r8PS7ZvUNtiv?usp=sharing) | | ||
| OpenDet | R-50 | 80.02 | 12.50 | 10758 | 58.64 | 14.38 | [config](configs/faster_rcnn_R_50_FPN_3x_opendet.yaml) [model](https://drive.google.com/drive/folders/10uFOLLCK4N8te08-C-olRyDV-cJ-L6lU?usp=sharing) | | ||
| OpenDet | Swin-T | 83.29 | 10.76 | 9149 | 63.42 | 16.35 | [config](configs/faster_rcnn_Swin_T_FPN_3x_opendet.yaml) [model](https://drive.google.com/drive/folders/1j5SkEzeqr0ZnGVVZ4mzXSOvookHfvVvm?usp=sharing) | | ||
|
||
* **RetinaNet** | ||
|
||
| Method | mAP<sub>K↑</sub>(VOC) | WI<sub>↓</sub> | AOSE<sub>↓</sub> | mAP<sub>K↑</sub> | AP<sub>U↑</sub> | Download | | ||
|----------------|:--------------------------:|:-------------------:|:---------------------:|:---------------------:|:--------------------:|:----------------:| | ||
| RetinaNet | 79.63 | 14.16 | 36531 | 57.32 | 0 | [config](configs/retinanet_R_50_FPN_3x_baseline.yaml) [model](https://drive.google.com/drive/folders/15fHfyA2HuXp6LfdTMBuHG6ZwtLcgvD-p?usp=sharing) | | ||
| Open-RetinaNet | 79.64 | 10.74 | 17208 | 57.32 | 10.55 | [config](configs/retinanet_R_50_FPN_3x_opendet.yaml) [model](https://drive.google.com/drive/folders/1uLRZ5bdGaoORWaP2huiyL_WyLicmWT4G?usp=sharing) | | ||
|
||
|
||
**Note**: | ||
* If you cannot access google drive, BaiduYun download link can be found [here](https://pan.baidu.com/s/1I4Pp40pM84aeYTNeGc0kPA) with extracting code ABCD. | ||
* The above results are reimplemented. Therefore, they are slightly different from our paper. | ||
* The official code of ORE is at [OWOD](https://github.com/JosephKJ/OWOD). We do not plan to include ORE in our code. | ||
|
||
### Online Demo | ||
|
||
Try our online demo at [huggingface space](https://huggingface.co/spaces/csuhan/opendet2). | ||
|
||
### Train and Test | ||
|
||
* **Testing** | ||
|
||
First, you need to download pretrained weights in the model zoo, e.g., [OpenDet](https://drive.google.com/drive/folders/10uFOLLCK4N8te08-C-olRyDV-cJ-L6lU?usp=sharing). | ||
|
||
Then, run the following command: | ||
``` | ||
python tools/train_net.py --num-gpus 8 --config-file configs/faster_rcnn_R_50_FPN_3x_opendet.yaml \ | ||
--eval-only MODEL.WEIGHTS output/faster_rcnn_R_50_FPN_3x_opendet/model_final.pth | ||
``` | ||
|
||
* **Training** | ||
|
||
The training process is the same as detectron2. | ||
``` | ||
python tools/train_net.py --num-gpus 8 --config-file configs/faster_rcnn_R_50_FPN_3x_opendet.yaml | ||
``` | ||
|
||
To train with the Swin-T backbone, please download [swin_tiny_patch4_window7_224.pth](https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth) and convert it to detectron2's format using [tools/convert_swin_to_d2.py](tools/convert_swin_to_d2.py). | ||
``` | ||
wget https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth | ||
python tools/convert_swin_to_d2.py swin_tiny_patch4_window7_224.pth swin_tiny_patch4_window7_224_d2.pth | ||
``` | ||
|
||
|
||
### Citation | ||
|
||
If you find our work useful for your research, please consider citing: | ||
|
||
```BibTeX | ||
@InProceedings{han2022opendet, | ||
author = {Han, Jiaming and Ren, Yuqiang and Ding, Jian and Pan, Xingjia and Yan, Ke and Xia, Gui-Song}, | ||
title = {Expanding Low-Density Latent Regions for Open-Set Object Detection}, | ||
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, | ||
year = {2022} | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
""" | ||
Online demo at huggingface. | ||
The link is: https://huggingface.co/spaces/csuhan/opendet2 | ||
""" | ||
import os | ||
os.system('pip install torch==1.9 torchvision') | ||
os.system('pip install detectron2==0.5 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.9/index.html') | ||
os.system('pip install timm opencv-python-headless') | ||
|
||
|
||
import gradio as gr | ||
|
||
from demo.predictor import VisualizationDemo | ||
from detectron2.config import get_cfg | ||
from opendet2 import add_opendet_config | ||
|
||
|
||
model_cfgs = { | ||
"FR-CNN": ["configs/faster_rcnn_R_50_FPN_3x_baseline.yaml", "frcnn_r50.pth"], | ||
"OpenDet-R50": ["configs/faster_rcnn_R_50_FPN_3x_opendet.yaml", "opendet2_r50.pth"], | ||
"OpenDet-SwinT": ["configs/faster_rcnn_Swin_T_FPN_18e_opendet_voc.yaml", "opendet2_swint.pth"], | ||
} | ||
|
||
|
||
def setup_cfg(model): | ||
cfg = get_cfg() | ||
add_opendet_config(cfg) | ||
model_cfg = model_cfgs[model] | ||
cfg.merge_from_file(model_cfg[0]) | ||
cfg.MODEL.WEIGHTS = model_cfg[1] | ||
cfg.MODEL.DEVICE = "cpu" | ||
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 | ||
cfg.MODEL.ROI_HEADS.VIS_IOU_THRESH = 0.8 | ||
cfg.freeze() | ||
return cfg | ||
|
||
|
||
def inference(input, model): | ||
cfg = setup_cfg(model) | ||
demo = VisualizationDemo(cfg) | ||
# use PIL, to be consistent with evaluation | ||
predictions, visualized_output = demo.run_on_image(input) | ||
output = visualized_output.get_image()[:, :, ::-1] | ||
return output | ||
|
||
|
||
iface = gr.Interface( | ||
inference, | ||
[ | ||
"image", | ||
gr.inputs.Radio( | ||
["FR-CNN", "OpenDet-R50", "OpenDet-SwinT"], default='OpenDet-R50'), | ||
], | ||
"image") | ||
|
||
iface.launch() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
_BASE_: "./Base-RCNN-FPN.yaml" | ||
MODEL: | ||
MASK_ON: False | ||
ROI_HEADS: | ||
NAME: "OpenSetStandardROIHeads" | ||
NUM_CLASSES: 81 | ||
NUM_KNOWN_CLASSES: 20 | ||
ROI_BOX_HEAD: | ||
NAME: "FastRCNNSeparateConvFCHead" | ||
OUTPUT_LAYERS: "OpenDetFastRCNNOutputLayers" | ||
CLS_AGNOSTIC_BBOX_REG: True | ||
UPLOSS: | ||
START_ITER: 100 | ||
SAMPLING_METRIC: "min_score" | ||
TOPK: 3 | ||
ALPHA: 1.0 | ||
WEIGHT: 1.0 | ||
ICLOSS: | ||
OUT_DIM: 128 | ||
QUEUE_SIZE: 256 | ||
IN_QUEUE_SIZE: 16 | ||
BATCH_IOU_THRESH: 0.5 | ||
QUEUE_IOU_THRESH: 0.7 | ||
TEMPERATURE: 0.1 | ||
WEIGHT: 0.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# The same as detectron2/configs/Base-RCNN-FPN.yaml | ||
MODEL: | ||
META_ARCHITECTURE: "GeneralizedRCNN" | ||
BACKBONE: | ||
NAME: "build_resnet_fpn_backbone" | ||
RESNETS: | ||
OUT_FEATURES: ["res2", "res3", "res4", "res5"] | ||
FPN: | ||
IN_FEATURES: ["res2", "res3", "res4", "res5"] | ||
ANCHOR_GENERATOR: | ||
SIZES: [[32], [64], [128], [256], [512]] # One size for each in feature map | ||
ASPECT_RATIOS: [[0.5, 1.0, 2.0]] # Three aspect ratios (same for all in feature maps) | ||
RPN: | ||
IN_FEATURES: ["p2", "p3", "p4", "p5", "p6"] | ||
PRE_NMS_TOPK_TRAIN: 2000 # Per FPN level | ||
PRE_NMS_TOPK_TEST: 1000 # Per FPN level | ||
# Detectron1 uses 2000 proposals per-batch, | ||
# (See "modeling/rpn/rpn_outputs.py" for details of this legacy issue) | ||
# which is approximately 1000 proposals per-image since the default batch size for FPN is 2. | ||
POST_NMS_TOPK_TRAIN: 1000 | ||
POST_NMS_TOPK_TEST: 1000 | ||
ROI_HEADS: | ||
NAME: "StandardROIHeads" | ||
IN_FEATURES: ["p2", "p3", "p4", "p5"] | ||
ROI_BOX_HEAD: | ||
NAME: "FastRCNNConvFCHead" | ||
NUM_FC: 2 | ||
POOLER_RESOLUTION: 7 | ||
ROI_MASK_HEAD: | ||
NAME: "MaskRCNNConvUpsampleHead" | ||
NUM_CONV: 4 | ||
POOLER_RESOLUTION: 14 | ||
DATASETS: | ||
TRAIN: ("coco_2017_train",) | ||
TEST: ("coco_2017_val",) | ||
SOLVER: | ||
IMS_PER_BATCH: 16 | ||
BASE_LR: 0.02 | ||
STEPS: (60000, 80000) | ||
MAX_ITER: 90000 | ||
INPUT: | ||
MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800) | ||
MIN_SIZE_TEST: 800 | ||
VERSION: 2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# The same as detectron2/configs/Base-RetinaNet.yaml | ||
MODEL: | ||
META_ARCHITECTURE: "RetinaNet" | ||
BACKBONE: | ||
NAME: "build_retinanet_resnet_fpn_backbone" | ||
RESNETS: | ||
OUT_FEATURES: ["res3", "res4", "res5"] | ||
ANCHOR_GENERATOR: | ||
SIZES: !!python/object/apply:eval ["[[x, x * 2**(1.0/3), x * 2**(2.0/3) ] for x in [32, 64, 128, 256, 512 ]]"] | ||
FPN: | ||
IN_FEATURES: ["res3", "res4", "res5"] | ||
RETINANET: | ||
IOU_THRESHOLDS: [0.4, 0.5] | ||
IOU_LABELS: [0, -1, 1] | ||
SMOOTH_L1_LOSS_BETA: 0.0 | ||
DATASETS: | ||
TRAIN: ("coco_2017_train",) | ||
TEST: ("coco_2017_val",) | ||
SOLVER: | ||
IMS_PER_BATCH: 16 | ||
BASE_LR: 0.01 # Note that RetinaNet uses a different default learning rate | ||
STEPS: (60000, 80000) | ||
MAX_ITER: 90000 | ||
INPUT: | ||
MIN_SIZE_TRAIN: (640, 672, 704, 736, 768, 800) | ||
VERSION: 2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
_BASE_: "./Base-RCNN-FPN-OPENDET.yaml" | ||
MODEL: | ||
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl" | ||
RESNETS: | ||
DEPTH: 50 | ||
ROI_BOX_HEAD: | ||
OUTPUT_LAYERS: "CosineFastRCNNOutputLayers" # baseline use a simple cosine FRCNN | ||
DATASETS: | ||
TRAIN: ('voc_2007_train', 'voc_2012_trainval') | ||
TEST: ('voc_2007_test', 'voc_coco_20_40_test', 'voc_coco_20_60_test', 'voc_coco_20_80_test', 'voc_coco_2500_test', 'voc_coco_5000_test', 'voc_coco_10000_test', 'voc_coco_20000_test') | ||
SOLVER: | ||
STEPS: (21000, 29000) | ||
MAX_ITER: 32000 | ||
WARMUP_ITERS: 100 | ||
AMP: | ||
ENABLED: True |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
_BASE_: "./Base-RCNN-FPN-OPENDET.yaml" | ||
MODEL: | ||
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl" | ||
RESNETS: | ||
DEPTH: 50 | ||
ROI_HEADS: | ||
NAME: "DropoutStandardROIHeads" | ||
ROI_BOX_HEAD: | ||
OUTPUT_LAYERS: "DropoutFastRCNNOutputLayers" | ||
DATASETS: | ||
TRAIN: ('voc_2007_train', 'voc_2012_trainval') | ||
TEST: ('voc_2007_test', 'voc_coco_20_40_test', 'voc_coco_20_60_test', 'voc_coco_20_80_test', 'voc_coco_2500_test', 'voc_coco_5000_test', 'voc_coco_10000_test', 'voc_coco_20000_test') | ||
SOLVER: | ||
STEPS: (21000, 29000) | ||
MAX_ITER: 32000 | ||
WARMUP_ITERS: 100 | ||
AMP: | ||
ENABLED: True |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
_BASE_: "./Base-RCNN-FPN-OPENDET.yaml" | ||
MODEL: | ||
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl" | ||
RESNETS: | ||
DEPTH: 50 | ||
DATASETS: | ||
TRAIN: ('voc_2007_train', 'voc_2012_trainval') | ||
TEST: ('voc_2007_test', 'voc_coco_20_40_test', 'voc_coco_20_60_test', 'voc_coco_20_80_test', 'voc_coco_2500_test', 'voc_coco_5000_test', 'voc_coco_10000_test', 'voc_coco_20000_test') | ||
SOLVER: | ||
STEPS: (21000, 29000) | ||
MAX_ITER: 32000 | ||
WARMUP_ITERS: 100 | ||
AMP: | ||
ENABLED: True | ||
|
||
# UPLOSS.WEIGHT: former two are 0.5, the last is 1.0 |
Oops, something went wrong.