Skip to content

Commit

Permalink
init
Browse files Browse the repository at this point in the history
  • Loading branch information
csuhan committed Mar 22, 2022
0 parents commit 2373d1b
Show file tree
Hide file tree
Showing 61 changed files with 119,035 additions and 0 deletions.
60 changes: 60 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# output dir
output*
instant_test_output
inference_test_output


# *.png
# *.json
*.diff
# *.jpg
!/projects/DensePose/doc/images/*.jpg

# compilation and distribution
__pycache__
_ext
*.pyc
*.pyd
*.so
*.dll
*.egg-info/
build/
dist/
wheels/

# pytorch/python/numpy formats
*.pth
*.pkl
*.npy
*.ts
model_ts*.txt

# ipython/jupyter notebooks
*.ipynb
**/.ipynb_checkpoints/

# Editor temporaries
*.swn
*.swo
*.swp
*~

# editor settings
.idea
.vscode
_darcs

# project dirs
/detectron2/model_zoo/configs
# /datasets/*
!/datasets/*.*
/projects/*/datasets
/models
/snippet
res*
/checkpoints
detectron2
results
checkpoints
demo_images
exps
109 changes: 109 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
## OpenDet

<img src="./docs/opendet2.png" width="78%"/>

> **Expanding Low-Density Latent Regions for Open-Set Object Detection (CVPR2022)**<br>
> [Jiaming Han](https://csuhan.com), [Yuqiang Ren](https://github.com/Anymake), [Jian Ding](https://dingjiansw101.github.io), [Xingjia Pan](https://scholar.google.com.hk/citations?user=NaSU3eIAAAAJ&hl=zh-CN), Ke Yan, [Gui-Song Xia](http://www.captain-whu.com/xia_En.html).<br>
> [arXiv preprint](https://csuhan.com/attaches/cvpr_3605_final.pdf).
OpenDet2: OpenDet is implemented based on [detectron2](https://github.com/facebookresearch/detectron2).

### Setup

The code is based on [detectron2 v0.5](https://github.com/facebookresearch/detectron2/tree/v0.5).

* **Installation**

Here is a from-scratch setup script.

```
conda create -n opendet2 python=3.8 -y
conda activate opendet2
conda install pytorch=1.8.1 torchvision cudatoolkit=10.1 -c pytorch -y
pip install detectron2==0.5 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.8/index.html
git clone https://github.com/csuhan/opendet2.git
cd opendet2
pip install -v -e .
```

* **Prepare datasets**

Please follow [datasets/README.md](datasets/README.md) for dataset preparation. Then we generate VOC-COCO datasets.

```
bash datasets/opendet2_utils/prepare_openset_voc_coco.sh
# using data splits provided by us.
cp datasets/voc_coco_ann datasets/voc_coco -rf
```

### Model Zoo

We report the results on VOC and VOC-COCO-20, and provide pretrained models. Please refer to the corresponding log file for full results.

* **Faster R-CNN**

| Method | backbone | mAP<sub>K&uarr;</sub>(VOC) | WI<sub>&darr;</sub> | AOSE<sub>&darr;</sub> | mAP<sub>K&uarr;</sub> | AP<sub>U&uarr;</sub> | Download |
|---------|:--------:|:--------------------------:|:-------------------:|:---------------------:|:---------------------:|:--------------------:|:------------:|
| FR-CNN | R-50 | 80.06 | 19.50 | 16518 | 58.36 | 0 | [config](configs/faster_rcnn_R_50_FPN_3x_baseline.yaml) [model](https://drive.google.com/drive/folders/10uFOLLCK4N8te08-C-olRyDV-cJ-L6lU?usp=sharing) |
| PROSER | R-50 | 79.42 | 20.44 | 14266 | 56.72 | 16.99 | [config](configs/faster_rcnn_R_50_FPN_3x_proser.yaml) [model](https://drive.google.com/drive/folders/1_L85gisyvDtBXPe2UbI49vrd5FoBIOI_?usp=sharing) |
| ORE | R-50 | 79.80 | 18.18 | 12811 | 58.25 | 2.60 | [config]() [model]() |
| DS | R-50 | 79.70 | 16.76 | 13062 | 58.46 | 8.75 | [config](configs/faster_rcnn_R_50_FPN_3x_ds.yaml) [model](https://drive.google.com/drive/folders/1OWDjL29E2H-_lSApXqM2r8PS7ZvUNtiv?usp=sharing) |
| OpenDet | R-50 | 80.02 | 12.50 | 10758 | 58.64 | 14.38 | [config](configs/faster_rcnn_R_50_FPN_3x_opendet.yaml) [model](https://drive.google.com/drive/folders/10uFOLLCK4N8te08-C-olRyDV-cJ-L6lU?usp=sharing) |
| OpenDet | Swin-T | 83.29 | 10.76 | 9149 | 63.42 | 16.35 | [config](configs/faster_rcnn_Swin_T_FPN_3x_opendet.yaml) [model](https://drive.google.com/drive/folders/1j5SkEzeqr0ZnGVVZ4mzXSOvookHfvVvm?usp=sharing) |

* **RetinaNet**

| Method | mAP<sub>K&uarr;</sub>(VOC) | WI<sub>&darr;</sub> | AOSE<sub>&darr;</sub> | mAP<sub>K&uarr;</sub> | AP<sub>U&uarr;</sub> | Download |
|----------------|:--------------------------:|:-------------------:|:---------------------:|:---------------------:|:--------------------:|:----------------:|
| RetinaNet | 79.63 | 14.16 | 36531 | 57.32 | 0 | [config](configs/retinanet_R_50_FPN_3x_baseline.yaml) [model](https://drive.google.com/drive/folders/15fHfyA2HuXp6LfdTMBuHG6ZwtLcgvD-p?usp=sharing) |
| Open-RetinaNet | 79.64 | 10.74 | 17208 | 57.32 | 10.55 | [config](configs/retinanet_R_50_FPN_3x_opendet.yaml) [model](https://drive.google.com/drive/folders/1uLRZ5bdGaoORWaP2huiyL_WyLicmWT4G?usp=sharing) |


**Note**:
* If you cannot access google drive, BaiduYun download link can be found [here](https://pan.baidu.com/s/1I4Pp40pM84aeYTNeGc0kPA) with extracting code ABCD.
* The above results are reimplemented. Therefore, they are slightly different from our paper.
* The official code of ORE is at [OWOD](https://github.com/JosephKJ/OWOD). We do not plan to include ORE in our code.

### Online Demo

Try our online demo at [huggingface space](https://huggingface.co/spaces/csuhan/opendet2).

### Train and Test

* **Testing**

First, you need to download pretrained weights in the model zoo, e.g., [OpenDet](https://drive.google.com/drive/folders/10uFOLLCK4N8te08-C-olRyDV-cJ-L6lU?usp=sharing).

Then, run the following command:
```
python tools/train_net.py --num-gpus 8 --config-file configs/faster_rcnn_R_50_FPN_3x_opendet.yaml \
--eval-only MODEL.WEIGHTS output/faster_rcnn_R_50_FPN_3x_opendet/model_final.pth
```

* **Training**

The training process is the same as detectron2.
```
python tools/train_net.py --num-gpus 8 --config-file configs/faster_rcnn_R_50_FPN_3x_opendet.yaml
```

To train with the Swin-T backbone, please download [swin_tiny_patch4_window7_224.pth](https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth) and convert it to detectron2's format using [tools/convert_swin_to_d2.py](tools/convert_swin_to_d2.py).
```
wget https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth
python tools/convert_swin_to_d2.py swin_tiny_patch4_window7_224.pth swin_tiny_patch4_window7_224_d2.pth
```


### Citation

If you find our work useful for your research, please consider citing:

```BibTeX
@InProceedings{han2022opendet,
author = {Han, Jiaming and Ren, Yuqiang and Ding, Jian and Pan, Xingjia and Yan, Ke and Xia, Gui-Song},
title = {Expanding Low-Density Latent Regions for Open-Set Object Detection},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2022}
}
```
56 changes: 56 additions & 0 deletions app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
"""
Online demo at huggingface.
The link is: https://huggingface.co/spaces/csuhan/opendet2
"""
import os
os.system('pip install torch==1.9 torchvision')
os.system('pip install detectron2==0.5 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.9/index.html')
os.system('pip install timm opencv-python-headless')


import gradio as gr

from demo.predictor import VisualizationDemo
from detectron2.config import get_cfg
from opendet2 import add_opendet_config


model_cfgs = {
"FR-CNN": ["configs/faster_rcnn_R_50_FPN_3x_baseline.yaml", "frcnn_r50.pth"],
"OpenDet-R50": ["configs/faster_rcnn_R_50_FPN_3x_opendet.yaml", "opendet2_r50.pth"],
"OpenDet-SwinT": ["configs/faster_rcnn_Swin_T_FPN_18e_opendet_voc.yaml", "opendet2_swint.pth"],
}


def setup_cfg(model):
cfg = get_cfg()
add_opendet_config(cfg)
model_cfg = model_cfgs[model]
cfg.merge_from_file(model_cfg[0])
cfg.MODEL.WEIGHTS = model_cfg[1]
cfg.MODEL.DEVICE = "cpu"
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.MODEL.ROI_HEADS.VIS_IOU_THRESH = 0.8
cfg.freeze()
return cfg


def inference(input, model):
cfg = setup_cfg(model)
demo = VisualizationDemo(cfg)
# use PIL, to be consistent with evaluation
predictions, visualized_output = demo.run_on_image(input)
output = visualized_output.get_image()[:, :, ::-1]
return output


iface = gr.Interface(
inference,
[
"image",
gr.inputs.Radio(
["FR-CNN", "OpenDet-R50", "OpenDet-SwinT"], default='OpenDet-R50'),
],
"image")

iface.launch()
25 changes: 25 additions & 0 deletions configs/Base-RCNN-FPN-OPENDET.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
_BASE_: "./Base-RCNN-FPN.yaml"
MODEL:
MASK_ON: False
ROI_HEADS:
NAME: "OpenSetStandardROIHeads"
NUM_CLASSES: 81
NUM_KNOWN_CLASSES: 20
ROI_BOX_HEAD:
NAME: "FastRCNNSeparateConvFCHead"
OUTPUT_LAYERS: "OpenDetFastRCNNOutputLayers"
CLS_AGNOSTIC_BBOX_REG: True
UPLOSS:
START_ITER: 100
SAMPLING_METRIC: "min_score"
TOPK: 3
ALPHA: 1.0
WEIGHT: 1.0
ICLOSS:
OUT_DIM: 128
QUEUE_SIZE: 256
IN_QUEUE_SIZE: 16
BATCH_IOU_THRESH: 0.5
QUEUE_IOU_THRESH: 0.7
TEMPERATURE: 0.1
WEIGHT: 0.1
44 changes: 44 additions & 0 deletions configs/Base-RCNN-FPN.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# The same as detectron2/configs/Base-RCNN-FPN.yaml
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
BACKBONE:
NAME: "build_resnet_fpn_backbone"
RESNETS:
OUT_FEATURES: ["res2", "res3", "res4", "res5"]
FPN:
IN_FEATURES: ["res2", "res3", "res4", "res5"]
ANCHOR_GENERATOR:
SIZES: [[32], [64], [128], [256], [512]] # One size for each in feature map
ASPECT_RATIOS: [[0.5, 1.0, 2.0]] # Three aspect ratios (same for all in feature maps)
RPN:
IN_FEATURES: ["p2", "p3", "p4", "p5", "p6"]
PRE_NMS_TOPK_TRAIN: 2000 # Per FPN level
PRE_NMS_TOPK_TEST: 1000 # Per FPN level
# Detectron1 uses 2000 proposals per-batch,
# (See "modeling/rpn/rpn_outputs.py" for details of this legacy issue)
# which is approximately 1000 proposals per-image since the default batch size for FPN is 2.
POST_NMS_TOPK_TRAIN: 1000
POST_NMS_TOPK_TEST: 1000
ROI_HEADS:
NAME: "StandardROIHeads"
IN_FEATURES: ["p2", "p3", "p4", "p5"]
ROI_BOX_HEAD:
NAME: "FastRCNNConvFCHead"
NUM_FC: 2
POOLER_RESOLUTION: 7
ROI_MASK_HEAD:
NAME: "MaskRCNNConvUpsampleHead"
NUM_CONV: 4
POOLER_RESOLUTION: 14
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
SOLVER:
IMS_PER_BATCH: 16
BASE_LR: 0.02
STEPS: (60000, 80000)
MAX_ITER: 90000
INPUT:
MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800)
MIN_SIZE_TEST: 800
VERSION: 2
26 changes: 26 additions & 0 deletions configs/Base-RetinaNet.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# The same as detectron2/configs/Base-RetinaNet.yaml
MODEL:
META_ARCHITECTURE: "RetinaNet"
BACKBONE:
NAME: "build_retinanet_resnet_fpn_backbone"
RESNETS:
OUT_FEATURES: ["res3", "res4", "res5"]
ANCHOR_GENERATOR:
SIZES: !!python/object/apply:eval ["[[x, x * 2**(1.0/3), x * 2**(2.0/3) ] for x in [32, 64, 128, 256, 512 ]]"]
FPN:
IN_FEATURES: ["res3", "res4", "res5"]
RETINANET:
IOU_THRESHOLDS: [0.4, 0.5]
IOU_LABELS: [0, -1, 1]
SMOOTH_L1_LOSS_BETA: 0.0
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
SOLVER:
IMS_PER_BATCH: 16
BASE_LR: 0.01 # Note that RetinaNet uses a different default learning rate
STEPS: (60000, 80000)
MAX_ITER: 90000
INPUT:
MIN_SIZE_TRAIN: (640, 672, 704, 736, 768, 800)
VERSION: 2
16 changes: 16 additions & 0 deletions configs/faster_rcnn_R_50_FPN_3x_baseline.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
_BASE_: "./Base-RCNN-FPN-OPENDET.yaml"
MODEL:
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
RESNETS:
DEPTH: 50
ROI_BOX_HEAD:
OUTPUT_LAYERS: "CosineFastRCNNOutputLayers" # baseline use a simple cosine FRCNN
DATASETS:
TRAIN: ('voc_2007_train', 'voc_2012_trainval')
TEST: ('voc_2007_test', 'voc_coco_20_40_test', 'voc_coco_20_60_test', 'voc_coco_20_80_test', 'voc_coco_2500_test', 'voc_coco_5000_test', 'voc_coco_10000_test', 'voc_coco_20000_test')
SOLVER:
STEPS: (21000, 29000)
MAX_ITER: 32000
WARMUP_ITERS: 100
AMP:
ENABLED: True
18 changes: 18 additions & 0 deletions configs/faster_rcnn_R_50_FPN_3x_ds.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
_BASE_: "./Base-RCNN-FPN-OPENDET.yaml"
MODEL:
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
RESNETS:
DEPTH: 50
ROI_HEADS:
NAME: "DropoutStandardROIHeads"
ROI_BOX_HEAD:
OUTPUT_LAYERS: "DropoutFastRCNNOutputLayers"
DATASETS:
TRAIN: ('voc_2007_train', 'voc_2012_trainval')
TEST: ('voc_2007_test', 'voc_coco_20_40_test', 'voc_coco_20_60_test', 'voc_coco_20_80_test', 'voc_coco_2500_test', 'voc_coco_5000_test', 'voc_coco_10000_test', 'voc_coco_20000_test')
SOLVER:
STEPS: (21000, 29000)
MAX_ITER: 32000
WARMUP_ITERS: 100
AMP:
ENABLED: True
16 changes: 16 additions & 0 deletions configs/faster_rcnn_R_50_FPN_3x_opendet.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
_BASE_: "./Base-RCNN-FPN-OPENDET.yaml"
MODEL:
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
RESNETS:
DEPTH: 50
DATASETS:
TRAIN: ('voc_2007_train', 'voc_2012_trainval')
TEST: ('voc_2007_test', 'voc_coco_20_40_test', 'voc_coco_20_60_test', 'voc_coco_20_80_test', 'voc_coco_2500_test', 'voc_coco_5000_test', 'voc_coco_10000_test', 'voc_coco_20000_test')
SOLVER:
STEPS: (21000, 29000)
MAX_ITER: 32000
WARMUP_ITERS: 100
AMP:
ENABLED: True

# UPLOSS.WEIGHT: former two are 0.5, the last is 1.0
Loading

0 comments on commit 2373d1b

Please sign in to comment.