(ICCV 2021) Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection (paper) (supp) (zhihuä¸æ–‡è§£è¯»)
Dark environment becomes a challenge for computer vision algorithms owing to insufficient photons and undesirable noise. To enhance object detection in a dark environment, we propose a novel multitask auto encod- ing transformation (MAET) model which is able to explore the intrinsic pattern behind illumination translation. In a self-supervision manner, the MAET learns the intrinsic visual structure by encoding and decoding the realistic illumination-degrading transformation considering the physical noise model and image signal processing (ISP). Based on this representation, we achieve the object detection task by decoding the bounding box coordinates and classes. To avoid the over-entanglement of two tasks, our MAET disentangles the object and degrad- ing features by imposing an orthogonal tangent regularity. This forms a parametric manifold along which multi-task predictions can be geometrically formulated by maximizing the orthogonality between the tangents along the outputs of respective tasks. Our framework can be implemented based on the mainstream object detection ar- chitecture and directly trained end-to-end using normal target detection datasets, such as VOC and COCO. We have achieved the state-of-the-art performance using synthetic and real-world datasets.
When Human Vision Meets Machine Vision (compare with enhancement methods):
Physics-based low-light degrading transformation (unprocess -- degradation -- ISP):
python 3.7
pytorch 1.6.0
mmcv 1.1.5 (for example CUDA10.1 and torch 1.6.0: pip install mmcv-full==1.1.5 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html, detail see: https://github.com/open-mmlab/mmcv)
matplotlib opencv-python Pillow tqdm scipy
dataset | model | size | logs |
---|---|---|---|
MAET-COCO (ours) 80 class | (google drive) (baiduyun, passwd:1234) | 489.10 MB | - |
MAET-EXDark (ours) (77.7) 20 class | (google drive) (baiduyun, passwd:1234) | 470.26 MB | google drive |
EXDark (76.8) 20 class | (google drive) (baiduyun, passwd:1234) | 470.26 MB | - |
EXDark (MBLLEN) (76.3) 20 class | (google drive) (baiduyun, passwd:1234) | 470.26 MB | - |
EXDark (Kind) (76.3) 20 class | (google drive) (baiduyun, passwd:1234) | 470.26 MB | - |
EXDark (Zero-DCE) (76.9) 20 class | (google drive) (baiduyun, passwd:1234) | 470.26 MB | - |
MAET-UG2-DarkFace (ours) (56.2) 1 class | (google drive) (baiduyun, passwd:1234) | 469.81 MB | - |
Step-1:
For MS COCO Dataset (Use for Pre-train): Download COCO 2017 dataset.
For EXDark Dataset (Use for Fine-tune and Evaluation): Download EXDark (include EXDark enhancement by MBLLEN, Zero-DCE, KIND) in VOC format from google drive or baiduyun, passwd:1234. The EXDark dataset should be look like:
EXDark
│
│
└───JPEGImages
│ │───IMGS (original low light)
│ │───IMGS_Kind (imgs enhancement by [Kind, mm2019])
│ │───IMGS_ZeroDCE (imgs enhancement by [ZeroDCE, cvpr 2020])
│ │───IMGS_MEBBLN (imgs enhancement by [MEBBLN, bmvc 2018])
│───Annotations
│───main
│───label
For UG2-DarkFace Dataset (Use for Fine-tune and Evaluation): Download UG2 in VOC format from google drive or baiduyun, passwd:1234. The UG2-DarkFace dataset should be look like:
UG2
│
└───main
│───xml
│───label
│───imgs
Step-2: Cd in "your_project_path", and do set-up process (see mmdetection if you want find details):
git clone [email protected]:cuiziteng/ICCV_MAET.git
cd "your project path"
pip install -r requirements/build.txt
pip install -v -e . # or "python setup.py develop"
Step-3: Change the data place line1 and line2 to your own COCO and EXDark path, and line3 to your own UG2-DarkFace path.
Testing MAET-YOLOV3 on (low-light) COCO dataset
python tools/test.py configs/MAET_yolo/maet_yolo_coco_ort.py [COCO model path] --eval bbox --show-dir [save dir]
Testing MAET-YOLOV3 on EXDark dataset
python tools/test.py configs/MAET_yolo/maet_yolo_exdark.py [EXDark model path] --eval mAP --show-dir [save dir]
Testing MAET-YOLOV3 on UG2-DarkFace dataset
python tools/test.py configs/MAET_yolo/maet_yolo_ug2.py [UG2-DarkFace model path] --eval mAP --show-dir [save dir]
Comparative Experiment
Testing YOLOV3 on EXDark dataset enhancement by MEBBLN/ Kind/ Zero-DCE
python tools/test.py configs/MAET_yolo/yolo_mbllen.py (yolo_kind.py, yolo_zero_dce.py) [MEBBLN/ Kind/ Zero-DCE model] --eval mAP --show-dir [save dir]
Setp-1: Pre-train MAET-COCO model (273 epochs on 4 GPUs): (if use other GPU number, please reset learining rate), or direct download our pre-train COCO model (google drive) (baiduyun, passwd:1234).
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=[port number] bash ./tools/dist_train_maet.sh configs/MAET_yolo/maet_yolo_coco_ort.py 4
Setp-2 (EXDark): Fine-tune on EXDark datastet (25epoch on 1 GPU):
python tools/train.py configs/MAET_yolo/maet_yolo_exdark.py --gpu-ids [gpu id] --load-from [COCO model path]
Setp-2 (UG2-DarkFace): Fine-tune on UG2-DarkFace datastet (20epoch on 1 GPU):
python tools/train.py configs/MAET_yolo/maet_yolo_ug2.py --gpu-ids [gpu id] --load-from [COCO model path]
Comparative Experiment
Fine-tune EXDark dataset enhancement by MEBBLN/ Kind/ Zero-DCE (25epoch on 1 GPU) on well-trained normal COCO model (608x608) for fairness
python tools/train.py configs/MAET_yolo/yolo_mbllen.py (yolo_kind.py, yolo_zero_dce.py) --gpu-ids [gpu id]
Baselines on EXDark dataset (renew) on YOLO-V3 object detector:
class | Bicycle | Boat | Bottle | Bus | Car | Cat | Chair | Cup | Dog | Motorbike | People | Table | Total |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Baseline | 79.8 | 75.3 | 78.1 | 92.3 | 83.0 | 68.0 | 69.0 | 79.0 | 78.0 | 77.3 | 81.5 | 55.5 | 76.4 |
KIND (MM 2019) | 80.1 | 77.7 | 77.2 | 93.8 | 83.9 | 66.9 | 68.7 | 77.4 | 79.3 | 75.3 | 80.9 | 53.8 | 76.3 |
MBLLEN (BMVC 2018) | 82.0 | 77.3 | 76.5 | 91.3 | 84.0 | 67.6 | 69.1 | 77.6 | 80.4 | 75.6 | 81.9 | 58.6 | 76.8 |
Zero-DCE (CVPR 2020) | 84.1 | 77.6 | 78.3 | 93.1 | 83.7 | 70.3 | 69.8 | 77.6 | 77.4 | 76.3 | 81.0 | 53.6 | 76.9 |
MAET (ICCV 2021) | 83.1 | 78.5 | 75.6 | 92.9 | 83.1 | 73.4 | 71.3 | 79.0 | 79.8 | 77.2 | 81.1 | 57.0 | 77.7 |
DENet (ACCV 2022) | 80.4 | 79.7 | 77.9 | 91.2 | 82.7 | 72.8 | 69.9 | 80.1 | 77.2 | 76.7 | 82.0 | 57.2 | 77.3 |
IAT-YOLO (BMVC 2022) | 79.8 | 76.9 | 78.6 | 92.5 | 83.8 | 73.6 | 72.4 | 78.6 | 79.0 | 79.0 | 81.1 | 57.7 | 77.8 |
If our work help to your research, please cite our paper, thx.
@InProceedings{Cui_2021_ICCV,
author = {Cui, Ziteng and Qi, Guo-Jun and Gu, Lin and You, Shaodi and Zhang, Zenghui and Harada, Tatsuya},
title = {Multitask AET With Orthogonal Tangent Regularity for Dark Object Detection},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {2553-2562}
}
If you also interest in low-light image enhancement & exposure correction, please refer to our BMVC2022 project Illumination adaptive transformer.
The code is largely borrow from mmdetection and unprocess, Thx to their wonderful works~
MMdetection: mmdetection (v2.7.0)
Unprocessing Images for Learned Raw Denoising: unprocess