-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
4d429e8
commit e0fede0
Showing
35 changed files
with
5,800 additions
and
93 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
# AutoNUE@CVPR 2021 Challenge | ||
Implementation of the 1st solution for AutoNUE@CVPR 2021 Challenge Semenatic Segmentation Track based on PaddlePaddle. | ||
|
||
## Installation | ||
|
||
#### step 1. Install PaddlePaddle | ||
|
||
System Requirements: | ||
* PaddlePaddle >= 2.0.0 | ||
* Python >= 3.6+ | ||
|
||
Highly recommend you install the GPU version of PaddlePaddle, due to large overhead of segmentation models, otherwise it could be out of memory while running the models. For more detailed installation tutorials, please refer to the official website of [PaddlePaddle](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/2.0/install/)。 | ||
|
||
|
||
#### step 2. Install PaddleSeg | ||
|
||
You should use *API Calling* method to install PaddleSeg for flexible development. | ||
|
||
```shell | ||
pip install paddleseg -U | ||
``` | ||
|
||
## Data Preparation | ||
|
||
Firstly, you need to to download and convert the [India Driving Dataset](https://idd.insaan.iiit.ac.in/evaluation/autonue21/#bm5) following the instructions of Segmentation Track. IDD_Dectection dataset also need for pseudo-labeling. | ||
|
||
And then, you need to organize data following the below structure. | ||
|
||
IDD_Segmentation | ||
| | ||
|--leftImg8bit | ||
| |--train | ||
| |--val | ||
| |--test | ||
| | ||
|--gtFine | ||
| |--train | ||
| |--val | ||
| |--test | ||
|
||
We make three contributions and managed to rank 1st. | ||
- Progressively Segmentation | ||
- Leverage IDD_Detection Dataset to generate extre training samples by pseudo-labeling. | ||
- Decoder-enhanced Swin Transformer | ||
|
||
## Training | ||
|
||
### Baseline | ||
1. Download pretrained weights on Mapillary. | ||
|
||
```shell | ||
mkdir -p pretrain && cd pretrain | ||
wget https://bj.bcebos.com/paddleseg/dygraph/cityscapes/ocrnet_hrnetw48_mapillary/pretrained.pdparams | ||
cd .. | ||
``` | ||
2. Modify `scripts/train.py` line 27 with `from core.val import evaluate` | ||
3. Run the training script. | ||
```shell | ||
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -u -m paddle.distributed.launch train.py \ | ||
--config configs/[email protected] --use_vdl \ | ||
--save_dir saved_model/sscale_auto_nue_map+city@1920 --save_interval 2000 --num_workers 2 --do_eval | ||
``` | ||
|
||
### Regional progressive segmentation | ||
1. Replace `scripts/train.py` line 27 'from core.val import evaluate' with `from core.val_crop import evaluate` | ||
2. Run | ||
```shell | ||
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -u -m paddle.distributed.launch train.py \ | ||
--config configs/auto_nue_map+city_crop.yml --use_vdl \ | ||
--save_dir saved_model/auto_nue_map+city_crop --save_interval 2000 --num_workers 2 --do_eval | ||
``` | ||
|
||
### Pseudo-labeling | ||
First you need to organize the IDD_Detection dataset as follow: | ||
|
||
|
||
IDD_Detection | ||
| | ||
|--JPEGImages | ||
|--Annotations | ||
|
||
|
||
where `JPEGImages` and `Annotation` are images and xml files collected from `IDD_Detection/FrontFar` and `IDD_Detection/FrontNear` two folders. | ||
|
||
And Then: | ||
1. Replace `AutoNUE21/predict.py` line 22 `from paddleseg.core import predict` with `from core.predict_generate_autolabel.py import predictAutolabel` | ||
2. Modity `AutoNUE21/predict.py` line 156 `predict(` with `predictAutolabel(` | ||
3. Run | ||
```shell | ||
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m paddle.distributed.launch predict.py --config configs/[email protected] --model_path saved_model/sscale_auto_nue_map+city@1920/best_model/model.pdparams --image_path data/IDD_Detection/JPEGImages --save_dir detection_out --aug_pred --scales 1.0 1.5 2.0 --flip_horizontal | ||
``` | ||
4. Auto-box `traffic lights` and `traffic sign` two classes from bounding box annotation by running `tools/IDD_labeling.py` | ||
5. Put the generated `pred_refine` folder under `data/IDD_Detection` | ||
5. Modify `scripts/train.py` line 27 with `from core.val import evaluate` | ||
6. Train these pseudo labels with fine-annotated sample: | ||
```shell | ||
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -u -m paddle.distributed.launch train.py \ | ||
--config configs/auto_nue_auto_label.yml --use_vdl \ | ||
--save_dir saved_model/auto_nue_auto_label --save_interval 2000 --num_workers 2 --do_eval | ||
``` | ||
|
||
### Decoder-enhanced Swin Transformer | ||
|
||
1. Download pretrained weights on Mapillary. | ||
|
||
```shell | ||
cd pretrain | ||
wget https://bj.bcebos.com/paddleseg/dygraph/cityscapes/swin_mla_p4w7_mapillary/pretrained_swin.pdparams | ||
cd .. | ||
``` | ||
|
||
2. Run the training script. | ||
```shell | ||
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -u -m paddle.distributed.launch train.py \ | ||
--config configs/swin_transformer_mla_base_patch4_window7_160k_autonue.yml --use_vdl \ | ||
--save_dir saved_model/swin_transformer_mla_autonue --save_interval 2000 --num_workers 2 --do_eval | ||
``` | ||
3. Run the testing script. | ||
```shell | ||
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m paddle.distributed.launch predict.py --config configs/swin_transformer_mla_base_patch4_window7_160k_autonue.yml --model_path saved_model/swin_transformer_mla_autonue/best_model/model.pdparams --image_path data/IDD_Segmentation/leftImg8bit/test/ --save_dir test_out_swin --aug_pred --scales 1.0 1.5 2.0 --flip_horizontal | ||
``` | ||
|
||
## Ensemble Testing | ||
We provide a predict script for ensembling `baseline`, `pseudo-labeling` and `rps`. | ||
Just running: | ||
```shell | ||
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m paddle.distributed.launch predict_ensemble_three.py --config configs/[email protected] --config_1 configs/auto_nue_auto_label.yml --config_crop configs/auto_nue_map+city_crop.yml --model_path saved_model/sscale_auto_nue_map+city@1920/best_model/model.pdparams --model_path_1 saved_model/auto_nue_auto_label/best_model/model.pdparams --model_path_crop saved_model/auto_nue_map+city_crop/best_model/model.pdparams --image_path data/IDD_Segmentation/leftImg8bit/test/ --save_dir test_out --aug_pred --scales 1.0 1.5 2.0 --flip_horizontal | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
batch_size: 1 | ||
iters: 80000 | ||
|
||
model: | ||
type: MscaleOCRNet | ||
pretrained: pretrain/pretrained.pdparams | ||
n_scales: [1.0] | ||
backbone: | ||
type: HRNet_W48_NV | ||
num_classes: 26 | ||
backbone_indices: [0] | ||
|
||
train_dataset: | ||
type: AutoNueAutolabel | ||
dataset_root: data/IDD_Segmentation | ||
transforms: | ||
- type: Resize | ||
target_size: [1920, 1080] | ||
- type: ResizeStepScaling | ||
min_scale_factor: 0.5 | ||
max_scale_factor: 2.0 | ||
scale_step_size: 0 | ||
- type: RandomPaddingCrop | ||
crop_size: [1920, 1080] | ||
- type: RandomHorizontalFlip | ||
- type: RandomDistort | ||
brightness_range: 0.25 | ||
brightness_prob: 1 | ||
contrast_range: 0.25 | ||
contrast_prob: 1 | ||
saturation_range: 0.25 | ||
saturation_prob: 1 | ||
hue_range: 63 | ||
hue_prob: 1 | ||
- type: Normalize | ||
mean: [0.485, 0.456, 0.406] | ||
std: [0.229, 0.224, 0.225] | ||
mode: train | ||
|
||
|
||
val_dataset: | ||
type: AutoNueAutolabel | ||
dataset_root: data/IDD_Segmentation | ||
transforms: | ||
- type: Resize | ||
target_size: [1920, 1080] | ||
- type: Normalize | ||
mean: [0.485, 0.456, 0.406] | ||
std: [0.229, 0.224, 0.225] | ||
mode: val | ||
|
||
optimizer: | ||
type: sgd | ||
momentum: 0.9 | ||
weight_decay: 0.0001 | ||
|
||
learning_rate: | ||
value: 0.02 | ||
decay: | ||
type: poly | ||
power: 2 | ||
end_lr: 0.0 | ||
|
||
loss: | ||
types: | ||
- type: DiceLoss | ||
- type: DiceLoss | ||
- type: BootstrappedCrossEntropyLoss | ||
min_K: 50000 | ||
loss_th: 0.05 | ||
- type: BootstrappedCrossEntropyLoss | ||
min_K: 50000 | ||
loss_th: 0.05 | ||
coef: [0.4, 0.16, 1.0, 0.4] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
batch_size: 1 | ||
iters: 85000 | ||
|
||
model: | ||
type: MscaleOCRNet | ||
pretrained: pretrain/pretrained.pdparams | ||
n_scales: [1.0] | ||
backbone: | ||
type: HRNet_W48_NV | ||
num_classes: 26 | ||
backbone_indices: [0] | ||
|
||
train_dataset: | ||
type: AutoNueCrop | ||
dataset_root: data/IDD_Segmentation | ||
transforms: | ||
- type: Resize | ||
target_size: [3200, 1800] | ||
- type: RandomHorizontalFlip | ||
- type: RandomDistort | ||
brightness_range: 0.25 | ||
brightness_prob: 1 | ||
contrast_range: 0.25 | ||
contrast_prob: 1 | ||
saturation_range: 0.25 | ||
saturation_prob: 1 | ||
hue_range: 63 | ||
hue_prob: 1 | ||
- type: Normalize | ||
mean: [0.485, 0.456, 0.406] | ||
std: [0.229, 0.224, 0.225] | ||
mode: train | ||
|
||
|
||
val_dataset: | ||
type: AutoNueCrop | ||
dataset_root: data/IDD_Segmentation | ||
transforms: | ||
- type: Resize | ||
target_size: [3200, 1800] | ||
- type: Normalize | ||
mean: [0.485, 0.456, 0.406] | ||
std: [0.229, 0.224, 0.225] | ||
mode: val | ||
|
||
optimizer: | ||
type: sgd | ||
momentum: 0.9 | ||
weight_decay: 0.0001 | ||
|
||
learning_rate: | ||
value: 0.005 | ||
decay: | ||
type: poly | ||
power: 2 | ||
end_lr: 0.0 | ||
|
||
loss: | ||
types: | ||
- type: DiceLoss | ||
- type: DiceLoss | ||
- type: BootstrappedCrossEntropyLoss | ||
min_K: 50000 | ||
loss_th: 0.05 | ||
- type: BootstrappedCrossEntropyLoss | ||
min_K: 50000 | ||
loss_th: 0.05 | ||
coef: [0.4, 0.16, 1.0, 0.4] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
batch_size: 1 | ||
iters: 80000 | ||
|
||
model: | ||
type: MscaleOCRNet | ||
pretrained: saved_model/sscale_ocr_auto_nue_map+city_ce+dice@1920/best_model/model.pdparams | ||
n_scales: [1.0, 1.5, 2.0] | ||
backbone: | ||
type: HRNet_W48_NV | ||
num_classes: 26 | ||
backbone_indices: [0] | ||
|
||
train_dataset: | ||
type: AutoNue | ||
dataset_root: data/IDD_Segmentation | ||
transforms: | ||
- type: Resize | ||
target_size: [1920, 1080] | ||
- type: ResizeStepScaling | ||
min_scale_factor: 0.5 | ||
max_scale_factor: 2.0 | ||
scale_step_size: 0 | ||
- type: RandomPaddingCrop | ||
crop_size: [1920, 1080] | ||
- type: RandomHorizontalFlip | ||
- type: RandomDistort | ||
brightness_range: 0.25 | ||
brightness_prob: 1 | ||
contrast_range: 0.25 | ||
contrast_prob: 1 | ||
saturation_range: 0.25 | ||
saturation_prob: 1 | ||
hue_range: 63 | ||
hue_prob: 1 | ||
- type: Normalize | ||
mean: [0.485, 0.456, 0.406] | ||
std: [0.229, 0.224, 0.225] | ||
mode: train | ||
|
||
|
||
val_dataset: | ||
type: AutoNue | ||
dataset_root: data/IDD_Segmentation | ||
transforms: | ||
- type: Resize | ||
target_size: [1920, 1080] | ||
- type: Normalize | ||
mean: [0.485, 0.456, 0.406] | ||
std: [0.229, 0.224, 0.225] | ||
mode: val | ||
|
||
optimizer: | ||
type: sgd | ||
momentum: 0.9 | ||
weight_decay: 0.0001 | ||
|
||
learning_rate: | ||
value: 0.005 | ||
decay: | ||
type: poly | ||
power: 2 | ||
end_lr: 0.0 | ||
|
||
loss: | ||
types: | ||
- type: DiceLoss | ||
- type: DiceLoss | ||
- type: BootstrappedCrossEntropyLoss | ||
min_K: 100000 | ||
loss_th: 0.05 | ||
- type: BootstrappedCrossEntropyLoss | ||
min_K: 100000 | ||
loss_th: 0.05 | ||
coef: [1, 0.4, 1, 0.4] |
Oops, something went wrong.