In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation

Dahyun Kang Minsu Cho

This repo is the official implementation of the ECCV 2024 paper In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation

Conda installation command

conda env create -f environment.yml --prefix $YOURPREFIX

$YOUPREFIX is typically /home/$USER/anaconda3

Dependencies

This repo is built on CLIP, SCLIP, and MMSegmentation.

mim install mmcv==2.0.1 mmengine==0.8.4 mmsegmentation==1.1.1
pip install ftfy regex yapf==0.40.1

Dataset preparation

Please make it compatible with Pascal VOC 2012, Pascal Context, COCO stuff 164K, COCO object, and ADEChallengeData2016 following the MMSeg data preparation. The COCO-Object dataset can be converted from COCO-Stuff164k by executing the following command:

python datasets/cvt_coco_object.py PATH_TO_COCO_STUFF164K -o PATH_TO_COCO164K

Place them under $yourdatasetroot/ directory such that:

    $yourdatasetroot/
    ├── ADEChallengeData2016/
    │   ├── annotations/
    │   ├── images/
    │   ├── ...
    ├── VOC2012/
    │   ├── Annotations/
    │   ├── JPEGImages/
    │   ├── ...
    ├── coco_stuff164k/
    │   ├── annotations/
    │   ├── images/
    │   ├── ...
    ├── ...

1) Panoptic Cut for unsupervised object mask discovery

cd panoptic_cut
python predict.py \
    --logs panoptic_cut \
    --dataset {coco_object, coco_stuff, ade20k, voc21, voc20, context60, context59} \
    --datasetroot $yourdatasetroot

The checkpoints for the panoptic mask discovery is found below google drive:

mask prediction root after stage 1)	benchmark id	Google drive link
coco_stuff164k	coco_object, coco_stuff164k	link to download (84.5 MB)
VOC2012	context59, context60, voc20, voc21	link to download (66.7 MB)
ADEChallengeData2016	ade20k	link to download (29.4 MB)

Place them under lavg/panoptic_cut/pred/ directory such that:

    lavg/panoptic_cut/pred/panoptic_cut/
    ├── ADEChallengeData2016/
    │   ├── ADE_val_00000001.pth
    │   ├── ADE_val_00000002.pth
    │   ├── ...
    ├── VOC2012/
    │   ├── 2007_000033.pth
    │   ├── 2007_000042.pth
    │   ├── ...
    ├── coco_stuff164k/
    │   ├── 000000000139.pth
    │   ├── 000000000285.pth
    │   ├── ...

2) Visual grounding & Segmentation evaluation

Update $yourdatasetroot in configs/cfg_*.py

cd lavg
python eval.py --config ./configs/{cfg_context59/cfg_context60/cfg_voc20/cfg_voc21}.py --maskpred_root VOC2012/panoptic_cut
python eval.py --config ./configs/cfg_ade20k.py --maskpred_root ADEChallengeData2016/panoptic_cut
python eval.py --config ./configs/{cfg_coco_object/cfg_coco_stuff164k}.py --maskpred_root coco_stuff164k/panoptic_cut

The run is a single-GPU compatible.

Quantitative performance (mIoU, %) on open-vocabulary semantic segmentation benchmarks

	With background category			Without background category
Method	VOC21	Context60	COCO-obj	VOC20	Context59	ADE	COCO-stuff
LaVG	62.1	31.6	34.2	82.5	34.7	15.8	23.2

Related repos

Our project refers to and heavily borrows some the codes from the following repos:

Acknowledgements

This work was supported by Samsung Electronics (IO201208-07822-01), the NRF grant (NRF-2021R1A2C3012728 (45%), and the IITP grants (RS-2022-II220959: Few-Shot Learning of Causal Inference in Vision and Language for Decision Making (50%), RS-2019-II191906: AI Graduate School Program at POSTECH (5%)) funded by Ministry of Science and ICT, Korea. We also thank Sua Choi for her helpful discussion.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
clip		clip
configs		configs
datasets		datasets
panoptic_cut		panoptic_cut
prompts		prompts
.gitignore		.gitignore
README.md		README.md
clip_segmentor.py		clip_segmentor.py
cog.yaml		cog.yaml
custom_datasets.py		custom_datasets.py
environment.yml		environment.yml
eval.py		eval.py
pamr.py		pamr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation

Dahyun Kang Minsu Cho

Conda installation command

Dependencies

Dataset preparation

1) Panoptic Cut for unsupervised object mask discovery

2) Visual grounding & Segmentation evaluation

Quantitative performance (mIoU, %) on open-vocabulary semantic segmentation benchmarks

Related repos

Acknowledgements

About

Releases

Packages

Languages

dahyun-kang/lazygrounding

Folders and files

Latest commit

History

Repository files navigation

In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation

Dahyun Kang Minsu Cho

Conda installation command

Dependencies

Dataset preparation

1) Panoptic Cut for unsupervised object mask discovery

2) Visual grounding & Segmentation evaluation

Quantitative performance (mIoU, %) on open-vocabulary semantic segmentation benchmarks

Related repos

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages