CVPR21-Cogrounding_semantic_attention

Co-Grounding Networks with Semantic Attention for Referring ExpressionComprehension in Videos (CVPR 2021).

Project page: https://sijiesong.github.io/co-grounding/

Check out our paper here.

Prerequisites

Python 3
Pytorch >= 0.4.1
Others (OpenCV, scipy, etc.)

Data Preparation

Please download the files from BaiduYun(Extraction code：p98p) or GoogleDrive[./data], then put the files under ./data.
Download VID Dataset from this link and put it into ./data/VID.
Download RefCOCO Dataset following this link and put it into ./data/unc.

Getting started

Train CG-SL-Att. on VID

python train_semantic_attention_cogrounding.py  --data_root  ./data  --dataset VID --gpu GPUID --savename SAVENAME_FOR_YOUR_MODEL  --batch_size BATCHSIZE --lstm

Test CG-SL-Att. on VID (model acc: 59.48):

Please download trained models from BaiduYun(Extraction code：p98p) or GoogleDrive[./saved_models], then put the models under ./saved_models.

python test_semantic_attention_cogrounding.py  --data_root  ./data  --dataset VID --gpu GPUID --savename VID_cogrounding_semantic_attn  --batch_size BATCHSIZE --lstm --test --resume ./saved_models/VID_cogrounding_semantic_attn.pth.tar

Post processing based on CG-SL-Att. on VID (model acc: 60.03):

Generate cache files (or download the file from BaiduYun(Extraction code：p98p)/Google Drive [cache.zip])

python test_semantic_attention_cogrounding.py  --data_root  ./data  --dataset VID --gpu GPUID --savename VID_cogrounding_semantic_attn  --batch_size BATCHSIZE --lstm --cache --resume ./saved_models/VID_cogrounding_semantic_attn.pth.tar

Post-process the initial results

python post_processing.py --data_root ./data/ --dataset VID --gpu GPUID --savename model_post_processing --batch_size 16 --lstm --test --cache_dir CACHE_FILE_PATH

Train SL-Att. on RefCOCO

python train_semantic_attention_visual_grounding.py --data_root  ./data  --dataset unc --gpu GPUID --savename SAVENAME_FOR_YOUR_MODEL --batch_size BATCHSIZE --lstm

Test SL-Att. on RefCOCO (model acc: val/testA/testB 77.65/80.75/73.37):

Please download trained models and the put the models under ./saved_models as above.

python train_semantic_attention_visual_grounding.py --data_root  ./data  --dataset unc --gpu GPUID --savename Refcoco_grounding_semantic_attn --batch_size BATCHSIZE --lstm --test --resume ./saved_models/RefCOCO_semantic_attn.pth.tar

Related projects

One-stage visual grounding

Citation

If you use this code for your research, please cite our paper:

@inproceedings{cogrounding_2021_CVPR,
author = {Song, Sijie and Lin, Xudong and Liu, Jiaying and Guo, Zongming and Chang, Shih-Fu},
title = {Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CVPR21-Cogrounding_semantic_attention

Co-Grounding Networks with Semantic Attention for Referring ExpressionComprehension in Videos (CVPR 2021).

Prerequisites

Data Preparation

Getting started

Related projects

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

CVPR21-Cogrounding_semantic_attention

Co-Grounding Networks with Semantic Attention for Referring ExpressionComprehension in Videos (CVPR 2021).

Prerequisites

Data Preparation

Getting started

Related projects

Citation