iSogCLR PyTorch Implementation

In this repo, we show how to train a self-supervised model by using Global Contrastive Loss (GCL) on a widely used bimodal image-text dataset CC3M.

Getting Started

Try in Colab: https://colab.research.google.com/drive/1FTF-cTcW11Gyrwu8uhTZOXgLsjp49Z9W?usp=sharing

Environment

Setting up a new virtual environment with Conda:

env_name='csce689_proj'
conda create -n "$env_name" python=3.10
conda activate "$env_name"
pip install -r requirements.txt

Training and Evaluation

Download the data: cc3m_subset_100k.tar.gz, a 100k subset of the Conceptual Captions dataset; mscoco_val.tar.gz, a 5k subset of the COCO val2014 dataset; clip_train.tar.gz, captions of the previous datasets; imagenet/val.tar, ImageNet validation set. The code and data should be structured as follows:
```
.
+--bimodal_exps (code)
|
+--clip_train (captions)
|  +--cc3m_train_subset.json
|  +--coco_val.json
|
+--datasets (images)
|  +--cc3m_subset_100k
|  +--mscoco_val
|  +--imagnet
|  |  +-- val
```

To train a model on cc3m, use run.slurm if slurm is supported or run

export PYTHONPATH="$PYTHONPATH:./bimodal_exps"
export HUGGINGFACE_HUB_CACHE='./checkpoints/huggingface'

data_path=./datasets
ann_path=./clip_train
train_image_root=cc3m_subset_100k/
data=cc3m
train_file=${data}_train_subset.json
gamma=0.8
epochs=30
ita_type=sogclr

CUDA_VISIBLE_DEVICES=0 python ./bimodal_exps/clip.py \
    --data_path ${data_path} \
    --ann_path ${ann_path} \
    --train_file ${train_file} \
    --train_image_root ${train_image_root} \
    --output_dir output/${ita_type}_${data}_g${gamma}_e${epochs} \
    --init_model \
    --use_amp \
    --ita_type ${ita_type} \
    --tau_init 0.01 \
    --sogclr_gamma ${gamma} \
    --eta_init 0.03 --sched cosine \
    --no-distributed \
    --epochs ${epochs}

To test the performance of a model on MSCOCO and ImageNet, use eval.slurm if slurm is supported or run

export PYTHONPATH="$PYTHONPATH:./bimodal_exps"
export HUGGINGFACE_HUB_CACHE='./checkpoints/huggingface'

data_path=./datasets
ann_path=./clip_train
train_image_root=cc3m_subset_100k/
data=cc3m
train_file=${data}_train_subset.json
gamma=0.8
epochs=30
ita_type=sogclr

CUDA_VISIBLE_DEVICES=0 python ./bimodal_exps/clip.py \
    --data_path ${data_path} \
    --ann_path ${ann_path} \
    --train_file ${train_file} \
    --train_image_root ${train_image_root} \
    --output_dir output/eval_${ita_type}_${data}_g${gamma}_e${epochs} \
    --init_model \
    --use_amp \
    --ita_type ${ita_type} \
    --tau_init 0.01 \
    --sogclr_gamma ${gamma} \
    --eta_init 0.03 --sched cosine \
    --no-distributed \
    --epochs ${epochs} \
    --evaluate --checkpoint ./output/${ita_type}_cc3m_g0.8_e30/checkpoint_30.pth

Reference

If you find this tutorial helpful, please cite:

@inproceedings{qiu2023not,
  title={Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization},
  author={Zi-Hao Qiu, Quanqi Hu, Zhuoning Yuan, Denny Zhou, Lijun Zhang, and Tianbao Yang},
  booktitle={International Conference on Machine Learning},
  pages={TBD},
  year={2023},
  organization={PMLR}
}

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
bimodal_exps		bimodal_exps
.gitignore		.gitignore
README.md		README.md
eval.slurm		eval.slurm
requirements.txt		requirements.txt
requirements_colab.txt		requirements_colab.txt
run.slurm		run.slurm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

iSogCLR PyTorch Implementation

Getting Started

Environment

Training and Evaluation

Reference

About

Releases

Packages

Languages

xywei00/csce689_iSogCLR

Folders and files

Latest commit

History

Repository files navigation

iSogCLR PyTorch Implementation

Getting Started

Environment

Training and Evaluation

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages