ICONS: Influence Consensus for Vision-Language Data Selection
Under construction 🚧
Authors: Xindi Wu, Mengzhou Xia, Rulin Shao, Zhiwei Deng, Pang Wei Koh, Olga Russakovsky
We propose ICONS, a method for selecting vision-language data that optimizes training efficiency by identifying and prioritizing data samples that are consistently valuable across multiple tasks.
- [01/25] We have released the LLAVA-ICONS-133K dataset on Hugging Face for public use.
- [12/24] We have released the paper ICONS.
To set up the environment for ICONS, you can use the provided environment.yml
file to create a Conda environment:
conda env create -f environment.yml
conda activate icons
The ICONS pipeline consists of two main stages:
-
Compute Training Data Gradients
# Submit SLURM jobs for processing training data chunks sbatch './scripts/0_slurm_train_grads.sh' 500 # or use other checkpoints, here we use ckpt=500 as an example
-
Merge Gradient Files
bash ./scripts/1_merge_train_gradient.sh
-
Process Validation Data
bash ./scripts/2_get_val_data_grads_all.sh
-
Compute Influence Matrices
bash ./scripts/3_specialist.sh
- Generate Consensus
bash ./scripts/4_generalist.sh
If you find this repository useful for your research, please cite with the following BibTeX entry:
@article{wu2024icons,
title={ICONS: Influence Consensus for Vision-Language Data Selection},
author={Wu, Xindi and Xia, Mengzhou and Shao, Rulin and Deng, Zhiwei and Koh, Pang Wei and Russakovsky, Olga},
journal={arXiv preprint arXiv:2501.00654},
year={2024}
}