ICONS

ICONS: Influence Consensus for Vision-Language Data Selection

Under construction 🚧

Authors: Xindi Wu, Mengzhou Xia, Rulin Shao, Zhiwei Deng, Pang Wei Koh, Olga Russakovsky

We propose ICONS, a method for selecting vision-language data that optimizes training efficiency by identifying and prioritizing data samples that are consistently valuable across multiple tasks.

News 🔥

[01/25] We have released the LLAVA-ICONS-133K dataset on Hugging Face for public use.
[12/24] We have released the paper ICONS.

Installation

To set up the environment for ICONS, you can use the provided environment.yml file to create a Conda environment:

conda env create -f environment.yml
conda activate icons

Usage

The ICONS pipeline consists of two main stages:

Stage 1: Specialist (Computing Task-Specific Influence)

Compute Training Data Gradients

# Submit SLURM jobs for processing training data chunks
sbatch './scripts/0_slurm_train_grads.sh' 500  # or use other checkpoints, here we use ckpt=500 as an example

Merge Gradient Files

bash ./scripts/1_merge_train_gradient.sh

Process Validation Data

bash ./scripts/2_get_val_data_grads_all.sh

Compute Influence Matrices
```
bash ./scripts/3_specialist.sh
```

Stage 2: Generalist (Influence Consensus)

Generate Consensus
```
bash ./scripts/4_generalist.sh
```

Citation

If you find this repository useful for your research, please cite with the following BibTeX entry:

@article{wu2024icons,
  title={ICONS: Influence Consensus for Vision-Language Data Selection},
  author={Wu, Xindi and Xia, Mengzhou and Shao, Rulin and Deng, Zhiwei and Koh, Pang Wei and Russakovsky, Olga},
  journal={arXiv preprint arXiv:2501.00654},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
docs		docs
icons		icons
scripts		scripts
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ICONS

News 🔥

Table of Contents

Installation

Usage

Stage 1: Specialist (Computing Task-Specific Influence)

Stage 2: Generalist (Influence Consensus)

Citation

About

Releases

Packages

Languages

princetonvisualai/icons

Folders and files

Latest commit

History

Repository files navigation

ICONS

News 🔥

Table of Contents

Installation

Usage

Stage 1: Specialist (Computing Task-Specific Influence)

Stage 2: Generalist (Influence Consensus)

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages