🧠 Robustness Impacts of Coreset Selection

Official implementation of our NeurIPS 2025 (Datasets and Benchmarks Track) paper “The Impact of Coreset Selection on Spurious Correlations and Group Robustness”.
This repository explores how different coreset selection strategies affect bias levels and group robustness across diverse datasets and architectures.

🧩 What’s inside

📊 Reproducible experiments and benchmarks
🧮 Evaluation and analysis scripts

📄 Paper: The Impact of Coreset Selection on Spurious Correlations and Group Robustness

Environment Setup

Create a conda environment:

conda create -n bias-select python=3.8
conda activate bias-select

Install PyTorch with CUDA support:

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

Install the deepcore package and its dependencies:

# Install the package in development mode (make sure you're in the repository root directory)
pip install -e .

# Or install dependencies first, then the package
pip install -r requirements.txt
pip install -e .

The -e flag installs the package in "editable" mode, which means you can modify the code without reinstalling.

Dataset Preparation

The codebase supports the following datasets:

Cmnist - Colored MNIST with spurious correlations
waterbirds - Bird species classification with background bias
Urbancars_cooccur - Car classification with co-occurrence bias
Urbancars_bg - Car classification with background bias
Urbancars_both - Car classification with combined biases
Nico_95_spurious - Natural Image Classification with Context
MultiNLI - Natural Language Inference dataset
Metashift - Dataset for studying distribution shifts
Civilcomments - Toxic comment classification dataset
CelebAhair - CelebA dataset with hair color attributes

Dataset Download Instructions

Create a data directory:

mkdir -p data

Download and prepare each dataset:

1. Cmnist

The CMNIST dataset can be downloaded from Google Drive:

Download the dataset from CMNIST Google Drive Link
Extract the downloaded cmnist.zip file:

unzip cmnist.zip -d data/

The dataset will be extracted to data/cmnist/ with the following structure:

data/cmnist/
├── test/
│   ├── 0/
│   ├── 1/
│   └── ... (classes 2-9)
└── 5pct/
    ├── align/
    │   ├── 0/
    │   └── 1/
    ├── valid/
    │   ├── 0/
    │   └── 1/
    └── conflict/
        ├── 0/
        └── 1/

2. waterbirds

Follow - https://github.com/kohpangwei/group_DRO/tree/master to generate the waterbirds dataset

3. Urbancars variants

Follow - https://github.com/facebookresearch/Whac-A-Mole/tree/main to create the dataset Urbancars_cooccur, Urbancars_bg, Urbancars_both

4. Metashift

Follow https://github.com/YyzHarry/SubpopBench/tree/main to download Metashift dataset and generate the metadata.

5. Civilcomments

Follow https://github.com/izmailovpavel/spurious_feature_learning/tree/main to setup the dataset. You will need to install Wilds package for this.

6. Nico_95_spurious

We follow https://github.com/yvsriram/FACTS to set up the dataset. Download the NICO++ dataset as specified by them into './Data/NICO'

7. MultiNLI

Follow https://github.com/izmailovpavel/spurious_feature_learning/tree/main to setup the dataset

8. CelebAhair

Follow https://github.com/kohpangwei/group_DRO#celeba to download the dataset. Then, to use the hair color as the target attribute, we have provided the metadata file at ./deepcore/datasets/metadata.csv

Label Preparation

To prepare the labels for any dataset:

python scripts/save_dataset_labels.py <dataset_name>

For example:

# For CMNIST
python scripts/save_dataset_labels.py Cmnist

## Running Experiments

### Sample Characterization Scores

To compute sample characterization scores for a dataset, run the corresponding script in the `scripts` directory:

```bash
# For CMNIST dataset
python scripts/run_cmnist.py

# For Waterbirds dataset
python scripts/run_waterbirds.py

# For CelebA dataset
python scripts/run_celeba.py

These scripts will:

Train models on the full dataset
Compute various sample characterization scores (EL2N, Forgetting, Uncertainty, etc.)
Save the scores in the results directory

Training Downstream Models

After computing the sample characterization scores, you can train downstream models on the selected coresets using the corresponding training scripts:

# For CMNIST dataset
python scripts/run_cmnist_train.py

# For Waterbirds dataset
python scripts/run_waterbirds_train.py

# For CelebA dataset
python scripts/run_celeba_train.py

These training scripts will:

Load the pre-computed sample characterization scores
Select coresets based on the specified selection method
Train models on the selected coresets
Save the trained models and results

Example Workflow

Here's a complete example for the CMNIST dataset:

# Step 1: Compute sample characterization scores
python scripts/run_cmnist.py

# Step 2: Train models on selected coresets
python scripts/run_cmnist_train.py

The results will be saved in the results directory with appropriate naming conventions for each dataset and selection method.

Acknowledgements

This codebase is based on DeepCore, a comprehensive library for coreset selection in deep learning. We extend their work to study the robustness impacts of coreset selection methods on various datasets with spurious correlations and distribution shifts.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
config		config
deepcore		deepcore
scripts		scripts
.gitignore		.gitignore
README.md		README.md
main.py		main.py
misc.py		misc.py
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py
utils.py		utils.py
utils_glue.py		utils_glue.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Robustness Impacts of Coreset Selection

Environment Setup

Dataset Preparation

Dataset Download Instructions

1. Cmnist

2. waterbirds

3. Urbancars variants

4. Metashift

5. Civilcomments

6. Nico_95_spurious

7. MultiNLI

8. CelebAhair

Label Preparation

Training Downstream Models

Example Workflow

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

princetonvisualai/Robustness-impacts-of-coreset-selection

Folders and files

Latest commit

History

Repository files navigation

🧠 Robustness Impacts of Coreset Selection

Environment Setup

Dataset Preparation

Dataset Download Instructions

1. Cmnist

2. waterbirds

3. Urbancars variants

4. Metashift

5. Civilcomments

6. Nico_95_spurious

7. MultiNLI

8. CelebAhair

Label Preparation

Training Downstream Models

Example Workflow

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages