Xiwei Xuan, Ziquan Deng, Hsuan-Tien Lin, and Kwan-Liu Ma
Recent studies highlight that deep learning models often learn spurious features mistakenly linked to labels, compromising their reliability in real-world scenarios where such correlations do not hold. Despite the increasing research effort, existing solutions often face two main challenges: they either demand substantial annotations of spurious attributes, or they yield less competitive outcomes with expensive training when additional annotations are absent. In this paper, we introduce SLIM, a cost-effective and performance-targeted approach to reducing spurious correlations in deep learning. Our method leverages a human-in-the-loop protocol featuring a novel attention labeling mechanism with a constructed attention representation space. SLIM significantly reduces the need for exhaustive additional labeling, requiring human input for fewer than 3% of instances. By prioritizing data quality over complicated training strategies, SLIM curates a smaller yet more feature-balanced data subset, fostering the development of spuriousness-robust models. Experimental validations across key benchmarks demonstrate that SLIM competes with or exceeds the performance of leading methods while significantly reducing costs. The SLIM framework thus presents a promising path for developing reliable models more efficiently.
- Python >= 3.8
- PyTorch >= 2.0.1 (PyTorch Official - Get Started)
Other dependencies can then be installed using the following command:
pip install -r requirements.txt
Alternatively, if you are using conda, a conda environment named slim
with packages installed can be created by:
conda env create -f environment.yml -n slim
Please follow the below link to download and organize datasets.
The code and datasets are organized as:
- datasets
- celeba
- img_align_celeba
- metadata.csv
- waterbirds
- waterbird_complete95_forest2water2
- metadata.csv
- ...
- slim_code (this repository)
- ...
To train a reference model:
python model_training.py --dataset waterbirds --data_dir [path-to-waterbirds-data-dir] --exp_info reference_model_training
This command trains a ResNet50 on the waterbirds dataset. You can modify the command to run different experiments with different hyperparameters or on different datasets.
To generate GradCAM visual explanation results and produce the attention evaluation score (AIOU):
python get_gradcam.py
Note that the data folder, model architecture, and trained model path, etc., can be modified to obtain GradCAM on other datasets and models.
python get_feature_vectors.py
Similarly, the data folder, model architecture, and trained model path, etc., can be modified to get feature vectors from other datasets and models.
This involves a human-in-the-loop step of annotating sampled data. The details are provided in the notebook slim_data_sampling.ipynb
.
After data curation, we can re-train the model and evaluate its performance with 1. Train a model
and 2. Generate GradCAM Results and evaluate the model attention
.
If you find our work useful, please cite it using the following BibTeX entry:
@InProceedings{xuan2024slim,
author = {Xuan, Xiwei and Deng, Ziquan and Lin, Hsuan-Tien and Ma, Kwan-Liu},
title = {{SLIM}: Spuriousness Mitigation with Minimal Human Annotations},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {September},
year = {2024}
}
This GradCAM implementation is based on PyTorch-Grad-CAM. We also refer to the wonderful repositories of our related works, including GroupDRO, Correct-N-Contrast, and DFR.