ConceptExpress

This is the official PyTorch codes for the paper:

ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction
Shaozhe Hao, Kai Han, Zhengyao Lv, Shihao Zhao, Kwan-Yee K. Wong
The University of Hong Kong
ECCV 2024 (Oral)

We present Unsupervised Concept Extraction (UCE) that focuses on the unsupervised problem of extracting multiple concepts from a single image.

Project Page

The dataset of input images used in our paper is now available at this link. All images in this dataset are sourced from Unsplash under a license that allows free download and use!

Set-up

Create a conda environment uce using

conda env create -f environment.yml
conda activate uce
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install -r requirements.txt

Training

Create a new folder that contains an img.jpg. For example, download our dataset and put it under the root path. You can change --instance_data_dir in bash file scripts/train.sh to uce_images/XX or any other image path you like. You can specify --output_dir to save the checkpoints.

When the above is ready, run the following to start training:

bash scripts/train.sh

The learned token embeddings of all concepts are saved to .bin files under your --output_dir.

Inference

Once trained, the i-th concept is represented as <asset$i> in the tokenizer. We can then freely generate images using any concept token <asset$i> (replace $i with a valid concept index):

python infer.py \
  --embed_path $CKPT_BIN_FILE \
  --prompt "a photo of <asset$i> in the snow" \
  --save_path $SAVE_FOLDER \
  --seed 0

Please specify $CKPT_BIN_FILE which is the .bin file path of your learned token embeddings, and $SAVE_FOLDER to save the generated images. You can also find inference examples in scripts/infer.sh.

Citation

If you use this code in your research, please consider citing our paper:

@InProceedings{hao2024conceptexpress,
    title={Concept{E}xpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction}, 
    author={Shaozhe Hao and Kai Han and Zhengyao Lv and Shihao Zhao and Kwan-Yee~K. Wong},
    booktitle={ECCV},
    year={2024},
}

Acknowledgements

This code repository is based on the great work of Break-A-Scene. Thanks!

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
clustering		clustering
emd		emd
scripts		scripts
src		src
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
infer.py		infer.py
ptp_utils.py		ptp_utils.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ConceptExpress

The dataset of input images used in our paper is now available at this link. All images in this dataset are sourced from Unsplash under a license that allows free download and use!

Set-up

Training

Inference

Citation

Acknowledgements

About

Releases

Packages

Languages

License

haoosz/ConceptExpress

Folders and files

Latest commit

History

Repository files navigation

ConceptExpress

The dataset of input images used in our paper is now available at this link. All images in this dataset are sourced from Unsplash under a license that allows free download and use!

Set-up

Training

Inference

Citation

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages