Yiren Song
·
Xiaokang Liu
·
Mike Zheng Shou
National University of Singapore
- [Dec.19.2024] Release DiffSim paper and source codes.
Please download any of the 7 benchmarks below for evaluation:
Sref is a benchmark we propose. We have collected 508 styles, each hand-picked by human artists and featuring four thematically distinct reference images, created using Midjourney’s Sref mode. Midjourney’s style reference feature allows users to guide the style or aesthetic of generated images by using external pictures or style seeds in their prompt.
Please download the benchmark at here.
IP is a benchmark we propose. We have collected images of 299 IP characters and used advanced Flux models and the IP-Adapter to create several variants of each character with different consistency weights.
Please download the benchmark at here.
- Experiment System Info: Debian 5.4, Cuda 12.1
- Evaluation Requirements: GPU inference with about 10k MB memory or CPU inference
conda create -n "diffsim" python=3.12.7
conda activate diffsim
pip3 install -r requirements.txt
For evaluating each benchmark, please refer to the corresponding bash script. For example, you can evaluate DiffSim on CUTE dataset by sh cute_main.sh
.
The script and arguments details are as follows:
python -u cute_main.py --image_path "/tiamat-NAS/songyiren/dataset/CUTE/" --image_size 512 --target_block "up_blocks" --target_layer 0 --target_step 600 --similarity "cosine" --seed 2334 --metric "diffsim"
--image_path
: Root path to the image dataset. Please modify to your downloaded folder.--image_size
: The image resolution.--target_block
: Specify which part of the UNet is used in DiffSim setting. ("up_blocks", "down_blocks" and "mid_blocks")--target_layer
: Specify which attention layer withintarget_block
is used in DiffSim setting.--target_step
: Specify denosing step used in DiffSim setting.--metric
: Specify which metric to evaluate. ("diffsim", "clip_cross", "dino_cross")
Note: it may take more than 30 minutes to evaluate for a complete run on some benchmarks.
If you find this codebase useful for your research, please use the following entry.
TODO