Skip to content

CVPR 2024: Learned representation-guided diffusion models for large-image generation

Notifications You must be signed in to change notification settings

cvlab-stonybrook/Large-Image-Diffusion

Repository files navigation

Learned representation-guided diffusion models for large-image generation

Official code for our CVPR 2024 publication Learned representation-guided diffusion models for large-image generation. This codebase builds heavily on CompVis/latent-diffusion and PathLDM.

teaser figure

Requirements

To install python dependencies,

conda env create -f environment.yaml
conda activate ldm

Downloading + Organizing Data

Due to storage limitations, we cannot upload the image patches and embeddings used for training. However, training data can be curated by following these steps -

Download the WSIs

We train diffusion models on TCGA-BRCA, TCGA-CRC and Chesapeake Land cover datasets. For BRCA and CRC, we used the DSMIL repository to extract 256 x 256 patches @ 20x magnification, and conditioned the diffusion models from HIPT and iBOT. We also train a model on 5x BRCA patches, conditioned on CTransPath embeddings See Section 4.1 from our paper for more details.

Prepare the patches

Once you clone the DSMIL repository, you can use the following command to extract patches from the WSIs.

$ python deepzoom_tiler.py -m 0 -b 20

SSL embeddings

Follow instructions in HIPT / iBOT repository to extract embeddings for each patch.

Pretrained models

We provide the following trained models

Dataset # Training images FID Conditioning Download link
BRCA 20x 15 Mil 6.98 HIPT link
CRC 20x 8 Mil 6.78 iBOT link
NAIP 667 k 11.5 VIT-B/16 link
BRCA 5x 976 k 9.74 CTransPath link

Training

Example training command:

python main.py -t --gpus 0,1 --base configs/latent-diffusion/crc/only_patch_20x.yaml

Sampling

Refer to these notebooks for generating images using the provided models:

Bibtex

@inproceedings{graikos2024learned,
  title={Learned representation-guided diffusion models for large-image generation},
  author={Graikos, Alexandros and Yellapragada, Srikar and Le, Minh-Quan and Kapse, Saarthak and Prasanna, Prateek and Saltz, Joel and Samaras, Dimitris},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={8532--8542},
  year={2024}
}

About

CVPR 2024: Learned representation-guided diffusion models for large-image generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published