Official code for our CVPR 2024 publication Learned representation-guided diffusion models for large-image generation. This codebase builds heavily on CompVis/latent-diffusion and PathLDM.
To install python dependencies,
conda env create -f environment.yaml
conda activate ldm
Due to storage limitations, we cannot upload the image patches and embeddings used for training. However, training data can be curated by following these steps -
We train diffusion models on TCGA-BRCA, TCGA-CRC and Chesapeake Land cover datasets. For BRCA and CRC, we used the DSMIL repository to extract 256 x 256 patches @ 20x magnification, and conditioned the diffusion models from HIPT and iBOT. We also train a model on 5x BRCA patches, conditioned on CTransPath embeddings See Section 4.1 from our paper for more details.
Once you clone the DSMIL repository, you can use the following command to extract patches from the WSIs.
$ python deepzoom_tiler.py -m 0 -b 20
Follow instructions in HIPT / iBOT repository to extract embeddings for each patch.
We provide the following trained models
Dataset | # Training images | FID | Conditioning | Download link |
---|---|---|---|---|
BRCA 20x | 15 Mil | 6.98 | HIPT | link |
CRC 20x | 8 Mil | 6.78 | iBOT | link |
NAIP | 667 k | 11.5 | VIT-B/16 | link |
BRCA 5x | 976 k | 9.74 | CTransPath | link |
- Customization: Create a config file similar to ./configs/latent-diffusion/crc/only_patch_20x.yaml to train your own diffusion model.
- Sample Dataset: We provide a sample dataset here . Study it to understand the required data format.
- Loading Data: See ./ldm/data/hybrid_cond/crc_only_patch.py for an example of how to load data.
- Embedding Guidance: We feed the SSL embedding via cross-attention (See Line 52 of ./ldm/modules/encoders/modules.py).
Example training command:
python main.py -t --gpus 0,1 --base configs/latent-diffusion/crc/only_patch_20x.yaml
Refer to these notebooks for generating images using the provided models:
- Image Patches: ./notebooks/brca_patch_synthesis.ipynb
- Large Images: ./notebooks/large_image_generation.ipynb
@inproceedings{graikos2024learned,
title={Learned representation-guided diffusion models for large-image generation},
author={Graikos, Alexandros and Yellapragada, Srikar and Le, Minh-Quan and Kapse, Saarthak and Prasanna, Prateek and Saltz, Joel and Samaras, Dimitris},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={8532--8542},
year={2024}
}