Official code for our WACV 2024 publication PathLDM: Text conditioned Latent Diffusion Model for Histopathology. This codebase builds heavily on CompVis/latent-diffusion
💥 Check out our CVPR paper - Learned representation-guided diffusion models for large-image generation , where we train histopathology diffusion models without labeled data.
To install python dependencies,
conda env create -f environment.yaml
conda activate ldm
tl;dr : TCGA-BRCA Image patches, captions and Tumor/TIL probabilities used in our training can be downloaded from this link. See this file for the Dataset class we use during training.
We obtained machine readable text reports for TCGA from this repo, and used GPT-3.5 to summarize them. Summaries of all BRCA reports can be found at this link.
We used wsinfer to obtain tumor and TIL probabilities. Wsinfer works directly with the WSI files, and outputs a csv with the probabilities for each patch, but the size and magnification might be different from the patches extracted by DSMIL. For each 10x patch, we use the average probabilities of the overlapping patches from wsinfer.
We used the DSMIL repository to extract 256 x 256 patches @ 10x magnification, resulting in 3.2 million patches for TCGA-BRCA. The following steps are borrowed from the DSMIL repository.
From GDC data portal. You can use GDC data portal with a manifest file and configuration file. The raw WSIs take about 1TB of disc space and may take several days to download. Please check details regarding the use of TCGA data portal. Otherwise, individual WSIs can be download manually in GDC data portal repository
Once you clone the DSMIL repository, you can use the following command to extract patches from the WSIs.
$ python deepzoom_tiler.py -m 0 -b 10
We provide the following trained models
Conditioning network | Conditioning type | Modality | FID | Link |
---|---|---|---|---|
Class embedder | Tumor + TIL | Class label (4 classes) | 29.45 | link |
OpenAI CLIP | Report + tumor + TIL | Text caption (154 tokens) | 10.64 | link |
PLIP | Report + tumor + TIL | Text caption (154 tokens) | 7.64 | link |
To train a diffusion model, create a config file similar to this and create / update the corresponding dataloader (ex this). To download frozen VAEs, follow instructions in the original LDM repo.
Example training command :
python main.py -t --gpus 0,1 --base configs/latent-diffusion/text_cond/plip_imagenet_finetune.yaml
This notebook shows how to sample from the text conditioned diffusion model.
@InProceedings{Yellapragada_2024_WACV,
author = {Yellapragada, Srikar and Graikos, Alexandros and Prasanna, Prateek and Kurc, Tahsin and Saltz, Joel and Samaras, Dimitris},
title = {PathLDM: Text Conditioned Latent Diffusion Model for Histopathology},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2024},
pages = {5182-5191}
}