Version 2.3.0 #344

jamesdolezal · 2023-12-21T18:57:09Z

Highlights

The highlight of Slideflow 2.3 is the introduction of whole-slide tissue segmentation. Both binary and multiclass tissue segmentation models can be trained from labeled ROIs and deployed for slide QC or used to generate ROIs. This release also adds CycleGAN-based stain normalization, as well as several smaller features and optimizations.

Tissue segmentation

tissue_seg.mp4

Slideflow now supports training and deploying tissue segmentation models, both via the programmatic interface as well as in Slideflow Studio. Tissue segmentation models can be trained in binary, multiclass, or multilabel mode using labeled ROIs. Tissue segmentation is performed at the whole-slide level, trained on randomly cropped sections of the slide thumbnail at a specified resolution.

Training segmentation models

Segmentation models are configured using SegmentConfig, which determines the segmentation architecture (U-Net, FPN, DeepLabV3, etc), image resolution for segmentation in microns-per-pixel (MPP), and other training parameters.

from slideflow import segment

# Create a config object
config = segment.SegmentConfig(mpp=20, mode='binary', arch='Unet')

Models can be trained with slideflow.segment.train(). Models will be saved in the given destination directory as model.pth, with an auto-generated segment_config.json file describing the architecture and parameters.

...

# Load a dataset
project = sf.Project(...)
dataset = project.dataset(...)

# Train the model
segment.train(config, dataset, dest='path/to/output')

Once trained, tissue segmentation models can either be used for slide-level QC or to generate ROIs.

Using models for QC

The new slideflow.slide.qc.Segment class provides an easy interface for generating QC masks from a segmentation model (e.g., for a model trained to identify tumor regions, pen marks, etc). This class takes a path to a trained segmentation model as an argument, and otherwise can be used for QC as outlined in the documentation.

import slideflow as sf
from slideflow.slide import qc

# Load the slide
wsi = sf.WSI('/path/to/slide', ...)

# Create the QC algorithm
segmenter = qc.Segment('/path/to/model.pth')

# Apply QC
applied_mask = wsi.qc(segmenter)

For multiclass segmentation models, qc.Segment provides additional arguments to customize how the model should be used for QC.

Generating ROIs

The same qc.Segment class can also be used to generate regions of interest (ROIs). Use Segment.generate_rois() to generate and apply ROIs to a single slide:

...

# Create a QC mask
segmenter = qc.Segment('/path/to/model.pth')

# Generate and apply ROIs to a slide
roi_outlines = segmenter.generate_rois(wsi)

Or use Dataset.generate_rois() to create ROIs for an entire dataset:

import slideflow as sf

# Load a project and dataset.
project = sf.load_project('path/to/project')
dataset = project.dataset()

# Generate ROIs for all slides in the dataset.
dataset.generate_rois('path/to/model.pth')

Deploying in Studio

The slide widget in Studio now has a "Segment" section. A trained segmentation model can be loaded and used for either QC or to generate ROIs. Further details regarding use are available in the documentation.

CycleGAN Stain Normalization

Slideflow now includes a CycleGAN-based stain normalizer, 'cyclegan'. Our implementation is based off of the work by Zingman et al. The stain normalization algorithm is a two-step process utilizing two separate GANs. The H&E image to be transformed is first converted via GAN-1 into Masson's Trichrome (MT), and then converted back to H&E via GAN-2. By default, pretrained weights provided by Zingman will be used, although custom weights can also be provided.

At present, CycleGAN stain normalization requires PyTorch. If you would like us to port GAN normalizers to the Tensorflow backend, please head to our ongoing Discussion and let us know!

This method can be used like any other stain normalizer:

# Configure training parameters
# to use CycleGAN stain normalization
params = sf.ModelParams(..., normalizer='cyclegan')

Other New Features

Stain normalizers can now augment an image without also normalizing, using the new .augment() method.

import slideflow as sf

# Get a Macenko normalizer
macenko = sf.norm.autoselect('macenko')

# Perform stain augmentation
img = macenko.augment(img)

Expanded support for more tile aggregation methods, for reducing tile-level predictions to slide- or patient-level predictions. The reduce_method argument to Project.train() and .evaluate() now supports 'median', 'sum', 'min', and 'max' (in addition to the previously supported 'average' and 'proportion'), as well as arbitrary callable functions. For example, to define slide-level predictions as the 75th percentile of tile-level predictions:
```
Project.train(
    ...
    reduce_method=lambda x: np.percentile(x, 75)
)
```
New utility function Dataset.get_unique_roi_labels() for getting a list of all unique ROI labels in a dataset.
Improve inference speed of PyTorch feature extractors when called on uint8 images.
Much faster generation of tile-level predictions for MIL models
Add function sf.mil.get_mil_tile_predictions(), which functions the same as sf.mil.save_mil_tile_predictions() but returns a pandas dataframe
Add ability to calculate tile-level uncertainty for MIL models trained with UQ, by passing uq=True to sf.mil.get_mil_tile_predictions()

Dependencies

Dependencies are largely unchanged. Updates include:

Tissue segmentation requires the segmentation-models-pytorch package.

Known Issues

Tissue segmentation is performed at the whole-slide level (based on cropped thumbnails), and performs best at lower magnifications (microns-per-pixel of 10 or greater). Attempting to train or deploy a tissue segmentation model at higher magnification may significantly increase memory requirements. Optimization work is ongoing to reduce memory requirements when training and deploying tissue segmentation models that operate at higher magnification.

…s()`

…tor & non-OpenCV normalizer

- Save a slide alignment with `WSI.alignment.save(path)` - Re-apply a saved alignment with `WSI.load_alignment(path)` or `WSI.apply_alignment(Alignment.load(path)`

- If a PyTorch, GPU-enabled normalizer has a device set, use this device when calculating DatasetFeatures - Move normalizers to a device when setting the `.device` attribute - Change the preferred device for Reinhard PyTorch normalizer from 'gpu' to 'cuda'

- Not dropping this batch can lead to shape errors, issues with batch normalization, and other problems.

…e size w/ CycleGAN

- Disable "use slide bounding boxes" option in Studio with cucim backend

- Alignment is now performed with reference to base slide dimensions (untransformed, without bounds or flipping/rotating). This is groundwork to allow alignments to be portable across tile sizes and slides (but potentially consistent between slide scanners)

- Reduce memory usage when generating images with CuCIM by using float32 instead of float64 during image conversion

- This initial support needs minor refactoring for efficiency and to ensure broader compatibility

…ch feature extractors.

…tion)

- Add multiclass support to segmentation models - Various bug fixes with segmentation training

- Multilabel segmentation support for qc.Segment

- Error is raised when a dataset has old/outdated index files that need regenerated. This fix circumvents the problem by regenerating index files before calculating/exporting features.

…bnails/masks

- Add segmentation documentation - Switch from `loss_mode` to `mode`; add `lr` parameter - Auto-detect `out_classes` from segmentation labels

- Fix bug with GPU stain augmentation in PyTorch (ValueError: Stain augmentation (n) requires a stain normalizer, which was not provided) - Fix "AssertionError: Input tensor must be float" for some PyTorch models deployed in Studio - Fix edge case where there is 1 tile in a slide - Fix bug in Studio in instances where there are no tiles in a slide (e.g. a JPEG image smaller than the tile size)

- New refresh button for loading in user-trained cellpose models - Fix error raised with whole-slide cell segmentation in Studio

- Fix inconsistent transparency issues with cell mask viewing in Studio

Small fix: flip mask in outsu if `roi_method == outside`

- Reducing tile-level predictions into slide- and patient-level predictions can now be done using arbitrary callable functions, by passing a callable function (e.g. lambda) to the argument `reduce_method`. Additional supported functions now also include 'median', 'sum', 'min', and 'max'.

…ictions()`

… models

jamesdolezal and others added 30 commits November 1, 2023 07:23

Faster inference for Torch feature extractors from uint8 images

a826e58

Get text features from PLIP feature extractor with `.get_text_feature…

0a99af9

…s()`

Bug fix with WSI.align_tiles_to(..., align_by='tile')

6360126

Fix slide thumbnail when using a slide transformation (eg. rotation)

ef42f83

Bug fix: fix hang when generating features from WSI w/ PyTorch extrac…

26391f0

…tor & non-OpenCV normalizer

Save and re-apply slide alignments

f84ae27

- Save a slide alignment with `WSI.alignment.save(path)` - Re-apply a saved alignment with `WSI.load_alignment(path)` or `WSI.apply_alignment(Alignment.load(path)`

Add CycleGAN stain normalizers.

199e35a

Enable CycleGAN normalizers

e0ba68b

Fix cyclegan fit export support

f6b4878

Drop last non-full batch when training MIL

36fb7d9

- Not dropping this batch can lead to shape errors, issues with batch normalization, and other problems.

Reduce verbosity from INFO to DEBUG when generating feature bags

28812bd

Support CycleGAN normalizer w/ Tensorflow models; fix mismatched imag…

06cd774

…e size w/ CycleGAN

Allow GPU memory growth when import SimCLR

7536514

Improve clarity of slide loading errors (instead of MissingMPPError)

a4a91c8

Bug fix: use_bounds and transforms arguments for JPEG images

c6df806

- Disable "use slide bounding boxes" option in Studio with cucim backend

Bug fix for new alignment

c884f3e

Improve debug logging during tile alignment

62905be

Improve debug logs.

4ea6d20

Improved alignment; fitted alignments are portable across tile sizes

0390d48

CycleGAN normalizer bug fix

aaa0225

Fix: add gdown to requirements

14bfa3e

Documentation update (Dataloaders)

56ed783

Bug fix: allow specifying normalizer=None when generating feature bags

59c00e4

Fix: support loading float16 bags for MIL models, converting to float32

54d58d5

Reduce memory usage w/ cucim image conversions

74a3fa4

- Reduce memory usage when generating images with CuCIM by using float32 instead of float64 during image conversion

Initial support for OME-TIFF; pending testing

3d65177

- This initial support needs minor refactoring for efficiency and to ensure broader compatibility

Experimental support for generic ViT feature extractors

3c43b92

Add mixed precision and channels-last memory format options for PyTor…

1a9bc97

…ch feature extractors.

jamesdolezal added 29 commits December 12, 2023 11:22

Add random flip/rotate during training

3f1b220

Add Segment QC algorithm for segmentation models.

2958be3

Revert segmentation stain normalization

d491f2d

Fixes to Dataset.generate_rois

709a185

Merge branch 'dev_wip' into dev (stain augmentation without normaliza…

0e4c718

…tion)

Add tissue segmentation QC (UNet) to Studio

01df003

Generate ROIs from segmentation models in Studio

f92269b

Add Dataset.get_unique_roi_labels()

41c9021

Segmentation models - multiclass support, bug fixes

84f31ce

- Add multiclass support to segmentation models - Various bug fixes with segmentation training

Segmentation updates

3c4acca

- Multilabel segmentation support for qc.Segment

Fix rare "LiveError" when generating feature bags.

10c3da0

- Error is raised when a dataset has old/outdated index files that need regenerated. This fix circumvents the problem by regenerating index files before calculating/exporting features.

Support for training segmentation models without first exporting thum…

ff2fdda

…bnails/masks

Segmentation updates

f5c16f6

- Add segmentation documentation - Switch from `loss_mode` to `mode`; add `lr` parameter - Auto-detect `out_classes` from segmentation labels

Add/fix custom cellpose model support in Studio

dfe2d13

Minor docstring / documentation updates

fe73931

Cell segmentation updates / fixes

30c3e1e

- New refresh button for loading in user-trained cellpose models - Fix error raised with whole-slide cell segmentation in Studio

Cellpose widget bug fixes

a168aee

- Fix inconsistent transparency issues with cell mask viewing in Studio

Documentation updates; disable experimental Studio UQ optimization

77b4fe7

Merge pull request #340 from jamesdolezal/otsu_outside

4611c3e

Small fix: flip mask in outsu if `roi_method == outside`

sf.mil.eval.get_mil_tile_predictions() -> `sf.mil.get_mil_tile_pred…

9222c08

…ictions()`

Segmentation utility function refactor

789bb70

Remove experimental CGRH

744d451

Add stain augmentation via StainNormalizer.augment() for Tensorflow…

07882be

… models

Stain normalizer documentation updates

65cdac0

Documentation updates

c941589

Documentation updates

67e0f5d

Merge branch 'master' into dev

1e64afc

jamesdolezal merged commit b117406 into master Dec 21, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 2.3.0 #344

Version 2.3.0 #344

jamesdolezal commented Dec 21, 2023

Version 2.3.0 #344

Version 2.3.0 #344

Conversation

jamesdolezal commented Dec 21, 2023

Highlights

Table of Contents

Tissue segmentation

Training segmentation models

Using models for QC

Generating ROIs

Deploying in Studio

CycleGAN Stain Normalization

Other New Features

Dependencies

Known Issues