Skip to content


Repository files navigation

Source code used in the work:
"High-accuracy prostate cancer pathology using deep learning"
Nature Machine Intelligence
by Tolkach Y. et al.

### 1) System requirements ###
*Operating system
Ubuntu 16.04 / 18.04

*Python 3.6.5

*Libraries / Dependencies
tensorflow 1.x (GPU version); also functionable with 2.0; also tested on CPU version
keras 2.2.5
staintools 2.1.2
openslide 3.4.1
PIL 7.0.0
Additional packages: statistics, numpy, opencv, os (last versions)

### 2) Hardware requirements ###
The code was implemented on the PC with Nvidia GPU card.
No non-standard hardware required.

### 5) Source code modules (correspond to folders) ###

List of modules
1 Training
2 validation Tumor vs Benign
3 validation Gleason score
4 WSI pipeline
5 Gleason score minimal tumor size

                        SOURCE CODE MODULES IN DETAILS

# 1 Training #
*Training of the models (model type 1: tumor vs benign, model type 2: gleason patterns)
*For detailed description of the training approach see Manuscript.
*Source code for training is similar for both model types
(3 classes, s. Details in the Manuscript).

-Only Version for NASNetLarge network provided at magnification of approx. 23x
-Representative images from training dataset included (stain normalized)
-Graph architecture of the network is provided (NASNetLarge with shapes.png).

# 2 Validation Tumor vs Benign #
*Validation of the model type 1 using two validation datasets
(Discrimination abilities between tumor and benign tissue).
*Validation datasets (patches of size 600px at 40x) are generated through
tiling of tumor and benign regions from pathologist pre-annotated whole slide images.

- Test can include one or more trained models
- Takes images (patches) from source folder and generates predictions
using model 1 (discrimination tumor vs benign tissue).
- Patches with predictions in gray zone (see Methods)
are being additionaly analysed through C8 approach (see Methods; shortly:
analysis not only of native patch but also of its derivates: flips and rotations).
- Generated results are saved as tab-delimited file
- Script can be adapted to any thresholds to trigger C8 testing (including all
images without any thresholds)

#3 Validation Gleason Score
*Validation of the model type 2 using three validation datasets (s. Methods).
*Validation datasets for this purpose are different to Validation datasets for
setting Tumor vs Benign.
*Validation datasets contain large tumor images up to 4800 px and more
(no more patches; saved under magnification 40x).
- Large images are being splitted into patches 600x600 px, which are further
reduced to model working patch size.
- Every patch is analysed by model type 2 (predictions for Gleason patterns 3, 4 and 5)
- Final grading of the whole tumor area according to developed algorithm
(output: percentages of gleason patterns; further will be transformed to Gleason scores)
- Saving the results into a tab-delimited file (would be further merged
with clinical database for agreement analysis and survival-based analysis)

#4 WSI pipeline
A pipeline for processing of whole-slide images.

$ Functionality:
- Processes the whole slide image (tiling, background/tissue detection, classification,
creation of maps and overlays)
- Final outputs of the pipeline in the provided version are:
  *mathematical maps with predictions and classification results
  *binary maps with classification results: background/tissue AND tumor vs benign;
  Gleason grading maps
  *overlaid images: heatmaps overlaid on the reduced whole-slide images

$ Processes during implementation:
- cutting WSI parts with tissue into patches 600x600 px.
- reduction of patch size to model patch size (350x350 px).
- background / tissue detection (based on the presence of the staining patterns
typical for cell nuclei)

*if patch = non-background
- brightness and stain normalization of the patch (Macenko method)
- test of patch via model type 1 with generation of probabilities of classes
(tumor or benign)
- creation of mathematical map based on probabilities for tumor and benign tissue
using threshold 0.5
- implementing additional strategies to reduce false positive results
(Strategy C8 and single patch environment analysis; see Materials and methods
for description)
- generation of the final binary map of the tumor vs benign tissue
- genetation of the image overlay: binary map on the reduced version of the
whole slide image.

- analysis of the tumor classified patches with the model type 2 for Gleason grading
(details of the algorithm for gleason Grading see Materials and Methods)
- generation of the Gleason grading map.
- genetation of the overlay: binary Gleason grading map on the reduced version of the whole slide image.
- save binary maps, heatmaps and overlaid images

$ Code components and description.
images                         Contains chunk images for binary (heatmap) maps
                               Contains standard image for stain normalization

*Files:                        Main script to execute pipeline
                               (Magnification of analysis defined through
                               model patch size)            Implementation of C8 strategy (s. Methods)
                               - generation of patch flip and rotation derivates
                               - final prediction for patch based on analysis
                               of derivates        Calculation of tumor square in the slide               Tissue / background detection functions                 Single patch gleason grade prediction
                               Algorithm for Gleason scoring from pattern predictions
                               Binary Gleason score map generation                Load images for binary map/heatmap for
                               different classes                    Make mathematical binary map from predictions,
                               Make binary heatmap as image
                               Make overlaid image                 Main script to process whole slide image
                               Separate implemetation for C1 (native) and
                               C8 strategies (s. Methods)
                               -background / tissue detection
                               -brightness/stain normalization
                               -make mathematical map with predictions (numpy array)              Implementation of additional "environment" analysis
                               of single positive patches (s. Methods)              Retrieve slide info from metadata              Initialization of Brightness and Stain normalizers

- WSIs from different scanner vendors may require small modifications of to cope with different metadata saving methods

#5 Gleason score minimal tumor size

*Analysis of the minimal tumor size necessary for reliable Gleason Grading.
*Refers to data presented in the Figure 6
*See Methods for description of principle.
-Gleason grading of subsampled ROIs from large tumor images
-Progressively increasing size of ROIs (in patches)
-Bootstrapping for every ROI size in patches (20 rounds).
-Comparison of the grading of subsampled ROIs and the
grading of the whole image.
-Main Aim: at which smallest tumor size grading becomes reliable, representative
of the whole tumor area.

Principle (See also Methods):
- Dataset of large tumor images (Examples provided).
- Random extraction of ROIs with a size of 1, 2, 3, n ... patches from large tumor images.
  *Random generation of coordinates.
- Bootstrapping with 20 iterations pro ROI size (number of subsampled patches).
- Control for non-intersecting patch extraction.
- Control for maximal image size to exclude some images from analysis with large ROI sizes.
- Gleason grading according to algorithm described in Methods and implemented in the
WSI pipeline.
- Saving grading results for every bootstrapping iteration for further comparison to grading
results of the whole tumor area in the image.


No description, website, or topics provided.







No releases published


No packages published
