Skip to content

innovationcore/monai-mil

Repository files navigation

Multiple Instance Learning (MIL) Examples

This tutorial contains a baseline method of Multiple Instance Learning (MIL) classification from Whole Slide Images (WSI). The dataset is from Prostate cANcer graDe Assessment (PANDA) Challenge - 2020 for cancer grade classification from prostate histology WSIs. The implementation is based on:

Andriy Myronenko, Ziyue Xu, Dong Yang, Holger Roth, Daguang Xu: "Accounting for Dependencies in Deep Learning Based Multiple Instance Learning for Whole Slide Imaging". In MICCAI (2021). arXiv

Requirements

The script is tested with:

  • Ubuntu 18.04 | Python 3.6 | CUDA 11.0 | Pytorch 1.10

  • the default pipeline requires about 16GB memory per gpu

  • it is tested on 4x16gb multi-gpu machine

Dependencies and installation

MONAI

Please install the required dependencies

pip install tifffile
pip install imagecodecs

For more information please check out the installation guide.

Data

Prostate biopsy WSI dataset can be downloaded from Prostate cANcer graDe Assessment (PANDA) Challenge on Kaggle. In this tutorial, we assume it is downloaded in the /PandaChallenge2020 folder

Examples

Check all possible options

python ./panda_mil_train_evaluate_pytorch_gpu.py -h

Train

Train in multi-gpu mode with AMP using all available gpus, assuming the training images in /PandaChallenge2020/train_images folder, it will use the pre-defined 80/20 data split in datalist_panda_0.json

python -u panda_mil_train_evaluate_pytorch_gpu.py
    --data_root=/PandaChallenge2020/train_images \
    --amp \
    --distributed \
    --mil_mode=att_trans \
    --batch_size=4 \
    --epochs=50 \
    --logdir=./logs

If you need to use only specific gpus, simply add the prefix CUDA_VISIBLE_DEVICES=...

CUDA_VISIBLE_DEVICES=0,1,2,3 python -u panda_mil_train_evaluate_pytorch_gpu.py
    --data_root=/PandaChallenge2020/train_images \
    --amp \
    --distributed \
    --mil_mode=att_trans \
    --batch_size=4 \
    --epochs=50 \
    --logdir=./logs

Validation

Run inference of the best checkpoint over the validation set

# Validate checkpoint on a single gpu
python -u panda_mil_train_evaluate_pytorch_gpu.py
    --data_root=/PandaChallenge2020/train_images \
    --amp \
    --mil_mode=att_trans \
    --checkpoint=./logs/model.pt \
    --validate

Inference

Run inference on a different dataset. It's the same script as for validation, we just specify a different data_root and json list files

python -u panda_mil_train_evaluate_pytorch_gpu.py
    --data_root=/PandaChallenge2020/some_other_files \
    --dataset_json=some_other_files.json
    --amp \
    --mil_mode=att_trans \
    --checkpoint=./logs/model.pt \
    --validate

Stats

Expected train and validation loss curves

Expected validation QWK metric

Questions and bugs

  • For questions relating to the use of MONAI, please use our Discussions tab on the main repository of MONAI.
  • For bugs relating to MONAI functionality, please create an issue on the main repository.
  • For bugs relating to the running of a tutorial, please create an issue in this repository.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published