We are one of the top teams in the AutoPET II challenge! I am happy to answer any question, just write me an Email.
Authors: Matthias Hadlich, Zdravko Marinov, Rainer Stiefelhagen Link to the Paper https://arxiv.org/abs/2309.12114
Link to the weights of the submission: https://bwsyncandshare.kit.edu/s/Jg3K6Y35jXiBzpK
An extension / rework of the DeepEdit code. Most work has been put into moving the transform to torch and thus on the GPU while preventing the common OOMs with MONAI. Also cupy based distance transforms have been integrated to remove the old scipy based ones. A lot of the improvements from this code have been integrated into MONAI 1.3.0.
This code was exracted from MONAI, reworked by M.Sc. Zdravko Marinov, Karlsuhe Institute of Techonology ([email protected]) to make it's way into my master thesis, which is this repository. 2023, M. Sc. Matthias Hadlich, Karlsuhe Institute of Technology ([email protected])
Important: This code is only tested on 3D PET images of AutoPET(II). 2D images are not suppported, I think the code has to be adapted for that.
Training for 200 epochs with 10 clicks on (224,224,224) patches with validation on full volumes finishes in under a week on a single Nvidia A6000 50Gb. With 10 clicks / guidance during validation: 0.8715 dice on the full volumes after 800 epochs of training. Without clicks / guidance during validation: 0.7407 dice on the full volumes after 400 epochs of training.
The SlidingWindowInferer
does not only beat the SimpleInferer
in terms of comfort (volumes of any size can be used), but also in terms of the Dice score.
When comparing both, on a val_crop_size
of (192,192,256)
, the sliding window approach yields a Dice of 0.8383 vs 0.8102 on the SimpleInferer.
Use the train.py
file for that. Use the --resume_from
flag to resume training from an aborted previous experiment. Example usage:
python train.py -a -i /projects/mhadlich_segmentation/AutoPET/AutoPET --dataset AutoPET -o /projects/mhadlich_segmentation/data/20 -c /local/work/mhadlich/cache -ta -e 400
Use the train.py
file for that and only add the --eval_only
flag. The network will only run the evaluator which finishes after one epoch. Evaluation will use the images and the label and thus print a metric at the end.
Use the --resume_from
flag to load previous weights.
Use --save_pred
to save the resulting predictions.
Use the test.py
file for running. Example usage on the AutoPET test mha files:
python test.py -i /input/images/pet/ -o /output/images/automated-petct-lesion-segmentation/ --non_interactive -a --resume_from checkpoint.pt -ta --dataset AutoPET2_Challenge --dont_check_output_dir --no_log --sw_overlap 0.75 --no_data --val_sw_batch_size 8
Also check out the Docker file for testing, it it configured to run on the AutoPET2 challenge files.
There are multiple steps involved to get this to run.
Optional: Create a new conda environment
- Install monailabel via
pip install monailabel
. - Install the dependencies of this repository with
pip install -r requirements.txt
, then install this repository as a package viapip install -e
. Hopefully this step can be removed in the future when the code is integrated into MONAI. - Download the radiology sample app
monailabel apps --download --name radiology --output .
(Alternative: Download the entire monailabel repo and just launch monailabel from there) - Copy the files from the repo under
monailabel/
toradiology/lib/
and into the according foldersinfers/
andconfigs/
. - Download the weights from https://bwsyncandshare.kit.edu/s/Yky4x6PQbtxLj2H , rename it to
pretrained_sw_fastedit.pt
and put them into the (new) folderradiology/model/
. This model was pretrained on tumor-only AutoPET volumes. - Make sure your images follow the monailabel convention, so e.g. all Nifti files in one folder
imagesTs
.
You can then run the model with (adapt the studies path where the images lie):
monailabel start_server --app radiology --studies ../imagesTs --conf models sw_fastedit
Use compute_metrics.py for that. Example usage:
python compute_metrics.py -l /projects/mhadlich_segmentation/AutoPET/AutoPET/labelsTs/ -p /projects/mhadlich_segmentation/user_study/baseline -o eval/
For the AutoPET II challenge this code has been dockerized. For details check out build.sh
, export.sh
, test_autopet.sh
, test.sh
.
- AutoPET: Should not be used, this was our own remodelled tumor-only dataset. Consists of images in the folder imagesTs, labelsTs, imagesTr, labelsTr
- AutoPET II: Default for the AutoPET NIFTI structure, supply a link to the FDG-PET-CT-Lesions folder to start
- AutoPET2_Challenge: AutoPET II challenge mode, loads the mha files. Look into test.sh for details on how to call it
- HECKTOR: Default for the HECKTOR NIFTI dataset, so supply the HECKTOR folder with the two subfolders
hecktor2022_training
andhecktor2022_testing
- MSD Spleen: Untested, still exists for legacy reasons
This code has initially run on 11 Gb GPU and should still run on 24Gb GPUs if you set train_crop_size
(and maybe also val_crop_size
) low enough (e.g., --train_crop_size='(128,128,128)'
).
Most of this code runs on magic since MONAI is leaking memory in the same way Pytorch does. For more details look into Project-MONAI/MONAI#6626 - the described problem is based on this code. In essence: Usually the problem is not that torch uses too much memory but rather the garbage collector does not clean the MONAI / torch objects often enough so that pointers to GPU memory remain in the memory for too long and hog GPU memory which in theory would already be free for use. Thus we need to encourage the gc to collect more often which is done with the GarbageCollector Handler.
If you run into unexpected OOMs (which can still happen sadly), try to:
- Increase the GarbageCollector collection steps
- Don't move any data on the GPU during the pre transforms
- There are a ton of debbuging options implemented in this repository, starting at the GPU_Thread and transforms that print the GPU memory usage (Note that the dataloader can spawn in a different process and thus use GPU memory independend of the main process. The memory won't pop up in the
torch.cuda.memory_summary()
) - Manually set
gpu_size
to small - Send me an Email and I'll try to help (matthiashadlichatyahoo.de)
Not working I think:
- When desperate enough try:
PYTORCH_NO_CUDA_MEMORY_CACHING=1
- In the past I restricted cupy with
CUPY_GPU_MEMORY_LIMIT="18%";
, however this appears to pin the cupy and thus overall increase the memory cupy used - Not recommended:
PYTORCH_CUDA_ALLOC_CONF=backend:cudaMallocAsync;
, breaks with cupy