trusted_datapaper_ds

/****

Copyright (C) 2022-2024 IRCAD France - All rights reserved. *
This file is part of Disrumpere. *
Disrumpere can not be copied, modified and/or distributed without
the express permission of IRCAD France. * ***/

trusted_datapaper_ds

The goal of the python project

This is the script developed to analyze the TRUSTED dataset, as well as to evaluate some baselines of deep-learning-based segmentations models, and registration models. Here is how to use it for "reproducibility". Each point is organized as:

config file: the configuration file to set before running the specific commands. The main fields you will have to update, are the paths. I recommend to not changing them too much
config variables to set: the fields to check in the configuration file before running the specific commands
command: the specific command to run

Installation of the python project

1. Clone the repo:

Commands:

git clone https://git.ircad.fr/wndzimbong/trusted_datapaper_ds.git

cd trusted_datapaper_ds

2. Environment setting:

Commands:

conda create -n trusted_env python=3.9

conda activate trusted_env

pip install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio==0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

pip install monai[all]==0.9.0

pip install SimpleITK==2.3.0

pip install pandas==1.3.0

pip install lxml

pip install connected-components-3d==3.12.4

pip install statsmodels==0.14.1

pip install natsort==8.4.0

pip install vtk==9.1.0

pip install jupyterlab

pip install open3d==0.15.2

pip install opencv-contrib-python==4.9.0.80

pip install plyfile==1.0.3

pip install numpymaxflow==0.0.6

pip install hydra-core==1.3.1

pip install telegram-send==0.35

pip install batchgenerators==0.23

pip install nnunet==1.7.1

pip install seaborn==0.12.2

pip install -e .

Data checking

Download the data folder "TRUSTED_dataset_for_submission"
Check the folder structure which is given in the file README.txt in the folder "TRUSTED_dataset_for_submission"

Data processing and analysis

This step describes the operations to run to estimate the ground-truth (GT) annotations, to quantitatively compare the estimated GT and the annotations provided by human experts, and to prepare the data used for baseline methods evaluation.

3. Notes:

config file: configs/anaconfig.yml
config variables to set: data_location
Check that the structure of the folder "TRUSTED_dataset_for_submission" and the convention naming used matches with the one used in "configs/anaconfig.yml" for the "PROVIDED" subfolders
In out python scripts, an annotator is represented by the values: "1" for annotator 1, "2" for annotator 2, "gt" for ground-truth segmentations, and "auto" for automatic segmentations

4. To estimate the ground-truth masks

config file: configs/anaconfig.yml
config variables to set: myUS_fusedmasks_location, myCT_fusedmasks_location, fuse_USmask, fuse_CTmask
command:

python src/trusted_datapaper_ds/dataprocessing/estimate_gtmasks.py --config_path configs/anaconfig.yml

Note: in the file "src/trusted_datapaper_ds/dataprocessing/estimate_gtmasks.py", line 69, there are some resizing option parameters, to avoid memory overload. I choose by default [512, 384, 384] just for the US data which are quite big. Depending on your memory, you can set different values.

5. To estimate the ground-truth landmarks

config file: configs/anaconfig.yml
config variables to set: myUS_fusedlandmarks_location, myCT_fusedlandmarks_location, fuse_USlandmark, fuse_CTlandmark
command:

python src/trusted_datapaper_ds/dataprocessing/estimate_gtldks.py --config_path configs/anaconfig.yml

Note: To obtain CT landmarks in the CT device coordinate systems (US landmarks are in the US device coordinate system)

config file: configs/anaconfig.yml
config variables to set: CTldks_over_images, annotator_mov_ldks
command:

python src/trusted_datapaper_ds/dataprocessing/put_landmarks_in_image_space.py --config_path configs/anaconfig.yml

6. To convert masks to meshes

config file: configs/anaconfig.yml
config variables to set: myDATA, CTmask_to_mesh, USmask_to_mesh, annotator_mask_to_mesh
command:

python src/trusted_datapaper_ds/dataprocessing/convert_mask_to_mesh.py --config_path configs/anaconfig.yml

Note: To obtain CT Meshes in the CT or US device coordinate systems

config file: configs/anaconfig.yml
config variables to set: CTmesh_over_images, USmesh_over_images, annotator_mov_mesh
command:

python src/trusted_datapaper_ds/dataprocessing/put_meshes_in_image_space.py --config_path configs/anaconfig.yml

7. To split CT masks

config file: configs/anaconfig.yml
config variables to set: myDATA, splitCTmask, annotator_splitCTmask
command:

python src/trusted_datapaper_ds/dataprocessing/splitCTmask.py --config_path configs/anaconfig.yml

8. To compare ground-truth estimated masks with annotator segmentations

config file: configs/anaconfig.yml
config variables to set: US_analysis_folder, CT_analysis_folder, usdata_eval and ctdata_eval
command:

python src/trusted_datapaper_ds/dataprocessing/groundtruth_eval.py --config_path configs/anaconfig.yml

9. To compute the statistical summary of the comparison in 8-

config file: configs/anaconfig.yml
config variables to set: US_analysis_folder, CT_analysis_folder, usdata_analysis and ctdata_analysis
command:

python src/trusted_datapaper_ds/dataprocessing/dataanalysis.py --config_path configs/anaconfig.yml

10. Data resizing

config file: configs/anaconfig.yml
config variables to set: usdata_resize, ctdata_resize, newsize, annotator_dataresize
command:

python src/trusted_datapaper_ds/dataprocessing/data_resizing.py --config_path configs/anaconfig.yml

Automatic segmentation

Segmentation processes in CT data

11. 3D UNet or VNet models training

config file: configs/ctsegconfig.yml
config variables to set:
- data_location, the path to the folder "TRUSTED_dataset_for_submission"
- img_location, the path to the folder containing the original CT images
- output_location, the main folder where you want to save your different training outputs (models weight, training graphs, ...), depending on the modality, the models, ... etc
Note: the learning rate, weight_decay, number of epochs are fixed in the file "src/trusted_datapaer_ds/segmentation/U_or_V_Net_training.py" (line 82-84)
command:

python src/trusted_datapaper_ds/segmentation/U_or_V_Net_training.py  --config_path configs/ctsegconfig.yml

12. 3D UNet or VNet inference

config file: configs/ctsegconfig.yml
config variables to set:
- trained_models_location, where your training folders are located
- list_UVnetmodels and list_training_target, depending on what you want to infer
- mask_seglocation, where the outputs masks in the original size are save by model and training target
command:

python src/trusted_datapaper_ds/segmentation/U_or_V_Net_inference.py  --config_path configs/ctsegconfig.yml

The outputs of these inferences have their original sizes

13. nnUNet or CoTr models training

Manually create a specific folder to train the model: ~/MedSeg/CT_DATA
Create 4 subfolders in ~/MedSeg/CT_DATA, named: CTUSimg_128_t, CTmask_a1_128, CTmask_a2_128 and CTmask_mf_128, containing respectively the resized images, resized masks from annotator1, resized masks from annotator2 and resized ground-truth masks
In the folder src/trusted_datapaper_ds/segmentation/nnunet_cotr/configs/dataset, there is four .yml to set properly depending on what you want to train. Particularly, set the values of:
- path.pth
- cv corresponding to what you want.
config file: src/trusted_datapaper_ds/segmentation/nnunet_cotr/configs/training/training_128_jz_v2.yaml
config variables to set:
- only_val and checkpoint.load to False, to launch a training
- pth, to set the training results location (Remember the change it if you change the data modality)
Important note about only_val and checkpoint.load:
- when only_val==False and checkpoint.load==False , a new training is launched
- when only_val==False and checkpoint.load==True , the last training continues using the "latest.pt", if it exists
- when only_val==True, the evaluation is run using the "best.pt" if it exists
command:

cd src/trusted_datapaper_ds/segmentation/nnunet_cotr

To launch the training of nnunet with single training target, with testing fold cv1, run the command:

python mainV2.py -m model=nnunet dataset=ct_128_simple_jz_v2 training=training_128_jz_v2 dataset.cv=cv1

To launch the training of nnunet with double training target, with testing fold cv1, run the command:

python mainDoubleV2.py -m model=nnunet dataset=ct_128_double_jz_v2 training=training_128_jz_v2 dataset.cv=cv1

In the command above, use: mainV2.py or mainDoubleV2.py, the model's value nnunet or cotr, and the dataset's value ct_128_simple_jz_v2 or ct_128_double_jz_v2 depending on what you need

14. nnUNet or CoTr models inference

config file: /src/trusted_datapaper_ds/segmentation/nnunet_cotr/configs/training/training_128_jz_v2.yaml
config variables to set: only_val and checkpoint.load to True
command: same command as for training

The segmentation masks with the size 128-128-128 will be located in the folder: "~~/MedSeg/CT_DATA/medseg_results/~~". For evaluation, you should move them into a folder corresponding to "seg128location" into "src/trusted_datapaper_ds/segmentation/configs/ctsegconfig.yml" and organize in the following structure:

├── CT_DATA/
│   ├── CToutput_mask128s/
│   │   ├── nnunet/
│   │   │   ├── single_target_training/
│   │   │   │   ├── 01_maskCT.nii.gz
│   │   │   │   ├── 02_maskCT.nii.gz
│   │   │   │   ├── ...
│   │   │   ├── double_targets_training/
│   │   │   │   ├── 01_maskCT.nii.gz
│   │   │   │   ├── 02_maskCT.nii.gz
│   │   │   │   ├── ...
│   │   ├── cotr/
│   │   │   ├── single_target_training/
│   │   │   │   ├── 01_maskCT.nii.gz
│   │   │   │   ├── 02_maskCT.nii.gz
│   │   │   │   ├── ...
│   │   │   ├── double_targets_training/
│   │   │   │   ├── 01_maskCT.nii.gz
│   │   │   │   ├── 02_maskCT.nii.gz
│   │   │   │   ├── ...

The outputs of these inferences have the size [128,128,128], and will be upsampled in the next step

Note: After finishing with training or/and inference with nnunet and cotr, go back project folder

cd ../../../..

The current dir must be the project folder

15. Automatic segmentation post-processing

Apply different post-processings (upsampling, meshing, splitCT) to the masks obtained in inference.

config file: configs/ctsegconfig.yml
config variables to set: ist_othermodels, list_training_target, upsampling, meshing, splitCT
command:

python src/trusted_datapaper_ds/segmentation/autoseg_postprocess.py  --config_path configs/ctsegconfig.yml

16. Segmentation evaluation

config file: configs/ctsegconfig.yml
config variables to set: segresults_folder, list_othermodels, list_UVnetmodels, list_training_target
command:

python src/trusted_datapaper_ds/segmentation/segmentation_evaluation.py  --config_path configs/ctsegconfig.yml

17. Segmentation results statistical analysis

config file: configs/seganalysis.yml
config variables to set: modality, refmodel, reftarget, list_segmodels, segresults_folder
command:

python src/trusted_datapaper_ds/segmentation/segmentation_analysis.py  --config_path configs/seganalysis.yml

Segmentation processes in US data

Follow the same steps with the corresponding command and config file obtained by changing "CT" to "US" or "ct" to "us"

Registration

18. Landmarks registration (global)

config file: configs/regconfig.yml
config variables to set: USldks_location, CTldks_location, ldks_model, transfo_location, noise_std, iternumb
command:

python src/trusted_datapaper_ds/registration/landmarks_registration.py  --config_path configs/regconfig.yml

It produces the global (or initial) registration matrices

19. Landmarks registration (global) + Surface-based registration (refinement)

config file: configs/regconfig.yml
config variables to set: regmethod, ldks_model, USldks_location, CTldks_location, noise_std, iternumb, refine_model, USautomesh_location, CTautomesh_location, transfo_location, BCPD_temp_folder, regpack_dir
command:

python src/trusted_datapaper_ds/registration/surface_registration.py  --config_path configs/regconfig.yml

It produces the refinement registration matrices computed with surface-based methods

20. Landmarks registration (global) + Intensity-based registration (refinement)

Install ImFusion Suite (You can use a trial version) - Get the installer file "BaseImFusionSuite-2.44.1-22.04.deb" and the License key(s) for Base ImFusion Suite, from ImFusion, - Open a terminal window and navigate to the directory containing the .deb file - run the command: sudo apt-get install -f - run the installation command: sudo dpkg --install BaseImFusionSuite-2.44.1-22.04.deb - launch the ImFusionSuite interface with the command: ImFusionSuite - activate your License by entering your License key - now you can use ImFusionSuite by launching the interface (command: ImFusionSuite), with our python script provided

Running - config file: configs/regconfig.yml - config variables to set: similarity_metric, ldks_model, USldks_location, CTldks_location, noise_std, iternumb, refine_model, USimg_location, CTimg_location, transfo_location, regpack_dir, imf_temp_folder - command: python src/trusted_datapaper_ds/registration/intensity_registration.py --config_path configs/regconfig.yml It produces the refinement registration matrices computed with intensity-based methods

21. Landmarks registration (global) + TransMorph (refinement)

For TransMorph_Affine and TransMorph_Rigid refinement (model training, then running on test sets), refer to the description made in our paper (the TRUSTED datapaper), and use TransMorph authors Github repo:
https://github.com/junyuchen245/TransMorph_Transformer_for_Medical_Image_Registration

22. Shift origin of CT images

The meshes and landmarks have been saved in different coordinate systems depending on the algorithm used (Marching Cubes implementation for meshes), and the landmarks annotation protocol. So, to have the CT images and their corresponding meshes and landmarks in the same coordinate system (here the mesh coordinate system), and then be able to transfer the estimated transforms by landmarks or point clouds registration to the images, I choose the easy option to translate the CT images from their initial origins to [0,0,0] (mesh coordinate system origin).

config file: configs/anaconfig.yml
config variables to set: shiftCTimg_origin, myDATA, data_location, CTimg_origin0_location, CT_tbackldk_transforms, CT_tbackmesh_transforms
command:

python src/trusted_datapaper_ds/dataprocessing/shiftCT.py  --config_path configs/anaconfig.yml

Note: There is certainly a more elegant way to do manage this transforms transfert between different coordinate systems, for example using the provided transforms to have landmarks and meshes in the US or CT device coordinate sytems. The creation of those transforms is given in option in the code of shifting origin.

23. Registration evaluation

config file: configs/regconfig.yml
config variables to set: CTimg_origin0_location, refinement_methods, transform_models, std_cases, transfo_location, CTgtmesh_location, USgtmesh_location, USldks_location, CTldks_location
command:

python src/trusted_datapaper_ds/registration/registration_evaluation.py  --config_path configs/regconfig.yml

24. Registration results statistical analysis

config file: configs/reganalysis.yml
config variables to set: refmethod, reftransform, list_regmethods, list_regtransforms, regresults_folder
command:

 python src/trusted_datapaper_ds/registration/registration_analysis.py  --config_path configs/reganalysis.yml

25. Boxplots of registration results statistical analysis

config file: configs/reganalysis.yml
config variables to set: regresults_folder, list_regmethods, list_regtransforms, list_std
command:

python src/trusted_datapaper_ds/registration/analysis_of_noising_on_registration.py  --config_path configs/reganalysis.yml

Project Organization

├── AUTHORS.md              <- List of developers and maintainers.
├── CHANGELOG.md            <- Changelog to keep track of new features and fixes.
├── CONTRIBUTING.md         <- Guidelines for contributing to this project.
├── Dockerfile              <- Build a docker container with `docker build .`.
├── LICENSE.txt             <- License as chosen on the command-line.
├── README.md               <- The top-level README for developers.
├── configs                 <- Directory for configurations of model & application.
├── data
│   ├── external            <- Data from third party sources.
│   ├── interim             <- Intermediate data that has been transformed.
│   ├── processed           <- The final, canonical data sets for modeling.
│   └── raw                 <- The original, immutable data dump.
├── docs                    <- Directory for Sphinx documentation in rst or md.
├── models                  <- Trained and serialized models, model predictions,
│                              or model summaries.
├── notebooks               <- Jupyter notebooks. Naming convention is a number (for
│                              ordering), the creator's initials and a description,
│                              e.g. `1.0-fw-initial-data-exploration`.
├── pyproject.toml          <- Build configuration. Don't change! Use `pip install -e .`
│                              to install for development or to build `tox -e build`.
├── references              <- Data dictionaries, manuals, and all other materials.
├── reports                 <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures             <- Generated plots and figures for reports.
├── scripts                 <- Analysis and production scripts which import the
│                              actual PYTHON_PKG, e.g. train_model.
├── setup.cfg               <- Declarative configuration of your project.
├── setup.py                <- [DEPRECATED] Use `python setup.py develop` to install for
│                              development or `python setup.py bdist_wheel` to build.
├── src
│   └── trusted_datapaper_ds <- Actual Python package where the main functionality goes.
├── tests                   <- Unit tests which can be run with `pytest`.
├── .coveragerc             <- Configuration for coverage reports of unit tests.
├── .isort.cfg              <- Configuration for git hook that sorts imports.
└── .pre-commit-config.yaml <- Configuration of pre-commit git hooks.

Note

This project has been set up using PyScaffold 4.5 and the dsproject extension 0.7.2.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
configs		configs
data		data
docs		docs
models		models
notebooks		notebooks
references		references
reports/figures		reports/figures
scripts		scripts
src		src
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
AUTHORS.md		AUTHORS.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py
subprocess		subprocess
tox.ini		tox.ini

License

IRCAD/trusted_code_nsd

Folders and files

Latest commit

History

Repository files navigation

trusted_datapaper_ds

The goal of the python project

Installation of the python project

1. Clone the repo:

2. Environment setting:

Data checking

Data processing and analysis

3. Notes:

4. To estimate the ground-truth masks

5. To estimate the ground-truth landmarks

6. To convert masks to meshes

7. To split CT masks

8. To compare ground-truth estimated masks with annotator segmentations

9. To compute the statistical summary of the comparison in 8-

10. Data resizing

Automatic segmentation

Segmentation processes in CT data

11. 3D UNet or VNet models training

12. 3D UNet or VNet inference

13. nnUNet or CoTr models training

14. nnUNet or CoTr models inference

15. Automatic segmentation post-processing

16. Segmentation evaluation

17. Segmentation results statistical analysis

Segmentation processes in US data

Registration

18. Landmarks registration (global)

19. Landmarks registration (global) + Surface-based registration (refinement)

20. Landmarks registration (global) + Intensity-based registration (refinement)

21. Landmarks registration (global) + TransMorph (refinement)

22. Shift origin of CT images

23. Registration evaluation

24. Registration results statistical analysis

25. Boxplots of registration results statistical analysis

Project Organization

Note

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages