Merge pull request #18 from neuropoly/nb/updated_all

Update pipeline for SCT's master branch version
neuropoly · Jul 6, 2023 · 2f8973a · 2f8973a
2 parents 62f0c8e + f642ca6
commit 2f8973a
Show file tree

Hide file tree

Showing 6 changed files with 1,029 additions and 798 deletions.
diff --git a/README.md b/README.md
@@ -1,14 +1,21 @@
 # Spinal cord MRI template
-Framework for creating unbiased MRI templates of the spinal cord.
+
+Framework for creating MRI templates of the spinal cord. The framework has two distinct pipelines, which has to be run sequentially: [Data preprocessing](#data-preprocessing) and [Template creation](#template-creation). 
+
+> **Important**
+> The framework has to be run independently for each contrast. In the end, the generated templates across contrasts should be perfectly aligned. This is what was done for the PAM50 template.
+
 
 ## Dependencies
-- [Spinal Cord Toolbox (SCT)](https://github.com/neuropoly/spinalcordtoolbox)
 
-SCT is used for all preprocessing steps, including extraction of centerline, generation of average centerline in the template space, and straightening/registration of all spinal cord images on the initial template space.
+### [Spinal Cord Toolbox (SCT)](https://spinalcordtoolbox.com/)
 
-- [ANIMAL registration framework, part of the IPL longitudinal pipeline](https://github.com/vfonov/nist_mni_pipelines)
+Installation instructions can be found [here](https://spinalcordtoolbox.com/user_section/installation.html).
+For the following repository, we used SCT in developper mode (commit `e740edf4c8408ffa44ef7ba23ad068c6d07e4b87`).
+
+### [ANIMAL registration framework](https://github.com/vfonov/nist_mni_pipelines)
 
-ANIMAL is used for generating the template, using iterative nonlinear deformation.
+ANIMAL, part of the IPL longitudinal pipeline, is used for generating the template, using iterative nonlinear deformation.
 The recommanded pipeline for generating a template of the spinal cord is the [nonlinear symmetrical template model](https://github.com/vfonov/nist_mni_pipelines/blob/master/examples/synthetic_tests/test_model_creation/scoop_test_nl_sym.py).
 
 Installation:
@@ -23,69 +30,179 @@ export PYTHONPATH="${PYTHONPATH}:path/to/nist_mni_pipelines/ipl/"
 export PYTHONPATH="${PYTHONPATH}:path/to/nist_mni_pipelines/ipl"
 ```
 
-You will also need to install `scoop` with: `pip install scoop`
-
-- [Minc Toolkit v2](http://bic-mni.github.io/)
+### [Minc Toolkit v2](http://bic-mni.github.io/)
 
 The Minc Toolkit is a dependency of the template generation process.
 
-On OSX, you may need to recompile Minc Toolkit from source to make sure all libraires are linked correctly.
+You will also need to install `scoop` with: `pip install scoop`
 
-- [minc2_simple](https://github.com/vfonov/minc2-simple)
+On macOs, you may need to recompile Minc Toolkit from source to make sure all libraires are linked correctly.
+
+On Linux: TODO
+
+### [minc2_simple](https://github.com/vfonov/minc2-simple)
 
 Install this python library in SCT python.
 
-## Get started
-The script "pipeline.py" contains several functions to preprocess spinal cord MRI data. Preprocessing includes:
-1) extracting the spinal cord centerline and compute the vertebral distribution along the spinal cord, for all subjects.
-2) computing the average centerline, by averaging the position of each intervertebral disks. The average centerline of the spinal cord is straightened and merged with the ICBM152 template.
-3) generating the initial template space, based on the average centerline and positions of intervertebral disks.
-4) straightening of all subjects on the initial template space
+## Dataset structure
+The dataset should be arranged according to the BIDS convention. Using the two examples subjects listed in the `configuration.json` template file, this would be as follows:
+```
+dataset/
+└── dataset_description.json
+└── participants.tsv  <-------------------------------- Metadata describing subjects attributes e.g. sex, age, etc.
+└── sub-01  <------------------------------------------ Folder enclosing data for subject 1
+└── sub-02
+└── sub-03
+    └── anat <----------------------------------------- `anat` can be replaced by the value of `data_type` in configuration.json
+        └── sub-03_T1w.nii.gz  <----------------------- MRI image in NIfTI format; `_T1w` can be replaced by the value of `suffix_image` in configuration.json
+        └── sub-03_T1w.json  <------------------------- Metadata including image parameters, MRI vendor, etc.
+        └── sub-03_T2w.nii.gz
+        └── sub-03_T2w.json
+└── derivatives
+    └── labels
+        └── sub-03
+            └── anat
+                └── sub-03_T1w_label-SC_seg.nii.gz  <-- Spinal cord segmentation; `_T1w` can be replaced by the value of `suffix_image` in configuration.json
+                └── sub-03_T1w_label-disc.nii.gz  <---- Disc labels; `_T1w` can be replaced by the value of `suffix_image` in configuration.json
+                └── sub-03_T2w_label-SC_seg.nii.gz
+                └── sub-03_T2w_label-disc.nii.gz
+```
+
+
+## Step 1. Data preprocessing
+
+This pipeline includes the following steps:\
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**1.1** Install SCT;\
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**1.2** Edit configuration file;\
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**1.3** Segment spinal cord and vertebral discs;\
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**1.4** Quality control (QC) labels;\
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**1.5** Normalize spinal cord across subjects;\
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**1.6** Quality control (QC) spinal cord normalization across subjects.
+
+### 1.1 Install SCT
+
+SCT is used for all preprocessing steps. The current version of the pipeline uses SCT development version (commit `e740edf4c8408ffa44ef7ba23ad068c6d07e4b87`) as we prepare for the release of SCT 6.0.
+
+Once SCT is installed, make sure to activate SCT's virtual environment because the pipeline will use SCT's API functions.
+
+```
+source ${SCT_DIR}/python/etc/profile.d/conda.sh
+conda activate venv_sct
+```
+
+### 1.2 Edit configuration file
+
+Copy the file `configuration_default.json` and rename it as `configuration.json`. Edit it and modify according to your setup:
+
+- `path_data`: Absolute path to the input [BIDS dataset](#dataset-structure); The path should end with `/`.
+- `subjects`: List of subjects to include in the preprocessing, separated with comma.
+- `data_type`: [BIDS data type](https://bids-standard.github.io/bids-starter-kit/folders_and_files/folders.html#datatype), same as subfolder name in dataset structure. Typically, it should be "anat".
+- `contrast`: Contrast to be used by `sct_deepseg_sc` function.
+- `suffix_image`: Suffix for image data, after subject ID but before file extension (e.g. `_rec-composed_T1w` in `sub-101_rec-composed_T1w.nii.gz`).
+- `first_disc`: Integer value corresponding to the label of the first vertebral disc you want present in the template (see [spinalcordtoolbox labeling conventions](https://spinalcordtoolbox.com/user_section/tutorials/registration-to-template/vertebral-labeling/labeling-conventions.html)).
+- `last_disc`: Integer value corresponding to the label of the last vertebral disc you want present in the template.
+
+> **Note**
+> Note that SCT functions treat your images with bright CSF as "T2w" (i.e. `t2` option) and dark CSF as "T1w" (i.e. `t1` option). You can therefore still use SCT even if your images are not actually T1w and T2w.
+
+> **Note**
+> If you wish to make a template that does not align discs across subjects, please open an [issue](https://github.com/neuropoly/template/issues) and we will follow-up with you.
+
+### 1.3 Segment spinal cord and vertebral discs
+
+Run script:
+```
+sct_run_batch -jobs 6 -path-data "/PATH/TO/dataset" -script preprocess_segment.sh -path-output "/PATH/TO/results"
+```
+
+> **Note**
+> Replace values appropriately based on your setup (eg: -jobs 6 means that 10 CPU-cores are used. For more details, run `sct_run_batch -h`).
+> If you wish to exclude subjects, add flag "-exclude-list". Example: `-exclude-list sub-107 sub-125`
 
-A small dataset, containing 5 T1w and T2w images, is available [here](https://osf.io/h73cm/) and is used as example for preprocessing. The dataset is downloaded automatically by the preprocessing script. To use your own dataset and images, follow the section [How to generate your own template?](#how-to-generate-your-own-template). The data preprocessing is performed by running the script `pipeline.py`, after making sure to use SCT python:
+### 1.4 Quality control (QC) labels
 
+* Spinal cord segmentation (or centerlines) and disc labels can be displayed by opening: `/PATH/TO/results/qc/index.html`
+* See [tutorial](https://spinalcordtoolbox.com/user_section/tutorials/registration-to-template/vertebral-labeling.html) for tips on how to QC and fix segmentation (or centerline) and/or disc labels manually.
+
+
+### 1.5 Normalize spinal cord across subjects
+
+`preprocess_normalize.py` contains several functions to normalize the spinal cord across subjects, in preparation for template generation. More specifically:
+* Extracting the spinal cord centerline and compute the vertebral distribution along the spinal cord, for all subjects,
+* Computing the average centerline, by averaging the position of each intervertebral discs. The average centerline of the spinal cord is straightened,
+* Generating the initial template space, based on the average centerline and positions of intervertebral discs,
+* Straightening of all subjects' spinal cord on the initial template space.
+
+Run:
 ```
-source sct_launcher
-python pipeline.py
+python preprocess_normalize.py configuration.json
 ```
 
+### 1.6 Quality control (QC) spinal cord normalizatio across subjects
+
 One the preprocessing is performed, please check your data. The preprocessing results should be a series of straight images registered in the same space, with all the vertebral levels aligned with each others.
 
-Now, you can generate the template using the IPL pipeline with the following command, where N has to be replace by the number of subjects:
 
+## Step 2. Template creation
+
+### Dependencies for template generation (see [dependencies](#dependencies_anchor))
+- [ANIMAL registration framework, part of the IPL longitudinal pipeline](https://github.com/vfonov/nist_mni_pipelines)
+- `scoop` (PyPI)
+- [Minc Toolkit v2](http://bic-mni.github.io/)
+- [minc2_simple](https://github.com/vfonov/minc2-simple)
+
+Now, you can generate the template using the IPL pipeline with the following command, where N has to be replace by the number of subjects:
 ```
 python -m scoop -n N -vvv generate_template.py
 ```
 
-## How to generate your own template?
-The template generation framework can be configured by the file "configuration.json", that includes the following variables:
-- "path_data": absolute path to the dataset, including all images [correctly structured](#dataset-structure).
-- "path_template": absolute path to the output folder, in which the final template will be placed.
-- "subjects": list of subjects names, that must be the same as folder names in the dataset structure.
-- "suffix_centerline": suffix for binary centerline.
-- "suffix_disks": suffix for binary images of the intervertebral disks labeling.
-- "suffix_segmentation": optional suffix for the spinal cord segmentation, that can be used to register the segmentation on the template space and generate probabilistic atlases.
+### Setting up on Canada's Alliance CPU cluster to generate template
+
+It is recommended to run the template generation on a large cluster. If you are in Canada, you could make use of [the Alliance](https://alliancecan.ca/en) (formerly Compute Canada), which is a bunch of CPU nodes accessible to researchers in Canada. **Once the preprocessing is complete**, you will generate the template with `generate_template.py`. This will require minctoolkit v2, minc2simple and nist-mni-pipelines. The easiest way to set up is to use Compute Canada and set up your virtual environment (without spinal cord toolbox, since your data should have already been preprocessed by now) as follows:
+
+a) Load the right modules and install packages from pip wheel
+```
+module load StdEnv/2020  gcc/9.3.0 minc-toolkit/1.9.18.1 python/3.8.10
+pip install --upgrade pip
+pip install scoop
+```
+
+b) Set up NIST-MNI pipelines
+```
+git clone https://github.com/vfonov/nist_mni_pipelines.git
+nano ~/.bashrc
+```
+add the following:
+```
+export PYTHONPATH="${PYTHONPATH}:/path/to/nist_mni_pipelines"
+export PYTHONPATH="${PYTHONPATH}:/path/to/nist_mni_pipelines/"
+export PYTHONPATH="${PYTHONPATH}:/path/to/nist_mni_pipelines/ipl/"
+export PYTHONPATH="${PYTHONPATH}:/path/to/nist_mni_pipelines/ipl"
+```
+```
+source ~/.bashrc
+```
+c) Minc2simple
+```
+pip install "git+https://github.com/NIST-MNI/minc2-simple.git@develop_new_build#subdirectory=python"
+``` 
+
+d) Create `my_job.sh`
+```
+#!/bin/bash
+python -m scoop -vvv generate_template.py
+```
+
+e) Batch on Alliance Canada
+```
+sbatch --time=24:00:00  --mem-per-cpu 4000 my_job.sh # will probably require batching several times, depending on number of subjects
+```
+
+
+## Additional information
+
+To have the generated template registered to an existing space (eg, ICBM152), please open an [issue](https://github.com/neuropoly/template/issues) and we will follow-up with you.
 
-## Dataset structure
-The dataset should be arranged in a structured fashion, as the following:
-- subject_name/
-    - t1/
-        - t1.nii.gz
-        - t1{suffix_centerline}.nii.gz
-        - t1{suffix_disks}.nii.gz
-        - t1{suffix_segmentation}.nii.gz
-    - t2/
-        - t2.nii.gz
-        - t2{suffix_centerline}.nii.gz
-        - t2{suffix_disks}.nii.gz
-        - t2{suffix_segmentation}.nii.gz
-    - dmri/
-        - dmri.nii.gz
-        - dmri{suffix_centerline}.nii.gz
-        - dmri{suffix_disks}.nii.gz
-        - bvecs.txt
-        - bvals.txt
-    - ...
 
 ## Licence
 This repository is under a MIT licence.
diff --git a/configuration_default.json b/configuration_default.json
@@ -0,0 +1,9 @@
+{
+	"path_data": "/path/to/data/",
+	"subjects": "sub-001, sub-002, sub-003",
+	"data_type": "anat",
+	"contrast": "t1",
+	"suffix_image": "_T1w",
+	"first_disc": "1",
+	"last_disc": "26"
+}
diff --git a/pipeline.py b/pipeline.py