Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New dataset nih-ms-mp2rage #280

Closed
Nilser3 opened this issue Nov 11, 2023 · 50 comments
Closed

New dataset nih-ms-mp2rage #280

Nilser3 opened this issue Nov 11, 2023 · 50 comments

Comments

@Nilser3
Copy link

Nilser3 commented Nov 11, 2023

Description:

Dataset of 200 MS patients from NIH, acquired at 3T MP2RAGE sequence; contains (inv1, inv2, UNIT1 and T1map contrast) in large fields-of-view (FOV) containing head and spinal cord from C1 to ~C5.
Resolution: 1.0 x 1.0 x 1.0 mm3

Contains no MS lesion or SC segmentations.

Details:

Current arborescence:

Transfer/
├── 1_inv1.nii.gz
├── 1_inv2.nii.gz
├── 1_t1_images.nii.gz
├── 1_uniden.nii.gz

Proposed BIDS format:

├── sub-NIH001
│   └── anat
│       ├── sub-NIH001_T1map.nii.gz
│       ├── sub-NIH001_UNIT1.nii.gz
│       ├── sub-NIH001_inv-1_MP2RAGE.nii.gz
│       └── sub-NIH001_inv-2_MP2RAGE.nii.gz

I was guided by Organization of qMRI data in BIDS, but it is not clear to me how to organise inv1 and inv2 (because I don't know if these images are phase or mag).

@jcohenadad
Copy link
Member

jcohenadad commented Nov 11, 2023

For the name, always small letters, and i would start with the site, pathology and contrast, so: nih-ms-mp2rage. Can you please create the label for this issue as well? Thank you.

For subject name I would go with: sub-nih001

@jcohenadad
Copy link
Member

it is not clear to me how to organise inv1 and inv2 (because I don't know if these images are phase or mag)

Look at them, you should figure you. Also read the MP2RAGE paper.

@Nilser3
Copy link
Author

Nilser3 commented Nov 12, 2023

Thanks you,
Ok, looking Marques et al., 2010 , we have magnitude images for TI1 and TI2:

Then we should organise inv1 and inv2 in sub-nih001/anat/ as :

├── sub-nih001
│   └── anat
│       ├── sub-nih001_inv-1_part-mag_MP2RAGE.nii.gz
│       └── sub-nih001_inv-2_part-mag_MP2RAGE.nii.gz

and following qMRI data in BIDS, organise T1map and UNIT1 in derivatives/qMRI-software-name/sub-nih001/anat as :

├── derivatives
│   └── qMRI-software-name
│       ├── sub-nih001
│       │   └── anat
│       │       ├── sub-nih001_T1map.nii.gz
│       │       ├── sub-nih001_UNIT1.nii.gz

@Nilser3
Copy link
Author

Nilser3 commented Nov 12, 2023

I have been exploring SC segmentation approaches

sct_deepseg_sc in T1map and UNIT1 images :

sct_deepseg_sc -i sub-nih001_UNIT1.nii.gz -c t1

image

sct_deepseg_sc -i sub-nih001_T1map.nii.gz -c t2

image

sct_deepseg_sc -i sub-nih001_T1map.nii.gz -c t2s

image

contrast-agnostic model in T1map and UNIT1 images :

some differences are observed, (after binarisation):
SC_CA
Obtaining most conservative SC segmentation from UNIT1 (blue):

image

Maybe we should keep the UNIT1 contrast-agnostic segmentation as a first SC segmentation ?

@jcohenadad
Copy link
Member

Maybe we should keep the UNIT1 contrast-agnostic segmentation as a first SC segmentation ?

👍
Excellent investigations @Nilser3

@Nilser3
Copy link
Author

Nilser3 commented Nov 13, 2023

The data has been organized in BIDS :

├── sub-nih001
│   └── anat
│       ├── sub-nih001_inv-1_part-mag_MP2RAGE.nii.gz
│       └── sub-nih001_inv-2_part-mag_MP2RAGE.nii.g

├── derivatives
│   ├── labels
│   │   ├── sub-nih001
│   │   │   └── anat
│   │   │       ├── sub-nih001_UNIT1_label-SC_seg.json
│   │   │       └── sub-nih001_UNIT1_label-SC_seg.nii.gz
│   └── qMRI-software-name
│       ├── sub-nih001
│       │   └── anat
│       │       ├── sub-nih001_T1map.nii.gz
│       │       └── sub-nih001_UNIT1.nii.gz

bids-validator
(base) nilaia@rosenberg:~/data_nvme_nilaia/NIH_mp2rage/DataForMontreal$ bids-validator nih-ms-mp2rage --verbose
[email protected]
bids-specification@disable
(node:1605962) Warning: Closing directory handle on garbage collection
(Use `node --trace-warnings ...` to show where the warning was created)
        1: [WARN] Not all subjects/sessions/runs have the same scanning parameters. (code: 39 - INCONSISTENT_PARAMETERS)
                ./sub-nih015/anat/sub-nih015_inv-1_part-mag_MP2RAGE.nii.gz
                         The most common set of dimensions is: 176,240,256 (voxels), This file has the dimensions: 20,240,256 (voxels).
                ./sub-nih029/anat/sub-nih029_inv-1_part-mag_MP2RAGE.nii.gz
                         The most common set of dimensions is: 176,240,256 (voxels), This file has the dimensions: 156,240,256 (voxels).
                ./sub-nih029/anat/sub-nih029_inv-2_part-mag_MP2RAGE.nii.gz
                         The most common set of dimensions is: 176,240,256 (voxels), This file has the dimensions: 150,240,256 (voxels).
                ./sub-nih093/anat/sub-nih093_inv-1_part-mag_MP2RAGE.nii.gz
                         The most common resolution is: 1.00mm x 1.00mm x 1.00mm, This file has the resolution: 1.20mm x 1.00mm x 1.00mm.
                ./sub-nih093/anat/sub-nih093_inv-2_part-mag_MP2RAGE.nii.gz
                         The most common resolution is: 1.00mm x 1.00mm x 1.00mm, This file has the resolution: 1.20mm x 1.00mm x 1.00mm.
                ./sub-nih099/anat/sub-nih099_inv-1_part-mag_MP2RAGE.nii.gz
                         The most common resolution is: 1.00mm x 1.00mm x 1.00mm, This file has the resolution: 1.20mm x 1.00mm x 1.00mm.
                ./sub-nih099/anat/sub-nih099_inv-2_part-mag_MP2RAGE.nii.gz
                         The most common resolution is: 1.00mm x 1.00mm x 1.00mm, This file has the resolution: 1.20mm x 1.00mm x 1.00mm.

        Please visit https://neurostars.org/search?q=INCONSISTENT_PARAMETERS for existing conversations about this issue.

        2: [WARN] The Authors field of dataset_description.json should contain an array of fields - with one author per field. This was triggered because there are no authors, which will make DOI registration from dataset metadata impossible. (code: 113 - NO_AUTHORS)

        Please visit https://neurostars.org/search?q=NO_AUTHORS for existing conversations about this issue.

        Summary:                 Available Tasks:        Available Modalities:
        404 Files, 3.81GB                                MRI
        200 - Subjects
        1 - Session


        If you have any questions, please post on https://neurostars.org/tags/bids.

@mguaypaq could you please create a repository for nih-ms-mp2rage

Thank you,

@mguaypaq
Copy link
Member

@Nilser3 done, I set up the repository and gave you write access. Let me know once you have pushed your data to a branch.

@Nilser3
Copy link
Author

Nilser3 commented Nov 13, 2023

Done @mguaypaq ,
I have pushed to my branch nlm/initial_data

@mguaypaq
Copy link
Member

I added a commit to the branch nlm/initial_data to fix various small issues, but I still have some questions:

  • One of the derivatives folders is derivatives/qMRI-software-name/, which looks like a placeholder. Is there a better name for this derivative?

  • Also, this derivative doesn't contain any description or any JSON sidecars. What is in there?

  • I see that this is a qMRI file collection. According to this section of the BIDS standard, there are some metadata fields that should be present in some JSON files somewhere, can you add it:

    • FlipAngle
    • InversionTime
    • RepetitionTimeExcitation
    • RepetitionTimePreperation
    • NumberShots
    • MagneticFieldStrength
    • (Optionally:) EchoTime

    Looking at an example MP2RAGE dataset, it looks like the values which are the same for every subject can be in a file MP2RAGE.json at the root of the dataset, and the other values should in JSON sidecars in the root dataset (not in the derivatives).

Other than that, git-annex and bids-validator are both happy with this dataset.

@Nilser3
Copy link
Author

Nilser3 commented Nov 17, 2023

I added a commit to the branch nlm/initial_data to fix various small issues, but I still have some questions:

Thanks you @mguaypaq

  • One of the derivatives folders is derivatives/qMRI-software-name/, which looks like a placeholder. Is there a better name for this derivative?

I agree, we could put the name of the vendor, Siemens?

  • Also, this derivative doesn't contain any description or any JSON sidecars. What is in there?

  • I see that this is a qMRI file collection. According to this section of the BIDS standard, there are some metadata fields that should be present in some JSON files somewhere, can you add it:

    • FlipAngle
    • InversionTime
    • RepetitionTimeExcitation
    • RepetitionTimePreperation
    • NumberShots
    • MagneticFieldStrength
    • (Optionally:) EchoTime

@jcohenadad I think we should request the MP2RAGE acquisition parameters

@jcohenadad jcohenadad changed the title New dataset NIH-mp2rage New dataset nih-ms-mp2rage Nov 17, 2023
@jcohenadad
Copy link
Member

One of the derivatives folders is derivatives/qMRI-software-name/, which looks like a placeholder. Is there a better name for this derivative?

I agree, we could put the name of the vendor, Siemens?

Is this folder empty? If so, why simply not getting rid of it? @Nilser3 what are your arguments for calling it Siemens? What do you want to put inside it?

Also, this derivative doesn't contain any description or any JSON sidecars. What is in there?
@jcohenadad I think we should request the MP2RAGE acquisition parameters

This group has been kind enough to spend some time gathering the data and sending them to us-- it is not appropriate to ask them (ie busy neuroradiologists) to spend more time to reconvert the DICOM using dcm2niix to generate the sidecar JSON fields. We just need to deal with this dataset without the JSON. @Nilser3 there are already some info that you can retrieve from the NIfTI metadata and data

@mguaypaq
Copy link
Member

Is this folder empty? If so, why simply not getting rid of it? @Nilser3 what are your arguments for calling it Siemens? What do you want to put inside it?

No, the folder isn't empty; it contains half of the image files: the _UNIT1.nii.gz files and the _T1map.nii.gz files.

(The other half of the image files is the _inv-1_part-mag_MP2RAGE.nii.gz files and the _inv-2_part-mag_MP2RAGE.nii.gz, as illustrated in @Nilser3's diagram above.)

@jcohenadad
Copy link
Member

ok that's a problem-- i would put the uni and t1map under the source images as done for the marseille and basel data-- i know this is not bids compliant but otherwise it will screw up our training scripts whicih assume input data is under source

@Nilser3
Copy link
Author

Nilser3 commented Nov 17, 2023

Thank you @jcohenadad and @mguaypaq
following the preceding comments, I have deleted the qMRI-software-name folder and
I present this diagram:

├── sub-nih001
│   └── anat
│       ├── sub-nih001_inv-1_part-mag_MP2RAGE.nii.gz
│       ├── sub-nih001_inv-2_part-mag_MP2RAGE.nii.gz
│       ├── sub-nih001_T1map.nii.gz
│       └── sub-nih001_UNIT1.nii.gz
├── derivatives
│   └── labels
│       ├── sub-nih001
│       │   └── anat
│       │       ├── sub-nih001_UNIT1_label-SC_seg.json
│       │       └── sub-nih001_UNIT1_label-SC_seg.nii.gz

I have worked in the same branch that @mguaypaq had worked : nlm/initial_data
I'm ready for PR

@jcohenadad
Copy link
Member

great! let me know when i can review-- thx!

@mguaypaq
Copy link
Member

Alright, I think you can review the branch nlm/initial_data now, @jcohenadad. All the files are in the right place, and git-annex and bids-validator are both happy.

@jcohenadad
Copy link
Member

The UNI image (typically used for lesion segmentation) looks very different than the ones I've seen in the past. The background is black, and the contrast looks much closer to a T1w MPRAGE. See notably the contrast in the vertebrae and discs. See below a comparison of the UNI from NIH, Basel and Marseille:

Screenshot 2023-11-18 at 12 52 09 PM

Screenshot 2023-11-18 at 12 48 26 PM

Screenshot 2023-11-18 at 12 48 13 PM

@jcohenadad
Copy link
Member

@Nilser3 the comparison you did #280 (comment) is irrelevant because sub-nih001 has excessive motion.

@Nilser3
Copy link
Author

Nilser3 commented Nov 18, 2023

@jcohenadad
I was also surprised by the contrast of the images,
originally the UNIT1 images were called XX_uniden.nii.gz maybe because of the denoising they lose that background effect.

@jcohenadad
Copy link
Member

Quick google search --> https://cbs-discourse.uwo.ca/t/removing-background-noise-from-mp2rage-images/101

The information 'den' should probably have not been removed. I've emailed the collaborators to see if they still have the UNI data. Beyond background, the contrast is questionable. @Nilser3 I encourage you to always dig (eg google search, etc.). These are important considerations which should not have been silenced.

@jcohenadad
Copy link
Member

we won't have the non-den UNI, therefore we move forward with these UNI-den images.

@Nilser3
Copy link
Author

Nilser3 commented Nov 27, 2023

Thanks you,
Ok, according to the Preprocessed or cleaned data of BIDS, we could add a description flag for denoised images.
it would look like this:

├── sub-nih001
│   └── anat
│       ├── sub-nih001_inv-1_part-mag_MP2RAGE.nii.gz
│       ├── sub-nih001_inv-2_part-mag_MP2RAGE.nii.gz
│       ├── sub-nih001_T1map.nii.gz
│       └── sub-nih001_desc-denoising_UNIT1.nii.gz

@jcohenadad @mguaypaq If you agree I can proceed

@mguaypaq
Copy link
Member

Since _T1map and _UNIT1 files would normally be under derivatives/ (although we decided not to do it this way), it makes sense that we could apply derivative entities like _desc-<label> to those files. I would suggest _desc-denoised instead of _desc-denoising.

@Nilser3
Copy link
Author

Nilser3 commented Nov 27, 2023

I agree @mguaypaq
then it would be denoised for UNIT1 and T1map?
like:

sub-nih001_desc-denoised_UNIT1.nii.gz
sub-nih001_desc-denoised_T1map.nii.gz

@jcohenadad
Copy link
Member

I would suggest _desc-denoised instead of _desc-denoising

👍

if we go that route i suggest we also add this use-case to our internal SOP. Also tagging @valosekj

@jcohenadad
Copy link
Member

then it would be denoised for UNIT1 and T1map?

look at the T1map and tell me if that makes sense...

@Nilser3
Copy link
Author

Nilser3 commented Nov 27, 2023

then it would be denoised for UNIT1 and T1map?

look at the T1map and tell me if that makes sense...

ok, it doesn't make sense to put denoised flag on both images:

image

So, denoised flag just for UNIT1

@Nilser3
Copy link
Author

Nilser3 commented Nov 28, 2023

OK, when organizing the data as in the schema :

├── sub-nih001
│   └── anat
│       └── sub-nih001_desc-denoised_UNIT1.nii.gz
├── derivatives
│   └── labels
│       ├── sub-nih001
│       │   └── anat
│       │       ├── sub-nih001_desc-denoised_UNIT1_label-SC_seg.json
│       │       └── sub-nih001_desc-denoised_UNIT1_label-SC_seg.nii.gz

and checking the bids-validateor, I have an ERROR.

bids-validator nih-ms-mp2rage/
[email protected]
bids-specification@disable
(node:48428) Warning: Closing directory handle on garbage collection
(Use `node --trace-warnings ...` to show where the warning was created)
        1: [ERR] Files with such naming scheme are not part of BIDS specification. This error is most commonly caused by typos in file names that make them not BIDS compatible. Please consult the specification and make sure your files are named correctly. If this is not a file naming issue (for example when including files not yet covered by the BIDS specification) you should include a ".bidsignore" file in your dataset (see https://github.com/bids-standard/bids-validator#bidsignore for details). Please note that derived (processed) data should be placed in /derivatives folder and source data (such as DICOMS or behavioural logs in proprietary formats) should be placed in the /sourcedata folder. (code: 1 - NOT_INCLUDED)
                ./sub-nih001/anat/sub-nih001_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih001_desc-denoised_UNIT1.nii.gz

@mguaypaq I don't know if this error can be resolved, or if we have to see another nomenclature.

commit d8027709ba35c282cf155193d2f4cc0c65e9caaa
branch: nlm/initial_data

@jcohenadad
Copy link
Member

good news! Haris said he can share the UNI (non-den) data:

I collected all uni images except for one (this is missing and we would have to generate it from scratch, this might take a bit more time to find the code for it) and shared them with you via NIH Box.

I will follow-up on this...

@jcohenadad
Copy link
Member

jcohenadad commented Dec 5, 2023

@Nilser3 I copied the non-den data under: duke:temp/jcohen/20231205_NIH/Only_Uni.zip

I suggest to keep both the den and non-den

@NathanMolinier
Copy link

I suggest that we keep the denoised images inside a derived folder since we have now access to non-denoised images.

Also, I'm not sure we should use the desc entity to characterize the denoised images since we are planning to use this same entity to differentiate MRI contrasts. What do you think @mguaypaq ?

@jcohenadad
Copy link
Member

Also, I'm not sure we should use the desc entity to characterize the denoised images since we are planning to use this same entity to differentiate MRI contrasts. What do you think @mguaypaq ?

under discussion: #282 (comment)

@jcohenadad
Copy link
Member

@Nilser3 I've also added the file ClinicalInfo_200_MP2RAGE.csv, which, as the name indicates, contains clinical info. This needs to be added to the BIDS folder. @valosekj and @mguaypaq have some insights on how to add these info in a BIDS dataset (we discussed it in the past-- it would be good to add whatever solution we came up with in the intranet)

@valosekj
Copy link
Member

valosekj commented Dec 6, 2023

@Nilser3 I've also added the file ClinicalInfo_200_MP2RAGE.csv, which, as the name indicates, contains clinical info. This needs to be added to the BIDS folder. @valosekj and @mguaypaq have some insights on how to add these info in a BIDS dataset (we discussed it in the past-- it would be good to add whatever solution we came up with in the intranet)

For our other datasets, I have used two possible scenarios:

  1. Include clinical info directly into participants.tsv (and participants.json); see for example sci-colorado/participants.tsv:
valosek@macbook-pro:~/data/data.neuro.polymtl.ca/sci-colorado$ head -1 participants.tsv
participant_id	age	sex	level_of_injury	height	weight	bmi	time_between_mri_and_injury	able_to_walk	initial_LEMS_R ...
  1. Keep clinical info in a separate file(s) and put the file(s) under the phenotype folder; see for example dcm-zurich/phenotype :
valosek@macbook-pro:~/data/data.neuro.polymtl.ca/dcm-zurich$ tree phenotype                                             
phenotype
├── README.md
├── anatomical_data.xlsx
├── clinical_scores.xlsx
├── electrophysiological_measurements.xlsx
├── motion_data.xlsx
└── motion_data_maximum_stenosis.xlsx

Both scenarios have their pros and cons; summary here.

@Nilser3
Copy link
Author

Nilser3 commented Dec 6, 2023

Thanks you @valosekj

  1. Include clinical info directly into participants.tsv (and participants.json); see for example sci-colorado/participants.tsv:

This scenario has been done,

  1. Keep clinical info in a separate file(s) and put the file(s) under the phenotype folder; see for example dcm-zurich/phenotype :

I find it interesting to save the information separately for each subject (there are 200 patients), however for this data there is only some information such as clinical score (see: participants.json)

participants.json
{
    "participant_id": {
        "LongName": "Participant ID",
        "Description": "Unique ID"
    },
    "sex": {
        "LongName": "Participant sex",
        "Description": "Sex",
        "Levels": {
            "M": "male",
            "F": "female"
        }
    },
    "date_of_birth": {
        "LongName": "Date of birth",
        "Description": "yyyy-mm-dd"
    },
    "age_2023": {
        "LongName": "Participant age at 2023",
        "Description": "yy",
        "Units": "years"
    },
    "race": {
        "LongName": "Ethnic group / race",
        "Description": "race"
    },
    "date_of_scan": {
        "LongName": "Date of scan",
        "Description": "yyyy-mm-dd"
    },
    "pathology": {
        "LongName": "Pathology name",
        "Description": "The diagnosis of pathology of the participant",
        "Levels": {
            "MS Spectrum": "Multiple Sclerosis"
        }
    },
    "phenotype": {
        "LongName": "Phenotype name",
        "Description": "The MS phenotype of the participant",
        "Levels": {
            "RRMS": "Relapsing-Remitting Multiple Sclerosis",
            "PPMS": "Primary Progressive Multiple Sclerosis",
            "SPMS": "Secondary Progressive Multiple Sclerosis"
        }
    },
    "onset": {
        "LongName": "Disease onset date",
        "Description": "yyyy-mm-dd"
    },
    "Relapses_Past12m_OR_PreviousVisit": {
        "LongName": "Relapses in past 12 months or since previous visit",
        "Description": "Number"
    },
    "Current_DMT": {
        "LongName": "Current disease-modifying therapies",
        "Description": "Current therapy"
    },
    "Date_Testing_Correct": {
        "LongName": "Date testing correct",
        "Description": "yyyy-mm-dd"
    },
    "Scripps": {
        "LongName": "Scripps neurologic rating scale (SNRS)",
        "Description": "Scripps SCORE (Maximum = 100)"
    },
    "EDSS": {
        "LongName": "Expanded Disability Status Scale (EDSS)",
        "Description": "EDSS score (Maximum = 10)"
    },
    "Nine_HPT_Right": {
        "LongName": "Nine-Hole Peg Test on right hand",
        "Description": "Nine-Hole Peg Test score (measure finger dexterity in seconds)"
    },
    "Nine_HPT_Left": {
        "LongName": "Nine-Hole Peg Test on left hand",
        "Description": "Nine-Hole Peg Test score (measure finger dexterity in seconds)"
    },
    "Handedness": {
        "LongName": "Handedness",
        "Description": "Manual dominance",
        "Levels": {
            "Left": "Left-handed",
            "Right": "Right-handed"
        }
    },
    "T25FW": {
        "LongName": "Timed 25-Foot Walk (T25FW)",
        "Description": "T25FW score",
        "Units": "seconds or Not_Completed"
    },
    "Assistive_Device": {
        "LongName": "Assistive device",
        "Description": "Assistive device",
        "Levels": {
            "None": "None",
            "Unilateral": "Unilateral",
            "Bilateral": "Bilateral"
        }
    },
    "SDMT": {
        "LongName": "Symbol digit modalities test (SDMT)",
        "Description": "SDMT score",
        "Units": "number or Not_Completed"
    },
    "PASAT": {
        "LongName": "Paced Auditory Serial Addition Test (PASAT)",
        "Description": "PASAT score",
        "Units": "number or Not_Completed"
    }
}

So I don't know if it is pertinent to do this second scenario.

@valosekj
Copy link
Member

valosekj commented Dec 6, 2023

I find it interesting to save the information separately for each subject (there are 200 patients)

Sorry, I probably expressed myself in an unclear way.

  1. Keep clinical info in a separate file(s) and put the file(s) under the phenotype folder; see for example dcm-zurich/phenotype :

By this, I meant to store clinical data for all subjects in a single file (not separately for each subject). By "separate file" I meant a file under phenotype, i.e., not integrating directly to participants.tsv.

@Nilser3
Copy link
Author

Nilser3 commented Dec 6, 2023

Oh, thanks @valosekj , now I understand it better
Ok,
The participants.tsv file has 21 columns, so maybe I should just stick with scenario 1

@valosekj
Copy link
Member

valosekj commented Dec 6, 2023

The participants.tsv file has 21 columns, so maybe I should just stick with scenario 1

That sounds reasonable to me!

@jcohenadad
Copy link
Member

@Nilser3 please make sure that the subject ID i on the CSV file corresponds to subject i on the BIDS dataset (and please describe the methodology for us to validate there is no error possible).

Similarly, please make sure that the non-den UNI image of subject i correspond to sub-00i. (also describe the methodology).

Please note that one subject is missing on the non-den files.

@Nilser3
Copy link
Author

Nilser3 commented Dec 14, 2023

To ensure that the ID of the source files are the same in the BIDS folder, I have applied this script.

script nih-bids
#!/bin/bash

bids_file="nih-ms-mp2rage"

#Creation of BIDS folders
mkdir $bids_file
mkdir $bids_file/derivatives
mkdir $bids_file/derivatives/labels

for sub in {1..200};
        do
        printf -v j "%03g" $sub ;

	# Creation of BIDS folders per subject
	mkdir $bids_file/sub-nih$j
	mkdir $bids_file/sub-nih$j/anat
	mkdir $bids_file/derivatives/labels/sub-nih$j
	mkdir $bids_file/derivatives/labels/sub-nih$j/anat

	# Images reorientation to RPI
	sct_image -i Transfer/$sub"_inv1.nii.gz" -setorient RPI -o $bids_file/sub-nih$j/anat/sub-nih$j"_inv-1_MP2RAGE.nii.gz"
	sct_image -i Transfer/$sub"_inv2.nii.gz" -setorient RPI -o $bids_file/sub-nih$j/anat/sub-nih$j"_inv-2_MP2RAGE.nii.gz"
	sct_image -i Only_Uni/$sub"_uni.nii.gz"  -setorient RPI -o $bids_file/sub-nih$j/anat/sub-nih$j"_UNIT1.nii.gz"
	sct_image -i Transfer/$sub"_uniden.nii.gz"  -setorient RPI -o $bids_file/sub-nih$j/anat/sub-nih$j"_desc-denoised_UNIT1.nii.gz"
	sct_image -i Transfer/$sub"_t1_images.nii.gz" -setorient RPI -o $bids_file/sub-nih$j/anat/sub-nih$j"_T1map.nii.gz"
	
	# Contrast agnostic model to obtain the SC from UNIT1
	python ../contrast-agnostic-softseg-spinalcord/monai/run_inference_single_image.py --path-img /mnt/nvme/nilaia/$bids_file/sub-nih$j/anat/sub-nih$j"_UNIT1.nii.gz" --chkp-path  /mnt/duke/temp/muena/contrast-agnostic/final_monai_model/nnunet_nf\=32_DS\=1_opt\=adam_lr\=0.001_AdapW_CCrop_bs\=2_64x192x320_20230918-2253/ --path-out $bids_file/derivatives/labels/sub-nih$j/anat/  --device cpu

	# SC mask binarization
	sct_maths -i $bids_file/derivatives/labels/sub-nih$j/anat/sub-nih$j"_UNIT1_pred.nii.gz" -bin 0.5001 -o  $bids_file/derivatives/labels/sub-nih$j/anat/sub-nih$j"_UNIT1_label-SC_seg.nii.gz"

        echo $j

done
  • Indeed, for sub-nih139, his "non-den" image is missing (BIDS-validator also reported it)
bids-validator
bids-validator nih-ms-mp2rage/
[email protected]
(node:18886) Warning: Closing directory handle on garbage collection
(Use `node --trace-warnings ...` to show where the warning was created)
        1: [ERR] Files with such naming scheme are not part of BIDS specification. This error is most commonly caused by typos in file names that make them not BIDS compatible. Please consult the specification and make sure your files are named correctly. If this is not a file naming issue (for example when including files not yet covered by the BIDS specification) you should include a ".bidsignore" file in your dataset (see https://github.com/bids-standard/bids-validator#bidsignore for details). Please note that derived (processed) data should be placed in /derivatives folder and source data (such as DICOMS or behavioural logs in proprietary formats) should be placed in the /sourcedata folder. (code: 1 - NOT_INCLUDED)
                ./sub-nih001/anat/sub-nih001_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih001_desc-denoised_UNIT1.nii.gz
                ./sub-nih002/anat/sub-nih002_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih002_desc-denoised_UNIT1.nii.gz
                ./sub-nih003/anat/sub-nih003_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih003_desc-denoised_UNIT1.nii.gz
                ./sub-nih004/anat/sub-nih004_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih004_desc-denoised_UNIT1.nii.gz
                ./sub-nih005/anat/sub-nih005_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih005_desc-denoised_UNIT1.nii.gz
                ./sub-nih006/anat/sub-nih006_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih006_desc-denoised_UNIT1.nii.gz
                ./sub-nih007/anat/sub-nih007_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih007_desc-denoised_UNIT1.nii.gz
                ./sub-nih008/anat/sub-nih008_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih008_desc-denoised_UNIT1.nii.gz
                ./sub-nih009/anat/sub-nih009_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih009_desc-denoised_UNIT1.nii.gz
                ./sub-nih010/anat/sub-nih010_desc-denoised_UNIT1.nii.gz
                        Evidence: sub-nih010_desc-denoised_UNIT1.nii.gz
                ... and 190 more files having this issue (Use --verbose to see them all).

        Please visit https://neurostars.org/search?q=NOT_INCLUDED for existing conversations about this issue.

        1: [WARN] Not all subjects contain the same files. Each subject should contain the same number of files with the same naming unless some files are known to be missing. (code: 38 - INCONSISTENT_SUBJECTS)
                ./sub-nih139/anat/sub-nih139_UNIT1.nii.gz
                        Evidence: Subject: sub-nih139; Missing file: sub-nih139_UNIT1.nii.gz

        Please visit https://neurostars.org/search?q=INCONSISTENT_SUBJECTS for existing conversations about this issue.

        Summary:                   Available Tasks:        Available Modalities:
        1003 Files, 12.89GB                                MRI
        200 - Subjects
        1 - Session


        If you have any questions, please post on https://neurostars.org/tags/bids.
  • bids-validator reports a problem with the images _desc-denoised_UNIT1 , maybe if @mguaypaq helps me

branch: nlm/initial_data
commit: 8b9e4f471fc5d613eabe8850b5acc317fecb543b

ready for PR

@jcohenadad
Copy link
Member

The problem is that it is possible that the subject ID differs between the uni-den and uni-nonden. Therefore, to ensure this, pls check 10 subjects randomly to make sure the subjects share the same ID.

@jcohenadad
Copy link
Member

I suggest to also save the non-binarized mask, in case in the future we would like to re-train the contrast-agnostic model using soft masks.

On the other hand, generating two outputs will create confusing when comes the time to manually QC and correct predictions. Therefore, it might be advisable to instead only keep the binarized version, perform manual QC and push that to the database. Feedback needed @naga-karthik

@naga-karthik
Copy link
Member

it might be advisable to instead only keep the binarized version, perform manual QC and push that to the database.

How about we do this for the soft seg from contrast-agnostic? i.e. get soft seg, perform QC and push that to labels_softseg ? IMO, having soft seg are much more useful than having the binarized seg (which can just be obtained by thresholding the soft seg with one SCT command)

OR, you think that manually correcting soft seg is not trivial? If that's the case then, I agree with having only the binarized ones pushed to the database, BUT, having a script/procedure that could generate soft seg based on the binarized ones (i know this part is not easy either)

@jcohenadad
Copy link
Member

OR, you think that manually correcting soft seg is not trivial? If that's the case then, I agree with having only the binarized ones pushed to the database, BUT, having a script/procedure that could generate soft seg based on the binarized ones (i know this part is not easy either)

Indeed, I do think this is not trivial. In any case, we need to come up with a script that goes from hard to soft, because we already have existing hard seg from other databases. @Nilser3 is working on it, but this is so important (and tricky to do well) that other folks should co-develop with him, so please review his progress, comment, co-develop: sct-pipeline/contrast-agnostic-softseg-spinalcord#84 thanks

@mguaypaq
Copy link
Member

For bids-validator, we can ignore the errors about the _desc_denoised_ file names by adding a .bidsignore file in the repository root, containing these patterns:

*_desc_denoised_UNIT1.json
*_desc_denoised_UNIT1.nii.gz

This should be fine, because we're consciously choosing to do things differently than strict BIDS (discussion).

For the warning about missing sub-nih139_UNIT1.nii.gz, I think it's good to have this warning visible. And we should mention the reason in README.md.

@Nilser3
Copy link
Author

Nilser3 commented Dec 17, 2023

The problem is that it is possible that the subject ID differs between the uni-den and uni-nonden. Therefore, to ensure this, pls check 10 subjects randomly to make sure the subjects share the same ID.

To ensure that the uni-den and uni-nonden images have the same IDs, I have calculated the MI and CC between them, in all 199 subjects.

image

Then check the subjects with the smaller metrics, such as sub-nih186 , MI = 0.114997

image

Thus it is verified that both images have the same IDs.

@Nilser3
Copy link
Author

Nilser3 commented Dec 18, 2023

I just verified that the UNIT1-nonden image of subject sub-nih193 is cropped (see image),

image

I have also verified it in the original folder Only_Uni.zip

Image headers

sub-nih193_UNIT1 header
sct_image -i sub-nih193_UNIT1.nii.gz -header

--
Spinal Cord Toolbox (git-master-c7a8072fd63a06a2775a74029c042833f0fce510)

sct_image -i sub-nih193_UNIT1.nii.gz -header
--

sizeof_hdr      348
data_type       INT16
dim             [3, 42, 240, 256, 1, 1, 1, 1]
vox_units       mm
time_units      Unknown
datatype        4
nbyper          2
bitpix          16
pixdim          [-1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
vox_offset      352
cal_max         0.000000
cal_min         0.000000
scl_slope       1.000000
scl_inter       0.000000
phase_dim       0
freq_dim        0
slice_dim       0
slice_name      Unknown
slice_code      0
slice_start     0
slice_end       0
slice_duration  0.000000
toffset         0.000000
intent          Unknown
intent_code     0
intent_name
intent_p1       0.000000
intent_p2       0.000000
intent_p3       0.000000
qform_name      Scanner Anat
qform_code      1
qto_xyz:1       -0.997542 0.024984 0.065460 84.117424
qto_xyz:2       0.028418 0.998240 0.052062 -125.059563
qto_xyz:3       0.064044 -0.053794 0.996496 -137.364914
qto_xyz:4       0.000000 0.000000 0.000000 1.000000
qform_xorient   Right-to-Left
qform_yorient   Posterior-to-Anterior
qform_zorient   Inferior-to-Superior
sform_name      Scanner Anat
sform_code      1
sto_xyz:1       -0.997542 0.024984 0.065458 84.117424
sto_xyz:2       0.028418 0.998240 0.052062 -125.059563
sto_xyz:3       0.064042 -0.053794 0.996496 -137.364914
sto_xyz:4       0.000000 0.000000 0.000000 1.000000
sform_xorient   Right-to-Left
sform_yorient   Posterior-to-Anterior
sform_zorient   Inferior-to-Superior
file_type       NIFTI-1+
file_code       1
descrip
aux_file
sub-nih193_desc-denoised_UNIT1 header
sct_image -i sub-nih193_desc-denoised_UNIT1.nii.gz -header

--
Spinal Cord Toolbox (git-master-c7a8072fd63a06a2775a74029c042833f0fce510)

sct_image -i sub-nih193_desc-denoised_UNIT1.nii.gz -header
--

sizeof_hdr      348
data_type       INT16
dim             [3, 176, 240, 256, 1, 1, 1, 1]
vox_units       mm
time_units      Unknown
datatype        4
nbyper          2
bitpix          16
pixdim          [-1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
vox_offset      352
cal_max         0.000000
cal_min         0.000000
scl_slope       1.000000
scl_inter       0.000000
phase_dim       0
freq_dim        0
slice_dim       0
slice_name      Unknown
slice_code      0
slice_start     0
slice_end       0
slice_duration  0.000000
toffset         0.000000
intent          Unknown
intent_code     0
intent_name
intent_p1       0.000000
intent_p2       0.000000
intent_p3       0.000000
qform_name      Scanner Anat
qform_code      1
qto_xyz:1       -0.997542 0.024984 0.065460 84.117424
qto_xyz:2       0.028418 0.998240 0.052062 -125.059563
qto_xyz:3       0.064044 -0.053794 0.996496 -137.364914
qto_xyz:4       0.000000 0.000000 0.000000 1.000000
qform_xorient   Right-to-Left
qform_yorient   Posterior-to-Anterior
qform_zorient   Inferior-to-Superior
sform_name      Scanner Anat
sform_code      1
sto_xyz:1       -0.997542 0.024984 0.065458 84.117424
sto_xyz:2       0.028418 0.998240 0.052062 -125.059563
sto_xyz:3       0.064042 -0.053794 0.996496 -137.364914
sto_xyz:4       0.000000 0.000000 0.000000 1.000000
sform_xorient   Right-to-Left
sform_yorient   Posterior-to-Anterior
sform_zorient   Inferior-to-Superior
file_type       NIFTI-1+
file_code       1
descrip
aux_file

Similarity metric analysis

This issue was not reported in the analysis of MI and CC on UNIT1-den VS UNIT1-nonden (because the similarity metrics were calculated only on the small common matrix).

MeasureImageSimilarity -d 3 -m MI[ nih-ms-mp2rage/sub-nih193/anat/sub-nih193_UNIT1.nii.gz, nih-ms-mp2rage/sub-nih193/anat/sub-nih193_desc-denoised_UNIT1.nii.gz ,1,32]
-0.504648

@Nilser3
Copy link
Author

Nilser3 commented Dec 22, 2023

For bids-validator, we can ignore the errors about the _desc_denoised_ file names by adding a .bidsignore file in the repository root, containing these patterns:

Thanks @mguaypaq ,

I've added the .bidsignore and bids-validator seems to be happy!

For the warning about missing sub-nih139_UNIT1.nii.gz, I think it's good to have this warning visible. And we should mention the reason in README.md.

I have also added in the README.md file the issues for the subjects sub-nih139 (UNIT1 missing) and sub-nih193 (UNIT1 cropped)

branch: nlm/initial_data
commit : 7c6c0a3ee7341088c7d3c08fc7cd1c1439d02008

ready for PR

@Nilser3
Copy link
Author

Nilser3 commented Jan 12, 2024

Hi @mguaypaq

I have renamed the JSON files (bids-validator seems be happy)

branch: nlm/initial_data
commit : 89a57c57422148e2ea118ad7f99192ffc2941e7b

ready for PR

@mguaypaq
Copy link
Member

Alright, everything looks good for bids-validator and git-annex, so I merged this into master and deleted the branch nlm/initial_data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants