-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New dataset nih-ms-mp2rage
#280
Comments
For the name, always small letters, and i would start with the site, pathology and contrast, so: For subject name I would go with: |
Look at them, you should figure you. Also read the MP2RAGE paper. |
Thanks you, Then we should organise
and following qMRI data in BIDS, organise
|
👍 |
The data has been organized in BIDS :
bids-validator(base) nilaia@rosenberg:~/data_nvme_nilaia/NIH_mp2rage/DataForMontreal$ bids-validator nih-ms-mp2rage --verbose
[email protected]
bids-specification@disable
(node:1605962) Warning: Closing directory handle on garbage collection
(Use `node --trace-warnings ...` to show where the warning was created)
1: [WARN] Not all subjects/sessions/runs have the same scanning parameters. (code: 39 - INCONSISTENT_PARAMETERS)
./sub-nih015/anat/sub-nih015_inv-1_part-mag_MP2RAGE.nii.gz
The most common set of dimensions is: 176,240,256 (voxels), This file has the dimensions: 20,240,256 (voxels).
./sub-nih029/anat/sub-nih029_inv-1_part-mag_MP2RAGE.nii.gz
The most common set of dimensions is: 176,240,256 (voxels), This file has the dimensions: 156,240,256 (voxels).
./sub-nih029/anat/sub-nih029_inv-2_part-mag_MP2RAGE.nii.gz
The most common set of dimensions is: 176,240,256 (voxels), This file has the dimensions: 150,240,256 (voxels).
./sub-nih093/anat/sub-nih093_inv-1_part-mag_MP2RAGE.nii.gz
The most common resolution is: 1.00mm x 1.00mm x 1.00mm, This file has the resolution: 1.20mm x 1.00mm x 1.00mm.
./sub-nih093/anat/sub-nih093_inv-2_part-mag_MP2RAGE.nii.gz
The most common resolution is: 1.00mm x 1.00mm x 1.00mm, This file has the resolution: 1.20mm x 1.00mm x 1.00mm.
./sub-nih099/anat/sub-nih099_inv-1_part-mag_MP2RAGE.nii.gz
The most common resolution is: 1.00mm x 1.00mm x 1.00mm, This file has the resolution: 1.20mm x 1.00mm x 1.00mm.
./sub-nih099/anat/sub-nih099_inv-2_part-mag_MP2RAGE.nii.gz
The most common resolution is: 1.00mm x 1.00mm x 1.00mm, This file has the resolution: 1.20mm x 1.00mm x 1.00mm.
Please visit https://neurostars.org/search?q=INCONSISTENT_PARAMETERS for existing conversations about this issue.
2: [WARN] The Authors field of dataset_description.json should contain an array of fields - with one author per field. This was triggered because there are no authors, which will make DOI registration from dataset metadata impossible. (code: 113 - NO_AUTHORS)
Please visit https://neurostars.org/search?q=NO_AUTHORS for existing conversations about this issue.
Summary: Available Tasks: Available Modalities:
404 Files, 3.81GB MRI
200 - Subjects
1 - Session
If you have any questions, please post on https://neurostars.org/tags/bids.
@mguaypaq could you please create a repository for Thank you, |
@Nilser3 done, I set up the repository and gave you write access. Let me know once you have pushed your data to a branch. |
Done @mguaypaq , |
I added a commit to the branch
Other than that, git-annex and bids-validator are both happy with this dataset. |
Thanks you @mguaypaq
I agree, we could put the name of the vendor,
@jcohenadad I think we should request the MP2RAGE acquisition parameters |
Is this folder empty? If so, why simply not getting rid of it? @Nilser3 what are your arguments for calling it Siemens? What do you want to put inside it?
This group has been kind enough to spend some time gathering the data and sending them to us-- it is not appropriate to ask them (ie busy neuroradiologists) to spend more time to reconvert the DICOM using dcm2niix to generate the sidecar JSON fields. We just need to deal with this dataset without the JSON. @Nilser3 there are already some info that you can retrieve from the NIfTI metadata and data |
No, the folder isn't empty; it contains half of the image files: the (The other half of the image files is the |
ok that's a problem-- i would put the uni and t1map under the source images as done for the marseille and basel data-- i know this is not bids compliant but otherwise it will screw up our training scripts whicih assume input data is under source |
Thank you @jcohenadad and @mguaypaq
I have worked in the same branch that @mguaypaq had worked : |
great! let me know when i can review-- thx! |
Alright, I think you can review the branch |
The UNI image (typically used for lesion segmentation) looks very different than the ones I've seen in the past. The background is black, and the contrast looks much closer to a T1w MPRAGE. See notably the contrast in the vertebrae and discs. See below a comparison of the UNI from NIH, Basel and Marseille: |
@Nilser3 the comparison you did #280 (comment) is irrelevant because |
@jcohenadad |
Quick google search --> https://cbs-discourse.uwo.ca/t/removing-background-noise-from-mp2rage-images/101 The information 'den' should probably have not been removed. I've emailed the collaborators to see if they still have the UNI data. Beyond background, the contrast is questionable. @Nilser3 I encourage you to always dig (eg google search, etc.). These are important considerations which should not have been silenced. |
we won't have the non-den UNI, therefore we move forward with these UNI-den images. |
Thanks you,
@jcohenadad @mguaypaq If you agree I can proceed |
Since |
I agree @mguaypaq
|
👍 if we go that route i suggest we also add this use-case to our internal SOP. Also tagging @valosekj |
look at the T1map and tell me if that makes sense... |
OK, when organizing the data as in the schema :
and checking the bids-validateor, I have an ERROR.
@mguaypaq I don't know if this error can be resolved, or if we have to see another nomenclature. commit d8027709ba35c282cf155193d2f4cc0c65e9caaa |
good news! Haris said he can share the UNI (non-den) data:
I will follow-up on this... |
@Nilser3 I copied the non-den data under: I suggest to keep both the den and non-den |
I suggest that we keep the denoised images inside a derived folder since we have now access to non-denoised images. Also, I'm not sure we should use the |
under discussion: #282 (comment) |
@Nilser3 I've also added the file |
For our other datasets, I have used two possible scenarios:
valosek@macbook-pro:~/data/data.neuro.polymtl.ca/sci-colorado$ head -1 participants.tsv
participant_id age sex level_of_injury height weight bmi time_between_mri_and_injury able_to_walk initial_LEMS_R ...
valosek@macbook-pro:~/data/data.neuro.polymtl.ca/dcm-zurich$ tree phenotype
phenotype
├── README.md
├── anatomical_data.xlsx
├── clinical_scores.xlsx
├── electrophysiological_measurements.xlsx
├── motion_data.xlsx
└── motion_data_maximum_stenosis.xlsx Both scenarios have their pros and cons; summary here. |
Thanks you @valosekj
This scenario has been done,
I find it interesting to save the information separately for each subject (there are 200 patients), however for this data there is only some information such as clinical score (see: participants.json{
"participant_id": {
"LongName": "Participant ID",
"Description": "Unique ID"
},
"sex": {
"LongName": "Participant sex",
"Description": "Sex",
"Levels": {
"M": "male",
"F": "female"
}
},
"date_of_birth": {
"LongName": "Date of birth",
"Description": "yyyy-mm-dd"
},
"age_2023": {
"LongName": "Participant age at 2023",
"Description": "yy",
"Units": "years"
},
"race": {
"LongName": "Ethnic group / race",
"Description": "race"
},
"date_of_scan": {
"LongName": "Date of scan",
"Description": "yyyy-mm-dd"
},
"pathology": {
"LongName": "Pathology name",
"Description": "The diagnosis of pathology of the participant",
"Levels": {
"MS Spectrum": "Multiple Sclerosis"
}
},
"phenotype": {
"LongName": "Phenotype name",
"Description": "The MS phenotype of the participant",
"Levels": {
"RRMS": "Relapsing-Remitting Multiple Sclerosis",
"PPMS": "Primary Progressive Multiple Sclerosis",
"SPMS": "Secondary Progressive Multiple Sclerosis"
}
},
"onset": {
"LongName": "Disease onset date",
"Description": "yyyy-mm-dd"
},
"Relapses_Past12m_OR_PreviousVisit": {
"LongName": "Relapses in past 12 months or since previous visit",
"Description": "Number"
},
"Current_DMT": {
"LongName": "Current disease-modifying therapies",
"Description": "Current therapy"
},
"Date_Testing_Correct": {
"LongName": "Date testing correct",
"Description": "yyyy-mm-dd"
},
"Scripps": {
"LongName": "Scripps neurologic rating scale (SNRS)",
"Description": "Scripps SCORE (Maximum = 100)"
},
"EDSS": {
"LongName": "Expanded Disability Status Scale (EDSS)",
"Description": "EDSS score (Maximum = 10)"
},
"Nine_HPT_Right": {
"LongName": "Nine-Hole Peg Test on right hand",
"Description": "Nine-Hole Peg Test score (measure finger dexterity in seconds)"
},
"Nine_HPT_Left": {
"LongName": "Nine-Hole Peg Test on left hand",
"Description": "Nine-Hole Peg Test score (measure finger dexterity in seconds)"
},
"Handedness": {
"LongName": "Handedness",
"Description": "Manual dominance",
"Levels": {
"Left": "Left-handed",
"Right": "Right-handed"
}
},
"T25FW": {
"LongName": "Timed 25-Foot Walk (T25FW)",
"Description": "T25FW score",
"Units": "seconds or Not_Completed"
},
"Assistive_Device": {
"LongName": "Assistive device",
"Description": "Assistive device",
"Levels": {
"None": "None",
"Unilateral": "Unilateral",
"Bilateral": "Bilateral"
}
},
"SDMT": {
"LongName": "Symbol digit modalities test (SDMT)",
"Description": "SDMT score",
"Units": "number or Not_Completed"
},
"PASAT": {
"LongName": "Paced Auditory Serial Addition Test (PASAT)",
"Description": "PASAT score",
"Units": "number or Not_Completed"
}
} So I don't know if it is pertinent to do this second scenario. |
Sorry, I probably expressed myself in an unclear way.
By this, I meant to store clinical data for all subjects in a single file (not separately for each subject). By "separate file" I meant a file under |
Oh, thanks @valosekj , now I understand it better |
That sounds reasonable to me! |
@Nilser3 please make sure that the subject ID i on the CSV file corresponds to subject i on the BIDS dataset (and please describe the methodology for us to validate there is no error possible). Similarly, please make sure that the non-den UNI image of subject i correspond to sub-00i. (also describe the methodology). Please note that one subject is missing on the non-den files. |
To ensure that the ID of the source files are the same in the BIDS folder, I have applied this script. script nih-bids#!/bin/bash
bids_file="nih-ms-mp2rage"
#Creation of BIDS folders
mkdir $bids_file
mkdir $bids_file/derivatives
mkdir $bids_file/derivatives/labels
for sub in {1..200};
do
printf -v j "%03g" $sub ;
# Creation of BIDS folders per subject
mkdir $bids_file/sub-nih$j
mkdir $bids_file/sub-nih$j/anat
mkdir $bids_file/derivatives/labels/sub-nih$j
mkdir $bids_file/derivatives/labels/sub-nih$j/anat
# Images reorientation to RPI
sct_image -i Transfer/$sub"_inv1.nii.gz" -setorient RPI -o $bids_file/sub-nih$j/anat/sub-nih$j"_inv-1_MP2RAGE.nii.gz"
sct_image -i Transfer/$sub"_inv2.nii.gz" -setorient RPI -o $bids_file/sub-nih$j/anat/sub-nih$j"_inv-2_MP2RAGE.nii.gz"
sct_image -i Only_Uni/$sub"_uni.nii.gz" -setorient RPI -o $bids_file/sub-nih$j/anat/sub-nih$j"_UNIT1.nii.gz"
sct_image -i Transfer/$sub"_uniden.nii.gz" -setorient RPI -o $bids_file/sub-nih$j/anat/sub-nih$j"_desc-denoised_UNIT1.nii.gz"
sct_image -i Transfer/$sub"_t1_images.nii.gz" -setorient RPI -o $bids_file/sub-nih$j/anat/sub-nih$j"_T1map.nii.gz"
# Contrast agnostic model to obtain the SC from UNIT1
python ../contrast-agnostic-softseg-spinalcord/monai/run_inference_single_image.py --path-img /mnt/nvme/nilaia/$bids_file/sub-nih$j/anat/sub-nih$j"_UNIT1.nii.gz" --chkp-path /mnt/duke/temp/muena/contrast-agnostic/final_monai_model/nnunet_nf\=32_DS\=1_opt\=adam_lr\=0.001_AdapW_CCrop_bs\=2_64x192x320_20230918-2253/ --path-out $bids_file/derivatives/labels/sub-nih$j/anat/ --device cpu
# SC mask binarization
sct_maths -i $bids_file/derivatives/labels/sub-nih$j/anat/sub-nih$j"_UNIT1_pred.nii.gz" -bin 0.5001 -o $bids_file/derivatives/labels/sub-nih$j/anat/sub-nih$j"_UNIT1_label-SC_seg.nii.gz"
echo $j
done
bids-validatorbids-validator nih-ms-mp2rage/
[email protected]
(node:18886) Warning: Closing directory handle on garbage collection
(Use `node --trace-warnings ...` to show where the warning was created)
1: [ERR] Files with such naming scheme are not part of BIDS specification. This error is most commonly caused by typos in file names that make them not BIDS compatible. Please consult the specification and make sure your files are named correctly. If this is not a file naming issue (for example when including files not yet covered by the BIDS specification) you should include a ".bidsignore" file in your dataset (see https://github.com/bids-standard/bids-validator#bidsignore for details). Please note that derived (processed) data should be placed in /derivatives folder and source data (such as DICOMS or behavioural logs in proprietary formats) should be placed in the /sourcedata folder. (code: 1 - NOT_INCLUDED)
./sub-nih001/anat/sub-nih001_desc-denoised_UNIT1.nii.gz
Evidence: sub-nih001_desc-denoised_UNIT1.nii.gz
./sub-nih002/anat/sub-nih002_desc-denoised_UNIT1.nii.gz
Evidence: sub-nih002_desc-denoised_UNIT1.nii.gz
./sub-nih003/anat/sub-nih003_desc-denoised_UNIT1.nii.gz
Evidence: sub-nih003_desc-denoised_UNIT1.nii.gz
./sub-nih004/anat/sub-nih004_desc-denoised_UNIT1.nii.gz
Evidence: sub-nih004_desc-denoised_UNIT1.nii.gz
./sub-nih005/anat/sub-nih005_desc-denoised_UNIT1.nii.gz
Evidence: sub-nih005_desc-denoised_UNIT1.nii.gz
./sub-nih006/anat/sub-nih006_desc-denoised_UNIT1.nii.gz
Evidence: sub-nih006_desc-denoised_UNIT1.nii.gz
./sub-nih007/anat/sub-nih007_desc-denoised_UNIT1.nii.gz
Evidence: sub-nih007_desc-denoised_UNIT1.nii.gz
./sub-nih008/anat/sub-nih008_desc-denoised_UNIT1.nii.gz
Evidence: sub-nih008_desc-denoised_UNIT1.nii.gz
./sub-nih009/anat/sub-nih009_desc-denoised_UNIT1.nii.gz
Evidence: sub-nih009_desc-denoised_UNIT1.nii.gz
./sub-nih010/anat/sub-nih010_desc-denoised_UNIT1.nii.gz
Evidence: sub-nih010_desc-denoised_UNIT1.nii.gz
... and 190 more files having this issue (Use --verbose to see them all).
Please visit https://neurostars.org/search?q=NOT_INCLUDED for existing conversations about this issue.
1: [WARN] Not all subjects contain the same files. Each subject should contain the same number of files with the same naming unless some files are known to be missing. (code: 38 - INCONSISTENT_SUBJECTS)
./sub-nih139/anat/sub-nih139_UNIT1.nii.gz
Evidence: Subject: sub-nih139; Missing file: sub-nih139_UNIT1.nii.gz
Please visit https://neurostars.org/search?q=INCONSISTENT_SUBJECTS for existing conversations about this issue.
Summary: Available Tasks: Available Modalities:
1003 Files, 12.89GB MRI
200 - Subjects
1 - Session
If you have any questions, please post on https://neurostars.org/tags/bids.
ready for PR |
The problem is that it is possible that the subject ID differs between the uni-den and uni-nonden. Therefore, to ensure this, pls check 10 subjects randomly to make sure the subjects share the same ID. |
I suggest to also save the non-binarized mask, in case in the future we would like to re-train the contrast-agnostic model using soft masks. On the other hand, generating two outputs will create confusing when comes the time to manually QC and correct predictions. Therefore, it might be advisable to instead only keep the binarized version, perform manual QC and push that to the database. Feedback needed @naga-karthik |
How about we do this for the soft seg from contrast-agnostic? i.e. get soft seg, perform QC and push that to OR, you think that manually correcting soft seg is not trivial? If that's the case then, I agree with having only the binarized ones pushed to the database, BUT, having a script/procedure that could generate soft seg based on the binarized ones (i know this part is not easy either) |
Indeed, I do think this is not trivial. In any case, we need to come up with a script that goes from hard to soft, because we already have existing hard seg from other databases. @Nilser3 is working on it, but this is so important (and tricky to do well) that other folks should co-develop with him, so please review his progress, comment, co-develop: sct-pipeline/contrast-agnostic-softseg-spinalcord#84 thanks |
For bids-validator, we can ignore the errors about the
This should be fine, because we're consciously choosing to do things differently than strict BIDS (discussion). For the warning about missing |
To ensure that the Then check the subjects with the smaller metrics, such as Thus it is verified that both images have the same IDs. |
I just verified that the UNIT1-nonden image of subject I have also verified it in the original folder Image headerssub-nih193_UNIT1 headersct_image -i sub-nih193_UNIT1.nii.gz -header
--
Spinal Cord Toolbox (git-master-c7a8072fd63a06a2775a74029c042833f0fce510)
sct_image -i sub-nih193_UNIT1.nii.gz -header
--
sizeof_hdr 348
data_type INT16
dim [3, 42, 240, 256, 1, 1, 1, 1]
vox_units mm
time_units Unknown
datatype 4
nbyper 2
bitpix 16
pixdim [-1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
vox_offset 352
cal_max 0.000000
cal_min 0.000000
scl_slope 1.000000
scl_inter 0.000000
phase_dim 0
freq_dim 0
slice_dim 0
slice_name Unknown
slice_code 0
slice_start 0
slice_end 0
slice_duration 0.000000
toffset 0.000000
intent Unknown
intent_code 0
intent_name
intent_p1 0.000000
intent_p2 0.000000
intent_p3 0.000000
qform_name Scanner Anat
qform_code 1
qto_xyz:1 -0.997542 0.024984 0.065460 84.117424
qto_xyz:2 0.028418 0.998240 0.052062 -125.059563
qto_xyz:3 0.064044 -0.053794 0.996496 -137.364914
qto_xyz:4 0.000000 0.000000 0.000000 1.000000
qform_xorient Right-to-Left
qform_yorient Posterior-to-Anterior
qform_zorient Inferior-to-Superior
sform_name Scanner Anat
sform_code 1
sto_xyz:1 -0.997542 0.024984 0.065458 84.117424
sto_xyz:2 0.028418 0.998240 0.052062 -125.059563
sto_xyz:3 0.064042 -0.053794 0.996496 -137.364914
sto_xyz:4 0.000000 0.000000 0.000000 1.000000
sform_xorient Right-to-Left
sform_yorient Posterior-to-Anterior
sform_zorient Inferior-to-Superior
file_type NIFTI-1+
file_code 1
descrip
aux_file sub-nih193_desc-denoised_UNIT1 headersct_image -i sub-nih193_desc-denoised_UNIT1.nii.gz -header
--
Spinal Cord Toolbox (git-master-c7a8072fd63a06a2775a74029c042833f0fce510)
sct_image -i sub-nih193_desc-denoised_UNIT1.nii.gz -header
--
sizeof_hdr 348
data_type INT16
dim [3, 176, 240, 256, 1, 1, 1, 1]
vox_units mm
time_units Unknown
datatype 4
nbyper 2
bitpix 16
pixdim [-1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
vox_offset 352
cal_max 0.000000
cal_min 0.000000
scl_slope 1.000000
scl_inter 0.000000
phase_dim 0
freq_dim 0
slice_dim 0
slice_name Unknown
slice_code 0
slice_start 0
slice_end 0
slice_duration 0.000000
toffset 0.000000
intent Unknown
intent_code 0
intent_name
intent_p1 0.000000
intent_p2 0.000000
intent_p3 0.000000
qform_name Scanner Anat
qform_code 1
qto_xyz:1 -0.997542 0.024984 0.065460 84.117424
qto_xyz:2 0.028418 0.998240 0.052062 -125.059563
qto_xyz:3 0.064044 -0.053794 0.996496 -137.364914
qto_xyz:4 0.000000 0.000000 0.000000 1.000000
qform_xorient Right-to-Left
qform_yorient Posterior-to-Anterior
qform_zorient Inferior-to-Superior
sform_name Scanner Anat
sform_code 1
sto_xyz:1 -0.997542 0.024984 0.065458 84.117424
sto_xyz:2 0.028418 0.998240 0.052062 -125.059563
sto_xyz:3 0.064042 -0.053794 0.996496 -137.364914
sto_xyz:4 0.000000 0.000000 0.000000 1.000000
sform_xorient Right-to-Left
sform_yorient Posterior-to-Anterior
sform_zorient Inferior-to-Superior
file_type NIFTI-1+
file_code 1
descrip
aux_file Similarity metric analysisThis issue was not reported in the analysis of MI and CC on UNIT1-den VS UNIT1-nonden (because the similarity metrics were calculated only on the small common matrix).
|
Thanks @mguaypaq , I've added the
I have also added in the README.md file the issues for the subjects branch: nlm/initial_data ready for PR |
Hi @mguaypaq I have renamed the JSON files (bids-validator seems be happy)
ready for PR |
Alright, everything looks good for bids-validator and git-annex, so I merged this into master and deleted the branch |
Description:
Dataset of 200 MS patients from NIH, acquired at 3T MP2RAGE sequence; contains (inv1, inv2, UNIT1 and T1map contrast) in large fields-of-view (FOV) containing head and spinal cord from C1 to ~C5.
Resolution:
1.0 x 1.0 x 1.0 mm3
Contains no MS lesion or SC segmentations.
Details:
Current arborescence:
Proposed BIDS format:
I was guided by Organization of qMRI data in BIDS, but it is not clear to me how to organise
inv1
andinv2
(because I don't know if these images arephase
ormag
).The text was updated successfully, but these errors were encountered: