-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a git-annex repo for data from Karolinska #16
Comments
given that karolinska has/will contribute to multiple datasets, coming from different studies, I think we need to specify them. Eg, this one could be called: |
Thanks @jcohenadad ! I am currently working on the bidsification of the data and facing a few issues. I am following the dcm2bids tutorial and ran the following commands: conda activate dcm2bids
mkdir bids_karo
dcm2bids_scaffold -o bids_karo The config file I created is the following (feedback is welcome on the suffixes chosen). It is stored in {
"descriptions": [
{
"datatype": "anat",
"suffix": "acq-MPRsag_T1w",
"criteria": {
"SeriesDescription": "t1_mpr_ns_sag_1mm_iso"
}
},
{
"datatype": "anat",
"suffix": "acq-sagDF_T2w",
"criteria": {
"SeriesDescription": "t2_space_dark-fluid_sag_REK_tra_3mm"
}
},
{
"datatype": "anat",
"suffix": "acq-sagDF_T2w",
"criteria": {
"SeriesDescription": "t2_space_dark-fluid_sag_REK_3mm_tra"
}
},
{
"datatype": "anat",
"suffix": "acq-tseSag_T2w",
"criteria": {
"SeriesDescription": "t2_tse_sag_MS"
}
},
{
"datatype": "anat",
"suffix": "acq-sagP2_T2w",
"criteria": {
"SeriesDescription": "t2_space_sag_p2_iso_REK_tra_3mm"
}
},
{
"datatype": "anat",
"suffix": "acq-sagP2_T2w",
"criteria": {
"SeriesDescription": "t2_space_sag_p2_iso_REK_3mm_tra"
}
},
{
"datatype": "anat",
"suffix": "acq-me2d_T2w",
"criteria": {
"SeriesDescription": "t2_me2d_tra_p2_3mm"
}
},
{
"datatype": "anat",
"suffix": "acq-tse_T2w",
"criteria": {
"SeriesDescription": "t2_tse_tra"
}
},
{
"datatype": "anat",
"suffix": "acq-MPRsag_T1w",
"criteria": {
"SeriesDescription": "t1_mpr_ns_sag_1mm_iso_REK_1mm_tra"
}
},
{
"datatype": "anat",
"suffix": "acq-MPRsag_T1w",
"criteria": {
"SeriesDescription": "t1_mpr_ns_sag_1mm_iso_MPR_3mm_tra"
}
},
{
"datatype": "anat",
"suffix": "acq-MPRsagDF_T2w",
"criteria": {
"SeriesDescription": "t2_space_dark-fluid_sag_MPR_3mm_tra"
}
},
{
"datatype": "anat",
"suffix": "acq-MPRsagP2_T2w",
"criteria": {
"SeriesDescription": "t2_space_sag_p2_iso_MPR_3mm_tra"
}
}
]
} I created the following bash script to convert the dcm images to BIDS: #!/bin/bash
# Check if the correct number of arguments is provided
if [ "$#" -ne 3 ]; then
echo "Usage: $0 path/to/config.json path/to/output_dir path/to/dicom"
exit 1
fi
# Get the config file, output directory, and DICOM directory from the command line arguments
config_file="$1"
output_dir="$2"
dicom_path="$3"
# Iterate over each folder in the DICOM directory
for folder in "$dicom_path"/SW1-*; do
# Check if it is a directory
if [ -d "$folder" ]; then
# Extract participant and session info from the folder name
# The folder bame is */SW1-1773_M0: the participant should be 1773 and the session M0
subfolder="${folder##*/}"
# Participant is the number after SW1- and before _
participant="${subfolder#SW1-}"
participant="${participant%%_*}"
# Session is the letter after the _
session="${subfolder##*_}"
echo "Converting participant $participant session $session"
# Define the DICOM directory
dicom_dir="$folder"
# Run dcm2bids
dcm2bids -d "$dicom_dir" -p "$participant" -s "$session" -c "$config_file" -o "$output_dir" --bids_validate
fi
done
echo "All conversions are done." The script was ran using the following command: bids_karo/code/convert_dcm2bids.sh bids_karo/code/dcm2bids_config.json bids_karo/ 20200612_longitudinal/Karolinska_data.1/ However, some files don't have the field INFO | SIDECAR PAIRING
INFO | No Pairing <- 001_SW1-1875_M12_0_i00001
INFO | No Pairing <- 001_SW1-1875_M12_0_i00004
INFO | No Pairing <- 002_SW1-1875_M12_0
INFO | No Pairing <- 002_SW1-1875_M12_0a
INFO | No Pairing <- 003_SW1-1875_M12_0
INFO | No Pairing <- 003_SW1-1875_M12_0a
INFO | No Pairing <- 004_SW1-1875_M12_0
WARNING | NO PAIRING WAS FOUND. BIDS FOLDER "BIDS_KARO/SUB-1875/SES-M12" WON'T BE CREATED. CHECK YOUR CONFIG FILE. You can find these files and logs in the following folder : What should I do ? How should I modify my config file to work with these files (I just showed an example but there are more that didn't work). @jcohenadad @valosekj @NathanMolinier Any ideas ? |
After some investigation, I found that using "SequenceName" would work as well to create the file suffix. {
"descriptions": [
{
"datatype": "anat",
"suffix": "acq-sagMprage_T1w",
"criteria": {
"SequenceName": "*tfl3d1_16ns"
}
},
{
"datatype": "anat",
"suffix": "acq-sagTse_T2w",
"criteria": {
"SequenceName": "*tseR2d1rr19"
}
},
{
"datatype": "anat",
"suffix": "acq-me2d_T2w",
"criteria": {
"SequenceName": "*me2d1r4"
}
},
{
"datatype": "anat",
"suffix": "acq-Tse_T2w",
"criteria": {
"SequenceName": "*tseR2d1rs17"
}
},
{
"datatype": "anat",
"suffix": "acq-sagMprageDf_T2w",
"criteria": {
"SequenceName": "*spcir_278ns"
}
},
{
"datatype": "anat",
"suffix": "acq-sagMprageP2_T2w",
"criteria": {
"SequenceName": "*spcR_282ns"
}
},
{
"datatype": "anat",
"suffix": "localiser",
"criteria": {
"SequenceName": "*fl2d1"
}
},
{
"datatype": "anat",
"suffix": "acq-sag_T1w",
"criteria": {
"SequenceName": "*spcir_257ns"
}
},
{
"datatype": "anat",
"suffix": "acq-epB0",
"criteria": {
"SequenceName": "*ep_b0"
}
},
{
"datatype": "anat",
"suffix": "acq-epB01000",
"criteria": {
"SequenceName": "*ep_b0_1000"
}
},
{
"datatype": "anat",
"suffix": "acq-epB1000t",
"criteria": {
"SequenceName": "*ep_b1000t"
}
},
{
"datatype": "anat",
"suffix": "acq-cor_T1w",
"criteria": {
"SequenceName": "*h2d1_205",
"ImageOrientationPatientDICOM": [1,0,0,0,0,-1]
}
},
{
"datatype": "anat",
"suffix": "acq-sag_T1w",
"criteria": {
"SequenceName": "*h2d1_205",
"ImageOrientationPatientDICOM": [0,1,0,0,0,-1]
}
},
{
"datatype": "anat",
"suffix": "acq-ax_T1w",
"criteria": {
"SequenceName": "*h2d1_205",
"ImageOrientationPatientDICOM": [1,0,0,0,1,0]
}
}
]
}
This should cover every-case in the dataset. Only the following files were not transfered because they didn't look relevant:
Feedback on the chosen conventions would be appreciated. |
The files contained in the 4 folders ( bids_karo/code/convert_dcm2bids.sh bids_karo/code/dcm2bids_config.json bids_karo/ 20200612_longitudinal/Karolinska_data.1
bids_karo/code/convert_dcm2bids.sh bids_karo/code/dcm2bids_config.json bids_karo/ 20200612_longitudinal/Karolinska_data.2
bids_karo/code/convert_dcm2bids.sh bids_karo/code/dcm2bids_config.json bids_karo/ 20200612_longitudinal/Karolinska_data.3
bids_karo/code/convert_dcm2bids.sh bids_karo/code/dcm2bids_config.json bids_karo/ 20200612_longitudinal/Karolinska_data.4 The metadata was added using the file Everything is done and stored on Waiting for review of the conventions and the creation of the git-annex repo. |
I created the repo and gave @plbenveniste write access: |
Some modifications were done in the |
The data was copied from the folder on romane to the git-annex folder using the following command : cp -a bids-karo/. ms-karolinska-2020/ Useless files were removed (such as tmpdcm2bids). It was commited and then pushed to the remote branch. Now ready for review! |
I left some review comments on the pull request: |
@jcohenadad I am not sure if you are seeing the tags when I tagged you here.
|
Indeed I had missed the tagging (please note that I don't receive any notification when you tag me with this username-- because of an issue with my account i have to log in with |
@mguaypaq The requested modifications were completed. 😃 |
I am currently reviewing the suffixes of the image of the dataset. SeriesDescription ScanningSequence SequenceName
0 t2_space_sag_p2_iso_MPR_3mm_tra SE *spcR_282ns
1 t2_tse_sag_MS SE *tseR2d1rr19
2 t2_me2d_tra_p2_3mm GR *me2d1r4
3 t1_mpr_ns_sag_1mm_iso_MPR_3mm_tra GR\IR *tfl3d1_16ns
4 t2_tse_tra SE *tseR2d1rs17
5 t2_space_dark-fluid_sag_MPR_3mm_tra SE\IR *spcir_278ns
8 t1_mpr_ns_sag_1mm_iso_REK_1mm_tra GR\IR *tfl3d1_16ns
10 t2_space_dark-fluid_sag_REK_3mm_tra SE\IR *spcir_278ns
11 t2_space_sag_p2_iso_REK_3mm_tra SE *spcR_282ns
18 GR\IR *tfl3d1_16ns
19 GR *me2d1r4
20 SE\IR *spcir_278ns
21 SE *tseR2d1rr19
22 SE *spcR_282ns
23 SE *tseR2d1rs17
36 GR *fl2d1
72 SE\IR *spcir_257ns
137 SE *h2d1_205
380 t1_mpr_ns_sag_1mm_iso GR\IR *tfl3d1_16ns
384 t2_space_sag_p2_iso_REK_tra_3mm SE *spcR_282ns
385 t2_space_dark-fluid_sag_REK_tra_3mm SE\IR *spcir_278ns |
Excellent, everything looks good, except "36: fl2d1" which is GR\IR instead of GR |
Just reminding, but this was extracted from the json sidecar (not written by me). EDIT: It cannot be MPRAGE since the images are anisotropic |
Here are the associated acq and suffix chosen for each: SeriesDescription ScanningSequence SequenceName suffix-acq suffix-contrast
0 t2_space_sag_p2_iso_MPR_3mm_tra SE *spcR_282ns acq-isoMPR T2w
1 t2_tse_sag_MS SE *tseR2d1rr19 acq-sagTse T2w
2 t2_me2d_tra_p2_3mm GR *me2d1r4 acq-ax T2star
3 t1_mpr_ns_sag_1mm_iso_MPR_3mm_tra GR\IR *tfl3d1_16ns acq-isoMpr MPRAGE
4 t2_tse_tra SE *tseR2d1rs17 acq-axTse T2w
5 t2_space_dark-fluid_sag_MPR_3mm_tra SE\IR *spcir_278ns acq-sagDfirMpr T2w
8 t1_mpr_ns_sag_1mm_iso_REK_1mm_tra GR\IR *tfl3d1_16ns acq-isoRek MPRAGE
10 t2_space_dark-fluid_sag_REK_3mm_tra SE\IR *spcir_278ns acq-sagDfirRek T2w
11 t2_space_sag_p2_iso_REK_3mm_tra SE *spcR_282ns acq-isoRek T2w
18 GR\IR *tfl3d1_16ns acq-iso MPRAGE
19 GR *me2d1r4 acq-ax T2star
20 SE\IR *spcir_278ns acq-sagDfir T2w
21 SE *tseR2d1rr19 acq-sagTse T2w
22 SE *spcR_282ns acq-iso T2w
23 SE *tseR2d1rs17 acq-axTse T2w
36 GR\IR *fl2d1 acq-localizer T1w
72 SE\IR *spcir_257ns acq-sagDfir257 T2w
137 SE *h2d1_205 T2w
380 t1_mpr_ns_sag_1mm_iso GR\IR *tfl3d1_16ns acq-iso MPRAGE
384 t2_space_sag_p2_iso_REK_tra_3mm SE *spcR_282ns acq-isoRek T2w
385 t2_space_dark-fluid_sag_REK_tra_3mm SE\IR *spcir_278ns acq-sagDfirRek T2w For the Working on the code to implement this now. |
Modifications were done with the script |
Referencing this issue (issue 63) which is linked to the previous pre-processing of the karolinska data. |
Used command line to rename files from MPRAGE to T1w find . -type f -name '*MPRAGE.*' -exec bash -c 'for file; do mv "$file" "${file/MPRAGE./T1w.}"; done' _ {} + Pushed to git-annex |
Based on the comment neuropoly/data-management#63 (comment), I am reviewing the suffixes chosen for the karolinska dataset. SeriesDescription ScanningSequence SequenceName suffix-acq suffix-contrast
0 t2_space_sag_p2_iso_MPR_3mm_tra SE *spcR_282ns acq-isoMpr T2w
1 t2_tse_sag_MS SE *tseR2d1rr19 acq-sagTse T2w
2 t2_me2d_tra_p2_3mm GR *me2d1r4 acq-ax MEGRE Changed
3 t1_mpr_ns_sag_1mm_iso_MPR_3mm_tra GR\IR *tfl3d1_16ns acq-isoMpr MPRAGE
4 t2_tse_tra SE *tseR2d1rs17 acq-axTse T2w
5 t2_space_dark-fluid_sag_MPR_3mm_tra SE\IR *spcir_278ns acq-sag FLAIR Changed
8 t1_mpr_ns_sag_1mm_iso_REK_1mm_tra GR\IR *tfl3d1_16ns acq-isoRek MPRAGE
10 t2_space_dark-fluid_sag_REK_3mm_tra SE\IR *spcir_278ns acq-sag FLAIR Changed
11 t2_space_sag_p2_iso_REK_3mm_tra SE *spcR_282ns acq-isoRek T2w
18 GR\IR *tfl3d1_16ns acq-iso MPRAGE
19 GR *me2d1r4 acq-ax MEGRE Changed
20 SE\IR *spcir_278ns acq-sag FLAIR Changed
21 SE *tseR2d1rr19 acq-sagTse T2w
22 SE *spcR_282ns acq-iso T2w
23 SE *tseR2d1rs17 acq-axTse T2w
36 GR\IR *fl2d1 acq-localizer T1w
72 SE\IR *spcir_257ns acq-sag257 FLAIR Changed
137 SE *h2d1_205 T2w
380 t1_mpr_ns_sag_1mm_iso GR\IR *tfl3d1_16ns acq-iso MPRAGE
384 t2_space_sag_p2_iso_REK_tra_3mm SE *spcR_282ns acq-isoRek T2w
385 t2_space_dark-fluid_sag_REK_tra_3mm SE\IR *spcir_278ns acq-sag FLAIR Changed Waiting for review from @jcohenadad and/or @NathanMolinier before applying on the dataset. |
Shouldn't we use It actually looks like you already change that here |
Great catch @NathanMolinier ! Thanks for the feedback ! Does the rest look good to you ? |
Maybe we should use |
I used |
And maybe |
Interesting! But If we agree to use |
I think we decided to use MEGRE in this case because of the specific sequence used: which is |
For this one, I kept |
Here is the final version of the suffix chosen: SeriesDescription ScanningSequence SequenceName suffix-acq suffix-contrast
0 t2_space_sag_p2_iso_MPR_3mm_tra SE *spcR_282ns acq-isoMpr T2w
1 t2_tse_sag_MS SE *tseR2d1rr19 acq-sagTse T2w
2 t2_me2d_tra_p2_3mm GR *me2d1r4 acq-ax MEGRE Changed
3 t1_mpr_ns_sag_1mm_iso_MPR_3mm_tra GR\IR *tfl3d1_16ns acq-isoMpr T1w Changed
4 t2_tse_tra SE *tseR2d1rs17 acq-axTse T2w
5 t2_space_dark-fluid_sag_MPR_3mm_tra SE\IR *spcir_278ns acq-sag FLAIR Changed
8 t1_mpr_ns_sag_1mm_iso_REK_1mm_tra GR\IR *tfl3d1_16ns acq-isoRek T1w Changed
10 t2_space_dark-fluid_sag_REK_3mm_tra SE\IR *spcir_278ns acq-sag FLAIR Changed
11 t2_space_sag_p2_iso_REK_3mm_tra SE *spcR_282ns acq-isoRek T2w
18 GR\IR *tfl3d1_16ns acq-iso T1w Changed
19 GR *me2d1r4 acq-ax MEGRE Changed
20 SE\IR *spcir_278ns acq-sag FLAIR Changed
21 SE *tseR2d1rr19 acq-sagTse T2w
22 SE *spcR_282ns acq-iso T2w
23 SE *tseR2d1rs17 acq-axTse T2w
36 GR\IR *fl2d1 acq-localizer T1w
72 SE\IR *spcir_257ns acq-sag257 FLAIR Changed
137 SE *h2d1_205 T2w
380 t1_mpr_ns_sag_1mm_iso GR\IR *tfl3d1_16ns acq-iso T1w Changed
384 t2_space_sag_p2_iso_REK_tra_3mm SE *spcR_282ns acq-isoRek T2w
385 t2_space_dark-fluid_sag_REK_tra_3mm SE\IR *spcir_278ns acq-sag FLAIR Changed Thanks for the feedback everybody ! |
Modifications done : python code/update_anat_suffixes.py -i ~/update_gitannex/ms-karolinska-2020/ The PR is ready for review 😃 |
The data is stored in duke:mri/karo/20200612_longitudinal
The steps are the following:
@jcohenadad What name should be used for the git-annex repo ? ms-karolinska ?
@mguaypaq Could you create the corresponding git-annex repo ?
This is related to issue 76. Creating this issue here to centralize the work on MS dataset.
The text was updated successfully, but these errors were encountered: