Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json creation when converting from nnUNetV2 to BIDS #20

Open
tzebre opened this issue Jun 27, 2023 · 8 comments
Open

json creation when converting from nnUNetV2 to BIDS #20

tzebre opened this issue Jun 27, 2023 · 8 comments
Labels

Comments

@tzebre
Copy link
Member

tzebre commented Jun 27, 2023

Linked to PR#15
The BIDS format contains a json file for each nifti file in the dataset.
The json file contains 2 fields (see our convention at the end of the Derivatives Structure section):

  • Author
  • Date of creation
{
    "Author": "Sandrine Bedard",
    "Date": "2021-06-08 15:05:54"
}

In the nnunetV2_to_bids.py script, we create a BIDS dataset from an nnUNet dataset.
We have 2 choices in the case of script-generated BIDS.
The script create the json file:

{
    "Author": "nnUNetV2_to_bids.py from nnUNet dataset NAME OF DATASET",
    "Date": "2023-06-27 14:00:51"
}

Or the reviewer creates the JSON when they manually review the file.

Do you have any ideas on what choice we need to implement?

@jcohenadad
Copy link
Member

jcohenadad commented Jun 27, 2023

Better to have it done automatically, no? Humans make mistakes

@tzebre
Copy link
Member Author

tzebre commented Jun 27, 2023

Yes, of course.

In the case the nnUNet dataset is the result of an nnUNet prediction:

  • Letting the reviewer create the json ensures that the file will be reviewed before the BIDS dataset is complete (if we consider that without json file the BIDS dataset is incomplete).
  • Automatically creating the JSON means that we consider the label from the nnUNet as truth. But if we say in the documentation that the label must be checked before the conversion script, this problem no longer exists.

@jcohenadad
Copy link
Member

Automatically creating the JSON means that we consider the label from the nnUNet as truth. But if we say in the documentation that the label must be checked before the conversion script, this problem no longer exists.

Not necessarily. It just means that the label was created by nnUNet. Then, if the label is reviewed by a human, we should concatenate the reviewer(s) to the "nnUNet" entry.

Ex: we could append another structure below:

{
    "Author": "nnUNetV2_to_bids.py from nnUNet dataset NAME OF DATASET",
    "Date": "2023-06-27 14:00:51"
}

{
    "Author": "Donald Trump",
    "Date": "2023-06-28 12:30:01"
}

@tzebre
Copy link
Member Author

tzebre commented Jun 27, 2023

Not necessarily. It just means that the label was created by nnUNet. Then, if the label is reviewed by a human, we should concatenate the reviewer(s) to the "nnUNet" entry.

Ex: we could append another structure below:

{
    "Author": "nnUNetV2_to_bids.py from nnUNet dataset NAME OF DATASET",
    "Date": "2023-06-27 14:00:51"
}

{
    "Author": "Donald Trump",
    "Date": "2023-06-28 12:30:01"
}

Yes really good idea.

@jcohenadad
Copy link
Member

Tagging @valosekj @sandrinebedard because they are involved in the "neuropoly good-practice SOP for label creation", so that idea could be implemented in our SOP

@valosekj
Copy link
Member

Thank you for coming up with this relevant question, @tzebre!

A bit of context: currently, we store JSON sidecars for manually corrected label files (SC seg, disc labeling, etc.) to track the information about who performed manual corrections. If multiple users did corrections, we store all of them (see here).
But in the case when label files are generated purely automatically (for example, SC seg generated by sct_deepseg_sc), we either have no JSON sidecar, or we use empty one during dataset BIDSification.

I think this issue opens a broader question: do we want to generate JSON sidecars for automatically generated files? I would vote for it. We could store the algorithm (e.g., propseg vs deepseg_sc vs nnUNet, etc.) and its version.
I'll put this point on the agenda for the next SCT dev meeting.

@jcohenadad
Copy link
Member

I think this issue opens a broader question: do we want to generate JSON sidecars for automatically generated files?

Yes, definitely. This is extremely important for tracking provenance. Storing the algorithm and the version is important.

@valosekj
Copy link
Member

I think this issue opens a broader question: do we want to generate JSON sidecars for automatically generated files?

Yes, definitely. This is extremely important for tracking provenance. Storing the algorithm and the version is important.

Okay! I found that SCT already has an open issue (spinalcordtoolbox/spinalcordtoolbox#3394) about this topic --> discussion redirected there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants