[MISC] Generalize rules that labels must be used consistently #1328

Remi-Gau · 2022-10-17T14:43:59Z

relates to Clarification of task entity in the appendix #1314

src/schema/objects/common_principles.yaml

Remi-Gau · 2022-10-17T14:50:51Z

Something I am not sure about:

no idea how to validate for this
this rule may be too restrictive

For example if we have 2 anat files

sub-01_acq-highRes_T1w.nii
sub-01_acq-highRes_T2w.nii

It may be the case that highRes refers to different things for those different types of suffixes.

src/schema/objects/common_principles.yaml

Co-authored-by: Taylor Salo <[email protected]>

codecov · 2022-10-18T18:30:32Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.79%. Comparing base (1cb92eb) to head (fe5397b).

❗ Current head fe5397b differs from pull request most recent head b165609. Consider uploading reports for the commit b165609 to get more accurate results

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1328   +/-   ##
=======================================
  Coverage   87.79%   87.79%           
=======================================
  Files          16       16           
  Lines        1360     1360           
=======================================
  Hits         1194     1194           
  Misses        166      166

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sappelhoff

I think we would make our lives easier to require that labels are unique throughout the dataset, not only in individual modalities 🤔 it seems like a huge source of confusion for little benefit otherwise

Remi-Gau · 2023-12-22T10:38:29Z

Probably a PR to revisit early 2024

sappelhoff

I stand by my point in #1328 (review) (require labels to be consistent throughout the dataset, rather than only modalities) ... BUT I think the present changes are more in the spirit of the current spec, and thus I deem this PR mergeable as a nice "refactoring".

(Note: not sure if the [ENH] is fitting, therefore ... maybe [MISC]?)

src/schema/objects/common_principles.yaml

Remi-Gau · 2024-04-19T07:11:21Z

I stand by my point in #1328 (review) (require labels to be consistent throughout the dataset, rather than only modalities) ... BUT I think the present changes are more in the spirit of the current spec, and thus I deem this PR mergeable as a nice "refactoring".

Updated to have consistency across datatypes too.

(Note: not sure if the [ENH] is fitting, therefore ... maybe [MISC]?)

Done

tsalo

The changes look good to me. One thing I do wonder, though, is if we should make it clear that "consistency" is at the discretion of the dataset curator. The distinction between lores and hires might be different for different curators.

For example, one might have T1ws at 1 mm3 and 0.8 mm3, and T2ws at 1 mm3 and 0.6 mm3, and I think it's up to the curator to decide if they was acq-lores (1 mm3), acq-hires (0.8 mm3), and acq-veryhires (0.6 mm3) or just acq-lores (1 mm3) and acq-hires (0.8 mm3 and 0.6 mm3). In both cases "lores" means low resolution and "hires" means high resolution, but the exact definitions differ.

effigies · 2024-04-19T15:30:46Z

I overall approve of the centralization and the spirit of this, but this can lead to a problem:

Validation of this would be computationally painful, even if it were well defined. However, without metadata tags tied to each entity, how would we validate this?
If a tool reads this "MUST", assumes it means something is validated and can be depended on, and decides to optimize by not rereading metadata it thinks it knows, you could end up with invalid processing.

Remi-Gau · 2024-04-19T15:39:25Z

Yeah I was not sure when rereading this about the MUST and also I had my doubt about to even validate this.

Maybe less constraining: turning this into an admonition to tell users they should be careful about consistency (so no SHOULD or MAY but try to bring attention to this in general).

effigies · 2024-08-16T13:57:37Z

Maybe less constraining: turning this into an admonition to tell users they should be careful about consistency (so no SHOULD or MAY but try to bring attention to this in general).

Rereading this after a couple months (looking through the 1.10.0 milestone), I think I agree with this.

yarikoptic · 2025-01-23T15:30:59Z

src/modality-specific-files/microscopy.md

@@ -102,7 +102,8 @@ Consistent with other data types in BIDS, the session entity is optional.

 The [`sample-<label>`](../appendices/entities.md#sample) entity is REQUIRED for
 Microscopy data and is used to distinguish between different samples from the same subject.
-The label MUST be unique per subject and is RECOMMENDED to be unique throughout the dataset.
+Contrary to other labels, the `sample-<label>` MUST be unique per subject
+and is RECOMMENDED to be unique throughout the dataset.


I do not see why I am to be disallowed to use sample-1, sample-2 and so on across my subjects if I do not have any other logical description for samples besides some index (label is superset of index AFAIK).
It seems that the intent here to demand some UUIDs, but I do not see really why here and why only for samples.

Moreover, as we ATM have mandatory sub- leading folder anyways, I do not understand really what "unique per subject" means really.

yarikoptic · 2025-01-23T15:37:01Z

src/schema/objects/common_principles.yaml

+    Labels MUST be consistent across subjects and sessions and data types.
+    For example, an `acq` entity label `hires` used on the `anat` data of a given subject
+    MUST mean the same thing as `acq-hires` in `func` files of any other subject.


MUST is IMHO too strong of a word here since we do not even define the meaning of "mean" in this case -- is it EXACTLY the same resolution? the same shimming? clearly it cannot be "the same everything" since then functional must be an anatomical ;) Other example could be _desc for which we provision descriptions.tsv per each subject/session hence allowing for different "meaning"s (e.g. exact filtering parameters) for the same overall descriptor filt.

Hence

Suggested change

Labels MUST be consistent across subjects and sessions and data types.

For example, an `acq` entity label `hires` used on the `anat` data of a given subject

MUST mean the same thing as `acq-hires` in `func` files of any other subject.

Labels SHOULD be consistent across subjects and sessions and data types.

For example, an `acq` entity label `hires` used on the `anat` data of a given subject

SHOULD mean the same thing as `acq-hires` in `func` files of any other subject.

might be additional factor -- IIRC bids-validator would issue a WARNING whenever "same" (e.g. _task-movie_run_1) files across subjects have different "qualities" (e.g. length), not ERROR. If changed to MUST, it then MUST issue an ERROR rendering some previously valid BIDS datasets invalid, breaking change.

yarikoptic · 2025-01-23T15:45:04Z

Coming late to this, but IMHO

we indeed should indeed strive to generalize e.g. all labels by default SHOULD to be consistent within dataset.
- but then, making some entities "special" adds cognitive burden/complexity/inflexibility etc. (referring to sample here)
BUT also to allow for flexibility to accommodate different "contexts" to allow explicitly define how any particular label/index should be treated

To address similar issue at the level of columns (in particular for those {entity}_id, like participant_id, session_id), I proposed

[ENH] Add ContextURI to allow to define the context for the entity values #1939

to allow for such explicit annotation.

In principle (TODO: find an issue we must have already), for every {entity_single} we could have {entity_plural}.[tsv,json] which would have that {entity_single}_id column (and at least "Description" column) with {entity_plural}.json providing such ContextURI to state that it is either this dataset unique (bids::), or may be even bids:raw: pointing that it is the same as of the original BIDS dataset, and then come up with URI identifiers for other contexts, e.g. to treat values within {} as entity names, hence bids::sub-{subject} could be used to state that it is per-subject entity and bids::sub-{subject}/ses-{session} and alike. And then with Description to actually describe the "meaning" behind the label (relating to my inline comment), even if just for a human -- is it "highres" as the same resolution, or "highres" flavor of T1w (0.2mm) or BOLD (1mm), and so on. task_id "rest" meaning "subjects laying down with eyes closed, allowed to mind wonder, etc", "movie" meaning "watching some movie" etc.

Remi-Gau added 4 commits October 17, 2022 16:29

add consistency definition

0f383a2

rm mention of consistency for other labels

f500e59

rm consistent from schema

0b26a61

update microscopy sample label paragraph

8bea7df

Remi-Gau requested review from tsalo and erdalkaraca as code owners October 17, 2022 14:44

Remi-Gau commented Oct 17, 2022

View reviewed changes

src/schema/objects/common_principles.yaml Outdated Show resolved Hide resolved

tsalo reviewed Oct 18, 2022

View reviewed changes

src/schema/objects/common_principles.yaml Outdated Show resolved Hide resolved

tsalo reviewed Oct 18, 2022

View reviewed changes

src/schema/objects/common_principles.yaml Outdated Show resolved Hide resolved

Remi-Gau and others added 2 commits October 18, 2022 20:24

Update src/schema/objects/common_principles.yaml

9d30948

Co-authored-by: Taylor Salo <[email protected]>

Update src/schema/objects/common_principles.yaml

e4b4365

Co-authored-by: Taylor Salo <[email protected]>

sappelhoff reviewed Oct 23, 2022

View reviewed changes

Remi-Gau added this to the 1.10.0 milestone Dec 22, 2023

Remi-Gau added the consistency Spec is (potentially) inconsistent label Dec 22, 2023

Remi-Gau added 2 commits April 15, 2024 12:51

Merge remote-tracking branch 'upstream/master' into consistency

a360f38

Merge branch 'master' into consistency

fe5397b

sappelhoff approved these changes Apr 18, 2024

View reviewed changes

Remi-Gau commented Apr 19, 2024

View reviewed changes

src/schema/objects/common_principles.yaml Outdated Show resolved Hide resolved

Update src/schema/objects/common_principles.yaml

c8a4844

Remi-Gau changed the title ~~[ENH] Generalize rules that labels must be used consistently~~ [MISC] Generalize rules that labels must be used consistently Apr 19, 2024

Merge branch 'master' into consistency

b165609

sappelhoff requested a review from effigies April 19, 2024 07:49

tsalo approved these changes Apr 19, 2024

View reviewed changes

effigies removed the request for review from erdalkaraca August 16, 2024 13:57

effigies modified the milestones: 1.10.0, 1.11.0 Aug 27, 2024

yarikoptic reviewed Jan 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MISC] Generalize rules that labels must be used consistently #1328

[MISC] Generalize rules that labels must be used consistently #1328

Remi-Gau commented Oct 17, 2022 •

edited

Loading

Remi-Gau commented Oct 17, 2022

codecov bot commented Oct 18, 2022 •

edited

Loading

sappelhoff left a comment

Remi-Gau commented Dec 22, 2023

sappelhoff left a comment

Remi-Gau commented Apr 19, 2024

tsalo left a comment •

edited

Loading

effigies commented Apr 19, 2024

Remi-Gau commented Apr 19, 2024

effigies commented Aug 16, 2024

yarikoptic Jan 23, 2025 •

edited

Loading

yarikoptic Jan 23, 2025

yarikoptic Jan 23, 2025

yarikoptic commented Jan 23, 2025

[MISC] Generalize rules that labels must be used consistently #1328

Are you sure you want to change the base?

[MISC] Generalize rules that labels must be used consistently #1328

Conversation

Remi-Gau commented Oct 17, 2022 • edited Loading

Remi-Gau commented Oct 17, 2022

codecov bot commented Oct 18, 2022 • edited Loading

Codecov Report

sappelhoff left a comment

Choose a reason for hiding this comment

Remi-Gau commented Dec 22, 2023

sappelhoff left a comment

Choose a reason for hiding this comment

Remi-Gau commented Apr 19, 2024

tsalo left a comment • edited Loading

Choose a reason for hiding this comment

effigies commented Apr 19, 2024

Remi-Gau commented Apr 19, 2024

effigies commented Aug 16, 2024

yarikoptic Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

yarikoptic Jan 23, 2025

Choose a reason for hiding this comment

yarikoptic Jan 23, 2025

Choose a reason for hiding this comment

yarikoptic commented Jan 23, 2025

Remi-Gau commented Oct 17, 2022 •

edited

Loading

codecov bot commented Oct 18, 2022 •

edited

Loading

tsalo left a comment •

edited

Loading

yarikoptic Jan 23, 2025 •

edited

Loading