Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: measurementLists Group as alternative to many measurementList1...measurementList2 indexed groups #115

Merged
merged 12 commits into from
Aug 23, 2024

Conversation

sstucker
Copy link
Collaborator

@sstucker sstucker commented Jun 1, 2022

It has been highlighted by #103 that the Indexed Group, described as

Each element of the sub-group is uniquely identified by appending a string-formatted index (starting from 1, with no preceding zeros) in the name, for example, /.../name1 denotes the first sub-group of data element name, and /.../name2 denotes the 2nd element, and so on.

is a wildly inefficient way to structure an HDF5 file.

This draft adds an alternative encoding of the measurementList.

There are a few known issues at this point, such as defining a character to be NaN at the index of channels which lack a particular value.

@sstucker
Copy link
Collaborator Author

sstucker commented Jun 1, 2022

I am not in love with "measurementLists" btw...

@samuelpowell
Copy link
Collaborator

@sstucker thank you for tackling this, the proposed changes capture the proposed changes. Is some more general opening explanatory text required to explain the two options?

@sstucker sstucker linked an issue Jun 21, 2022 that may be closed by this pull request
* **Type**: group
* **Location**: `/nirs(i)/data(j)/measurementLists`

The group for measurement list variables which map data array onto the to map the data array onto the probe geometry (sources and detectors), data type, and wavelength. This group's datasets are arrays with size `<number of channels>`, which each position describing the corresponding column in the data matrix. (i.e. the values at `measurementLists/sourceIndex(3)` and `measurementLists/detectorIndex(3)` correspond to `dataTimeSeries(:,3)`).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix wording

@samuelpowell samuelpowell self-assigned this Apr 10, 2023
@Horschig
Copy link
Collaborator

Given that it's > 1 year for this, should we assume it's not relevant anymore?

@samuelpowell
Copy link
Collaborator

Issue #103 is still a problem and this solution is a sane way to handle it. I will return to this before the end of the year.

@samuelpowell
Copy link
Collaborator

This is an old issue but it remains pertinent and it would be nice to get this merged.

Other than resolving the review comments, what needs to be done (beyond merging) to make sure that this is properly supported in, e.g, the validator?

The arrays of `measurementLists` are:

#### /nirs(i)/data(j)/measurementLists/sourceIndex
* **Presence**: required
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for all the places in this chunk of code where we say "required", do we need to clarify that this is "required if measurementLists is utilized?

Copy link
Collaborator

@dboas dboas Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this could be merged... but @sstucker is not working on this anymore. Can someone else make the fixes in the review comments?

Also, we need a to do list item to remind us that the validator needs to incorporate this. Do we just create a new Issue for getting that done? But the validator is not in this GitHub... hmm... I forget who is managing the validator. Actually, my group is managing pysnirf2 at https://github.com/BUNPC/pysnirf2

Copy link
Collaborator Author

@sstucker sstucker Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can update the validator logic in pysnirf2 as I recall working with files like this and have some other pending work there I want to release. I have the bandwidth for this, no worries

I do not feel comfortable finishing this feature for the specification itself.

There are some issues with this feature as I drafted it here, for instance, how do we deal with the missing values in the table? Should we delineate a column order for all possible channel datasets? Are NAN values supported by HDF5 and portable to all commonly used interfaces?

A more minor note: the measurementLists name only differs by a letter and may be confusing... I still don't like it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sstucker I am fine with measurementLists and am not confused by this being one letter different than measurementList.
I do wonder why we just didn't utilize the existing measurementList(1) for this as I described way back in #103 (comment), but I can live with measurementLists as I appreciate that it makes the intent of using arrays for describing the measurementList explicit.

@samuelpowell
Copy link
Collaborator

@sstucker thank you for any help, much appreciated!

Could you clarify where the NaN value situation may arise? What required fields do you envisaged being omitted in practice?

@dboas
Copy link
Collaborator

dboas commented May 19, 2024

@samuelpowell I do hope we can make good progress when we meet in 10 days. It would be great to resolve this soon. I am also now hitting the same issue you described way back in #103 (comment).

@sstucker
Copy link
Collaborator Author

sstucker commented May 20, 2024

@sstucker thank you for any help, much appreciated!

Could you clarify where the NaN value situation may arise? What required fields do you envisaged being omitted in practice?

There is no rule enforcing that any of the optional fields must be present for all channels in the list. I don't know why this would occur, but it can, strictly speaking.

@samuelpowell
Copy link
Collaborator

Following discussion in May 24 meeting, it was decided that this feature should be implemented as described.

The issue regarding missing fields, which can arise when e.g. both processed and raw data are stored together was also discussed. The conclusion here is that a permissive approach is reasonable - when a data field is not required by a data type (e.g. a wavelength index for a chromphore) the value, if present, is ignored. This will be discussed further in #119.

@sstucker will this work as implemented, e.g., will the validator be able to parse **Presence**: required if measurementLists is not present, or is additional work neccesary?

@sstucker
Copy link
Collaborator Author

@samuelpowell It will require some work from me on the validator ahead of the next release, but I can do it. Let's get the ball rolling towards the release of the feature by merging this.

@samuelpowell
Copy link
Collaborator

Okay, I've addressed one of David's comments, merged master and updated the spell check.

Before proceeding, what do you think of @dboas question "for all the places in this chunk of code where we say "required", do we need to clarify that this is "required if measurementLists is utilized?"

@samuelpowell
Copy link
Collaborator

Further to meeting of 10/07 it was determined that it would be good to include the text "required if measurementLists is present" in order to ensure the spec is literal, however the implementation overhead should be considered.

@sreekanthkura7 to check the validator to determine

  • if there is a required field nested underneath an optional group, is its presence still enforced in the absence of the group, and,
  • what is the complexity of supporting the condition "required if measurementLists is present" (note that in this case the constraint can be ignored if the validator does not enforce required fields under absent optional groups)

Added to required fields of measurementLists
'required if measurementLists is present'
@dboas
Copy link
Collaborator

dboas commented Jul 22, 2024

I modified the spec to indicate that required fields of measurementLists are required if measurementLists is present

Sreekanth and I discussed the validator changes needed. He will work with Stephen on the validator once the pull request is merged.

@samuelpowell
Copy link
Collaborator

I'd like to get #151 merged first, then we can remove the same fields in this PR too. If anyone able to review the same we can make progress.

@dboas
Copy link
Collaborator

dboas commented Aug 10, 2024

@sreekanthkura7 , #151 has been merged. Can you remove the same fields from this PR as Sam requested in the comment above?

@sreekanthkura7
Copy link
Collaborator

@dboas , It looks like I don't have permission to edit this. Can someone please do this or give me the access to do?

@samuelpowell
Copy link
Collaborator

@sreekanthkura7 you should now be able to push commits to this PR

Remove following fields that are removed in PR fNIRS#151.
data.measurementList.moduleIndex
data.measurementList.sourceModuleIndex
data.measurementList.detectorModuleIndex
probe.useLocalIndex
@sreekanthkura7
Copy link
Collaborator

@samuelpowell The fields that were removed in #151 have now been removed in this PR as well

@samuelpowell
Copy link
Collaborator

Thanks @sreekanthkura7 I will re-review by Monday

@samuelpowell samuelpowell marked this pull request as ready for review August 19, 2024 16:05
@@ -537,6 +537,119 @@ As described below, optional variables `probe.sourceLabels` and
`probe.detectorLabels` are provided for indicating the instrument specific
label for sources and detectors.

#### /nirs(i)/data(j)/measurementLists
* **Presence**: optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be 'required if measurementList is not present' for parity with @dboas changes to measurementList(k)?

snirf_specification.md Outdated Show resolved Hide resolved
snirf_specification.md Outdated Show resolved Hide resolved

Must be 1-D array with length equal to the size of the second dimension of `/nirs(i)/data(j)/dataTimeSeries`. Units are optionally defined in `metaDataTags`.

#### /nirs(i)/data(j)/measurementLists/moduleIndex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove field

snirf_specification.md Outdated Show resolved Hide resolved
snirf_specification.md Outdated Show resolved Hide resolved
snirf_specification.md Outdated Show resolved Hide resolved
@samuelpowell
Copy link
Collaborator

I think there was some confusion here @sreekanthkura7, the request from myself / @dboas was not to remove the exact same fields as in #151, as these would be removed by merging master into this branch. It was instead to remove the associated fields under measurementLists which no longer existed in measurementList(k). Have added a review which I hope is clear.

@dboas
Copy link
Collaborator

dboas commented Aug 20, 2024

@sreekanthkura7 I just noticed that the table of contents needs to be updated to link to measurementLists and all the sub-fields

Implement changes to the measurementLists similar to those made in change request #157 on the measurementList.
Update the presence of measurementLists.
Reorder the descriptions of measurementLists for consistency.
Revise the table of contents to include measurementLists.
Update the format summary to include measurementLists.
@sreekanthkura7
Copy link
Collaborator

@samuelpowell @dboas The changes discussed above have been implemented. Additionally, I updated the measurementLists in the summary table. However, I am uncertain if the type column for the measurementLists was updated correctly. Please review and let me know if any further adjustments are needed.

@sreekanthkura7 sreekanthkura7 merged commit e279629 into fNIRS:master Aug 23, 2024
3 checks passed
@sreekanthkura7
Copy link
Collaborator

@sstucker, now that this PR is merged, do you still have time to assist with the validator? I'm available to work with you on this as well.

@sstucker
Copy link
Collaborator Author

@sreekanthkura7 Yes, I can start a draft release of the validator. I will want to align the validator to the entire draft SNIRF release, so it would be good to get a changelog and draft release notes for the next SNIRF online soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Overhead of channel descriptor groups
5 participants