Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.2.0: switch to nested layout #40

Merged
merged 1 commit into from
Apr 24, 2021
Merged

0.2.0: switch to nested layout #40

merged 1 commit into from
Apr 24, 2021

Conversation

joshmoore
Copy link
Member

This defines the dimension separator (sometimes called
"key separator") for chunks in the NGFF file to be "/"
rather than ".". The benefit of storing hierarchically
is that for very large volumes the number of files in
a directory is limited to shape / chunk for a given
dimension.

close: #29

Review:

(Note: the review URLs use my branch name and therefore will be updated when I push)

Primary change:

Screen Shot 2021-03-27 at 20 25 26

This defines the dimension separator (sometimes called
"key separator") for chunks in the NGFF file to be "/"
rather than ".". The benefit of storing hierarchically
is that for very large volumes the number of files in
a directory is limited to `shape / chunk` for a given
dimension.

close: ome#29
@sbesson
Copy link
Member

sbesson commented Mar 29, 2021

The W3 services diff looks good and makes sense to me thanks @joshmoore.

From the upstream Zarr perspective, this matches the work that has been engaged to use / as the default rather than .. Our latest understanding is that this will becomes the default in Zarr v3 - see zarr-developers/zarr-specs#78 for the original discussion. So in additional to solving concrete use cases, this also brings us closer to the upcoming specification release.

On the implementation side, various efforts have started and will keep happening to prepare tools to update data writing and reading to handle the new separator- see glencoesoftware/bioformats2raw#80 for instance.

My main question for other implementations was the following: what is the most direct way to detect that an image has been stored according to version 0.2 (or later) of this specification and that / should be used as a separator? My expectation would be the version key in the multiscales specification is the most relevant key/value pair here, is that correct?. If so, would it make sense to pulling the subset of the changes in #39 into 0.2. that expands this section of the specification to include a clear "version": "0.2" statement?

@joshmoore
Copy link
Member Author

Practically, I think an initialization of the zarr store will happen to get access to the metadata and then if it's a version that requires special action (e.g. version == 0.1, therefore hard-code . as the separator) a new store will be created with a log statement to the user.

I thought about migrating @constantinpape's changes to this PR, but I wasn't completely sure on their status (i.e. if more are coming). Happy to add those on top of the 0.2/index.bs at any point in the future.

joshmoore added a commit to joshmoore/bioformats2raw that referenced this pull request Mar 29, 2021
This also updates the version added to the metadata
based on the nested flag. In the future, this logic
will need to be more sophisticated.

see: ome/ngff#40
@constantinpape
Copy link
Contributor

I thought about migrating @constantinpape's changes to this PR, but I wasn't completely sure on their status (i.e. if more are coming). Happy to add those on top of the 0.2/index.bs at any point in the future.

I think these changes are done, except for a decision about MUST vs SHOULD for the "axes" field.

I am happy to rebase #39 onto the changes here, or change #39 so that we merge it into this PR instead of master.

@joshmoore
Copy link
Member Author

Going to start merging all the related nested-chunk-storage PRs. In implementing zarr-developers/zarr-python#716, I think I've convinced myself that we will be able to drop this requirement soon, i.e. this specification should be agnostic to the underlying storage, but I'd propose getting this in. Future versions can describe the nested layout as a SHOULD.

@joshmoore joshmoore merged commit af5399f into ome:main Apr 24, 2021
@joshmoore joshmoore deleted the 0_2_0 branch April 24, 2021 09:42
@joshmoore
Copy link
Member Author

@constantinpape : happy to take your wording update on top of 0.2/* as you modify 0.3/*

github-actions bot added a commit that referenced this pull request Apr 24, 2021
0.2.0: switch to nested layout

SHA: af5399f
Reason: push, by @joshmoore

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@constantinpape
Copy link
Contributor

@constantinpape : happy to take your wording update on top of 0.2/* as you modify 0.3/*

Sorry, I was busy with other stuff last week...
I will refactor and split the PRs now, so that we can get the wording update into 0.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hierarchical chunk storage
3 participants