Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lift 5d requirement for images and move multiscales description into spec #39

Closed
wants to merge 4 commits into from

Conversation

constantinpape
Copy link
Contributor

As discussed in #35 this lifts the restriction for images to be 5-dimensional.

Also, I have added a description of the multiscales metadata based on https://forum.image.sc/t/multiscale-arrays-v0-1/37930.
This will allow us to keep track of changes to the metadata better.

Note that I have added the field "axes" in order to specify the names of axes in the dataset; this becomes necessary due to
images not having fixed dimension any more. Note that there is still an ongoing discussion about the allowed names for axes (and also about allowing more than 5 dimensions) in #35 and #28. I have not addressed these issues yet, so for now the names are restricted to "t", "c", "x", "y", "z". Also, this is a SHOULD requirement to not break compatibility with 0.1.

@joshmoore

  • I am still refering to the version with 0.1, should it be bumped due to the changes w.r.t. number of dims?
  • Do I need to render the changes locally or is there some CI that also renders PRs? (Would be very convenient :))

@joshmoore
Copy link
Member

joshmoore commented Mar 14, 2021

I am still refering to the version with 0.1, should it be bumped due to the changes w.r.t. number of dims?

Yes. You can bump to an 0.2.x but if we go through with #29 (comment) it'll likely become an 0.3.x.

  • is there some CI that also renders PRs? (Would be very convenient :))

I haven't build it into the GHA workflow(s) yet (wink, wink) but:

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @constantinpape. A few initial questions largely revolving around the contract for the axes metadata.

latest/index.bs Outdated
It SHOULD contain the field "version", which indicates the version of the
multiscale metadata of this image (current version is 0.1).

It SHOULD contain the field "axes", which is a list of dimension names of the axes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the default behavior if axes is not specified? In particular should the consumer have an expectation with regards to the number of underlying dimensions or try to detect it?

Alternatively, the lifting of the 5D requirement could lead to increasing the requirement for axes at a MUST level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maybe it's better to change axes to MUST. I went for SHOULD to stay compatible with 0.1; but if hierarchical storage becomes MUST (see #29) 0.1 will be deprecated anyway and we can also promote axes to MUST.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts @joshmoore?

Copy link
Member

@joshmoore joshmoore Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No immediate objection to breaking now along with this loosening of the 5D/strict-ordering version, but it might be nice as a part of this effort to plan out the next 1 or 2 large changes based on #35 and see which are also going to be breaking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No immediate objection to breaking now along with this loosening of the 5D/strict-ordering version, but it might be nice as a part of this effort to plan out the next 1 or 2 large changes based on #35 and see which are also going to be breaking.

Good point. I will try to summarise the current discussion in there later so we can see what else down the line might be breaking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to MUST now, also summarised potential changes: #35 (comment). As far as I can see none of this would be breaking with the current proposal for 0.2.

latest/index.bs Show resolved Hide resolved
latest/index.bs Show resolved Hide resolved
@constantinpape
Copy link
Contributor Author

Yes. You can bump to an 0.2.x but if we go through with #29 (comment) it'll likely become an 0.3.x.

Ok, I will bump to 0.2 then. And looking forward to having hierarchies :).

I haven't build it into the GHA workflow(s) yet (wink, wink) but:

* http://api.csswg.org/bikeshed/?url=https://raw.githubusercontent.com/ome/ngff/9fae8451eba2e3dbec45fe22c4a74bc66750831e/latest/index.bs produces the HTML that merging this PR would generate

* while https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fngff.openmicroscopy.org%2Flatest%2F&doc2=http%3A%2F%2Fapi.csswg.org%2Fbikeshed%2F%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fome%2Fngff%2F9fae8451eba2e3dbec45fe22c4a74bc66750831e%2Flatest%2Findex.bs even produces a diff of the existing latest/index.html to the output of this PR

Cool, I will check it out!

Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few thoughts on top of Seb's comments.

latest/index.bs Outdated Show resolved Hide resolved
│ ├── .zarray # All image arrays are 5-dimensional
│ │ # with dimension order (t, c, z, y, x).
│ ├── .zarray # All image arrays are up to 5-dimensional with dimension order (t, c, z, y, x).
│ │ # An image without time dimension has axes (c, z, y, x) etc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a 3D array interpreted as (z, y, x)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we promote "axes" to MUST, then the interpretation always depends on the metadata, i.e. on the values in "axes".

latest/index.bs Outdated Show resolved Hide resolved
latest/index.bs Outdated Show resolved Hide resolved
@jni
Copy link
Contributor

jni commented Mar 17, 2021

Hi all,

Sorry to go around in circles, but I'd prefer relaxing MUST to SHOULD here. In the absence of axis labels, consumers could process things as:

  • 5D: backwards-compatible behaviour, TCZYX
  • Anything else: all spatial dimensions unless time or channel axes are specified by the consumer.

Of course, the latter behaviour is asking for trouble, but that's what SHOULD means.

My two cents. Of course, since backwards compatibility is no object given the versioning scheme, I am not too stressed about this choice. 😊 Thanks @constantinpape for jumping in!

@constantinpape
Copy link
Contributor Author

@jni I would still opt for MUST, because not having "axes" is inviting undefined behaviour and we will also require to have this
in future iterations of the spec, see discussion in #35. So requiring it now is also more future proof.

* Anything else: all spatial dimensions unless time or channel axes are specified _by the consumer_.

Of course, the latter behaviour is asking for trouble, but that's what SHOULD means.

Consumer software can still treat this metadata as a hint and over-ride it or allow the user to override; or even dismiss it completely.

@joshmoore
Copy link
Member

FWIW xarray's use of zarr requires an _ARRAY_DIMENSIONS attribute: http://xarray.pydata.org/en/stable/internals.html#zarr-encoding-specification

see also: pydata/xarray#4118 (comment) cc: @rabernat

@jni
Copy link
Contributor

jni commented Mar 19, 2021

@constantinpape

Consumer software can still treat this metadata as a hint and over-ride it or allow the user to override; or even dismiss it completely.

I'm not worried about consumers, but writers/converters. Many image formats don't have well-defined axis names, so if I want to convert some TIFF or .npy to NGFFv0.2, I now have to make up some axes. Which I guess is fine, zyx is a reasonable guess as I point out above, but it feels yuckier at write time (I'm codifying my guess to disk) than at read time (my guess is ephemeral until I close the software).

@d-v-b
Copy link
Contributor

d-v-b commented Mar 19, 2021

this is a problem for xarray, which it solves by using default names dim_0, dim_1, etc. Those names seem a little more generic than x,y,z

@constantinpape
Copy link
Contributor Author

this is a problem for xarray, which it solves by using default names dim_0, dim_1, etc. Those names seem a little more generic than x,y,z

But the xarray scope is also a more general than NGFF. I think x, y, z, c, y will cover most of the use case AND these changes are just an intermediate step until we can agree on a more general solution (that may also involve allowing more than 5d), see discussion in #35.
We need to make sure though that what we decide here does not require breaking changes in a more general solution, see #35 (comment).

@joshmoore
Copy link
Member

@tlambert03 just pinged me elsewhere that this is a general improvement in terms of readability (e.g. "how does one store an image with only one resolution?" Answer: "as a multiscales of array length = 1")

@joshmoore
Copy link
Member

Are there any other high-level points, or is the discussion now primarily around whether this is a MUST or a SHOULD?

If so, I'm inclined to bump this to 0.3 (due to #40) and get it in as "latest" while we start on implementations.

@constantinpape
Copy link
Contributor Author

Are there any other high-level points, or is the discussion now primarily around whether this is a MUST or a SHOULD?

Yes, I think this is the only open point of discussion.

If so, I'm inclined to bump this to 0.3 (due to #40) and get it in as "latest" while we start on implementations.

👍

@constantinpape
Copy link
Contributor Author

Let's revitalize this PR. I think there are only 2 things to do before this can be merged:

@joshmoore
Copy link
Member

  • Do we classify axes as MUST or SHOULD?

I'll just add a data point to this discussion for now: https://twitter.com/shoyer/status/1379877702254034946

@constantinpape, if you can prepare a commit which is the information from zarr-developers/zarr-specs#50 without the specific bits of this PR, then I'm happy to cherry-pick and get 0.2.0 out the door. Then you can move this PR to 0.3.0.

@constantinpape
Copy link
Contributor Author

Thanks for the reviews everyone.
I will close this PR to not get confused; we will first incorporate the wording changes into 0.2 and then work on adding axes.

@joshmoore
Copy link
Member

This PR has been migrated to #46

Please join the conversation there. It would be good to get a first version of axes naming in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants