-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lift 5d requirement for images and move multiscales description into spec #39
Conversation
Yes. You can bump to an 0.2.x but if we go through with #29 (comment) it'll likely become an 0.3.x.
I haven't build it into the GHA workflow(s) yet (wink, wink) but:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @constantinpape. A few initial questions largely revolving around the contract for the axes
metadata.
latest/index.bs
Outdated
It SHOULD contain the field "version", which indicates the version of the | ||
multiscale metadata of this image (current version is 0.1). | ||
|
||
It SHOULD contain the field "axes", which is a list of dimension names of the axes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the default behavior if axes
is not specified? In particular should the consumer have an expectation with regards to the number of underlying dimensions or try to detect it?
Alternatively, the lifting of the 5D requirement could lead to increasing the requirement for axes
at a MUST
level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, maybe it's better to change axes
to MUST
. I went for SHOULD to stay compatible with 0.1; but if hierarchical storage becomes MUST
(see #29) 0.1 will be deprecated anyway and we can also promote axes
to MUST
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts @joshmoore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No immediate objection to breaking now along with this loosening of the 5D/strict-ordering version, but it might be nice as a part of this effort to plan out the next 1 or 2 large changes based on #35 and see which are also going to be breaking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No immediate objection to breaking now along with this loosening of the 5D/strict-ordering version, but it might be nice as a part of this effort to plan out the next 1 or 2 large changes based on #35 and see which are also going to be breaking.
Good point. I will try to summarise the current discussion in there later so we can see what else down the line might be breaking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to MUST now, also summarised potential changes: #35 (comment). As far as I can see none of this would be breaking with the current proposal for 0.2.
Ok, I will bump to 0.2 then. And looking forward to having hierarchies :).
Cool, I will check it out! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few thoughts on top of Seb's comments.
│ ├── .zarray # All image arrays are 5-dimensional | ||
│ │ # with dimension order (t, c, z, y, x). | ||
│ ├── .zarray # All image arrays are up to 5-dimensional with dimension order (t, c, z, y, x). | ||
│ │ # An image without time dimension has axes (c, z, y, x) etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a 3D array interpreted as (z, y, x)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we promote "axes" to MUST, then the interpretation always depends on the metadata, i.e. on the values in "axes".
Hi all, Sorry to go around in circles, but I'd prefer relaxing MUST to SHOULD here. In the absence of axis labels, consumers could process things as:
Of course, the latter behaviour is asking for trouble, but that's what SHOULD means. My two cents. Of course, since backwards compatibility is no object given the versioning scheme, I am not too stressed about this choice. 😊 Thanks @constantinpape for jumping in! |
@jni I would still opt for MUST, because not having "axes" is inviting undefined behaviour and we will also require to have this
Consumer software can still treat this metadata as a hint and over-ride it or allow the user to override; or even dismiss it completely. |
FWIW xarray's use of zarr requires an _ARRAY_DIMENSIONS attribute: http://xarray.pydata.org/en/stable/internals.html#zarr-encoding-specification see also: pydata/xarray#4118 (comment) cc: @rabernat |
I'm not worried about consumers, but writers/converters. Many image formats don't have well-defined axis names, so if I want to convert some TIFF or .npy to NGFFv0.2, I now have to make up some axes. Which I guess is fine, zyx is a reasonable guess as I point out above, but it feels yuckier at write time (I'm codifying my guess to disk) than at read time (my guess is ephemeral until I close the software). |
this is a problem for xarray, which it solves by using default names |
But the xarray scope is also a more general than NGFF. I think x, y, z, c, y will cover most of the use case AND these changes are just an intermediate step until we can agree on a more general solution (that may also involve allowing more than 5d), see discussion in #35. |
@tlambert03 just pinged me elsewhere that this is a general improvement in terms of readability (e.g. "how does one store an image with only one resolution?" Answer: "as a multiscales of array length = 1") |
Are there any other high-level points, or is the discussion now primarily around whether this is a MUST or a SHOULD? If so, I'm inclined to bump this to |
Yes, I think this is the only open point of discussion.
👍 |
Let's revitalize this PR. I think there are only 2 things to do before this can be merged:
|
I'll just add a data point to this discussion for now: https://twitter.com/shoyer/status/1379877702254034946
@constantinpape, if you can prepare a commit which is the information from zarr-developers/zarr-specs#50 without the specific bits of this PR, then I'm happy to cherry-pick and get 0.2.0 out the door. Then you can move this PR to 0.3.0. |
Thanks for the reviews everyone. |
This PR has been migrated to #46 Please join the conversation there. It would be good to get a first version of axes naming in place. |
As discussed in #35 this lifts the restriction for images to be 5-dimensional.
Also, I have added a description of the multiscales metadata based on https://forum.image.sc/t/multiscale-arrays-v0-1/37930.
This will allow us to keep track of changes to the metadata better.
Note that I have added the field "axes" in order to specify the names of axes in the dataset; this becomes necessary due to
images not having fixed dimension any more. Note that there is still an ongoing discussion about the allowed names for axes (and also about allowing more than 5 dimensions) in #35 and #28. I have not addressed these issues yet, so for now the names are restricted to "t", "c", "x", "y", "z". Also, this is a SHOULD requirement to not break compatibility with 0.1.
@joshmoore