-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move codecs into separate (versioned documents), update urls #187
Conversation
array-extensions/filters can be removed since that functionality is now covered by the |
As also mentioned in #183, I think there is value in keeping the metadata representation concise, because in some cases users may read and write it directly. For example, some zarr implementations (like tensorstore) may just rely on the metadata json representation itself for specifying codecs, rather than defining a separate programming API for each codec. I think we might look at HTML as a model: we have My own preference is that, for example, the gzip codec is specified with just plain "gzip". If we later need to make an incompatible change (unlikely in my opinion), we could then introduce a "gzip2" or "gzip/v2" codec as well. If a backwards-compatible change is added, such as introduction of a new optional field, then we could continue using just "gzip", and implementations that don't support the new field would either fail or ignore it, depending on how the codec is defined. I can see that there is one downside --- with an explicit version number, if you try to read "gzip/v1.3" and the implementation is only aware of "gzip/v1.2", then the error message can say "gzip/v1.3" not supported, latest version supported is "gzip/v1.2", as opposed to "Invalid field 'foobar', array may have been created with a newer version of the gzip codec.". Only codecs defined by external organizations, and that haven't been standardized as part of the zarr specification, would use a longer identifier (e.g. a full URL, possibly without the scheme). It still could make sense to have version numbers for the codec specification itself, so that implementations can specify in their documentation exactly which version of the codec specification they claim to support. The proposed mechanism here seems to be to duplicate the file under a new name with an incremented version number when any change is made. I wonder if we could instead just rely on git to track multiple versions; implementations could then just link to the older spec file on github. There are also various ways to manage multiple versions in the generated documentation. |
@jbms I agree with you in all points, but I still would prefer to split up the codec documents. I just reconciled the purl URLs and would like to change or update them, see #176. How about continuing the URL/id conversation there? I think it should not block this PR, this just cleans up lot's of things to have a clean overview now.
Good point, removing it in this PR as well now. @jbms Would you like to see any further changes except for an ongoing discussion about the purl urls? As explained, I'd rather defer that discussion to #176 for now. |
This PR
related ZEP PR: ZEP1: define which extensions are part of ZEP1 zeps#23
filters
extension which is superseded by thecodecs
list as of Support a list ofcodecs
in place of a singlecompressor
field #153Note: The text of the actual codec specifications is unchanged.