You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, translating HDF5 to Zarr will result in a Zarr store with identical chunks as the source. If the source isn't chunked, this will cause worse performance when you slice a subset of the original data, since fsspec will make the full range request.
Having the flexibility to make smaller requests by splitting large ranges into separate chunks would be helpful, if it's feasible for the backend (which it should be for these large, contiguous buffers from HDF5).
The text was updated successfully, but these errors were encountered:
TomAugspurger
changed the title
Add the ability to split chunks
Add the ability to split large chunks
Feb 4, 2022
NOT when that chunk is compressed with something that doesn't have clear internal blocks (e.g., gzip). Zarr does not support streaming of any sort, it only knows which blocks you want, so you need to be able to cleanly subdivide the whole thing, which is easy for uncompressed buffers, and possible for block-compressed buffers (e.g., Zstd).
I wonder if in your example it is as fast to get the last point of the array?
Currently, translating HDF5 to Zarr will result in a Zarr store with identical chunks as the source. If the source isn't chunked, this will cause worse performance when you slice a subset of the original data, since fsspec will make the full range request.
Here's a Kerchunked file
Timing small reads
Compared with the non-kerchunked version
Having the flexibility to make smaller requests by splitting large ranges into separate chunks would be helpful, if it's feasible for the backend (which it should be for these large, contiguous buffers from HDF5).
The text was updated successfully, but these errors were encountered: