Support Kerchunk indices embedded in STAC items #32

TomAugspurger · 2023-10-18T17:22:48Z

stac-utils/xstac#38 is prototyping how we might store Kerchunk indices in STAC items. Storing Kerchunk metadata in STAC items removes the need to put that metadata in some sidecar file: https://tomaugspurger.net/posts/stac-updates/#stac-and-kerchunk.

The high-level goal is to store the metadata needed for Kerchunk under the fields added by the datacube extension. This lets us deduplicate a few fields (like the attrs maybe others). I'm not sure if this is worth doing or not, because now you need a function to translate between Kerchunk in STAC and the plain kerchunk references. But I don't think we should be putting JSON strings like .zarray in the STAC objects, so we'll needs something like that anyway I think.

Here's a hacky version of what I have in mind. Using this item collection: https://gist.github.com/TomAugspurger/5b5f40c34212b8302e824e66b477062a.

import pystac
import xstac
import pystac
import kerchunk.combine
import fsspec
import xarray as xr

class STACKerchunkBackend(xr.backends.BackendEntrypoint):
    open_dataset_parameters = ["filename_or_obj", "drop_variables"]

    def open_dataset(self, filename_or_obj, *, drop_variables=None):
        if isinstance(filename_or_obj, (list, pystac.ItemCollection)):
            refs = [xstac.kerchunk.stac_to_kerchunk(item) for item in filename_or_obj]
            refs2 = kerchunk.combine.MultiZarrToZarr(refs, concat_dims="time").translate()
        else:
            refs2 = xstac.kerchunk.stac_to_kerchunk(filename_or_obj)

        return xr.open_dataset(fsspec.filesystem("reference", fo=refs2).get_mapper(), engine="zarr", consolidated=False)

ic = pystac.ItemCollection.from_file("item_collection.json")

ds = xr.open_dataset(list(ic), engine=STACKerchunkBackend, chunks={})
ds

The text was updated successfully, but these errors were encountered:

jsignell · 2023-10-30T14:25:06Z

Thanks for pointing me to that Tom. I guess I should be watching xstac!

jsignell linked a pull request Oct 30, 2023 that will close this issue

Support Kerchunk indices embedded in STAC items #33

Open

jsignell self-assigned this Oct 30, 2023

keewis mentioned this issue Nov 10, 2023

opening STAC elements with assets pointing to kerchunk files #34

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Kerchunk indices embedded in STAC items #32

Support Kerchunk indices embedded in STAC items #32

TomAugspurger commented Oct 18, 2023

jsignell commented Oct 30, 2023

Support Kerchunk indices embedded in STAC items #32

Support Kerchunk indices embedded in STAC items #32

Comments

TomAugspurger commented Oct 18, 2023

jsignell commented Oct 30, 2023