Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from stackstc to odc-stac? #12

Closed
jsignell opened this issue Mar 10, 2023 · 6 comments · Fixed by #26
Closed

Switch from stackstc to odc-stac? #12

jsignell opened this issue Mar 10, 2023 · 6 comments · Fixed by #26

Comments

@jsignell
Copy link
Member

It seems like there is a shift towards using odc-stac rather than stackstac. I'm wondering if that needs to be configurable somehow or if this library should just pick one.

@gadomski
Copy link
Member

IMO the community is still unsure of which (if either) is The One™. If you do integrate/switch, I would love to hear your thoughts on the comparison between the two w.r.t. ease of integration, ease of use, etc.

@jsignell
Copy link
Member Author

Yeah I read through opendatacube/odc-stac#54 and came out the other end thinking that odc-stac probably has more of a future. I'll see if I can com e up with any ideas around how to make odc-stac more ergonomic.

@weiji14
Copy link
Contributor

weiji14 commented Mar 21, 2023

Maybe have both? Currently, stackstac produces an xarray.DataArray whereas odc-stac produces an xarray.Dataset. An xr.DataArray is suited for 2D data + bands, whereas an xr.Dataset is suited for multi-dimensional datasets (e.g. climate model outputs), so slightly different use cases.

With xpystac=0.0.1, you have xr.open_dataset(item_collection, ...) using stackstac in the backend. But realistically, you could swap stacstac for odc-stac to remove the .to_dataset call here:

https://github.com/jsignell/xpystac/blob/65b08c26603b6f64d5fe388973b26c8a29bf16a9/xpystac/core.py#L30

In addition, you could register xr.open_dataarray() to use stackstac instead. Of course, this might need some documentation to be clear that STAC ItemCollections passed to xr.open_dataarray() are stacked using stackstac.stack while those passed to xr.open_dataset() are stacked with odc.stac.load.

@jsignell
Copy link
Member Author

In addition, you could register xr.open_dataarray() to use stackstac instead. Of course, this might need some documentation to be clear that STAC ItemCollections passed to xr.open_dataarray() are stacked using stackstac.stack while those passed to xr.open_dataset() are stacked with odc.stac.load.

Oh that is an interesting idea. I wonder if that would feel surprising to the user.

@maawoo
Copy link

maawoo commented Sep 11, 2023

I just stumbled upon this discussion and wanted to add to @weiji14's comment, that a major difference is also the parsing of STAC metadata to Xarray, which in my opinion is an important difference to consider. Quoting from opendatacube/odc-stac#54 (comment) :

Access to the original STAC metadata

  • odc-stac doesn't really expose any of that, and there is a fundamental design choice that makes it impossible to do in a general case, but we can certainly add it for special case data loading in the future.
  • stackstac exposes all the metadata fields in the returned xarray, combined with delayed computation enabled by Dask this can be very handy as you can leverage all the xarray conveniences to filter out unwanted data.

Here is an example of how it can look like in practice with a dataset created from https://github.com/SAR-ARD/S1_NRB :

image

Users can then easily filter the array based on the parsed STAC Item properties:

ds_filtered = ds.where((ds['sat:relative_orbit'] == 44), drop=True)

I am working a lot with local, static STAC Catalogs without using an API or database to do the querying beforehand.
@weiji14's suggestion is interesting and could be a bridge between both libraries. I don't think there is a shift to one or the other and I also don't think there will be The One™ anytime soon. I think it's best to not press forward too fast with #26.

@jsignell
Copy link
Member Author

Thank you for commenting! I had reached a similar decision last week and updated #26 to make the stacking library configurable as suggested by @weiji14. I just renamed the PR to indicate that change in functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants