Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse Array in Labels? #691

Open
Mr-Milk opened this issue Aug 27, 2024 · 3 comments
Open

Sparse Array in Labels? #691

Mr-Milk opened this issue Aug 27, 2024 · 3 comments
Labels
element: labels 🏷️ Regions as pixel masks enhancement ✨ New feature or request

Comments

@Mr-Milk
Copy link

Mr-Milk commented Aug 27, 2024

I wonder if it's possible to support sparse array to some degree in Labels. If it's a mask then the sparse format could save quite a lot of disk space.

Sparse array status in xarray: pydata/xarray#3213
Documentation: https://docs.xarray.dev/en/latest/user-guide/duckarrays.html

@LucaMarconato
Copy link
Member

Thank you for opening the discussion on sparse arrays for labels. Currently there is no plan to support this, mainly because sparse array are not part of the released OME-NGFF specifications https://ngff.openmicroscopy.org/latest/, nor of the enhancement proposals https://github.com/ome/ngff/issues?q=sort%3Aupdated-desc+is%3Aissue+is%3Aopen.

For an alternative approach, available already today, I suggest to convert the labels to collections of multipolygons (="shapes" object, represented as a geopandas.GeoDataFrame). You can use the functions to_polygons() and rasterize() from spatialdata for the conversions).

For a long term approach, I kindly ask you to open the discussion also in the ngff repo (same link as above).

I hope this helps!

@Mr-Milk
Copy link
Author

Mr-Milk commented Aug 29, 2024

May I ask why spatialdata need follow the OME-NGFF specification?

@giovp giovp added enhancement ✨ New feature or request element: labels 🏷️ Regions as pixel masks labels Sep 6, 2024
@LucaMarconato
Copy link
Member

We believe that standardization of file formats will make it easier in the long term to enabled cross-interoperable workflows. Also, vendors would have an incentive in producing directly data in a standard format.

Currently, we are still not 100% NGFF compliant because we needed some additional features, but we are working on either contributing some of our ideas to NGFF, either clearly defining in a specification document how we complement missing features from NGFF using external technologies (e.g. GeoParquet).

Back to your question. I think that having sparse arrays in labels would benefit not only the spatial omics community but also the bioimaging community, so I would definitely consider starting the discussion within NGFF, where it can have more visibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
element: labels 🏷️ Regions as pixel masks enhancement ✨ New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants