-
Notifications
You must be signed in to change notification settings - Fork 41
Having a custom engine
for open_mfdatatree
#55
Comments
Hi @mraspaud , thanks so much for your interest!
Some initial thoughts:
Tagging @jhamman for his backends expertise too! EDIT: Related to #51 |
Eg
|
+1 on this being the current recommendation. Hierarchical datasets conform to a number of semantic linking conventions and, at least at this point, I would recommend writing custom openers for each dataset/convention. I think we'll learn a lot from the implementation of these custom openers, and as @alexamici mentions in pydata/xarray#1982, there are some emerging standards that we may be able to leverage is some generic openers. |
Hey ya'll (@TomNicholas )- we have some custom engines for radar data in our xradar package, where we can read data using the following: import xarray as xr
import xradar
ds = xr.open_dataset("radar_file.nc", group='sweep_0', engine='cfradial1') but we cannot use this engine with datatree directly yet since it is not one of the registered engines import datatree as dt
dt.open_datatree("radar_file.nc", engine='cfradial1')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In [6], line 1
----> 1 dt.open_datatree(filename, engine='cfradial1')
File ~/miniforge3/envs/xradar-dev/lib/python3.10/site-packages/datatree/io.py:60, in open_datatree(filename_or_obj, engine, **kwargs)
58 return _open_datatree_netcdf(filename_or_obj, engine=engine, **kwargs)
59 else:
---> 60 raise ValueError("Unsupported engine")
ValueError: Unsupported engine What is the best way of adding our new engines so we can load these datasets into a datatree? Here is a full example with our working functionality and API |
Hi @mgrover1! Quick Q: If the file is
The most general way would be to extend xarray's backend entrypoint system to support In the meantime I guess we could add another special case to |
@TomNicholas - though these files are netcdf, they are a specific type of netcdf (cfradial) this has additional hierarchal metadata that we then use to parse into groups and such. Also, this is just one of the files supported by the package. Other readers include |
This issue is related to adding a new backend to open cfradial files (weather radar files). I think there is an implementation here Do you think @mgrover1 or @kmuehlbauer we can close this? |
This seems to have been resolved so I will go ahead and close this issue. |
Hi @TomNicholas !
I am one of the core devs of satpy (https://github.com/pytroll/satpy), which makes use of xarray/dask to handle satellite data for earth-observing satellites.
In this context, we have many times satellite data which have different resolutions for a same dataset, hence xarray's dataset can't really be used for these data, as the coords for the different variables don't match, and DataTree makes a lot of sense for us.
The satellite data, more often than not, is in some binary format, and we read it and convert it to xarray.DataArrays, and I'm now started experimenting placing them in a DataTree by hand.
So it would be really nice if there was an interface for adding custom engines to read that data (multiple files). Did you already consider that? Do you maybe already have an idea on how this would work?
We have been wanting to stick closer to the data model of xarray in our library, and datatree looks like something we could really use :) let's hope we can contribute here, at least with ideas in the future.
The text was updated successfully, but these errors were encountered: