Representation of NetCDF data #298
Replies: 5 comments
-
Hi @sandorkertesz , thanks for starting the dicussion thread. I want to say that this is could be a higher discussion than just NetCDF representation, I think the conversation is also relevant for formats such as coverageJSON which can contain gridded data, and therefore could also be represented as fieldlist. So, if the argument for converting things to fieldlist is consistency then we should also be trying to convert coverageJSON to fieldlists. I do not think representing things as fieldlist automatically is a good idea, I think it should be on the user/applcation/downstrem-earthkit-method to decide what the representation should be. Also, as you point out, the behaviour in response to netDF can be different depending on whether it is a fieldlist or not, this already breaks our consistency. If the consistency is along the lines of "gridded" vs "point/observations", I think this is a very messy (if-ey) game to code, and will not necessarily benefit users. I think we can expect a user to know if they are handling observation or gridded data, therefore can choose the visual representation/method of iteration. Similarly, the downstream code we write should also be able to make these decisions. |
Beta Was this translation helpful? Give feedback.
-
Thanks @sandorkertesz for opening up this discussion. To play devil's advocate, do we need a default data representation for NetCDF - or any source type for that matter? Would we make things easier by simply not offering a default data representation at all in earthkit-data, or at least only offering something very basic at the top I don't think we can provide a one-size-fits-all data representation that will work for all of our users all of the time, but we can offer them the flexibility to choose the data representation that works for them. I know we already do this with all our Using NetCDF as an example, we know that the community tool of choice for working with NetCDF in Python is xarray, so a lot of users will want to call With this in mind, as suggested by @EddyCMWF, I think it's worth asking this question at a higher level - i.e. what should be the representation of data in earthkit-data? Should the |
Beta Was this translation helpful? Give feedback.
-
Hi @sandorkertesz, @EddyCMWF and @JamesVarndell! Just want to add my point of view here but I don't know that well the internals of earthkit-data so correct me if I'm wrong. |
Beta Was this translation helpful? Give feedback.
-
I agree with almost everything here, but I'll point out what I believe is significant but left out of discussino so far. I don't see why GRIBs being stricter than netCDF is a problem. I can imagine we could extend the concept of "gridspec" for describing gridded data, coming from any format. Another significant "non-spatial" case is spectral data. Observations are generally (?) spacial data, unstructured in time and space -- in physics they are "events": describing a time and a place. It might be useful to have a hierarchy starting from Field (base), then SpatialField, SpectralField, ..., (still abstract), and FieldLists can collect base Field's of any type (just like a GRIB file, a bit different on netCDF ass they share mostly the same descriptive metadata in general). Since metadata is easy to transport, it can be attached to every single field in a FieldList, hence all formats I know of are taken care of. |
Beta Was this translation helpful? Give feedback.
-
Fully agree with you @pmaciel, especially on the SpatialField, SpectralField, etc. and the collection of Field of any type. |
Beta Was this translation helpful? Give feedback.
-
This discussion is about the representation of NetCDF data in earthkit-data.
Currently, a NetCDF file is handled in the following way when loaded with
from_source("file", path, ...)
:NetCDFFieldList
NetCDFReader
will be generated. Then we can callto_xarray()
on it to convert it into a format we can actually work with.There are various problems with the approach:
The questions we need to address:
Beta Was this translation helpful? Give feedback.
All reactions