Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes to CMIP6 Properties model to allow serialize #22

Merged
merged 1 commit into from
Sep 28, 2023

Conversation

fmigneault
Copy link
Collaborator

@fmigneault fmigneault commented Sep 27, 2023

used/tested by crim-ca/weaver#567

@huard based on top of your modifications, a few patches I needed to apply to make the CMIP6 properties work

@@ -118,7 +132,7 @@ def apply(self, attrs: Dict[str, Any]) -> None:
variables : Dictionary mapping variable name to a :class:`Variable`
object.
"""
self.properties.update(**Properties(**attrs).model_dump_json())
self.properties.update(**Properties(**attrs).model_dump())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one could be a problem, because the json serializer from pystac does not know how to serialize some of the content.

Copy link
Collaborator

@huard huard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extensions/cmip6.py is still in development and not used yet in the ingestion mechanism. If you call the populator, it's the code in add_CMIP6.py that's exercised.

@fmigneault
Copy link
Collaborator Author

@huard @Nazim-crim @dchandan
We need to align our efforts to resolve the code duplication between https://github.com/crim-ca/stac-populator/blob/arch-changes/implementations/CMIP6-UofT/add_CMIP6.py and https://github.com/crim-ca/stac-populator/blob/arch-changes/STACpopulator/extensions/cmip6.py

IMO, add_CMIP6 is not properly named. It is more of a "NCML to STAC Item converter" because it does not only add CMIP6 extension, but also the CF extension and Datacube extension (and maybe more in the future?). I believe the (currently named) add_CMIP6 script should only do a loop for every NCML to generate STAC Items and STAC Collections from an entrypoint THREDDS URL. To do so, it needs to shuffle around some data attributes as I did in my sample script:
https://github.com/crim-ca/ncml2stac/blob/main/notebooks/ncml2stac.ipynb

Preferably, the add_CMIP6 script should import the CMIP6 schemas/validation from extensions/cmip6.py.
This duplicate code should be removed:
https://github.com/crim-ca/stac-populator/blob/arch-changes/implementations/CMIP6-UofT/add_CMIP6.py#L45-L98

Finally, I think that we should improve how the populator is called from the command line.
It is redundant to call the CLI add_CMIP6.py and the CMIP6.yaml file, when this YAML cannot really be used with any other CLI than add_CMIP6.py. I think there should be a generic CLI which can point at add_CMIP6.py for running a specific "populator".

@huard huard merged commit 6d32e14 into arch-changes Sep 28, 2023
1 check failed
@huard
Copy link
Collaborator

huard commented Sep 28, 2023

Totally agree. I started to move things around to differentiate generic functionality from CMIP6 specific features, but didn't want to disrupt things too much without Deepak's input. The arch-changes branch is kind of a sandbox for now.

@fmigneault
Copy link
Collaborator Author

@huard

I've updated my test notebook: https://github.com/crim-ca/ncml2stac/blob/main/notebooks/ncml2stac.ipynb
It shows a STAC Item JSON with the CMIP6 metadata applied using the patches from branch weaver-repo2cwl-ncml2stac + the additional data attribute shuffling in the notebook itself.

I would like to transfer over the "data attribute shuffling" portion to this repository (under https://github.com/crim-ca/stac-populator/blob/arch-changes/STACpopulator/extensions/cmip6.py), so I don't do them myself.

Once this is all working, it is easy to deploy this operation directly in weaver via the notebook!
See: https://github.com/crim-ca/ncml2stac/blob/main/README.md

@huard
Copy link
Collaborator

huard commented Sep 29, 2023

I have the feeling all the data attribute shuffling you're doing is already implemented when combining the cmip6 and datacube extensions, maybe with minor adjustments. I'll finish up and post a STAC item here so we can make those final adjustments.

@huard
Copy link
Collaborator

huard commented Sep 29, 2023

{"type":"FeatureCollection","context":{"limit":10,"returned":1},"features":[{"id":"ScenarioMIP_CCCma_CanESM5_ssp245_r13i1p2f1_SImon_siconc_gn","bbox":[0.049800001084804535,-78.39350128173828,359.99493408203125,89.74176788330078],"type":"Feature","links":[{"rel":"collection","type":"application/json","href":"http://localhost:8880/stac/collections/CMIP6"},{"rel":"parent","type":"application/json","href":"http://localhost:8880/stac/collections/CMIP6"},{"rel":"root","type":"application/json","href":"http://localhost:8880/stac/"},{"rel":"self","type":"application/geo+json","href":"http://localhost:8880/stac/collections/CMIP6/items/ScenarioMIP_CCCma_CanESM5_ssp245_r13i1p2f1_SImon_siconc_gn"},{"rel":"source","href":"https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/catalog/birdhouse/testdata/xclim/cmip6/catalog.xml","type":"text/html","title":"thredds:birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc"}],"assets":{"ISO":{"href":"https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/iso/birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc"},"WCS":{"href":"https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/wcs/birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc"},"WMS":{"href":"https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/wms/birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc"},"NCML":{"href":"https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/ncml/birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc"},"UDDC":{"href":"https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/uddc/birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc"},"OPENDAP":{"href":"https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc"},"HTTPServer":{"href":"https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/fileServer/birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc"},"NetcdfSubset":{"href":"https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/ncss/birdhouse/testdata/xclim/cmip6/sic_SImon_CCCma-CanESM5_ssp245_r13i1p2f1_2020.nc"}},"geometry":{"type":"Polygon","coordinates":[[[0.049800001084804535,-78.39350128173828],[0.049800001084804535,89.74176788330078],[359.99493408203125,89.74176788330078],[359.99493408203125,-78.39350128173828],[0.049800001084804535,-78.39350128173828]]]},"collection":"CMIP6","properties":{"cmip6:grid":"ORCA1 tripolar grid, 1 deg with refinement to 1/3 deg within 20 degrees of the equator; 361 x 290 longitude/latitude; 45 vertical levels; top grid cell 0-6.19 m","cmip6:realm":["seaIce"],"cmip6:source":"CanESM5 (2019): \naerosol: interactive\natmos: CanAM5 (T63L49 native atmosphere, T63 Linear Gaussian Grid; 128 x 64 longitude/latitude; 49 levels; top level 1 hPa)\natmosChem: specified oxidants for aerosols\nland: CLASS3.6/CTEM1.2\nlandIce: specified ice sheets\nocean: NEMO3.4.1 (ORCA1 tripolar grid, 1 deg with refinement to 1/3 deg within 20 degrees of the equator; 361 x 290 longitude/latitude; 45 vertical levels; top grid cell 0-6.19 m)\nocnBgchem: Canadian Model of Ocean Carbon (CMOC); NPZD ecosystem with OMIP prescribed carbonate chemistry\nseaIce: LIM2","end_datetime":"2020-11-04T12:00:00Z","cmip6:license":"CMIP6 model data produced by The Government of Canada (Canadian Centre for Climate Modelling and Analysis, Environment and Climate Change Canada) is licensed under a Creative Commons Attribution ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and at https:///pcmdi.llnl.gov/. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.","cmip6:mip_era":"CMIP6","cmip6:product":"model-output","cmip6:version":"v20190429","cmip6:table_id":"SImon","cube:variables":{"type":{"type":"data","dimensions":["maxStrlen64"],"description":"Sea Ice area type"},"siconc":{"type":"data","unit":"%","dimensions":["time","j","i"],"description":"Sea-Ice Area Percentage (Ocean Grid)"},"latitude":{"type":"auxiliary","unit":"degrees_north","dimensions":["j","i"],"description":"latitude"},"areacello":{"type":"data","unit":"m2","dimensions":["j","i"],"description":"Grid-Cell Area for Ocean Variables"},"longitude":{"type":"auxiliary","unit":"degrees_east","dimensions":["j","i"],"description":"longitude"},"time_bnds":{"type":"data","dimensions":["time","bnds"]},"vertices_latitude":{"type":"data","dimensions":["j","i","vertices"]},"vertices_longitude":{"type":"data","dimensions":["j","i","vertices"]}},"start_datetime":"2019-12-06T12:00:00Z","cmip6:frequency":"mon","cmip6:source_id":"CanESM5","cube:dimensions":{"i":{"axis":"x","type":"spatial","extent":[0,360],"description":["projection_x_coordinate","grid_longitude","projection_x_angular_coordinate"]},"j":{"axis":"y","type":"spatial","extent":[0,291],"description":["projection_y_coordinate","grid_latitude","projection_y_angular_coordinate"]},"time":{"axis":"t","type":"temporal","description":["time"]}},"cmip6:experiment":"update of RCP4.5 based on SSP2","cmip6:grid_label":"gn","cmip6:Conventions":"CF-1.7 CMIP-6.2","cmip6:activity_id":"ScenarioMIP","cmip6:institution":"Canadian Centre for Climate Modelling and Analysis, Environment and Climate Change Canada, Victoria, BC V8P 5C2, Canada","cmip6:source_type":["AOGCM"],"cmip6:tracking_id":"hdl:21.14100/9e4f804b-c161-44fa-acd1-c2e94e220c95","cmip6:variable_id":"siconc","cmip6:creation_date":"2019-09-25T23:01:33Z","cmip6:experiment_id":"ssp245","cmip6:forcing_index":1,"cmip6:physics_index":2,"cmip6:variant_label":"r13i1p2f1","cmip6:institution_id":"CCCma","cmip6:sub_experiment":"none","cmip6:further_info_url":"https://furtherinfo.es-doc.org/CMIP6.CCCma.CanESM5.ssp245.none.r13i1p2f1","cmip6:realization_index":13,"cmip6:sub_experiment_id":"none","cmip6:data_specs_version":"01.00.30","cmip6:nominal_resolution":"100 km","cmip6:initialization_index":1},"stac_version":"1.0.0","stac_extensions":["https://stac-extensions.github.io/datacube/v2.0.0/schema.json"]},]}

@fmigneault
Copy link
Collaborator Author

fmigneault commented Sep 29, 2023

@huard
Looks good. Only missing the cf and datacube extensions.
This is mostly what my edits for data shuffling were doing. The cmip6 extension was handled properly by your edits.

edit
I didn't read the STAC properly, you got all the datacube properly defined!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants