Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collection enhancements #53

Merged
merged 12 commits into from
Feb 28, 2024
Merged

Collection enhancements #53

merged 12 commits into from
Feb 28, 2024

Conversation

dchandan
Copy link
Collaborator

  • Adding ability to add collection level assets
  • Adding collection assets to CMIP6_UofT
  • Adding an end date to CMIP6_UofT's temporal extent for better rendering in STAC Browser

Copy link
Collaborator

@fmigneault fmigneault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STAC Assets are usually intended for referencing to data or metadata directly relating to the data (eg: some configuration file to interpret it, alternate representations, spec-sheet of the satellite or the convention names of the attributes).

For contextual metadata around the item, such as the CMIP6 project and license, the links section seems more appropriate, using standard IANA relation types known even by non-STAC JSON parsers.

The feature (STAC Assets in Collections) is valid by itself, but the CMIP6 collection config should instead use links with the specific changes provided.

@@ -3,6 +3,9 @@
## [Unreleased](https://github.com/crim-ca/stac-populator) (latest)

<!-- insert list items of new changes here -->
- Adding ability to add collection level assets
- Adding collection assets to `CMIP6_UofT`
- Adding an end date to `CMIP6_UofT`'s temporal extent for better rendering in STAC Browser
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show how 2500-01-01 makes it better for rendering?
The current one seems fine to me:

https://redoak.cs.toronto.edu/stac-browser/collections/CMIP6_UofT
image

If the final date of CMIP6 is unknown, it looks like the "until present" is ideal.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "until present" is not ideal because it's not true. The final date for CMIP6 data is known, the maximum a simulation in the archive goes up to is year 2500. When I originally had "null" I was expecting something like STAC browser to show an undetermined/open end date, not something wrong like "until present".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a specific date imposed by CMIP6, then it is fine. I'm not familiar with that granular level of details about it.

Comment on lines 15 to 19
title : License
href : https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/main/LICENSE-CC-BY
description: License for CMIP6 data
media_type: text/plain
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be added in links with rel: license instead of assets:
https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md#license

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I took this from your example: https://hirondelle.crim.ca/stac/collections/EuroSAT-subset-train. You have license in assets. Is that wrong?

Copy link
Collaborator

@fmigneault fmigneault Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Looks like a misuse.

The license in links is specifically recommended by STAC (note that EuroSAT-subset-train uses both).
It is also commonly used, for example: https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now updated the assets to links.

Comment on lines 10 to 14
title : Project homepage
href : https://wcrp-cmip.org/cmip-phase-6-cmip6/
description: CMIP6 project homepage
media_type: text/html
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems more appropriate to use links with rel: about.
https://www.iana.org/assignments/link-relations/link-relations.xhtml

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again it seemed from my examination of your implementation that you were using about like links in assets.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmigneault was this a misuse too?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This should also be in links.

Copy link
Collaborator

@huard huard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporal extent is good for me.
Don't feel qualified to weigh in on the asset vs link discussion.

Copy link
Collaborator

@fmigneault fmigneault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended change, but not blocking.

Comment on lines 145 to 147
collection_links = self.__make_collection_links()
# need to remove the links item other constructing a collection object fails
_ = self._collection_info.pop("links")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "other" -> "otherwise" ?

I think it would make more sense to delegate all the checks to self.__make_collection_links to avoid spliting the logic in two places.
i.e.:

collection_links = self.__make_collection_links()  # called directly

def __make_collection_links(self):
    links = self._collection_info.pop("links", [])
    for link_name, link_info in self._collection_info["links"].items():
        links.append(pystac.Link(**link_info))
    return links

Same idea for "assets".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good idea. Let me rearrange it that way.

Comment on lines 9 to 19
homepage:
rel: about
title : Project homepage
target : https://wcrp-cmip.org/cmip-phase-6-cmip6/
media_type: text/html
license:
rel: license
title : License
target : https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/main/LICENSE-CC-BY
media_type: text/plain
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use a dict instead of a list if the link name is dropped anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. Done.

@dchandan
Copy link
Collaborator Author

Merging, since it looks like we are good based on your earlier comment.

@dchandan dchandan merged commit d4bc6aa into master Feb 28, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants