Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update MLM JSON schema to enforce mlm:input band checks as intended #39

Merged
merged 10 commits into from
Sep 27, 2024

Conversation

fmigneault
Copy link
Collaborator

@fmigneault fmigneault commented Sep 20, 2024

Description

Warning

If an existing MLM definition was succeeding with https://github.com/crim-ca/mlm-extension/tree/v1.2.0, it might fail with this revision due to the stricter validation of bands. Older definitions could be considered "incomplete" from the missing "bands" reference if they were not employing other extensions properly (see https://github.com/crim-ca/mlm-extension/tree/fix-comments#bands-and-statistics for details).

Added

Changed

  • Split ModelBands and AnyBandsRef definitions in the JSON schema to allow them to be referenced individually.
  • Move AnyBandsRef definition explicitly to STAC Item JSON schema, rather than implicitly inferred via mlm:input.
  • Modified the JSON schema to use a if check of the type (STAC Item or Collection) prior to validating further
    properties. This allows some validators (e.g. pystac) to better report the real error that causes the schema
    to fail, rather than reporting the first mismatching type case with a poor error description to debug the issue.

Deprecated

  • n/a

Removed

  • Removed $comment entries from the JSON schema that are considered as invalid by some parsers.
  • When mlm:input objects do NOT define band references (i.e.: bands: [] is used), the JSON schema will not
    fail if an Asset with the mlm:model role contains a band definition. This is to allow MLM model definitions to
    simultaneously use some inputs with bands reference names while others do not.

Fixed

  • Band checks against eo, raster or STAC Core 1.1 bands when a mlm:input references names in bands are now properly validated.
  • Fix the examples using raster:bands incorrectly defined in STAC Item properties.
    The correct use is for them to be defined under the STAC Asset using the mlm:model role.
  • Fix the https://github.com/crim-ca/mlm-extension/blob/fix-comments/stac_model/examples.py that incorrectly referenced some bands in its mlm:input definition without providing any definition of those bands. The eo:bands properties have been added to the corresponding model Asset using the pystac.extensions.eo utilities.
  • Fix various STAC Asset definitions erroneously employing mlm:model role instead of the intended mlm:source_code.

Related Issue

Type of Change

  • 📚 Examples, docs, tutorials or dependencies update;
  • 🔧 Bug fix (non-breaking change which fixes an issue);
  • 🥂 Improvement (non-breaking change which improves an existing feature);
  • 🚀 New feature (non-breaking change which adds functionality);
  • 💥 Breaking change (fix or feature that would cause existing functionality to change);
  • 🔐 Security fix.

Checklist

  • I've read the CONTRIBUTING.md guide;
  • I've updated the code style using make check;
  • I've written tests for all new methods and classes that I created;
  • I've written the docstring in Google format for all the methods and classes that I used.

@fmigneault fmigneault self-assigned this Sep 20, 2024
@rbavery rbavery mentioned this pull request Sep 20, 2024
8 tasks
@fmigneault
Copy link
Collaborator Author

@rbavery FYI
I'm still working on this...

I've realized that by enabling the following:

{
"$comment": "Schema to validate cross-references of bands between MLM inputs and any 'bands'-compliant section describing them using another STAC definition.",
"$ref": "#/$defs/AnyBandsRef"
},

... many of the "band" checks across EO/Raster/STAC-core were actually incorrect, as well as some of the provided examples.

For example,

uses raster:bands at the Item properties level, when they MUST be under an Asset according to raster extension.

Similarly, https://github.com/crim-ca/mlm-extension/blob/v1.2.0/stac_model/examples.py defined mlm:input with some bands, but they were never actually validated correctly. I've added the https://github.com/stac-utils/pystac/blob/main/pystac/extensions/eo.py extension wrapper to it, to maek it work correctly with band checks.

@fmigneault fmigneault changed the title remove errorneous '$comment' under JSON object 'properties' Update MLM JSON schema to enforce mlm:input band checks as intended Sep 21, 2024
@fmigneault
Copy link
Collaborator Author

@rbavery
Forgot to ping you again. The PR is ready now.
Let me know if it works with your examples whenever you have some time.

@fmigneault fmigneault linked an issue Sep 26, 2024 that may be closed by this pull request
@fmigneault
Copy link
Collaborator Author

Merging and will pin as v1.3.0 to avoid dealing with the integration before move to https://github.com/stac-extensions/mlm.
If anything doesn't work, a v1.3.1 could be considered, or users can refer to the v1.2.0 in the meantime.

@fmigneault fmigneault merged commit d40ba53 into main Sep 27, 2024
8 checks passed
@fmigneault fmigneault deleted the fix-comments branch September 27, 2024 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

how to interpret pystac oneOf validation error
1 participant