Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI to validate a scenario data file #404

Open
danielhuppmann opened this issue Oct 2, 2024 · 5 comments
Open

CLI to validate a scenario data file #404

danielhuppmann opened this issue Oct 2, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@danielhuppmann
Copy link
Member

danielhuppmann commented Oct 2, 2024

To make it easier for non-expert users to check their data against a DataStructureDefinition, we need a simple CLI that takes an input file (IAMC scenario data) and validates it against a the definitions.

Basically, this should be a reduced version of cli_run_workflow, but only take an input file (required) and a definitions folder (default to 'definitions'). No output, only raising errors if validation fails.

I suggest to call the CLI nomenclature validate-scenarios (and the function cli_validate_scenarios).

@phackstock
Copy link
Contributor

I find the name validate-scenarios a little bit misleading. In my mind I thought the function validates the values in the scenario column and not an entire data frame. I'd suggest calling it validate-data.
What might be useful in addition is an optional dimension argument that defaults to region, variable but can be extended to scenario, subannual, etc...

@danielhuppmann
Copy link
Member Author

Sorry for not being sufficiently clear in my proposal - in my understanding, validate-scenarios should do the following:

  1. check that the scenario data conforms to the codelists (either specified as keyword argument or via nomenclature.yaml or for all available sub-folders}
  2. check that the data-values satisfy any given DataValidator instances
  3. check that the meta-values satisfy any given MetaValidator instances

#419 implements step 1, with 2 and 3 to be added as we move forward...

@phackstock
Copy link
Contributor

Ah ok thanks for the clarification. To avoid further misunderstandings, by "added as we move forward" you mean that #419, will be merged and then follow-up RPs will implement 2 and 3, or should this all be in #419?
I'm not fully happy with the name validate-scenarios but if you think it's good, I won't object.
Finally, even though rarely used, we could also include applying RequiredDataValidators.

@danielhuppmann
Copy link
Member Author

Keep the PRs clean and simple, so I suggest to merge #419. Not sure how soon we can move forward with the next steps.

@phackstock
Copy link
Contributor

Agreed on the strategy of nice, small PRs.
@dc-almeida, in this case you should edit the description of #419 and remove the "closes" as otherwise this issue will be closed upon merging which is not what we want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants