Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read dimensions other than variable and region from external repo #415

Conversation

korsbakken
Copy link
Collaborator

This PR fixes #414, as far as I can see. All tests pass, the error cited in #414 goes away, and the resulting DataStructureDefinition object has the expected dimensions.

If possible, it would be great if this PR could be merged as soon as possible, since it's a bug that can cause significant problems. Some possible items that might be considered in the future (none of which I have the capacity to work on in the next few months):

  • Add a test. A proper test for this issue would probably be an integration test, which tries to read from a test repo that has all the valid dimensions and a non-empty codelist for each them, and checks that the resulting DataStructureDefinition object has all the expected dimensions and the expected codes in each codelist. As far as I can see, there is no such test currently?
  • This PR still hardcodes the dimensions to be in {'region', 'variable', 'model', 'scenario'}. The question is whether other arbitrary dimensions should be allowed. But at the moment, I think that would cause issues multiple other places in the code, so probably goes far beyond the scope of this PR.

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that line 139 doesn't have to be changed as well?

@korsbakken
Copy link
Collaborator Author

I'm surprised that line 139 doesn't have to be changed as well?

Yes, it does for crossing t's and dotting i's, thanks for flagging that. It doesn't really affect the final DataStructureDefinition object, but it causes the DataStructureConfig object to lack those dimensions in its repos attribute. It doesn't look like that issue propagates to anywhere in this case, but it might cause problems in other circumstances?

Also, I added the extra dimensions in the add_dimension field validator. Again, I don't have a lot of pydantic expertise. But I assume not including them there might mean that formatting errors under model and scenario in the yaml file might not be caught.

Copy link
Contributor

@phackstock phackstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the useful addition @korsbakken.
To address the two points you mentioned:

  1. I agree that a test with all dimensions would be great. Normally I would insist on adding a test now, but since this feature expands on an existing, tested feature, it's fine to be merged from my side. We should open an issue though for adding a test so that we don't forget.
  2. In the DataStructureDefinition class we are actually using what you are suggesting in that we create "dimension" attributes on the fly to allow for arbitrary dimensions. I am not super happy with this since it's potentially confusing for people reading the code and it's not compatible with type checkers. I prefer the explicit listing of the dimensions in the class attributes.
    What you could do is replace the explicit listing in the validator with "*". Although I'm also not sure that's the best idea. Following the zen of python "explicit is better than implicit", I'd be fine to keep it.

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @korsbakken, looks good!

@danielhuppmann danielhuppmann merged commit 7913ff1 into IAMconsortium:main Oct 17, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dimensions other than region and variable are not read from repositories listed in nomenclature.yaml
3 participants