Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimensions other than region and variable are not read from repositories listed in nomenclature.yaml #414

Closed
korsbakken opened this issue Oct 15, 2024 · 2 comments · Fixed by #415
Labels
bug Something isn't working

Comments

@korsbakken
Copy link
Collaborator

When creating a new DataStructureDefinition object from a path that includes external repositories in a nomenclature.yaml file, only the region and variable dimensions get read from the repository. If I list, e.g., model and scenario in the dimensions list in nomenclature.yaml, I get a ValueError with message "Empty codelist: model, scenario". Those dimensions are present with valid yaml files in the repository, and are read without any problems if I clone the repository to my computer and read its definitions folder locally.

I have pasted the nomenclature.yaml file at the bottom. It will reproduce the problem if put in a folder with empty definitions and mappings subdirectories, and read with dsd = nomenclature.DataStructureDefinition('./definitions').

I'm not fluent enough in pydantic to diagnose the problem 100% certainly or know for sure how to fix it, but I think the problem lies in the definition of config.DataStructureConfig, which defines the definitions field in config.NomenclatureConfig. The definition looks suspiciously like it only defines region and variable as valid fields, which I suspect means that the other dimensions under definitions in nomenclature.yaml get discarded. Relevant part of the code here:

class DataStructureConfig(BaseModel):
"""A class for configuration of a DataStructureDefinition
Attributes
----------
region : RegionCodeListConfig
Attributes for configuring the RegionCodeList
"""
region: Optional[RegionCodeListConfig] = Field(default_factory=RegionCodeListConfig)
variable: Optional[CodeListConfig] = Field(default_factory=CodeListConfig)
@field_validator("region", "variable", mode="before")
@classmethod
def add_dimension(cls, v, info: ValidationInfo):
return {"dimension": info.field_name, **v}
@property
def repos(self) -> dict[str, str]:
return {
dimension: getattr(self, dimension).repositories
for dimension in ("region", "variable")
if getattr(self, dimension).repositories
}

I'd greatly appreciate if someone who is familiar with the pydantic definitions and how they are processed can take a look at this relatively soon. At the moment it's a roadblock for deploying a validation tool in an ongoing project.

Here is the content of the nomenclature.yaml file I used:

repositories:
  iamcompact-nomenclature-definitions:
    url: https://github.com/ciceroOslo/iamcompact-nomenclature-definitions.git
    release: validation-ui
dimensions:
  - variable
  - region
  - model
  - scenario
definitions:
  variable:
    repository: iamcompact-nomenclature-definitions
  region:
    repository: iamcompact-nomenclature-definitions
  model:
    repository: iamcompact-nomenclature-definitions
  scenario:
    repository: iamcompact-nomenclature-definitions
mappings:
  repository: iamcompact-nomenclature-definitions

The error traceI get is the following:

File ~/src/repos/iamcompact/iamcompact-nomenclature/.venv/lib/python3.12/site-packages/nomenclature/definition.py:81, in DataStructureDefinition.__init__(self, path, dimensions)
     76     self.__setattr__(
     77         dim, codelist_cls.from_directory(dim, path / dim, self.config)
     78     )
     80 if empty := [d for d in self.dimensions if not getattr(self, d)]:
---> 81     raise ValueError(f"Empty codelist: {', '.join(empty)}")

ValueError: Empty codelist: model, scenario

If I enter debug mode, I get the following content in self.config.dict():

{'dimensions': ['variable', 'region', 'model', 'scenario'],
 'repositories': {'iamcompact-nomenclature-definitions': {'url': 'https://github.com/ciceroOslo/iamcompact-nomenclature-definitions.git',
   'hash': None,
   'release': 'validation-ui',
   'local_path': PosixPath('iamcompact-nomenclature-definitions')}},
 'definitions': {'region': {'dimension': 'region',
   'repositories': {'iamcompact-nomenclature-definitions'},
   'country': False,
   'nuts': None},
  'variable': {'dimension': 'variable',
   'repositories': {'iamcompact-nomenclature-definitions'}}},
 'mappings': {'repositories': {'iamcompact-nomenclature-definitions'}}}

In other words, only variable: and region: have been read from the definitions section of the config file.

@korsbakken korsbakken added the bug Something isn't working label Oct 15, 2024
korsbakken added a commit to ciceroOslo/iamcompact-nomenclature that referenced this issue Oct 15, 2024
This unfortunately leads to an error. Nomenclature thinks the model and
scenario codelists are empty. Possible a bug in nomenclature, see
nomenclature-iamc issue #414
(IAMconsortium/nomenclature#414)
@danielhuppmann
Copy link
Member

I think you are right, the current implementation is hard-coded to only import definitions for the region and variable dimension. We haven't had this use case yet.

Not sure how quickly @dc-almeida or @phackstock can take a look to extend the feature.

@korsbakken
Copy link
Collaborator Author

Actually, it was simple enough to just add the fields, copying the definition used for the variable field (which doesn't use a custom definition, unlike region). I submitted it as a PR, #415 . Great if someone could take a look to make sure there's isn't some issue here that I'm not aware of. But it seems to fully resolve the issue on my end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants