-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add docstrings and docs pages for
RequiredDataValidator
and `DataVa…
…lidator` (#470) * Add docstrings and docs pages for `RequiredDataValidator` and `DataValidator` * Update docs/user_guide/data-validation.rst Co-authored-by: Daniel Huppmann <[email protected]> * Update docs/user_guide/data-validation.rst Co-authored-by: Daniel Huppmann <[email protected]> * Update docs/user_guide/data-validation.rst Co-authored-by: Daniel Huppmann <[email protected]> * Update docs/user_guide/data-validation.rst Co-authored-by: Daniel Huppmann <[email protected]> * Create subpages and ToC * Change title to Validation * Update docs/user_guide/validation/data-validation.rst Co-authored-by: Daniel Huppmann <[email protected]> * Update docs/user_guide/validation/required-data-validation.rst Co-authored-by: Daniel Huppmann <[email protected]> --------- Co-authored-by: David Almeida <[email protected]> Co-authored-by: Daniel Huppmann <[email protected]>
- Loading branch information
1 parent
e3541cc
commit 4e5d9a9
Showing
7 changed files
with
160 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
.. _validation: | ||
|
||
.. currentmodule:: nomenclature | ||
|
||
Validation | ||
========== | ||
|
||
The **nomenclature** package allows users to validate IAMC data in several ways. | ||
|
||
For this, validation requirements and criteria can be specified in YAML configuration | ||
files. | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
|
||
validation/data-validation | ||
validation/required-data-validation |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
.. _data-validation: | ||
|
||
.. currentmodule:: nomenclature | ||
|
||
Data validation | ||
=============== | ||
|
||
**Data validation** checks if timeseries data values are within specified ranges. | ||
|
||
Consider the example below: | ||
|
||
.. code:: yaml | ||
- variable: Primary Energy | ||
year: 2010 | ||
validation: | ||
- upper_bound: 5 | ||
lower_bound: 1 | ||
- warning_level: low | ||
upper_bound: 2.5 | ||
lower_bound: 1 | ||
- variable: Primary Energy|Coal | ||
year: 2010 | ||
value: 5 | ||
rtol: 2 | ||
atol: 1 | ||
Each criteria item contains **data filter arguments** and **validation arguments**. | ||
|
||
Data filter arguments include: ``model``, ``scenario``, ``region``, ``variable``, | ||
``unit``, and ``year``. | ||
For the first criteria item, the data is filtered for variable *Primary Energy* | ||
and year 2010. | ||
|
||
The ``validation`` arguments include: ``upper_bound``/``lower_bound`` *or* | ||
``value``/``rtol``/``atol`` (relative tolerance, absolute tolerance). Only one | ||
of the two can be set for each ``warning_level``. | ||
The possible levels are: ``error``, ``high``, ``medium``, or ``low``. | ||
For the same data filters, multiple warning levels with different criteria each | ||
can be set. These must be listed in descending order of severity, otherwise a | ||
``ValidationError`` is raised. | ||
In the example, for the first criteria item, the validation arguments are set | ||
for warning level ``error`` (by default, in case of omission) and ``low``, | ||
using bounds. | ||
Flagged datapoints are skipped for lower severity warnings in the same criteria | ||
item (e.g.: if datapoints are flagged for the ``error`` level, they will not be | ||
checked again for ``low``). | ||
|
||
The second criteria item (for variable *Primary Energy|Coal*) uses the old notation. | ||
Its use is deprecated for being more verbose (requires each warning level to be | ||
a separate criteria item) and slower to process. | ||
|
||
Standard usage | ||
-------------- | ||
|
||
Run the following in a Python script to check that an IAMC dataset has valid data. | ||
|
||
.. code-block:: python | ||
from nomenclature.processor import DataValidator | ||
# ...setting directory/file paths and loading dataset | ||
DataValidator.from_file(data_val_yaml).apply(df) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
.. _required-data-validation: | ||
|
||
.. currentmodule:: nomenclature | ||
|
||
Required data validation | ||
======================== | ||
|
||
**Required data validation** checks if certain models, variables, regions and/or | ||
periods of time are covered in the timeseries data. | ||
|
||
For this, a configuration file specifies the model(s) and dimension(s) expected | ||
in the dataset. These are ``variable``, ``region`` and/or ``year``. | ||
Alternatively, instead of using ``variable``, it is possible to declare measurands, | ||
which jointly specify variables and units. | ||
|
||
.. code:: yaml | ||
description: Required variables for running MAGICC | ||
model: model_a | ||
required_data: | ||
- measurand: | ||
Emissions|CO2: | ||
unit: Mt CO2/yr | ||
region: World | ||
year: [2020, 2030, 2040, 2050] | ||
In the example above, for *model_a*, the dataset must include datapoints of the | ||
variable *Emissions|CO2* (measured in *Mt CO2/yr*), in the region *World*, for the | ||
years 2020, 2030, 2040 and 2050. | ||
|
||
Standard usage | ||
-------------- | ||
|
||
Run the following in a Python script to check that an IAMC dataset has valid | ||
required data. | ||
|
||
.. code-block:: python | ||
from nomenclature import RequiredDataValidator | ||
# ...setting directory/file paths and loading dataset | ||
RequiredDataValidator.from_file(req_data_yaml).apply(df) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,4 +6,3 @@ required_data: | |
unit: Mt CO2/yr | ||
region: World | ||
year: [2020, 2030, 2040, 2050] | ||
|