Skip to content

Commit

Permalink
Merge branch 'better_regrid_time' into test_distance_metric
Browse files Browse the repository at this point in the history
  • Loading branch information
schlunma committed Jan 29, 2024
2 parents c8138ce + ddf1f9d commit d80a00d
Show file tree
Hide file tree
Showing 6 changed files with 450 additions and 416 deletions.
63 changes: 54 additions & 9 deletions doc/recipe/preprocessor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1223,8 +1223,7 @@ The ``_time.py`` module contains the following preprocessor functions:
* resample_time_: Resample data
* resample_hours_: Convert between N-hourly frequencies by resampling
* anomalies_: Compute (standardized) anomalies
* regrid_time_: Aligns the time axis of each dataset to have common time
points and calendars.
* regrid_time_: Aligns the time coordinate of each dataset.
* timeseries_filter_: Allows application of a filter to the time-series data.
* local_solar_time_: Convert cube with UTC time to local solar time.

Expand Down Expand Up @@ -1617,13 +1616,59 @@ See also :func:`esmvalcore.preprocessor.anomalies`.
``regrid_time``
---------------

This function aligns the time points of each component dataset so that the Iris
cubes from different datasets can be subtracted. The operation makes the
datasets time points common; it also resets the time
bounds and auxiliary coordinates to reflect the artificially shifted time
points. Current implementation for monthly and daily data; the ``frequency`` is
set automatically from the variable CMOR table unless a custom ``frequency`` is
set manually by the user in recipe.
This function aligns the time points and bounds of an input dataset according
to the following rules:

* Decadal data: 1 January 00:00:00 for the given year.
Example: 1 January 2005 00:00:00 for given year 2005 (decade 2000-2010).
* Yearly data: 1 July 00:00:00 for each year.
Example: 1 July 1993 00:00:00 for the year 1993.
* Monthly data: 15th day 00:00:00 for each month.
Example: 15 October 1993 00:00:00 for the month October 1993.
* Daily data: 12:00:00 for each day.
Example: 14 March 1996 12:00:00 for the day 14 March 1996.
* `n`-hourly data where `n` is a divisor of 24: center of each time interval.
Example: 03:00:00 for interval 00:00:00-06:00:00 (6-hourly data), 16:30:00
for interval 15:00:00-18:00:00 (3-hourly data), or 09:30:00 for interval
09:00:00-10:00:00 (hourly data).

The frequency of the input data is automatically determined from the CMOR table
of the corresponding variable, but can be overwritten in the recipe if
necessary.
This function does not alter the data in any way.

.. note::

By default, this preprocessor will not change the calendar of the input time
coordinate.
For decadal, yearly, and monthly data, it is possible to change the calendar
using the optional `calendar` argument.
Be aware that changing the calendar might introduce (small) errors to your
data, especially for extensive quantities (those that depend on the period
length).

Parameters:
* `frequency`: Data frequency.
If not given, use the one from the CMOR tables of the corresponding
variable.
* `calendar`: If given, transform the calendar to the one specified
(examples: `standard`, `365_day`, etc.).
This only works for decadal, yearly and monthly data, and will raise an
error for other frequencies.
If not set, the calendar will not be changed.
* `units` (default: `days since 1850-01-01 00:00:00`): Reference time units
used if the calendar of the data is changed.
Ignored if `calendar` is not set.

Examples:

Change the input calendar to `standard` and use custom units:

.. code-block:: yaml
regrid_time:
calendar: standard
units: days since 2000-01-01
See also :func:`esmvalcore.preprocessor.regrid_time`.

Expand Down
8 changes: 3 additions & 5 deletions esmvalcore/_recipe/recipe.py
Original file line number Diff line number Diff line change
Expand Up @@ -147,13 +147,11 @@ def _update_target_grid(dataset, datasets, settings):
_spec_to_latlonvals(**target_grid)


def _update_regrid_time(dataset, settings):
def _update_regrid_time(dataset: Dataset, settings: dict) -> None:
"""Input data frequency automatically for regrid_time preprocessor."""
regrid_time = settings.get('regrid_time')
if regrid_time is None:
if 'regrid_time' not in settings:
return
frequency = settings.get('regrid_time', {}).get('frequency')
if not frequency:
if 'frequency' not in settings['regrid_time']:
settings['regrid_time']['frequency'] = dataset.facets['frequency']


Expand Down
38 changes: 23 additions & 15 deletions esmvalcore/cmor/_fixes/shared.py
Original file line number Diff line number Diff line change
Expand Up @@ -446,11 +446,11 @@ def get_time_bounds(time: Coord, freq: str) -> np.ndarray:
"""Get bounds for time coordinate.
For monthly data, use the first day of the current month and the first day
of the next month. For yearly or decadal data, use 1 January of the current
year and 1 January of the next year or 10 years from the current year. For
other frequencies (daily, 6-hourly, 3-hourly, hourly), half of the
frequency is subtracted/added from the current point in time to get the
bounds.
of the next month. For yearly data, use 1 January of the current year and 1
January of the next year. For decadal data, use 1 January 5 years
before/after the current year. For other frequencies (daily or `n`-hourly,
where `n` is a divisor of 24), half of the frequency is subtracted/added
from the current point in time to get the bounds.
Parameters
----------
Expand All @@ -475,36 +475,44 @@ def get_time_bounds(time: Coord, freq: str) -> np.ndarray:
for step, date in enumerate(dates):
month = date.month
year = date.year
if freq in ['mon', 'mo']:
if 'mon' in freq or freq == 'mo':
next_month, next_year = get_next_month(month, year)
min_bound = date2num(datetime(year, month, 1, 0, 0),
time.units, time.dtype)
max_bound = date2num(datetime(next_year, next_month, 1, 0, 0),
time.units, time.dtype)
elif freq == 'yr':
elif 'yr' in freq:
min_bound = date2num(datetime(year, 1, 1, 0, 0),
time.units, time.dtype)
max_bound = date2num(datetime(year + 1, 1, 1, 0, 0),
time.units, time.dtype)
elif freq == 'dec':
min_bound = date2num(datetime(year, 1, 1, 0, 0),
elif 'dec' in freq:
min_bound = date2num(datetime(year - 5, 1, 1, 0, 0),
time.units, time.dtype)
max_bound = date2num(datetime(year + 10, 1, 1, 0, 0),
max_bound = date2num(datetime(year + 5, 1, 1, 0, 0),
time.units, time.dtype)
else:
delta = {
deltas = {
'day': 12.0 / 24,
'12hr': 6.0 / 24,
'8hr': 4.0 / 24,
'6hr': 3.0 / 24,
'4hr': 2.0 / 24,
'3hr': 1.5 / 24,
'2hr': 1.0 / 24,
'1hr': 0.5 / 24,
'hr': 0.5 / 24,
}
if freq not in delta:
for (freq_str, delta) in deltas.items():
if freq_str in freq:
point = time.points[step]
min_bound = point - delta
max_bound = point + delta
break
else:
raise NotImplementedError(
f"Cannot guess time bounds for frequency '{freq}'"
)
point = time.points[step]
min_bound = point - delta[freq]
max_bound = point + delta[freq]
bounds.append([min_bound, max_bound])

return np.array(bounds)
Loading

0 comments on commit d80a00d

Please sign in to comment.