-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: add support for noon-centered times to copernicusmarine.open_dataset()
#271
Comments
We will discuss it internally because as you said:
That could also be the position of the toolbox. But we will get back to you! |
The easiest way to clarify the offset that I described in the issue description is with a sine wave with a period of several days. This is also the timescale on which processes like zos/temperature/salinity could take place. This is purely for illustration purposes, I realize this is not actual data, but this is way easier to produce: import matplotlib.pyplot as plt
plt.close("all")
import numpy as np
import pandas as pd
x = pd.date_range("2020-01-01", "2020-02-01", freq="10min")
xrange = np.arange(len(x))
y = np.sin(5/len(x) * np.pi * xrange)
ser = pd.Series(y, index=x)
fig, ax = plt.subplots(figsize=(12,6))
ax.plot(ser.index, ser, label="original data")
daymean_start = ser.groupby(pd.PeriodIndex(ser.index, freq="D")).mean()
daymean_start.index = daymean_start.index.to_timestamp()
daymean_mid = daymean_start.copy()
daymean_mid.index = daymean_mid.index + pd.Timedelta(hours=12)
ax.plot(daymean_start.index, daymean_start, label='mean start-of-interval')
ax.plot(daymean_mid.index, daymean_mid, label='mean center-of-interval')
ax.legend() Usecase: |
I was working on a workaround for this on our side in the meantime, and hoped to be able to use the dataset_id string, since it contains things like "P1D-m" for daily means. However, I realized this is not always consistent, for instance the dataset_id |
About this last comment, there is nothing we can do on the toolbox side 🤔 |
Motivation
Since the indroduction of ARCO, all time-averaged datasets now have start-of-interval instead of center-of-interval time samples as documented in https://help.marine.copernicus.eu/en/articles/8656000-differences-between-netcdf-and-arco-formats. It makes complete sense to harmonize everything across datasets and I realize the ARCO structure was necessary to prepare for the impressive performance we see in the copernicusmarine toolbox these days.
However, in our institute we use the data as model forcing and we also compare it to measurements. What we see, is that the new time administration shows a clear offset, 12 hours in case of daily-averaged data. This makes sense, but it is quite inconvenient. Of course, it can be manually shifted in our workflows, but I would avoid this if possible.
Furthermore, the PUM states the daily averaged products are centered at noon, not at midnight. So the current behaviour could be confusing for users.:

I have been in touch with the helpdesk before, I think the original request was filed under MDSOP-179, if that helps.
Expected behavior
Support in
copernicusmarine.open_dataset()
for getting noon-centered times, for instance via an additional keyword.Actual behavior
midnight-centered times, so the reproducible code prints:
1993-01-01 00:00:00 2021-06-30 00:00:00
Steps to reproduce
Alternatives
Alternatively, it would at least be helpful if there is averaging metadata present in the returned dataset, for instance as attributes in the time variable. So the fact that it is daily averaged, and that the times are start-of-interval-times. However, I can imagine that is not easy to implement across all averaged datasets. However, it would make it easier to apply the correct time corrections on our side (Deltares/dfm_tools#878).
Environment
The text was updated successfully, but these errors were encountered: