discrepancies between (HRV) zarr files on gcp and downloaded satellite data #191
Labels
bug
Something isn't working
good first issue
Good for newcomers
help wanted
Extra attention is needed
Describe the bug
Apparently, I came across a discrepancy between the public (HRV) dataset on gcp and data directly downloaded from EUMETSAT api.
To Reproduce
Steps to reproduce the behavior:
gcs = gcsfs.GCSFileSystem()
zstore = 'gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_hrv.zarr'
mapper = gcs.get_mapper(zstore)
ds = xr.open_zarr(mapper, consolidated=True)
and plot the data with coastlines:
projection = { 'proj': 'geos', 'lon_0': 9.5, 'h': 35785831, 'x_0': 0, 'y_0': 0, 'a': 6378169, 'rf': 295.488065897014 }
fig = plt.figure(figsize=(20, 20))
crs = ccrs.Geostationary( central_longitude=projection['lon_0'], satellite_height=projection['h'], )
ax = plt.axes(projection=crs)
ax.coastlines(resolution='10m', alpha=0.5, color='blue')
ds['data'].sel(time=np.datetime64('2020-07-02T07:00:00'), variable='HRV').plot( ax=ax, cmap='gray', add_colorbar=False )
clearly shows that the coastlines are offset with the satelliet observation data (have a look at Libya).
On the other hand, after downloading with the same data with eumdac cli (
eumdac download -c EO:EUM:DAT:MSG:MSG15-RSS --start 2020-07-02T06:45 --end 2020-07-02T07:15
) and combining the *.NAT files with the methods inscripts/extend_gcp_zarr.py
(temporary link here) removes the discrepancy between coastline and satellite observation.Hence, it appears something is incorrect about the satellite data in the public gc-bucket.
I am guessing here, but could it be that this is because the public zarr file lumps all information over 1 year together with moving spatial dimensions - during the year - of the observations? If this is the case, the data should be temporally divided over move zarr files.
Best regards,
Tomas
The text was updated successfully, but these errors were encountered: