Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaNs in 6-hourly analysis-ready dataset for 2m temperature` #62

Closed
tom-andersson opened this issue Oct 30, 2023 · 5 comments
Closed

NaNs in 6-hourly analysis-ready dataset for 2m temperature` #62

tom-andersson opened this issue Oct 30, 2023 · 5 comments

Comments

@tom-andersson
Copy link

tom-andersson commented Oct 30, 2023

Hi there! I've come across NaNs in the 2m_temperature variable in the 6-hourly analysis-ready dataset -- MWE below -- does this reproduce for you?

Three strange observations:

  • I've tried the 1-hourly dataset and couldn't see any NaNs in any of the 24 hours for this date (2016-06-25).
  • No NaNs in neighbouring dates.
  • No NaNs in some other variables (u- and v-wind) in the same 6-h dataset.
import xarray as xr

source = "gs://gcp-public-data-arco-era5/ar/1959-2022-full_37-6h-0p25deg-chunk-1.zarr-v2"
era5_zarr = xr.open_zarr(source, consolidated=True, chunks={"time": 48})
era5_zarr["2m_temperature"].sel(time="2015-06-28").load()

Returns

<xarray.DataArray '2m_temperature' (time: 4, latitude: 721, longitude: 1440)>
array([[[nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        ...,
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan]],
       [[nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        ...,
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan]],
       [[nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        ...,
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan]],
       [[nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        ...,
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan]]], dtype=float32)
Coordinates:
  * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
  * longitude  (longitude) float32 0.0 0.25 0.5 0.75 ... 359.0 359.2 359.5 359.8
  * time       (time) datetime64[ns] 2015-06-28 ... 2015-06-28T18:00:00
Attributes:
    long_name:   2 metre temperature
@dabhicusp
Copy link
Collaborator

Yes @tom-andersson it's reproducible for us too.

jfyi -- we are only maintaining this file gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3/ and also we will be deprecating all files other than this (gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3/) in future.

@tom-andersson
Copy link
Author

Thanks for confirming @dabhicusp - I'll switch to the 1-hourly dataset in my application.

The reason I was using the 6-hourly dataset was partly out of laziness to reduce download size/duration when I only want daily averages for testing environmental ML, see #61.

tom-andersson added a commit to alan-turing-institute/deepsensor that referenced this issue Nov 1, 2023
@dabhicusp
Copy link
Collaborator

Hello @tom-andersson If your tasks have been successfully completed, could we proceed with closing this issue?

@tom-andersson
Copy link
Author

Hi @dabhicusp, yes, since the dataset with NaNs isn't being maintained, feel free to close this. Though it could be useful to make this more clear in the docs or remove the dataset from the cloud bucket (if you haven't already).

@dabhicusp
Copy link
Collaborator

I'm closing this issue because we're only keeping the files that are mentioned in the readme.md file. Any other files that aren't listed there will be getting the deprecated in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants