Skip to content

Inconsistent Loading Results for netCDF4 version #1232

Closed
@powellb

Description

@powellb

I am trying to load a variable of data from a THREDDS URL; however, I get different results depending on the netCDF4/Python version.

The code is simply:

import netCDF4
file="https://www.star.nesdis.noaa.gov/thredds/dodsC/swathNPPVIIRSNRTL2PWW00/2023/041/20230210123000-STAR-L2P_GHRSST-SSTsubskin-VIIRS_NPP-ACSPO_V2.80-v02.0-fv01.0.nc"
nc=netCDF4.Dataset(file)
err = nc.variables["sses_standard_deviation"][:]

On Linux with Python 3.6.3 and netCDF4 1.2.4, this results in a masked array with valid entries:

masked_array(data =
 [[[0.36000001430511475 0.36000001430511475 0.36000001430511475 ..., -- --
   --]
  [0.36000001430511475 0.36000001430511475 0.36000001430511475 ..., -- --
   --]
  [0.36000001430511475 0.36000001430511475 0.36000001430511475 ..., -- --
   0.36000001430511475]
  ..., 
  [0.3100000023841858 0.3400000333786011 0.3799999952316284 ...,
   0.25999999046325684 0.25999999046325684 0.25999999046325684]
  [0.33000004291534424 0.3400000333786011 0.3799999952316284 ...,
   0.25999999046325684 0.25999999046325684 0.25999999046325684]
  [0.3100000023841858 0.2800000309944153 0.36000001430511475 ...,
   0.25999999046325684 0.25999999046325684 0.25999999046325684]]],
             mask =
 [[[False False False ...,  True  True  True]
  [False False False ...,  True  True  True]
  [False False False ...,  True  True False]
  ..., 
  [False False False ..., False False False]
  [False False False ..., False False False]
  [False False False ..., False False False]]],
       fill_value = -128)

However, running this same code on several instances of Linux and macOS configurations installed via condaforge (each using Python 3.10.8/netCDF4 1.6.2 and Python 3.11.0/netCDF4 1.6.2 in different virtual environments) produce invalid results. The err variable is entirely masked with invalid entries.

masked_array(
  data=[[[--, --, --, ..., --, --, --],
         [--, --, --, ..., --, --, --],
         [--, --, --, ..., --, --, --],
         ...,
         [--, --, --, ..., --, --, --],
         [--, --, --, ..., --, --, --],
         [--, --, --, ..., --, --, --]]],
  mask=[[[ True,  True,  True, ...,  True,  True,  True],
         [ True,  True,  True, ...,  True,  True,  True],
         [ True,  True,  True, ...,  True,  True,  True],
         ...,
         [ True,  True,  True, ...,  True,  True,  True],
         [ True,  True,  True, ...,  True,  True,  True],
         [ True,  True,  True, ...,  True,  True,  True]]],
  fill_value=128,
  dtype=float32)

The variable of interest is defined as an int8 with a scale_factor of 0.01. Entries outside of ±127 would be invalid for a signed 8-bit number.

nc.variables["sses_standard_deviation"] is:

<class 'netCDF4._netCDF4.Variable'>
int8 sses_standard_deviation(time, nj, ni)
    _Unsigned: false
    add_offset: 1.0
    comment: Standard deviation of sea_surface_temperature from SST measured by drifting buoys. Further information at (Petrenko et al., JTECH, 2016; doi:10.1175/JTECH-D-15-0166.1)
    coordinates: lon lat
    long_name: SSES standard deviation
    scale_factor: 0.01
    units: kelvin
    valid_max: 127
    valid_min: -127
    _FillValue: -128
    coverage_content_type: qualityInformation
    _ChunkSizes: [   1 1536 3200]
unlimited dimensions: 
current shape = (1, 5392, 3200)
filling off

Examining err.data in the non-working cases all reveal values greater than 127 that would imply netCDF4 is not treating this as a signed int8; however, since all values listed below are greater than 127, it would also mean every standard deviation is negative. As shown by the working version at the top, the int8 values for this dataset should typically be in the 30-40 range prior to scale_factor.

>>> err.data
array([[[192., 192., 192., ..., 128., 128., 128.],
        [192., 192., 192., ..., 128., 128., 128.],
        [192., 192., 192., ..., 128., 128., 192.],
        ...,
        [187., 190., 194., ..., 182., 182., 182.],
        [189., 190., 194., ..., 182., 182., 182.],
        [187., 184., 192., ..., 182., 182., 182.]]], dtype=float32)

Is it possible that there is an issue in 1.6.2 that mishandles type int8?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions