Description
I am trying to load a variable of data from a THREDDS URL; however, I get different results depending on the netCDF4/Python version.
The code is simply:
import netCDF4
file="https://www.star.nesdis.noaa.gov/thredds/dodsC/swathNPPVIIRSNRTL2PWW00/2023/041/20230210123000-STAR-L2P_GHRSST-SSTsubskin-VIIRS_NPP-ACSPO_V2.80-v02.0-fv01.0.nc"
nc=netCDF4.Dataset(file)
err = nc.variables["sses_standard_deviation"][:]
On Linux with Python 3.6.3 and netCDF4 1.2.4, this results in a masked array with valid entries:
masked_array(data =
[[[0.36000001430511475 0.36000001430511475 0.36000001430511475 ..., -- --
--]
[0.36000001430511475 0.36000001430511475 0.36000001430511475 ..., -- --
--]
[0.36000001430511475 0.36000001430511475 0.36000001430511475 ..., -- --
0.36000001430511475]
...,
[0.3100000023841858 0.3400000333786011 0.3799999952316284 ...,
0.25999999046325684 0.25999999046325684 0.25999999046325684]
[0.33000004291534424 0.3400000333786011 0.3799999952316284 ...,
0.25999999046325684 0.25999999046325684 0.25999999046325684]
[0.3100000023841858 0.2800000309944153 0.36000001430511475 ...,
0.25999999046325684 0.25999999046325684 0.25999999046325684]]],
mask =
[[[False False False ..., True True True]
[False False False ..., True True True]
[False False False ..., True True False]
...,
[False False False ..., False False False]
[False False False ..., False False False]
[False False False ..., False False False]]],
fill_value = -128)
However, running this same code on several instances of Linux and macOS configurations installed via condaforge (each using Python 3.10.8/netCDF4 1.6.2 and Python 3.11.0/netCDF4 1.6.2 in different virtual environments) produce invalid results. The err
variable is entirely masked with invalid entries.
masked_array(
data=[[[--, --, --, ..., --, --, --],
[--, --, --, ..., --, --, --],
[--, --, --, ..., --, --, --],
...,
[--, --, --, ..., --, --, --],
[--, --, --, ..., --, --, --],
[--, --, --, ..., --, --, --]]],
mask=[[[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
...,
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True]]],
fill_value=128,
dtype=float32)
The variable of interest is defined as an int8
with a scale_factor of 0.01. Entries outside of ±127 would be invalid for a signed 8-bit number.
nc.variables["sses_standard_deviation"]
is:
<class 'netCDF4._netCDF4.Variable'>
int8 sses_standard_deviation(time, nj, ni)
_Unsigned: false
add_offset: 1.0
comment: Standard deviation of sea_surface_temperature from SST measured by drifting buoys. Further information at (Petrenko et al., JTECH, 2016; doi:10.1175/JTECH-D-15-0166.1)
coordinates: lon lat
long_name: SSES standard deviation
scale_factor: 0.01
units: kelvin
valid_max: 127
valid_min: -127
_FillValue: -128
coverage_content_type: qualityInformation
_ChunkSizes: [ 1 1536 3200]
unlimited dimensions:
current shape = (1, 5392, 3200)
filling off
Examining err.data
in the non-working cases all reveal values greater than 127 that would imply netCDF4 is not treating this as a signed int8
; however, since all values listed below are greater than 127, it would also mean every standard deviation is negative. As shown by the working version at the top, the int8
values for this dataset should typically be in the 30-40 range prior to scale_factor
.
>>> err.data
array([[[192., 192., 192., ..., 128., 128., 128.],
[192., 192., 192., ..., 128., 128., 128.],
[192., 192., 192., ..., 128., 128., 192.],
...,
[187., 190., 194., ..., 182., 182., 182.],
[189., 190., 194., ..., 182., 182., 182.],
[187., 184., 192., ..., 182., 182., 182.]]], dtype=float32)
Is it possible that there is an issue in 1.6.2 that mishandles type int8
?