You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Checking out #674, related to _Unsigned = "true" from #656 and #671, I suppose the following is a question of semantics.
Should Variable.dtype return the data type on disk, or the datatype that the user will receive from __getitem__? Right now this test fails. I think it should pass, reasoning that:
The real datatype on disk should be completely abstracted, similar to endianness.
User can expect type of data returned to be consistent with result of file.variables["var"].dtype
If this test should indeed pass, I would be happy to initiate a PR with this test and a solution. The solution that came to mind would be to change references to self.dtype inside the Variable class to self._dtype and then expose dtype as a property, thoughts?
import unittest
import netCDF4
import tempfile
import numpy as np
import os
class TestUnsignedMasking(unittest.TestCase):
def setUp(self):
_, self.filename = tempfile.mkstemp()
with netCDF4.Dataset(self.filename, "w") as nc_file:
nc_file.createDimension("time", None)
nc_x = nc_file.createVariable("x", np.int8, dimensions=("time"), fill_value=np.int8(-1))
nc_x._Unsigned = "true"
x = np.arange(10, dtype=np.int8)
nc_file.variables["x"][:] = np.ma.masked_where(x > 5, x)
def tearDown(self):
os.remove(self.filename)
def test_unsigned(self):
with netCDF4.Dataset(self.filename, "r") as nc_file:
self.assertTrue(nc_file.variables["x"].dtype == nc_file.variables["x"][:].dtype)
if __name__ == '__main__':
unittest.main()
Results:
FAIL: test_unsigned (__main__.TestUnsignedMasking)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tst_Unsigned_masking.py", line 25, in test_unsigned
self.assertTrue(nc_file.variables["x"].dtype == nc_file.variables["x"][:].dtype)
AssertionError: False is not true
The text was updated successfully, but these errors were encountered:
Right now Variable.dtype has to be the datatype on disk - that assumption is made internally in the module. Could be changed to _dtype, as you suggest. However, I'm not sold on this - the case could be made that the dtype should reflect the netcdf data type. Variable.dtype has always been different than the numpy dtype when scale_factor/add_offset is set. I view _Unsigned as another sort of rescaling.
Checking out #674, related to _Unsigned = "true" from #656 and #671, I suppose the following is a question of semantics.
Should Variable.dtype return the data type on disk, or the datatype that the user will receive from __getitem__? Right now this test fails. I think it should pass, reasoning that:
If this test should indeed pass, I would be happy to initiate a PR with this test and a solution. The solution that came to mind would be to change references to self.dtype inside the Variable class to self._dtype and then expose dtype as a property, thoughts?
Results:
The text was updated successfully, but these errors were encountered: