Use NDAstroData.unit #140

saimn · 2021-03-15T18:58:22Z

This was suggested by @chris-simpson in Trac#828, cc @KathleenLabrie as well.

This PR allows to read the unit from BUNIT and set it to NDAstroData.unit, and the opposite when writing files.
However there is a problem with this. When a NDData object has a unit, then it uses this unit in computations (using Astropy Quantity objects). Which means that all operands should have a unit. Adding or subtracting a scalar or a Numpy array without a unit raises a UnitConversionError.

There are I think 3 solutions:

forget about this change ;)
update the code everywhere to use units ...
or we could probably customize the arithmetic code in NDAstroData so we don't require the operand to have a unit. But then we loose the benefits of having units, so not sure about this.

Another thing to mention here is that currently there is a small performance overhead when doing arithmetic with NDData, because it uses self.data * self.unit to attach the unit which makes a copy of the array. This could be fixed by changing the code in Astropy to use self.data << self.unit which does not make a copy: astropy/astropy#11107

chris-simpson · 2021-03-15T19:14:09Z

Will we have data without units after this is merged, given that units are added when the AD is created? The addition/subtraction operations we do are things like dark or bias removal, and those will now have units, so it's not clear your "problem" is actually a problem.
Is there a performance difference between these (if gain is a scalar):

ext.multiply(gain * u.electron / u.adu)

and

ext.multiply(gain)
ext.unit = u.electron

saimn · 2021-03-15T19:38:18Z

You mean the .data attribute? It will still be a plain Numpy array, so indeed the problem is maybe not that worse. All the places where we use ext.data would not be impacted, the problem is only when using .add() or .subtract(). As I just pushed this branch we will have a better view once Jenkins has run.
If ext has a unit it will be the same for both since ext.multiply(gain) will do the multiplication with gain * u.dimensionless_unscaled and self.data * self.unit which currently does a copy.

saimn · 2021-03-16T18:28:26Z

Tests are passing after setting the unit to ADU in _append_nddata and with GeminiDRSoftware/AstroFaker#9. For .append I'm not sure what's the best way: we could add a unit argument that would default to ADU and would be set to the NDData object if it doesn't have a unit ?

chris-simpson · 2021-03-16T18:34:18Z

Yes, I think just like we assume that data in a FITS file are in ADU if there's no BUNIT, then we should assume they're in ADU when created from an NDData with no unit.

saimn · 2021-03-17T22:15:06Z

Ok, it works and I replaced various places where BUNIT was parsed. I'm not completely happy with the changes in fluxCalibrate but it's a bit tricky because .uncertainty has its own .unit attribute which isn't updated if e.g. ext.unit is set directly.

chris-simpson · 2021-03-17T22:54:37Z

I'm not clear why the code at the end of fluxCalibrate() has had to change the way it has. What's wrong with the existing code that multiplies ext rather than ext.nddata?

saimn · 2021-03-18T14:05:50Z

Yes, that's the part I'm not happy with, but the problem with the previous version is that as it does the computation without units it doesn't update .uncertainty.unit. So I choose to copy .convert_unit_to from Astropy because it may be useful in other cases, and then I need to update in-place ext.nddata.
Another option would be to use the former code and update .uncertainty.unit with .uncertainty._data_unit_to_uncertainty_unit(ext.unit), but that's a private method.
Or we could set directly data and uncertainty with what's done in .convert_unit_to e.g. ext.data = ext.unit.to(final_unit, ext.data, equivalencies=equivalencies) and same for uncertainty.
Last option would be to allow to set the nddata attribute, ext.nddata = new_nddata. That would be the best option, but replacing the nddata object which is stored in the parent AstroData object may be a problem if there are other references to it. But now I'm thinking that I could do the in-place replacement as done here in the nddata setter. That would be a good compromise and could be useful in other places.

chris-simpson · 2021-03-18T18:52:13Z

But the previous version used ext.multiply, which calls nddata.multiply so is the failure to update units an omission from NDData? If the only operation that's missing from the previous code is the failure to update .uncertainty.unit then can't we keep the previous code and just update the unit manually as one additional line?

Although a quick look through nduncertainty.py looks like the uncertainty propagation is OK if .uncertainty.unit is None. So should we simply not worry about updating it?

saimn · 2021-03-18T19:03:28Z

In the previous version ext.multiply was called with a Numpy array, without units, so NDData was not updating the unit. ext.unit was updated manually, but ext.uncertainty.unit was not so it was still in electron2 / s2 which would brake if some other operations are done after on the spectrum. So yeah we could also update ext.uncertainty.unit manually, that would be the simplest, but I was thinking that it would be nicer if we could use directly NDData with a quantity.

saimn · 2021-03-18T19:11:41Z

So the other option I was thinking is ae59e96 which I just pushed. I'm happy to revert to the original version if you prefer it, and setting manually ext.uncertainty.unit. Not sure what is best, having the possibility to set a new NDData object to ext.nddata may be more generally useful ?

chris-simpson · 2021-03-18T19:59:11Z

NDAstroData is hardcoded to use VarianceUncertainty (actually ADVarianceUncertainty), so if you want to set the .nddata directly then you either need to raise an error if not isinstance(nddata.uncertainty, VarUncertainty) or convert the array of the uncertainty subclass into variance (which I don't think is possible).

Because we use ADVarianceUncertainty, the units are always determined by the units of the data array, so can we just define the unit property in that class:

@property
def unit(self):
    return self.parent_nddata.unit ** 2

saimn · 2021-03-18T22:12:32Z

That's also an option, but then something must be done for the setter method as well.

chris-simpson · 2021-03-18T23:05:24Z

Can't you simply raise an exception in the ADVarianceUncertainty.unit.setter? Or maybe have unit return _unit if that's set and what I said above if not? It looks like NDUncertainty is written to check compatibility of units so you can have data in W/m^2/nm and uncertainty in (W/m^2/Hz) ** 2 but why would you do that? And, even though it might be allowed by NDData, we don't really allow it in NDAstroData since we have a variance attribute that gets set without concern about units and is assumed to have units equal to the square of the data units. If we genuinely allow this sort of unit mixing (and support it, rather than just not worry about it), then the .variance property needs to take a look at.uncertainty.unit and do a unit conversion so that a calculation like data / sqrt(variance) really is returning the signal-to-noise ratio. At the moment it just returns uncertainty.array and perhaps we want our ADVarianceUncertainty subclass to keep things tightly-constrained.

Sorry this seems to have opened a can of worms but I think it's clearly the correct idea to use .unit rather than have it in the BUNIT keyword of the metadata.

saimn added 3 commits March 15, 2021 15:19

Set NDData.unit when reading FITS files

193c6ca

Add setter for unit

7bdfee6

Write unit to BUNIT

4c5ea65

saimn added 6 commits March 15, 2021 18:05

Fix getting unit from AstroData for Spek1D

b430235

Update unit in ADUToElectrons

dcf3ae2

Set default unit for .append and fix unit tests

90b9688

Fix test_stack

16cbd34

Fix test_spectral

7ec18cd

Use AstroFaker branch

e2f48c6

saimn added 10 commits March 16, 2021 18:36

Add unit argument for .append

513d0ad

Revert to use master branch of AstroFaker

a146920

Add test for ADUToElectrons

d9797d4

Add .unit attribute for AstroData

ffd90ec

Use .unit where possible

6a1c010

Add comment when setting BUNIT

20d6b43

Test that BUNIT is correctly written after ADUToElectrons

b99352c

Fix reading file

aadc1b3

Fix fluxCalibrate to have correct unit for uncertainty

393e938

Avoid running test twice

c72a7c2

saimn marked this pull request as ready for review March 17, 2021 22:15

Allow to set a new NDData

ae59e96

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use NDAstroData.unit #140

Use NDAstroData.unit #140

saimn commented Mar 15, 2021

chris-simpson commented Mar 15, 2021

saimn commented Mar 15, 2021

saimn commented Mar 16, 2021

chris-simpson commented Mar 16, 2021

saimn commented Mar 17, 2021

chris-simpson commented Mar 17, 2021

saimn commented Mar 18, 2021

chris-simpson commented Mar 18, 2021

saimn commented Mar 18, 2021

saimn commented Mar 18, 2021

chris-simpson commented Mar 18, 2021

saimn commented Mar 18, 2021

chris-simpson commented Mar 18, 2021

Use NDAstroData.unit #140

Are you sure you want to change the base?

Use NDAstroData.unit #140

Conversation

saimn commented Mar 15, 2021

chris-simpson commented Mar 15, 2021

saimn commented Mar 15, 2021

saimn commented Mar 16, 2021

chris-simpson commented Mar 16, 2021

saimn commented Mar 17, 2021

chris-simpson commented Mar 17, 2021

saimn commented Mar 18, 2021

chris-simpson commented Mar 18, 2021

saimn commented Mar 18, 2021

saimn commented Mar 18, 2021

chris-simpson commented Mar 18, 2021

saimn commented Mar 18, 2021

chris-simpson commented Mar 18, 2021