Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add row-by-row extraction and CTE correction modules #2144

Merged
merged 9 commits into from
Jan 9, 2024
Merged

Conversation

schlafly
Copy link
Contributor

@schlafly schlafly commented Dec 1, 2023

This PR adds two components to the pipeline

  • an "optimal" row-by-row extraction routine
  • a CTE correction routine

These are connected because the CTE correction routine uses the row-by-row extraction to model images. spectroperfectionism would be better but is slower (?); this should occur in a loop ~5-10 times for each CTE-affected frame and so speed was desirable. I also experimented with the simpler aperture qproc.extract routine, which was plenty fast but which did a poor job of eliminating light from neighboring spectra from contaminating target spectra. This was particularly a problem for bright tiles where we want good sky spectra (faint) but the target spectra are rather bright. We ended up on row-by-row extraction as a compromise.

The main CTE correction algorithm looks like this:

  • construct an image model (extract 1D spectra from 2D frame I'; filter & compute sky; project back to 2D) to get M(I')
  • apply CTE to get CTE(M(I'))
  • Improve 2D image: I' <- I' - (CTE(M(I')) - I')
  • Repeat.

The real challenges here are getting a good model of what CTE does to an image, and correcting for it in a way that doesn't correlate the read noise. With Nicolas Regnault's help we have a pretty decent model (though we could study this more...), and the row-by-row extraction modeling handles the second part well enough.

Here are some QA comparisons. The following PDFs have interleaved daily & this-PR-on-z1-only reductions of a lot of exposures from 20230825 - 20230906. The first in each pair of pages is daily, the second is the new reduction.
https://data.desi.lbl.gov/desi/users/schlafly/misc/schlafly-cte-5-tileqa.pdf
https://data.desi.lbl.gov/desi/users/schlafly/misc/schlafly-cte-5-skyzfiber.pdf
To drill down more, look at the nightqa in
https://data.desi.lbl.gov/desi/spectro/redux/schlafly-cte-5/nightqa/
But here's at least one comparison plot to show that when things are bad, the new code makes a big improvement:
Daily:
image
New:
image
(only z1 has the new code active)

And finally, one plot to show that we're not increasing the noise.
image
This just gives the median absolute deviation converted to a standard deviation of the sky fibers in all of the exposures I've run, as a function of exposure index, for z1, only in the CTE affected region. The new code is slightly better than the old code. I think this is understating the improvement---most of the issues in daily are on sky lines or individual fibers, which the clipping is doing a good job of removing---but it at least says that things aren't worse.

The optimal row-by-row extraction routine is intended to be the usual thing:

  • we have some traces and a PSF
  • we get the PSF value in each pixel (rowbyrowextract.pgh; copy of version in specter.psf.gausshermite, but with some broadcasting changes)
  • we get the 1D cross-dispersion profile in each pixel for each trace (rowbyrowextract.onedprofile_gh)
  • we model each spectral bundle / CCD row by linear least squares (build_atcinva, build_atcinvb, extract).
  • that's it?

Maybe a few things worth discussing in the extraction:

  • I worked pretty hard to make this fast. Most of the time is now in the PSF creation (onedprofile_gh) and matrix solution (linalg.solve, linalg.inv(atcinva)). It's possible there are more improvements there, but I didn't immediately find them. Extracting 500 spectra on a frame takes ~10 s.
  • I got annoyed about degenerate matrices and replaced all of the ivar = 0 pixels with ivar = 10^-10. This means that I get an answer for every pixel, though the errors will be very large for fully masked pixels. There's probably something I should be doing about flagging those very uncertain masked pixels.
  • It looks to me like we could work harder to get the traces perfectly centered, but it doesn't make a big difference. But that's the thing that jumps out in the 2D residuals. I'm just assuming the traces have already been perfectly centered here.

The CTE correction module does a few different CTE-related things.

Applying CTE to images:

  • apply_multiple_cte_effects: loop through the possible multiple traps on each amp, apply them in the right order to the right pixels. Currently we only have at most one per amp, but...
  • add_cte: adds CTE to the whole image it's given
  • simplified_regnault: this is the function that I actually ended up using that determines how electrons flow into and out of the trap according to how many electrons are currently in the trap, how many are in the pixel coming through, and the trap parameters.

Correcting CTE:

  • correct_amp: Does the core loop, adding CTE to an image, treating the difference wrt the original image as a CTE correction that needs to be applied, and iterating. This does not use a model image and so correlates the noise; we don't actually use this mode in practice.
  • correct_image: This loops over correct_amp for each amplifier, and also flips the read-out direction so that correct_amp doesn't need to think about this.
  • get_image_model: gets the model for an image used in the image-model based CTE correction. This version uses the old aperture-extraction based approach.
  • get_rowbyrow_image_model: gets the model for an image used in the image-model based CTE correction. This calls the row-by-row extraction routine after updating the traces. It then flattens, computes and subtracts a sky model, median filters the sky-subtracted fluxes, and constructs model extracted fluxes as sky + median filtered fluxes times the flat field. These model spectra are then converted back into a 2D image via rowbyrowextract.model.
  • correct_image_via_model: this does the new CTE correction that uses an image model.

Fitting CTE parameters:

  • chi_simplified_regnault: computes chi for a CTE model from some flat field images. Assumes we have some clean traces and some CTE-affected traces and that the two should match when the right model is applied to the clean traces.
  • fit_cte: calls chi_simplified_regnault using numpy nonlinear least squares routines, and sets up the appropriate clean and CTE-affected traces to compare. This only works if we don't have CTE affecting both an amp and its mirror amp. I'm sure it won't work for two amp mode, etc.
  • get_cte_images: tries to grab the images we need for fitting CTE for a particular night. Right now, grabs the 1s flat and the previous image.
  • fit_cte_night: fits the CTE parameters for a particular camera and night.

Metadata:

  • get_amps_and_cte: looks up the amp definitions and CTE trap locations, and the CTE parameters for the particular trap. This should some day consult the calibrations that were derived for the particular day, but right now it looks up the information in a hard coded dictionary.
  • ctefun_dict: this is the hard coded dictionary that the CTE transfer function information is currently looked up in. This should eventually be deleted.

@schlafly
Copy link
Contributor Author

schlafly commented Dec 4, 2023

Maybe one more bit of code that could be useful but isn't in the PR proper... this is the hack I used to call my CTE correction code on existing preproc files.

I don't think we really want this---we should build it into the preproc code---but this is the easiest way to run the current code on an existing preproc image.

def process(x):
    im = desispec.io.read_image('/global/homes/s/schlafly/desiproject/spectro/redux/daily/preproc/'+x)
    if im.meta['OBSTYPE'] != 'SCIENCE':
        return
    fmap = read_fibermap('/global/homes/s/schlafly/desiproject/spectro/redux/daily/preproc/'+x)
    desispec.io.util.addkeys(fmap.meta, im.meta, skipkeys=('BZERO', 'BSCALE'))
    im.fibermap = fmap
    newim = correct_cte.correct_image_via_model(im, niter=5)
    desispec.io.write_image(x, newim)

@schlafly
Copy link
Contributor Author

schlafly commented Dec 5, 2023

Here are a couple of example images showing how the trap parameters have evolved with time, on z1 and z3.

Note that when the trap amplitudes are small (e.g., less than ~50), the existing corrections are adequate and we probably don't need to turn on this special mode. Moreover, the uncertainty in the leakiness isn't a big deal as the amplitude gets lower, since the trap is just not trapping as many electrons and the correction becomes quite small.

image
image

Stephen expressed some concern about the single point on z1 with highly outlying amplitude of ~40. I haven't looked at that one case, but my sense is that I'm amazed at how well this worked with the simple scheme of just grabbing one 1 s flat and the previous image for each night. As a reminder, a lot can go wrong with the cals. It may make sense to add the fit parameters to nightly QA or compare them to the previous night's values or something to help us identify when something has gone wrong with the nightly cals. I don't know what the usual mechanisms there are.

There is also room to be concerned about the higher chi^2 / dof on pre-wildfire nights on z1. The suggestion is that the CTE looked different then and the existing model can't fit it well. I don't think we should apply the correction to that data and can certainly imagine that we'll need to develop new prescriptions in the future to address them. But the existing large CTE effects on z1, z3, r6 that are poorly corrected are all well modeled by this parameterization.

@schlafly
Copy link
Contributor Author

Stephen asked me to diagnose the one night where z1 shows dramatically lower CTE than usual, night index ~ 550 or so. It turns out that that night is 20230822. This was the night where we experimented with different voltage settings on z1. We ended up not keeping the new voltages because they increased the read noise, but they did dramatically improve the CTE. i.e., this turns out not to be an actual outlier but in fact a reflection of an actual change in the CTE in that exposure.

As part of diagnosing this I computed appropriate CTE model parameters for all of the traps I know about through today and stored them here:
https://data.desi.lbl.gov/desi/users/schlafly/cteparams/cteparamsall.fits
These are pretty fast to compute; doing all nights from 2022/1 through today took O(15 min) on one interactive node. There's a corresponding PDF here:
https://data.desi.lbl.gov/desi/users/schlafly/cteparams/cteparamtrends.pdf

It looks like at least one of the outliers on z5 is due to a particularly vertical CR. I should be able to do better with the masking there.

Copy link
Contributor

@julienguy julienguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Ran a successful test (see other comment)
  • The code in this PR is just the functions, so we have to work on scripts, calibration data I/O, and integration in the pipeline. This will be done in another PR. I will look into it.
  • 2 requests:
    • please remove the hardcoded z1 CTE parameters from the library and have the parameter dictionary as an argument to the correction function
    • please add a unit test for the row-by-row Horne extraction function

@julienguy
Copy link
Contributor

Test report:
I ran the CTE parameter fit on the night 20231111 for z1 and z3.

export SPECPROD=daily

then in python

from desispec.correct_cte import fit_cte_night
fit_cte_night(20231111,"z1")
# {'C': {'543:2057': (array([139.49198384,   0.20785951]), 0.5409279496512583)}}
fit_cte_night(20231111,"z3")
# {'C': {'1668:2057': (array([91.33659997,  0.22315425]), 1.0845442585069123)}}

I then edited the library code to add those parameters in order to run the code. Then wrote a script with the following lines and ran it for exposure NIGHT=20231111 EXPID=00204371 which is a 1s LED exposure.

    im = read_image(args.infile)
    fmap = read_fibermap(args.infile)
    addkeys(fmap.meta, im.meta, skipkeys=('BZERO', 'BSCALE'))
    im.fibermap = fmap
    newim = correct_image_via_model(im, niter=args.niter)
    write_image(args.outfile, newim)

z1 before:
preproc-z1-00204371-before
z1 after:
preproc-z1-00204371-after
z3 before:
preproc-z3-00204371-before
z3 after:
preproc-z3-00204371-after

The correction is almost perfect for z1 and pretty good for z3. We may need to add more calibration data for the modeling.

@schlafly
Copy link
Contributor Author

schlafly commented Jan 9, 2024

I have added some simple unit tests; take a look. They were a bit harder to make than I wanted because it doesn't look like it's easy to make a simple specter PSF object, and so I used the test PSF object that's also used in specter. And that object is rather more detailed than I had wanted in my unit tests. I would have been happy with straight traces and sum(profile) = 1 and no negative pixels in the profile, for example! Those details make sanity checks less easy.

Of particular note is that the 1D profile in specter contains "tails" which are usually zero in real DESI data. I spent a while not realizing that my model & extraction were inconsistent because I had used tails = True (the default) in onedprofile_gh and tails = False (the default) in extract, and so I was failing these tests and trying to figure out how I had screwed up so badly! This doesn't really matter in real data since the tails are zero, but not in the test data. I don't think there are any real bugs here. It would be nice if the defaults for extract(...) and onedprofile_gh(...) were the same---that would certainly minimize surprise---but on the other hand it feels like the default for onedprofile should be to do everything and for extract(...) should be to do the efficient thing, so I haven't changed the behavior here.

That said, here's what the checks do:

  • when the image is zero, the fluxes are zero.
  • the one-d profiles are not completely crazy (the center is more than the edges, the sums are near 1 in the median and don't go terribly far from 1).
  • We can make an image where the fluxes are all 1 at each row, extract it, and gets fluxes that are all close to 1. i.e., extract(...) and model(...) are inverses.
  • We can make an image where the fluxes are 10 + random noise at each row, extract it, and get fluxes that are close to the injected random spectrum. i.e., extract and model are inverses for a somewhat more complicated case.
  • We can add noise and get a sensible reduced chi^2. I'm kind of surprised at how well that worked; I haven't tried to get the reduction in the degrees of freedom quite right.

Does that sound adequate?

@julienguy
Copy link
Contributor

julienguy commented Jan 9, 2024

The test looks great. I would suggest to encapsulate the functions into a class inheriting from unittest.TestCase , and a convenient function

if __name__ == '__main__':
    unittest.main()      

like is done for instance in desispec/test/test_coadd.py

Please also remove the hardcoded set of CTE params from the code and add them as an argument to the main function correct_image_via_model. I will then merge the PR and start from there to integrate this code into the pipeline processing.

@weaverba137
Copy link
Member

@julienguy, @schlafly, I'm actually going suggest the opposite: do not add the extra if __name__... boilerplate to the test. That is a leftover from the days of python setup.py test. The current best practice would be to run:

pytest py/desispec/test/test_rowbyrowextract.py

You can still optionally wrap the tests in unittest.TestCase though for consistency with other tests in the package.

@schlafly
Copy link
Contributor Author

schlafly commented Jan 9, 2024

In my own testing I ran just this module with Ben's invocation. The unit tests are getting run as I've written it.
https://github.com/desihub/desispec/actions/runs/7463894705/job/20309610108?pr=2144#step:5:50
i.e., I can wrap it for consistency but it's not clear to me that that buys anything.

I have a version locally that I haven't pushed yet that allows the arguments to be passed by hand but falls back to the dictionary, which we can delete later. If I totally delete it I won't be able to run nights anymore.

@schlafly
Copy link
Contributor Author

schlafly commented Jan 9, 2024

I went ahead and removed the hard-coded dictionary; I had forgotten that my current mechanism for running nights involved hacking the preproc files, so my objection wasn't relevant.

@julienguy julienguy self-requested a review January 9, 2024 19:29
@julienguy
Copy link
Contributor

OK, thanks Ben. pytest py/desispec/test/test_rowbyrowextract.py is working fine. I see the hardcoded params are removed. I am going to merge after sorting out the documentation error.

@sbailey
Copy link
Contributor

sbailey commented Jan 9, 2024

I think the failing documentation test is because the new modules haven't been added to doc/api.rst . I think the procedure is to run the following from the top-level desispec directory:

desi_api_file --overwrite

and then commit the updated doc/api.rst file.

@julienguy julienguy merged commit a05cbcc into main Jan 9, 2024
24 checks passed
@schlafly schlafly deleted the cte-modeling branch January 10, 2024 16:31
@sbailey sbailey removed their assignment Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants