-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rerun Redrock with dynamic priors based on QN redshift #2298
Conversation
- Previously code implicitly computed the prior in two different places. This commit reduces it to 1 so we can calculate it directly from the QN grid.
- Previously stored a magic number, now calculate it from the QN model grid assuming logarithmic grid.
Thanks @dylanagreen. This PR looks good to me. @julienguy is also going to run it through his simulation study as a final check, then hopefully we can merge. My main take away is that expanding the priors to be the width of the QN bins is the right thing to do, but isn't going to recover us a bunch of good targets. Those specific cases you showed were mostly pretty for both of the before/after fits. |
sel_QSO_RR_with_no_z_pb[sel_QN] &= np.abs(redrock['Z'][sel_QN] - z_QN[sel_index_with_QN]) <= 0.05 | ||
log.info(f"Remove {sel_QSO_RR_with_no_z_pb[sel_QN].sum()} objetcs with SPECTYPE==QSO and |z_RR - z_QN| < 0.05 (since even with the prior, RR will give the same result)") | ||
prior = a * (z_QN + 1) # Analytic prior width from QN box width | ||
sel_QSO_RR_with_no_z_pb[sel_QN] &= np.abs(redrock['Z'][sel_QN] - z_QN[sel_index_with_QN]) <= (prior[sel_index_with_QN] / 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the prior width is the width of the wavelength bin used by QN to identify the lines, shouldn't the delta_z selection cut be the full prior width and not half of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the same convention here as was used in the previous version, which also cut on prior_width / 2 (when prior_width = 0.1).
Redrock accepts an input prior width and a central redshift value for the prior. The prior is then centered at the given center with prior_width / 2 on either side of the central redshift given, which is why I (and whoever wrote the previous version) cut on abs(z_rr - z_qn) <= prior_width / 2, since those are the bounds used by the redrock convention.
The prior width in this PR is calculated such that if the Z_QN value is exactly in the middle of a box, the prior width is the entire box, i.e. the prior extends to the edge of the QN box on either side of the redshift, so it is correct as far as I'm aware to cut on prior_width / 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it true that the prior_width should then be twice the possible range of redshifts defined by the logarithmic wavelength bin, because we don't want to discard solutions where for instance RR redshift is at the lower bound and QN redshift at the upper bound? I am asking because it seems that the scale factor 'a' should then be twice as large as the value used in the code (but I may be wrong).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose that is a possibility, but it seems to me to be unlikely. QuasarNET tends to clump estimates in the middle of the boxes, it is true, but the converse of that is that when QuasarNET is at the edge of the box it tends to be somewhat sure that it's actually there, or alternatively if a line is at the edge of the plot QuasarNET's most confident line will probably be a different line in the middle of that line's box.
Based on my confusion plot we can probably get to 2x prior width and still be ok relative to line confusions. Of the 1440 that got rerun in my initial sampling in this branch, only 3 of them have RR redshifts that lie at or slightly beyond the boundary of the current prior. Only 23 are within 3000 km/s of the either edge of the current prior boundary. The overwhelming majority of reruns find a redshift within the prior range.
It might be useful to compare to data we have redshifts on:
Based on this version of the prior width plot in redshift space rather than velocity space, and comparing it to the simulations you presented at desi-data would indicate that most of the points you plotted comparing the QN redshift to the true simulated redshift would fall within this new prior without the need to double it, although it's hard to say without including the actual data. Comparing the v1.3 weights to the validation dataset I have held out there there are some spectra to be gained outside the current prior_width (y axis bounds here set to 2*prior_width for comparison):
The gain going from current width -> width * 2 seems small, compared to the gain from main_width -> branch_width. Actually the new weights significantly improve on this aspect...
I could be convinced the double the prior width, but I don't see a compelling reason to. If there's a desire though I can double it and rerun the same 20 tiles as before and see what the results look like.
I tested this branch with the QSO sims from the Lyman-alpha WG. For QN v1.3 (latest version), the results are the same as with the main branch, but for the QN version used in Y1, it is an improvement, with 0.09% outliers (|delta z|>0.05) compared to 0.12% in the final catalog. The improvement is probably more significant with real data where there is more diversity in the quasar spectra. |
A few thoughts from the brink of vacation:
However, I remain confused about a more basic point: A logarithmic wavelength binning is a constant step in I am disappearing on vacation so won't re-engage with this for a bit, but please do double check if the logarithmic bins are indeed a constant On a purely pragmatic level, the simulation study from Julien does show the scatter constant in velocity (not |
@sbailey I can try and provide some clarity at least on what we use in the code. The equation I use in the code is derived in the following way, considering the edges of two adjacent bins as where I rewrite so that and evidently from logarithmic properties: Using log base 10, which is what QuasarNET uses. A constant logartihmic grid means that I've done a little abuse of notation to remove the subscript on the redshift used here, since the Incidentally, this rewrite of Equation (3) shows why the I think the power of a log-lambda grid is not that it's constantly spaced in
I probably have a misunderstanding of my own, since I'm not sure where this value comes from myself, so some clarity would be appreciated. |
Per discussion with @julienguy I've run main, branch, and branch (2x prior size) on the VI'd QSO tiles: 80605, 80607, and 80609. Here's a table of results, and a few related plots:
I include here a few examples of the 12 that differ (with the VI quality cut) with VI redshifts plotted in the chi^2 scan. If there's no VI redshift in the scan that means it was outside the range of any of the priors: It seems that overall not much really changes between the two in terms of triggered reruns, although 10% of the reruns triggered in this PR change when we double the width of the prior. In most cases these are comparably low SNR fits, although in the examples two of them recovers the VI redshift with the 2x width prior. |
Merging after deciding to double the width of the prior because we have two clear case (with large delta_chi2) where QN got it wrong by more than half the width and redrock got the correct solution. |
This PR addresses #2293 and updates the QuasarNET afterburner to run using a prior defined based on the QuasarNET redshift rather than a constant width prior across the entire redshift range.
Using the width of the QN output grid boxes in redshift space we determine that the width of the prior goes as$\Delta z = a (1 + z)$ . We can calculate the necessary value of $a$ from the QN output grid bin widths. The following snippet of code will calculate the width necessary for the new RR priors, assuming QuasarNET uses 13 output bins (which any and all weights files we've used thus far in DESI do) and a logarithmic scaled grid:
desispec/py/desispec/scripts/qsoqn.py
Lines 312 to 322 in 07bdb74
This value is about
0.0814
Note that this value differs from that fit and shown in the above linked issue slightly due to a bug in my plot related to which x coordinate I used to plot the box widths.For validation I have run 20 pseudorandomly chosen tiles from jura/cumulative:
25574 83024 24020 42016 24421 41937 8829 24633 41917 8981 20113 41761 11089 22610 3036 9378 7219 3543 42115 1000
and compared the results with the old and new priors, using the iron weights file for QN. In these 20 tiles I get the following statistics (excluding sky fibers, some of which did trigger a rerun):I've orgnized the data into two columns based on whether a rerun is triggered by the MTL confidence cut (C_MAX > 0.5) or the more restrictive QSO Catalog confidence cut (C_MAX > 0.95). Note that it is expected that N_RERUN_BRANCH$\leq$ N_RERUN_MAIN, because we do not rerun any spectra where the redrock redshift is already within the prior width because redrock would get the same result. In this case, all spectra that were rerun in main but not rerun in this branch fall within the updated prior widths so were not rerun in branch.
In summary ~2% of spectra trigger a rerun, and of those ~5-10% receive a different redshift with the new priors.
The following plot shows these TARGETIDS who received different redshifts due to the new priors, using the MTL confidence cut of C_MAX > 0.5:
For every orange point within the old prior there is a corresponding blue point outside the old prior with the new redshift.
In order to validate that this width does not make us susceptible to any line confusions I have produced a version of the above plot with the necessary prior width that would give some common Quasar line confusions validating that our prior width does not introduce the possibility of RR line confusions.
I have manually inspected the$\chi^2$ scans of some of these differences to validate that at least from a redrock perspective these changes in redshift seem reasonable. The following are a few examples validating that the new redshifts fall into $\chi^2$ minima that are either outside or at the edge of the old prior but are now captured accurately in the new prior. I include only a few examples, more can be provided on request.
Fits (C_MAX > 0.5)
Most of these are low SNR, for most of these it is clear that the new priors have a better minima in redrock but it's not necessarily clear that it's a better fit.
FIts (C_MAX > 0.95)
These tend to be higher SNR. These are only a few examples, some mixed and some a success for the new priors.