-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test candidates for replacing tiles-daily.csv and exposures-daily.fits #2345
Comments
PS, the patching code is currently in this file. |
I m poking around a bit with the files. maybe that is out of scope of this exercise here, I don t know. example of one check, which looks for replaced non-zero, finite values (for float/int keys only):
returns
for the
and from the data header, the 0.933 value looks right:
|
Ah, I neglected to mention: the only values that are replaced are NaN or empty. So any valid value is better than the existing value. |
thanks for that precision. |
That's unexpected. OK, I will take a look at that in detail when I get a chance. |
I think the issue comes from this line:
shouldn t be replaced only the 'masked' indexes here? (which can change for each column)
|
Another subtlety here (I am not sure if it matters or not, but I mention in case): it can happen that a column is masked in |
@weaverba137 : I ended up coding some modifications in
This is not a complete fix, but I share in case it is useful. I tried to:
If useful, I ve made a run of this script in that same folder; in particular, I ve dumped the prompted output in a log file ( |
And an aside-remark: if you re ok to bring in a
That |
@araichoor I think that you are correct that the replacement is happening over all rows instead of the daily rows with invalid values. That somehow got dropped when moving the code from my notebook. However, I won't have time to look at this in detail until next week. |
@araichoor I've implemented most of your suggestions, or otherwise found similar solutions, so please examine these new files:
|
thanks! I ve given a very quick look at the EXPOSURES extension of the exposures*fits files: this first sanity check looks good, ie changed values are maybe it s because of these lines: https://github.com/desihub/specprod-db/blob/0b3168e3d6e364f4781ab8c719391627a449bd52/py/specprodDB/patch.py#L219-L227?
returns
|
Thank you, the numbers in your table agree with the numbers logged by the patch script. @sbailey should comment on this, but yes, we decided that any remaining NaN, even after patching, should be set to zero. |
ok great, if that s expected; let s wait Stephen s confirmation here. now, if I run the same code snippet on the FRAMES extension of exposures*fits, I do see remaining
also, I notice that e.g. no
=>
wouldn t we run the same patching function for both extensions? |
The patching code still has vestiges of patching only tiles/exposures/frames that would ever be loaded into a database, rather than attempting to patch all possible tiles/exposures/frames. That's what we're seeing here. I'll take another pass at the frames. |
re-thanks. if relevant, in the case we end up replacing |
Again, vestiges... |
@araichoor, I've applied your suggestions to the FRAMES hdu, so please test with
|
Also note: |
@sbailey @akremin @araichoor could you please review these latest files with an eye toward making the final patch and substitution of the files this week:
|
@weaverba137 summary of findings: tiles-daily-patched-with-jura-20240913.csv
exposures-daily-patched-with-jura-20240913.fits
For both tiles and exposures
|
That's unexpected, I'll have to do more debugging.
Maybe they are NaN in Jura. Again more debugging.
Acknowledged. |
Also needs more debugging. Didn't want to forget that one. |
@sbailey, here's a quick analysis.
This one was easy, as it turns out. Tile 41685 is not in
For reasons I can no longer recall, I was not actually patching that column. I must have thought there were no NaN in that column. It may be a technical issue. Normally I would detect columns that contain NaN because the Table column has a
I'm still investigating this one, but I'm pretty sure that those exposures are not in |
@sbailey, I've fixed the two issues with the tiles file, and the issue with the exposures file seems to have gone away. Please take a look:
|
It looks like the SURVEY, PROGRAM patching was done for tiles but not exposures, leading to inconsistencies in these columns for the same TILEID. Admittedly, daily also has inconsistencies on these columns, but let's get that fixed now too while we're patching these files. For the record, production reruns like Kibo do not have inconsistencies on these columns for the tiles that they cover. Otherwise this looks good to proceed with making an updated patch and putting it into place, coordinated with @akremin . |
@sbailey I will have to run some tests to ensure that the value of PS, I'm @weaverba137. I know, it's weird. |
And good thing I checked, because I found inconsistencies between exposures and frames in both |
@sbailey the most recent SURVEY, PROGRAM issue is now fixed, please test:
|
With the updates 20241001 files, the updates to SURVEY, PROGRAM, FAPROGRAM, and FAFLAVOR look good, but GOALTYPE might be a mix, e.g.
there are still 1050 cmx and sv1 exposures entries with FAPRGRM, FAFLAVOR, or GOALTYPE = 'unknown'. @araichoor could you help identify what those should be?
Especially for GOALTYPE the value may not matter, but something like 'n/a' or 'na' (not applicable) or "none" could be better than "unknown". |
Is that in the tiles file or the corresponding exposures in the exposures file? |
PS, with the latest update, the goal was to enforce consistency, which is my interpretation of your previous request. This is done by first performing all the patching and then "back-patching" the exposures and frames to match the tiles file. The "back-patching" is done on these columns: |
My latest statements were about the exposures file, and I agree that they are now consistent with the tiles file. What I was raising with the "unknown" was a newly discovered issue that I hope we can fix (or otherwise decide that that some remaining unknowns are ok). |
@akremin @araichoor please see the comments above about GOALTYPE. In the hopes that this will be wrapped up relatively soon, @sbailey & I had a short offline discussion about what
where 20201214 is the earliest night in However, there are tiles which satisfy this condition that have the following anomaly: when you examine the corresponding exposures for the tile, they have To identify these, I used the following snippet: tiles = Table.read('tiles-daily-patched...csv'). # Adjust path as needed
exposures = Table.read('exposures-daily-patched...fits', hdu='EXPOSURES')
candidate_tiles = tiles[(tiles['LASTNIGHT'] >= 20201214) & (tiles['EFFTIME_SPEC'] > 0)]
for new_tile in candidate_tiles:
row_index = np.where((exposures['TILEID'] == new_tile['TILEID']) & (exposures['EFFTIME_SPEC'] > 0))[0]
if len(row_index) == 0:
print("ERROR: No valid exposures found for tile {0:d}, even though EFFTIME_SPEC == {1:f}!".format(new_tile['TILEID'], new_tile['EFFTIME_SPEC']))
bad_index = np.where((exposures['TILEID'] == new_tile['TILEID']))[0]
print(exposures[['TILEID', 'EXPID', 'NIGHT', 'SURVEY', 'PROGRAM', 'FAPRGRM', 'FAFLAVOR', 'EFFTIME_SPEC']][bad_index]) @sbailey & I agree on the following:
One option would be to:
In any case, we should at least try to understand what happened to create the anomaly and finalize a patching decision based on that. |
In fact, many, but not all, of the |
@sbailey, about: if I m correct, the goaltype keyword in the fiberassign header has been introduced through this PR merged on Mar. 22, 2021: desihub/fiberassign#319 so I guess there s no "correct" value here, it s just what do fill those up with. if informative, here s the
|
re- "tiles 83024 and 83004 went from GOALTYPE=bright/dark to other. That might be right, but @araichoor please check." btw, I notice that those "other" values are partially there in the daily, there in jura, but not in kibo, i.e. it is a bit messy: in daily:
in jura:
in kibo:
|
Summarizing the outstanding issues with action items:
Outstanding issue to understand:
|
Thanx @sbailey, I'll get started on those as soon as I can. |
Summarizing analog conversation with @akremin about the exposures vs. tiles EFFTIME_SPEC inconsistencies:
|
@sbailey, one point of clarification: the above doesn't cover the situation where all exposures have EFFTIME_SPEC = 0 even after patching with |
I think we should leave those as EFFTIME_SPEC=0 in exposures and EFFTIME_SPEC>0 in tiles, i.e. leave them as glaringly inconsistent pending further study. If/when we do update them again, the new tiles UPDATED column will help trigger a refresh of the DB. And after further inspection, @akremin recommends that "For 21273 and 1825 I think we want both to be EFFTIME=0" |
In that case I suggest we proceed as follows: for tiles in this situation, they will appear in the |
So it looks like my concern may be moot. After excluding tiles 1825, 21273, all other tiles can be at least partially patched with |
@sbailey, @araichoor, @akremin, I've implemented the latest round of suggested patches:
|
@weaverba137 thanks. This looks good to me. No further requests. For the record: there are still discrepancies where exposures sum(EFFTIME_SPEC) != tiles EFFTIME_SPEC which we are punting to study later(never?). I have checked and agree that we don't have any more cases where tiles EFFTIME_SPEC>0 but exposures sum(EFFTIME_SPEC)==0. They are always at least semi-close. I will leave it to @weaverba137 and @akremin to coordinate the final integration with dailyops: patch+install these into the actual daily prod, plus merge #2773 and update at NERSC so that the next update will include the UPDATED timestamp column. I suggest that you do this on Monday so that if there are any unexpected side-effects, you are debugging them during the work week not during the weekend. |
Yes, that agrees with my findings. If there are additional cases with this situation, they are prior to 20201214.
Monday would be fine with me. |
PS, and to be clear, we would re-patch the files to ensure that any tiles that came in over the weekend would be included. |
Perfect, thank you Ben. Yes, on Monday we would want to do the following:
|
Closed by #2373 and desihub/specprod-db#16. |
I've prepared some patched versions of the top-level
tiles-daily.csv
andexposures-daily.(fits|csv)
files.The patch data come from:
jura
, in the case of exposures and theFRAMES
HDU inexposures-daily.fits
.faflavor2program()
, in the case of tiles with a missingPROGRAM
.SURVEY
== 'unknown' to 'cmx', essentially by hand.Please take a look at:
${DESI_ROOT}/users/bweaver/exposures-daily-patched-with-jura-20240826.fits
${DESI_ROOT}/users/bweaver/tiles-daily-patched-with-jura-20240826.csv
For the best possible comparison, I made copies of
tiles-daily.csv
andexposures-daily.fits
at the time they were patched:${DESI_ROOT}/users/bweaver/exposures-daily-original-20240826.fits
${DESI_ROOT}/users/bweaver/tiles-daily-original-20240826.csv
The patching only takes ~1 minute, so we can iterate on any problems found. Then, when we're ready, we can choose a "quiet" time, possibly after
kibo
to permanently replace the top-leveldaily
files, which will then be appended to as usual, i.e.desi_tsnr_afterburner
.The text was updated successfully, but these errors were encountered: