Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix missing zall TSNR2 columns #2368

Merged
merged 1 commit into from
Sep 11, 2024
Merged

fix missing zall TSNR2 columns #2368

merged 1 commit into from
Sep 11, 2024

Conversation

sbailey
Copy link
Contributor

@sbailey sbailey commented Sep 11, 2024

This PR fixes the missing Kibo zall TSNR2_LYA and TSNR2_QSO columns reported in #2363 . This is required for making the final kibo/zcatalogs/v1/zall*.fits files.

The underlying issue is that desispec.zcatalog.update_table_columns makes some assumptions about column order of the input files while trying to standardize the output column order. The input order changed in Kibo (for reasons I haven't investigated) which broke those assumptions.

This is a minimal fix to get things to work for Kibo while still being backwards compatible with Iron etc. This update is only slightly less fragile than the original code, but while wrapping up Kibo I'm also trying to make the minimal working change and save deeper fixes for later.

The change: instead of assuming that TNSR2_LRG is the last TSNR2 column, check for for the last column that starts with TSNR2.

Test outputs using this branch are in $CFS/desi/spectro/redux/kibo/zcatalog/v1/zall-*-v1a.fits . These have the missing columns TSNR2_QSO and TSNR2_LYA at the intended location (after the other TSNR2* columns, and before the *_NSPEC and *_PRIMARY columns). Due to the size of the files and Perlmutter being out today, I have not checked that every row of every other column is unchanged, but I did check the first million rows and they exactly match as expected.

Copy link
Member

@akremin akremin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, for the quick fix. It looks like this fixes the current issue. It does appear that this script has several baked-in assumptions about column ordering that could lead to more hiccups down the road. It may be better to e.g. use the new tsnr = np.where(np.char.startswith(np.array(tab.colnames), 'TSNR2_'))[0] to select all of the TSNR columns rather than just the last one. Other columns would have to be hard-coded though, which is presumably why that wasn't done initially.

This PR fixes the issue at hand, so I am happy to merge as-is, but I would appreciate a comment on the point above before we do so. That way our future selves have a better understanding of the choices.

I'm not worried about row changes. If the first million rows are identical and your code is only adding columns, I'm not sure how the rows could be impacted (unless there are NaN's, which would be a bigger problem than this change).

@sbailey
Copy link
Contributor Author

sbailey commented Sep 11, 2024

I agree that this PR still has fragile assumptions, especially about the location of the NUMOBS_INIT and PLATE_RA columns relative to the *_TARGET and TSNR2_* columns. I don't think it is worth further generalizing the TSNR2 columns without also addressing those issues. I opened #2369 to track this after this PR is merged.

@akremin akremin merged commit 37e454d into main Sep 11, 2024
26 checks passed
@akremin
Copy link
Member

akremin commented Sep 11, 2024

Thanks. We will address further robustness improvements in that PR. Merging this as good enough for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants