Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata caching runs slow (specially at Utah) #123

Closed
ajmejia opened this issue Jul 25, 2024 · 5 comments · Fixed by #146
Closed

Metadata caching runs slow (specially at Utah) #123

ajmejia opened this issue Jul 25, 2024 · 5 comments · Fixed by #146

Comments

@ajmejia
Copy link
Contributor

ajmejia commented Jul 25, 2024

Caching the header metadata is slow in general, but it gets worse when adding header fix. At Utah this can take ~1min per camera frame.

@ajmejia
Copy link
Contributor Author

ajmejia commented Sep 4, 2024

Hi @havok2063, can you take a look into this one? I think I found the bottleneck is around the call of apply_hdrfix here:

https://github.com/sdss/lvmdrp/blob/master/python/lvmdrp/utils/metadata.py#L579

I think we don't need to read the header fixes every time, specially if we already gave a header object.

@havok2063
Copy link
Collaborator

If there are fixes to the header that need to be applied, then it does need to be read in every time we run the reductions. The header fix needs to be applied once at the beginning of the reduction, to the header of the raw sdR file, before the header gets propagated down to the other products.

Because you're also extracting raw sdR header information into the raw_metadata file, which we also use in the pipeline, the header fix needs to be applied here as well.

@ndrory
Copy link
Contributor

ndrory commented Sep 4, 2024 via email

@havok2063
Copy link
Collaborator

Then can you provide more context? Which MJDs are you testing with? Is it the same over every MJD? Do the number of raw exposures taken in a given MJD night make a difference? Is it this slow for MJDs that don't have header fix files but a large number of raw frames? This will help determine what kind of solution we might need.

@ajmejia
Copy link
Contributor Author

ajmejia commented Sep 5, 2024

This happens with all MJDs that have header fixes. Just to test I turned off the header fixes and the caching of metadata went from ~40s/frame to ~5frames/s.

I think a possible solution would be to change how we apply the header fixes during metadata caching. Instead of reading and resolving header fixes for each camera frame header, we could instead read header fixes for the target MJD, resolve all the camera frames that need fixes to be applied and feed that as input to the metadata caching routine. This way you only read header fixes once per MJD. This is more compatible with the "per MJD" design of the HdrFix files.

We could do something similar when applying header fixes to reductions. I suspect is not critical in the case of single MJD reductions but for long reductions this optimization could make a difference in speed of reductions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants