Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Performance Improvements for LST binning #913

Open
steven-murray opened this issue Sep 20, 2023 · 0 comments
Open

Potential Performance Improvements for LST binning #913

steven-murray opened this issue Sep 20, 2023 · 0 comments
Assignees
Labels
type: cpu-performance An enhancement to wall-time performance

Comments

@steven-murray
Copy link
Contributor

steven-murray commented Sep 20, 2023

This issue is just to keep a little log of performance for the lst binner as it stands, and to serve as a jumping-off point for future improvements, if necessary.

When run on redundantly-averaged H6C data (1760 baseline-pols), the lst_bin_files function takes 337 seconds (5.5 min) per output file (with 2 10-second LST bins per file).

The major breakdown of time is:

  • 1min [~20%] -- reading the LST bin configuration YAML
  • 64sec [~20%] -- reduce_lst_bins (i.e. doing the averaging itself)
  • 51sec [~18%] -- HERAData.read() (no, not using the Fast reader yet)
  • 2 min [~40%] -- keyed_on_bls() (i.e. finding the unique baselines that exist in the data)

Of these, I think the LST bin configuration YAML reading and the unique baseline keying should be simplest to trim down. I have no reason to suspect that they shouldn't be roughly negligible, meaning we can ~halve the time of LST binning.

For NON redundantly averaged H6C data, the lst_bin_files function takes 5150 seconds (~1.5 hours) per output file.
The major breakdown of time is:

  • 64% in lst_bin_files_for_baselines, of which
    • 73% in reading data (HERAData.read), of which
      • 50% in actual uvdata read
      • 20% in blt slice determination
      • 30% in build_datacontainers
    • 10% calibrate-in-place
    • 6% in lst-align
  • 32% in reduce_lst_bins , of which
    • 37% in lst_average
    • 60% in getting MED/MAD (which was supposed to be turned off...)
@steven-murray steven-murray self-assigned this Sep 20, 2023
@steven-murray steven-murray added the type: cpu-performance An enhancement to wall-time performance label Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: cpu-performance An enhancement to wall-time performance
Projects
None yet
Development

No branches or pull requests

1 participant