-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.1 ref sim ci workflow #514
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #514 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 10 10
Lines 2024 2024
=========================================
Hits 2024 2024 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great, thank you @burdorfmitchell!
…niform for remote downloading.
…ns. currently only asserts '==' can implement others or even an absolute check
…obust downloading
* added dummy counter * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
… object comparison should be)
…g compare_post_benchmark using completion of another workflow (pull request and push)
…n testing / ci in the developer section of the docs, and amend README to link to it
…oved some commented out code and comments
8979cbd
to
ca1e7da
Compare
Implemented a new ci workflow which regression tests and benchmarks the less intensive reference simulations using pytest and pytest-benchmark. The historical simulation output is stored on the Brown Digital Repository. Benchmarking data is stored on the gh-pages branch.
Description
Added the necessary files to run many of the first generation reference simulations to src/pyuvsim/data:
src/pyuvsim/data/baseline_lite.csv
src/pyuvsim/data/bl_lite_gauss.yaml
src/pyuvsim/data/bl_lite_uniform.yaml
src/pyuvsim/data/mwa88_nocore_config_MWA.yaml
src/pyuvsim/data/mwa88_nocore_config_gauss.yaml
src/pyuvsim/data/test_catalogs/letter_R_12pt_2458098.38824015.txt
src/pyuvsim/data/test_catalogs/mock_catalog_heratext_2458098.38824015.txt
src/pyuvsim/data/test_catalogs/two_distant_points_2458098.38824015.txt
src/pyuvsim/data/test_config/obsparam_ref_1.1_gauss.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.1_mwa.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.1_uniform.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.2_gauss.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.2_uniform.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.3_gauss.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.3_uniform.yaml
These files are then used to run the reference simulations via the test_run_sim pytest method in the added tests/test_run_ref.py file, which has been configured to parametrize command line input to determine which reference simulation to run. The pytest method additionally benchmarks the reference simulation, and downloads older reference simulation output data from the Brown Digital Repository for regression testing.
The pytest method is called from the command line via a new ci workflow currently named "Run Compare Post Benchmark", which runs each reference simulation on a single core in parallel (depending on the amount of github runners available). The benchmark output for each reference simulation is stored as an artifact. Another job in the workflow is then ran which loads all the stored artifacts from the current workflow, concatenates the benchmarking data to a single file, then runs github-action-benchmark.
The github-action-benchmark action loads previous benchmark data from gh-pages, compares it with the current benchmark output, and provides an alert if a performance regression has occurred. The action then pushes the latest benchmark data to gh-pages ONLY if it is a push to main.
Two new dependencies have been added to the environment.yaml and mentioned in the README -- pytest-benchmark and requests. The docs have not been updated, but the README attempts to clarify that serious benchmarking should be done via the benchmarking directory and that this is a single core regression test.
Currently the ci workflow runs on push and pull request as the workflow_run webhook event did not seem to be triggering appropriately, but after merging with main it is likely this can be swapped to running after a successful completion of the Tests workflow. There seems to be roughly enough available runners to manage either way.
Motivation and Context
The reference simulations have had changes in output over time -- to the extent that some simulations stopped passing an equality check with their own older output. As the output change occurred over many years with minimal oversight, a significant amount of effort was required to confirm the physical consistency of pyuvsim simulations. By incorporating basic regression testing of the reference simulations into the ci workflow, we can identify any time a change in simulation output occurs, even if the cause of the change is external to pyuvsim. We can additionally maintain code performance through benchmarking.
Types of changes
Checklist:
For all pull requests:
Documentation change checklist:
Build or continuous integration change checklist: