1.1 ref sim ci workflow #514

burdorfmitchell · 2024-12-05T05:38:46Z

Implemented a new ci workflow which regression tests and benchmarks the less intensive reference simulations using pytest and pytest-benchmark. The historical simulation output is stored on the Brown Digital Repository. Benchmarking data is stored on the gh-pages branch.

Description

Added the necessary files to run many of the first generation reference simulations to src/pyuvsim/data:

src/pyuvsim/data/baseline_lite.csv
src/pyuvsim/data/bl_lite_gauss.yaml
src/pyuvsim/data/bl_lite_uniform.yaml
src/pyuvsim/data/mwa88_nocore_config_MWA.yaml
src/pyuvsim/data/mwa88_nocore_config_gauss.yaml
src/pyuvsim/data/test_catalogs/letter_R_12pt_2458098.38824015.txt
src/pyuvsim/data/test_catalogs/mock_catalog_heratext_2458098.38824015.txt
src/pyuvsim/data/test_catalogs/two_distant_points_2458098.38824015.txt
src/pyuvsim/data/test_config/obsparam_ref_1.1_gauss.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.1_mwa.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.1_uniform.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.2_gauss.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.2_uniform.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.3_gauss.yaml
src/pyuvsim/data/test_config/obsparam_ref_1.3_uniform.yaml

These files are then used to run the reference simulations via the test_run_sim pytest method in the added tests/test_run_ref.py file, which has been configured to parametrize command line input to determine which reference simulation to run. The pytest method additionally benchmarks the reference simulation, and downloads older reference simulation output data from the Brown Digital Repository for regression testing.

The pytest method is called from the command line via a new ci workflow currently named "Run Compare Post Benchmark", which runs each reference simulation on a single core in parallel (depending on the amount of github runners available). The benchmark output for each reference simulation is stored as an artifact. Another job in the workflow is then ran which loads all the stored artifacts from the current workflow, concatenates the benchmarking data to a single file, then runs github-action-benchmark.

The github-action-benchmark action loads previous benchmark data from gh-pages, compares it with the current benchmark output, and provides an alert if a performance regression has occurred. The action then pushes the latest benchmark data to gh-pages ONLY if it is a push to main.

Two new dependencies have been added to the environment.yaml and mentioned in the README -- pytest-benchmark and requests. The docs have not been updated, but the README attempts to clarify that serious benchmarking should be done via the benchmarking directory and that this is a single core regression test.

Currently the ci workflow runs on push and pull request as the workflow_run webhook event did not seem to be triggering appropriately, but after merging with main it is likely this can be swapped to running after a successful completion of the Tests workflow. There seems to be roughly enough available runners to manage either way.

Motivation and Context

The reference simulations have had changes in output over time -- to the extent that some simulations stopped passing an equality check with their own older output. As the output change occurred over many years with minimal oversight, a significant amount of effort was required to confirm the physical consistency of pyuvsim simulations. By incorporating basic regression testing of the reference simulations into the ci workflow, we can identify any time a change in simulation output occurs, even if the cause of the change is external to pyuvsim. We can additionally maintain code performance through benchmarking.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Reference simulation update or replacement
Documentation change (documentation changes only)
Version change
Build or continuous integration change

Checklist:

For all pull requests:

I have read the contribution guide.
My code follows the code style of this project.

Documentation change checklist:

Any updated docstrings use the numpy docstring format.
If this is a significant change to the readme or other docs, I have checked that they are rendered properly on ReadTheDocs. (you may need help to get this branch to build on RTD, just ask!)

Build or continuous integration change checklist:

If required or optional dependencies have changed (including version numbers), I have updated the readme to reflect this.
If this is a new CI setup, I have added the associated badge to the readme and to references/make_index.py (if appropriate).

codecov · 2024-12-05T05:39:00Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (f5e5228) to head (ca1e7da).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##              main      #514   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           10        10           
  Lines         2024      2024           
=========================================
  Hits          2024      2024

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

bhazelton

This is looking great, thank you @burdorfmitchell!

README.md

.github/workflows/compare-post-benchmark.yaml

tests/test_run_ref.py

…niform for remote downloading.

…erly

…pendently

…ns. currently only asserts '==' can implement others or even an absolute check

…obust downloading

* added dummy counter * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

… object comparison should be)

…g compare_post_benchmark using completion of another workflow (pull request and push)

…st-benchmark

…n testing / ci in the developer section of the docs, and amend README to link to it

…oved some commented out code and comments

bhazelton added the continuous integration label Dec 5, 2024

bhazelton reviewed Dec 6, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

.github/workflows/compare-post-benchmark.yaml Outdated Show resolved Hide resolved

tests/test_run_ref.py Outdated Show resolved Hide resolved

burdorfmitchell added 27 commits December 19, 2024 05:06

added catalog file for 1.1 reference simulations to /data

0b3bb97

possible approach to packaging reference simulation data files

6da5eb7

moved files to better location

9b85f70

fixed up filepaths for 1.1_uniform reference simulation in /data

5adc447

attempting to add benchmarks and updated ref_sim_1.1 in /data

64892cf

benchmark needs more work but almost there

c57da79

re-commented line

19dc9af

preliminary commit for benchmark workflow. Still need to set up 1.1_u…

eec469e

…niform for remote downloading.

pytest should now hopefully work remotely -- file on gdrive

7d5ffd7

trying slight change to see if all other tests run to completion prop…

3acf42a

…erly

updated testsuite to install requests

b394a7e

test change

2fe9190

change and mpiexec

046ac37

tried possibly fixing caching

d995aa1

benchmark action stuff

85e1df4

minor updates to see if alert works and if -np 4 speeds up run

7dae05f

test 3

d80a216

just alert

3ef5c82

test failed assert

d38f00e

swap back to ==

18b1145

swapping back

78638e0

small TODO

41f3759

started adding new ref sim tests

441589f

formatting

b5bb17b

added 1.1_gauss

98e0144

got 1.1 uniform and gauss working, and resolved warnings

8d13bf2

cosmetic update to testsuite

cce59c4

burdorfmitchell and others added 26 commits December 19, 2024 05:06

removed dependence on Tests as that workflow seems to be failing inde…

994a7f1

…pendently

hopefully fixed yaml syntax

dd836e9

added initial output statistics to the reference simulation compariso…

43bbf9c

…ns. currently only asserts '==' can implement others or even an absolute check

re-added setting history to be equal

5766fad

fix

62779e4

all current ref sims should run now, and implemented hopefully more r…

d20f415

…obust downloading

commented out the 0 tolerance sim comparison check

57320fe

added dummy counter (#513)

0a4f02e

* added dummy counter * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

only one TODO left to resolve in test_run_ref (determining how strict…

a4326c8

… object comparison should be)

cleaned up compare_post_benchmark.yaml a bit. now need to test runnin…

7e8bfa2

…g compare_post_benchmark using completion of another workflow (pull request and push)

updated approach to computing num_mismatched and fixed style

f304d88

swapped compare_post_benchmark to run after Tests

72bfb12

minor edits to compare_post_benchmark so hopefully it runs

3a09cdf

not sure why linking to tests isn't working -- swapping back

c784654

edited README / environment.yaml to discuss/require requests and pyte…

7da96f8

…st-benchmark

edited README

cc48512

swapping to have defaults expected for pull request

b80aae5

changed underscore to hyphen to match style

3d55832

Tentative README update -- should probably add a section of regressio…

414c5d7

…n testing / ci in the developer section of the docs, and amend README to link to it

made data comparison same as np.testing.assert_allclose defaults, rem…

d8ce0fc

…oved some commented out code and comments

fixed typos in ci workflow

16f4659

fixed formatting for a line

01739c9

Futher updated the README

32fa583

switching back to multiple ids

031fbeb

refactored job matrix

ca2c3c7

swapped discussion to docs for pytest regression testing

ca1e7da

burdorfmitchell force-pushed the 1.1_ref_sim_ci_workflow branch from 8979cbd to ca1e7da Compare December 19, 2024 11:06

bhazelton approved these changes Dec 19, 2024

View reviewed changes

bhazelton merged commit 8cb4e68 into main Dec 19, 2024
43 of 45 checks passed

bhazelton deleted the 1.1_ref_sim_ci_workflow branch December 19, 2024 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.1 ref sim ci workflow #514

1.1 ref sim ci workflow #514

burdorfmitchell commented Dec 5, 2024 •

edited

Loading

codecov bot commented Dec 5, 2024 •

edited

Loading

bhazelton left a comment

1.1 ref sim ci workflow #514

1.1 ref sim ci workflow #514

Conversation

burdorfmitchell commented Dec 5, 2024 • edited Loading

Description

Motivation and Context

Types of changes

Checklist:

codecov bot commented Dec 5, 2024 • edited Loading

Codecov Report

bhazelton left a comment

Choose a reason for hiding this comment

burdorfmitchell commented Dec 5, 2024 •

edited

Loading

codecov bot commented Dec 5, 2024 •

edited

Loading