Skip to content

Addressing Test Failures

Roscoe A. Bartlett edited this page Jan 16, 2023 · 8 revisions

Through various mechanisms, test failures (and rarely build failures) may pop up in the Trilinos Pull Request builds that are present on the 'develop' branch and are breaking many PR builds trying to test against 'develop'. These test failures must be addressed ASAP in order to allow innocent unrelated PRs to pass their builds and merge. (By not identifying and addressing these failures quickly, the PRs will get backed up and create a logjam that will take a long time and a lot of effort to get out of.)

In all cases, the first step is to create a Trilinos GitHub Issue that lists the full names for the failing tests and what builds they are occurring in (e.g. Issue #10847). (This allows any Trilinos developer to search the GitHub issues to see if an issue already exists for the failure.)

Once the Trilinos GitHub issue has been created, the test failures can be addressed by one of following processes:

A) Fix the failure (if the defect can be fixed quickly):

  1. The package team or some other Trilinos developer will fix the problem.
  2. Example: Issue #2247

B) Temporarily disable the failing code or test (if the defect is not critical or if it is a defective test):

  1. Add test disables <FullTestName>_DISABLE=ON to the Trilinos file packages/framework/ini-files/config-specs.ini under the appropriate section (e.g. see commit 6e315322af for PR #10871)
  2. Verify that the tests are properly disabled with a local configure on the targeted builds and machines (see Reproducing PR Testing Errors).
  3. Create a Trilinos GitHub PR referencing the Trilinos GitHub Issue ID) for the new commits and get PR reviewed and merged (e.g. PR #10871).
  4. Add a comment to the associated Trilinos GitHub issue about the test(s) being disabled and reference the Trilinos GitHub PR ID that disables the tests (e.g. commit 6e315322af from PR #10871).
  5. After the PR disabling the test(s) has been merged, add the GitHub label Disabled Tests to the associated Trilinos GitHub Issue.
  6. Do NOT close the issue!
  7. Someone offline reproduces and fixes the failing test(s) by configuring the impacted PR builds with -D<FullTestName>_DISABLE=OFF.
  8. One or more PRs are created to fix the failing test(s)
  9. After the test(s) are fixed, create a PR to remove cache var <FullTestName>_DISABLE=ON so the test will run again in PR builds
  10. Remove the Disabled Tests label from the GitHub Issue
  11. Verify over a long enough time that that the test is passing in future PR builds
  12. Close the PR Issue
  13. Examples: Trilinos GitHub #2410, Trilinos GitHub #10847

C) Back out the merge commit that caused the failures (if it is a build failure or some other critical defect)

  1. Confirm this is a significant failure that can't be fixed quickly or where a temporary disable of some tests will not address the problem in the short term.
  2. Figure out which merge commit(s) caused the new failures.
  3. Add a comment to the Trilinos GitHub Issue about the need to back out the merge commit and @mention the package team.
  4. Create a new branch, create revert commit(s) and test locally.
  5. Create a new Trilinos GitHub PR (referencing the Trilinos GitHub Issue ID) with new revert commits and get the PR reviewed and merged.
  6. Do NOT close the issue.
  7. Example: Trilinos GitHub #2650 and PR #2653
Clone this wiki locally