Skip to content

IDAES Check Suite

Ludovico Bianchi edited this page Jul 16, 2021 · 1 revision

The IDAES Check Suite

About this document

Goals

  • Addressing issues or questions commonly encountered by IDAES developers, such as:
    • What do I need to know about the IDAES check suite when I propose changes to the IDAES codebase?
    • One or more checks are failing. How should I proceed to make them pass?
  • Documenting the check suite, including:
    • The intended behavior of the check suite: allows maintainers to detect possible issues with the check suite itself if the checks are behaving different from what is being described here
    • The decisions or tradeoffs motivating the design or implementation choices in the check suites

Structure

This document (somewhat loosely) follows the Diátaxis system for organizing documentation as one of four distinct forms:

  • Tutorials (learning-oriented)
  • How-to guides (task-oriented)
  • Reference (information-oriented)
  • Explanation (understanding-oriented)

Whenever possible, the Diátaxis category that is closest to the intended purpose of a section is indicated, along with the intended target audience for whom the information contained in that section is most likely to be useful or relevant.

Legend

Symbol Meaning
Yes; True; OK
No; False; not OK
⚠️ Yes with caveats; Warning
PR Pull Request
CI Continuous Integration
GHA GitHub Actions
🔎 Detail (implementation or otherwise); can be safely skipped in most cases
⚙️ Note for check suite maintainers; can be skipped by most developers
⚒️ Work In Progress (WIP); may be subject to changes soon
An open question or doubt while drafting this document

The IDAES Check Suite for Continuous Integration

Explanation: Purpose of the IDAES check suite

The purpose of the IDAES check suite is to ensure that the IDAES codebase continues to behave as expected as it undergoes any type of changes. Changes are typicaly explicit, such as modifications to the codebase submitted for consideration through a Pull Request (PR), but can also be implicit, i.e. a library on which IDAES depends on being updated at one point in time.

Each individual check can be thought of as verifying that the terms of a contract still uphold, by comparing the observed behavior of the codebase with the expected behavior described by that contract.

Tutorial: What does the check suite do during a pull request?

Target audience: developers

This tutorial gives a high-level view on what to expect from the IDAES check suite for the typical workflow of proposing changes to the IDAES codebase by submitting a pull request (PR) in the GitHub repository. Normally, the checks will run without requiring manual intervention from the developer, who will mostly be concerned with making the necessary changes to the PR's code following failing checks and/or reviewers' comments.

The following assumes that a user myuser is opening a PR from a branch named myfeature on their fork, with the PR having the number 1234.

Creating a PR

  • Go to https://github.com/IDAES/idaes-pse/pulls and create a pull request, selecting myfeature as the head branch
  • As soon as the PR is created, the check suite will be triggered, and a subset of checks (comprising the Core checks) will be dispatched for running
    • The Core checks will also be run whenever the code in the PR's branch is updated (see below)
  • Wait for the checks to complete. Depending on the check, this should take from 5 up to about 20 minutes.

Ensuring that all of the Core checks are passing

  • If any of the Core checks fail, it will be displayed in the main PR view, under the "Checks" section.
    • Click on the details link to access the logs for that check run
    • At any time, the status of the latest run of the check suite for a PR can be accessed from the PR's main view under the "Checks" tab (at the top, immediately below the title)
  • If the probable cause for the check failing can be seen from the check run log, implement the necessary changes on your local clone of the myfeature branch, commit, and push to the myuser fork
  • Whenever a commit is pushed to the PR's head branch, the Core checks in the check suite will be triggered again
  • When all of the Core checks are passing, the PR is ready for review.

Requesting a review and implementing requested changes

  • Assign two or more reviewers using the "Reviewers" box in the top-right of the PR's main view
  • Each reviewer will submit a review, which will likely ask for changes to be made to the code being submitted in the PR. Once the changes are implemented, commit and push the changes to the myfeature branch as described above, which will once again cause the Core checks to be dispatched
  • When the PR has two or more approving reviews, it is considered approved by reviewers
  • Whenever a PR is approved, or the code in an already approved PR is updated by pushing to the myfeature branch, the remaining subset of checks (comprising the Extended checks) will be triggered

Ensuring that all of the check pass

  • For a PR to be merged, it needs to pass checks from both the Core and Extended checks
  • It might happen that a PR passes all of the Core checks, but fails one or more of the Extended checks. This is expected (and, indeed, the reason why the Extended checks exist in the first place)
  • If any of the Extended checks fail, follow the process described above for the Core checks and push the committed changes to the myfeature branch
  • Depending on the nature and amount of changes requested, existing reviews might be invalidated (or automatically "dismissed" in GitHub's terminology), requiring reviewers to submit a review for the latest version of the PR's code. If this happens, repeat the steps for requesting a review as described before
  • Once all of the checks checks in both the Core and Extended categories pass, the PR is ready to be merged

Reference: Available checks

Categories of checks

Throughout the lifetime of a PR, there are two main sets of checks that are run as part of the CI pipeline:

  • Core checks are run frequently, essentially whenever a PR is updated, by either:
    • Updating the code (by pushing to the PR's head branch); or
    • Updating the PR itself (for example, by converting a Draft PR to "Ready for review")
  • Extended checks are run after the PR is approved by reviewers
    • ⚠️ as of the first draft of this document (Jul 15, 2021), the Extended GHA workflow is still named Integration. This inconsistency is temporary and the workflow will be renamed soon in a dedicated PR so that all names match those used in this document
    • For both IDAES/idaes-pse and IDAES/examples-pse, a PR is considered approved if two or more reviews submitted by reviewers who have write access to the repository are present
    • After each valid PR event, a utility job checks if the PR has been approved by reviewers. If not, the rest of the workflow (responsible for running the actual checks) is skipped
    • 🔎 One exception to this is manually triggering: if the Run integration label is added to the PR, the extended checks will run, regardless if the PR has been approved or not
      • Normally, it shouldn't be necessary to perform this step manually as the automated triggers cover most of the desired cases already

An additional category is external checks, i.e. checks that are not run directly by the IDAES CI pipeline, but typically by third-party services in response to repository events

  • 🔎 As these checks are implemented as Software-as-a-service (SaaS), it is generally not possible (or at least practically feasible) to run them locally, or more generally outside of the context of a PR opened against one of the IDAES upstream repositories

Summary of checks

Check Required for PRs Run by GHA workflow Can be run locally
Core / pytest GHA Core
Core / pylint GHA Core
Core / build-docs GHA Core
readthedocs readthedocs.org -
coverage: patch codecov.io Core (*)
coverage: project codecov.io Core (*)
Extended / pytest-integration GHA Extended
Extended / Run examples GHA Extended ⚠️

(*): Triggered as part of a GitHub Actions workflow but run externally

Summary of events and conditions triggering check runs

Click to expand
Event Branch on which the checks are run # of existing approving reviews Core checks Extended checks
A new PR is created PR head -
A draft PR is marked as "Ready for review" PR head -
A commit is pushed to the head branch PR head 0-1
A commit is pushed to the head branch PR head 2 or more
The base branch is merged into the head branch PR head 0-1
The base branch is merged into the head branch PR head 2 or more
An approving review is left PR head 0
An approving review is left PR head 1 or more
A comment review is left PR head Any
A "changes requested" review is left PR head Any
The Run integration[1] label is added PR head Any
A PR is merged into main main -
A commit is pushed directly to the main branch main -
A commit is pushed to any of the <x>.<y>_rel branches <x>.<y>_rel -
Scheduled (daily) main -
Triggered on-demand (via GHA dashboard) main -
Triggered on demand (via GitHub API) main -

[1]: ⚒️ should be renamed to Run extended

List of checks

The following describes each of the available checks in the IDAES check suite. For each check, the information is divided in these categories:

Section Form of documentation Target audience Description
Contract Explanation Everyone A summary of the expected behavior that the check is supposed to verify. In other words, code that passes the check satisfies the conditions formulated in the contract
CI implementation Reference Maintainers Describes how the check is implemented in the IDAES CI pipeline
Running locally How-To Developers Describes the steps needed to be able to run the check (or an approximation of it) in a local developer environment
Understanding and addressing failures How-To Developers A roughly checklist-like series of known issues and possible causes for why the check could fail, with a summary of recommended steps to be taken if that issue occurs
Gotchas/deep dives/etc Explanation Maintainers Any number of design considerations, known limitations, etc providing additional context related to the check

pytest

Contract

  • The runtime behavior of the IDAES codebase matches the expected behavior, as implemented in a subset of the IDAES pytest test suite comprising tests with short durations (< 2 s)
  • In particular, modification to the IDAES codebase must:
    • Not cause changes in the expected behavior of other code (i.e. code that is not directly affected by the proposed modification)
    • Be accompanied by additional tests (conforming to the Testing Standards described here) that encode the expected behavior of the codebase following those modifications

Reference: CI Implementation

  • This check is implemented as a job in the Core workflow that runs a subset of the IDAES pytest test suite
    • 🔎 The subset is chosen so that it runs quickly: currently, this is set by excluding any test marked as integration
Steps
  • Clone repo
  • Set up Python version
  • Set up IDAES
    • Install IDAES + dependencies: from the PR head using pip install -r requirements-dev.txt
    • 🔎 For the complete steps, see .github/action/setup-idaes
  • Run pytest -m 'not integration' --cov
  • Upload coverage report with CodeCov
Runs on
  • Python: 3.6, 3.7, 3.8, 3.9
  • OS: Linux (ubuntu-latest), Windows (windows-latest)

How To: run locally

  • Set up an IDAES developer environment following the installation instructions for developers
  • Run pytest -m 'not integration'
  • To generate a coverage report locally:
    • Run pytest -m 'not integration' --cov --cov-report html:.idaes-coverage
    • This will create a directory named .idaes-coverage in the local directory
    • The report can be browsed interactively by opening the .idaes-coverage/index.html file in any web browser

How To: understand and address failures

  • A change in the PR breaks one or more of the tests in the IDAES pytest test suite
    • What to do?
      • Fix the code until all the tests pass
  • A test involving a numerical computation
  • One or more tests fail because of reasons unrelated to the IDAES code being tested
    • A typical cause of this is network failures, that can cause this check to fail for multiple reasons:
      • Packages cannot be fetched from remote repositories (e.g. PyPI) during the setup/installation
      • IDAES-specific commands fail (e.g. get-extensions)
      • Tests requiring network access (e.g. test_fetch_from_github_api() fail
    • ⚙️ Overall this shouldn't happen too frequently: if tests are flaky because of external conditions, they probably shouldn't belong to the unit tests category
    • What to do?
      • Re-running the check is potentially enough to fix one-off errors (or, conversely, can be used to show that the failure is persistent rather than one-off)
        • In GitHub Actions, this can be done from the Checks or Actions dashboards using the Rerun jobs > Run all jobs button
          • 🔎 Currently, re-running only failed jobs is not supported by GitHub Actions

build-docs

Contract

  • The IDAES documentation is up-to-date and match the runtime behavior of the codebase described by it
  • Docstrings in the Python codebase are well-formed and compatible with Sphinx requirements
  • Code examples in the documentation are runnable without errors
  • Developers are able to generate the HTML documentation from the IDAES repository: i.e., the Sphinx build process succeeds without errors or warnings

Reference: CI Implementation

  • This check is implemented as a job in the Core workflow that runs the process of building the docs, i.e. generating the IDAES HTML documentation from the .rst files in docs/, plus information collected from the Python code (such as docstrings), using the Sphinx documentation tool (https://sphinx-doc.org)
Steps
  • Clone repo
  • Same as pytest: essentially, installing IDAES in developer mode (pip install -r requirements-dev.txt)
  • Run python build.py from the docs/ directory

How To: run locally

  • Set up an IDAES developer environment
  • Run python build.py from the docs/ directory

How To: understand and address failures

  • Sphinx errors are reported in one or more of the .rst files modified or added by the PR
    • Check failures can also be caused by (a subset of?) warnings emitted by Sphinx ("strict mode")
  • Sphinx errors are reported, but they are not associated with any .rst files; or, no .rst files were modified in the PR
    • In this case, the cause of the error might be in a Python docstring defined in a .py file

pylint

Contract

  • The IDAES Python codebase:
    • Does not contain code that is likely to cause runtime errors, or unexpected runtime behavior
    • Matches the agreed-upon standards for code quality, formatting, and conventions

Explanation: how pylint works

  • Static analysis: unlike unit tests, pylint analyzes the code without running it; rather, it collects information and performs checks on the code at the syntax level (similarly to a compiler for a compiled language such as C, C++, Java, or Go would do)

Reference: CI implementation

  • This check is implemented as a job in the Core workflow running pylint with the required configuration to make it compatible with the IDAES codebase
    • 🔎 this includes the pylint transform plugins used to avoid constructs commonly in the IDAES codebase to be erroneously flagged as errors by pylint (false positives, FP)
Steps
  • Clone repo
  • Setup Python
  • Install IDAES with developer dependencies
  • Run pylint using:
    • Errors only, i.e. disable all other types of checks (Warning, Convention, ...)
    • The configuration specified in the .pylint/pylintrc file
      • 🔎 A setting in the RC file also appends the .pylint directory to sys.path, allowing pylint to register plugins and Python modules contained therein
  • 🔎 For the complete steps, see .github/action/pylint

How to: run locally

How to: understand and address failures

  • pylint flags a piece of code as an error, which actually would cause an error at runtime
    • a.k.a. true positive (TP)
    • If the pytest check pass for that code, this most likely means that the code is not covered by the test suite
    • What to do?
      • Write one or more unit tests covering that part of the code
      • Ensure the tests fail
      • Fix the error
      • Ensure that both the tests and pylint pass
  • pylint flags a piece of code as an error, but that code runs just fine at runtime
    • a.k.a false positive (FP)
    • What to do?
      • Refer to the section "pylint: dealing with false positives" for more information on how to deal with this

Explanation: pylint gotchas

  • The behavior of pylint can (and will, see issue #407) change dramatically following an update, so it is important to pin the precise version
    • For the same reason, the validity of the IDAES-specific transform plugins is not guaranteed to be either backwards- or forwards-compatible, and will perform a rudimentary check with a warning if the version detected at runtime doesn't match the reference version
  • Even though pylint runs as a static analysis tool (which, in principle, only needs to parse files, rather than running (importing) them), some packages need to be imported as part of pylint's analysis
    • This include IDAES (plus its dependencies, most notably Pyomo), since a few parts of the code are imported as part of the custom plugins developed to eliminate false positives
      • See .pylint/idaes_plugins.py for more information
    • pylint will report an error if the code contains a module import but the module is not found by pylint
      • i.e. anything that is imported in any of the files being checked by pylint (i.e. anything in idaes/**.py) should be installed in the environment where pylint is running
        • Unlike what would happen at runtime, transitive dependencies (i.e. module C in the following scenario: module idaes.A imports module B, that imports module C) are not needed, since pylint will only check import within the files that it is told to check

pytest-integration

Contract

  • The runtime behavior of the IDAES codebase matches the expected behavior, as implemented in the subset of the IDAES pytest test suite identified by the integration mark
  • Tests are marked with integration if they take longer than a few seconds to run, as described in detail by the IDAES Model Testing Standards

Reference: CI implementation

  • This check is implemented as the pytest-integration job in the Extended GHA workflow
Steps
  • Same as the pytest check, with two differences:
    • pytest is called with the -m integration flag (i.e. the tests being run as part of pytest and pytest-integration by definition form two disjoint sets)
    • pytest is called without the --cov flag (as the integration tests should not count for computing coverage)

How To: understand and address failures

  • Broadly speaking, the same reasons as pytest above

Run examples

Contract

  • IDAES should be backwards-compatible with existing IDAES examples (in examples-pse), i.e. changes to IDAES should not result in any of the examples to result in errors or otherwise deviate from their expected behavior

Reference: CI implementation

Steps
  • Clone idaes-pse from PR head branch
  • Set up Python version
  • Install and set up IDAES + dependencies using the developer workflow
    • IDAES: from the PR head using pip install -r requirements-dev.txt
    • 🔎 For the complete steps, see .github/action/setup-idaes
  • Download a stable version of the IDAES examples
    • Currently, this is done by git clone https://github.com/IDAES/examples-pse (i.e. the default main branch)
  • Set up examples build configuration
    • 🔎 See .github/actions/run-examples/build-config-for-idaes-ci.yml
  • Run the examples using build.py --test
Runs on
  • Python: 3.6, 3.7, 3.8
  • OS: Linux (ubuntu-latest), Windows (windows-latest)

How-To: understand and address failures

  • Something in IDAES causes one or more examples to fail
    • This can mean that a change in the PR is breaking backwards compatibility with the examples, which can itself mean two things
      • The errors are expected: the examples are supposed to fail given the IDAES version being tested
        • What to do?
          • Document the fact that a breaking change has been made to IDAES
            • Possibly, including the before/after changes needed to make the code work with the new IDAES version
          • Submit a PR to examples-pse with the changes needed to make the examples compatible with the IDAES version being tested
            • ⚠️ WARNING This requires careful coordination between idaes-pse and examples-pse as the changes in examples-pse might break compatibility with existing versions of IDAES, causing unwanted failures in:
              • The Run examples checks in any other open PR in idaes-pse that doesn't include the changes in the original PR
              • The checks in examples-pse, since they use the latest stable version of IDAES, which would typically be IDAES/idaes-pse:main
            • Depending on the PR status between the two repos, it might be easier to ignore the failing checks and have someone with big-scary-red-button privileges merge the idaes-pse PR
      • The errors are unexpected and unwanted
        • What to do?
          • Fix the errors in the PR until the Run examples check is successful
  • Changes in IDAES uncover a previously unknown issue in examples-pse
    • i.e. the error is in examples-pse
    • This can mean either one or more of the notebooks have an error, or there is a bug in the examples build tools (build.py & friends)
    • What to do?
      • Create an issue in examples-pse describing the problem
      • Open a PR fixing the issue (or ask someone to do so)
      • Re-run the checks in idaes-pse
        • If no other commits were pushed to the PR branch, the Run integration label can be added to the PR to trigger the Extended checks workflow

Explanation

  • Currently, the check uses a configuration file for building the examples that is different from the config files that come with examples-pse
    • 🔎 See PR #215 for more details on the rationale for this choice
  • We test an unstable version of idaes-pse with a stable version of examples-pse to see if the changes in idaes-pse break something when running the examples
    • In this context, stable means "known to be working" and unstable means "unsure if working or not" (which is why it's being tested)
  • ⚙️ Currently Python 3.9 is excluded from the tested combinations as the combo Windows + Jupyter + recent versions of Python is a pretty reliable source of headaches

coverage

Contract

  • The test coverage of the IDAES codebase, the fraction of it that is being run as part of the test suite, should not decrease (and, ideally, increase) following any change

Reference: CI implementation

  • Coverage reports are generated and submitted to CodeCov as part of the pytest jobs in the Core GHA workflow
  • The coverage report appears as a single comment in a PR's conversation and is managed by CodeCov
    • The same comment will be updated multiple times whenever an updated report is created by CodeCov after a check run

Explanation: coverage concepts

Target audience: everyone

  • "Test coverage" refers to the fraction of a codebase (expressed in number of lines of code or, more accurately, number of statements) that is covered by a test suite.
    • A line (or statement) is covered by a test suite if it is executed at least once when running the test suite. This is also known as a hit
    • A miss is a line or statement that is not executed when running the test suite
    • A partial miss can happen in a few cases, e.g. when a statement contains an if clause, but only one of the possible branches is encountered by the test run
  • A coverage check for a PR is used to quantify the effect that the changes being proposed in the PR would have on the coverage
  • When analyzing coverage for the set of changes introduced by a PR, two complementary analyses are possible:
    • Patch coverage: What is the coverage considering only the lines of code that were changed?
      • i.e. without considering the rest of the project when calculating the coverage, where "the rest of the projects" refers to the lines that are not affected by the changes in the PR
    • Project coverage: If this PR was merged in, how would the coverage of the entire project change?
Example scenarios
  • A user adds unit tests covering code in idaes/my_module.py; however, the PR contains no changes to idaes/my_module.py
    • The patch coverage delta would be 0, since there are no lines of code under test being added by the PR
    • The project coverage delta would be positive (= the coverage would increase), since merging the PR would cause more lines of code to be covered by tests compared to the base branch

How To: understand and address failures

  • The coverage check fails claiming that a PR would cause the coverage to decrease; however, that PR does not affect the Python codebase at all (e.g. on a documentation-only PR affecting exclusively .rst files)
    • Short answer: this is a known issue and can be safely ignored as long as the coverage delta is small (~< 10)
  • The coverage report is missing for one or more PRs
    • Short answer: this is a known issue, which seems to occur sporadically, and is currently under (equally sporadic) investigation
    • To address this, developers can:
      • Wait until all checks are complete (including checks that are not directly involved with generating and uploading a coverage report)
      • Try to push a small update to the code (possibly involving adding a single statement such that the coverage would change)
      • Contact one of the core developers and/or maintainers assigned to this issue (currently @lbianchi-lbl)

Explanation: coverage gotchas

  • Since the coverage is calculated from the subset of the pytest test suite not having an integration mark, changing a test's mark from unit or component to integration will cause a decrease in reported coverage corresponding to the amount of coverage contributed by that test
  • The coverage delta requires a comparison between two versions of the codebase: for PRs, these are the head of the PR branch and the base branch (typically main). Therefore, to have an accurate result is important that:
    • The same tests are run on both the PR and the base branch
    • Up-to-date coverage reports for the main branch are available (ideally, for every commit on the main branch)
      • This is done in the Core workflow in GHA by adding a trigger so that the pytest job runs whenever a commit is pushed to the main branch

Reference: misc CodeCov topics

  • TL;DR when the check runs in which the coverage is calculated finish running
  • ⚒️ TODO

Deep dive

This section contains detailed descriptions of specific topics.

pylint

  • Since pylint analyzes the code without running it, and since in Python one can easily do all sorts of meta-programming at runtime, pylint's ability to understand what a piece of code is doing is limited

  • Non-exhaustive list of code constructs/functionality that are likely to trip up pylint

    # defining variables at runtime in non-standard ways using globals(), eval(), or exec()
    globals()['var'] = 123
    print(var)  # var will be defined at runtime but pylint cannot know it
    
    could_be_anything = eval('var * 2')
    
    def_template = """
    def do_{what}():
        "This function is very useful to do {what}."
        return "I did {what}!"
    """
    for thing in ['this', 'that']:
        exec(def_template.format(what=thing))
    
    result = do_that()
    
    # instantiating objects with conditional logic implemented in a class' `__new__()` method or in a metaclass
    class MyClass:
        def __new__(*args, **kwargs):
            if len(args) == 1:
                return MyScalarSubclass.__new__(*args, **kwargs)
            if len(args) >= 2:
                return MyVectorSubclass.__new__(*args, **kwargs)
    
    class MyScalarSubclass(MyClass):
        def do_scalar(self): ...
    
    class MyVectorSubclass(MyClass):
        def do_vector(self): ....
    
    my_obj = MyClass(1, 2, 3)
    my_obj.do_vector()  # pylint assumes that type(my_obj) is MyClass and will report an attribute error for this statement
Clone this wiki locally