Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use unittest XML files to parse PyTorch test results #3633

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

Flamefire
Copy link
Contributor

@Flamefire Flamefire commented Feb 21, 2025

(created using eb --new-pr)

Requires

Some explanations:

  • In a discussion in a PyTorch issue the only machine readable output are the test XML files which are only generated on (their) CI
  • The easyblock applies patches to allow enabling test reports by setting an EasyBuild specific variable.
    • There is an option to pass to run_test.py that is supposed to enable that but it isn't passed to subprocess and hence not reliable
    • Another bug results in not generating test reports outside CI even with the option passed
    • As the patched file gets installed we shouldn't change its (default) behavior in case users use it, hence the env variable
  • The PyTorch test suite uses Python unittest, pytest and since 2.3 a custom logic to rerun failed tests. That generates XML result files in different formats and potentially with duplicates
    • The successful reruns might be reported alongside their previous failures but in different files so "merging" is required to keep only the successful ones
    • The implemented parser collects all results and attributes them to their "test suite" (usually the Python file executed which might include or run other files)
    • Some tests are run multiple times in different configurations, i.e. the same test file is executed multiple times with an environment variable set to choose e.g. the distribution backend. Those need to be considered as separate tests
    • Afterwards all results are combined/merged. Each test from the same test suite that is found multiple times is considered as successful if at least one of the duplicates was successful
  • I used Python type hints to make it a bit easier to follow
  • In many places assumptions are verified by raising a descriptive error. This should allow to detect changes in PyTorch that affect the logic
  • The "old" (current) parsing of the stdout is still used
    • The new logic is only enabled when the PyTorch easyconfig has the required xmlrunner Python package directly or transitively. We have unittest-xml-reporting ECs for that
    • For PyTorch < 2.3 (Since 2.3 that parser isn't really useful anymore) the found results are compared and differences shown in the logfile. They should match of course, but in the end the result from the XML files is used
    • The final output of PyTorchs run_tests.py contains a list of failed test suites. We match against that as before to detect when we missed something.
    • That also detects test suites that failed to start due to e.g. syntax errors introduced by our patches. In that case no XML file is generated and we'd miss it but we should handle all those cases by fixing the issue or skipping the test
  • I considered verifying the found suites against the list of suites to run as printed by run_test.py but some of the test files are missing the code required to start the test and hence show up in that list but produce no output at all
  • The easyblock file can be run directly and accepts:
    • An easybuild log file: Parses the stdout of run_test as found in the log to test the old parser. This exists already
    • A directory: Run the new (XML) parser on a test-results folder containing the XML reports

I prepared PRs for new and old PyTorch ECs to include the dependency required for the XML reporting. Those can be used to test this PR:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant