Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework CI & Tests Handling #1137

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Rework CI & Tests Handling #1137

wants to merge 2 commits into from

Conversation

DyXel
Copy link
Contributor

@DyXel DyXel commented Jun 23, 2024

This PR reworks the Github Actions workflow files and its associated code, so that there's only 1 pipeline that does all that we might need:

  • Build cppfront
  • Run passthrough tests
  • Run regressions tests
  • ... more in the future? see below

Currently, the pass-through tests are not executed in the CI, and regressions test have duplicated logic to build cppfront (granted, its not that complex anyways).

The idea is that whatever cppfront lowers or outputs should be platform agnostic, likewise for the built executables, and so, for most tests there should be only a single output with which to test against. Additionally, plenty of tests do not output anything when executed, so rather than creating empty files, we should check that the executables under test didn't output anything as well.

Let's take as an example this recent commit, which added 1 test: 54edaba. It created the following files (related to testing):

  • regression-tests/pure2-trailing-comma-assert.cpp2
  • regression-tests/test-results/pure2-trailing-comma-assert.cpp
  • regression-tests/test-results/pure2-trailing-comma-assert.cpp2.output
  • regression-tests/test-results/pure2-trailing-comma-assert.out
  • regression-tests/test-results/apple-clang-14-c++2b/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/apple-clang-14-c++2b/pure2-trailing-comma-assert.cpp.output
  • regression-tests/test-results/apple-clang-15-c++2b/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/apple-clang-15-c++2b/pure2-trailing-comma-assert.cpp.output
  • regression-tests/test-results/clang-12-c++20/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/clang-12-c++20/pure2-trailing-comma-assert.cpp.output
  • regression-tests/test-results/clang-15-c++20-libcpp/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/clang-15-c++20-libcpp/pure2-trailing-comma-assert.cpp.output
  • regression-tests/test-results/clang-15-c++20/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/clang-15-c++20/pure2-trailing-comma-assert.cpp.output
  • regression-tests/test-results/clang-18-c++20/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/clang-18-c++20/pure2-trailing-comma-assert.cpp.output
  • regression-tests/test-results/gcc-10-c++20/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/gcc-10-c++20/pure2-trailing-comma-assert.cpp.output
  • regression-tests/test-results/gcc-13-c++2b/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/gcc-13-c++2b/pure2-trailing-comma-assert.cpp.output
  • regression-tests/test-results/gcc-14-c++2b/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/gcc-14-c++2b/pure2-trailing-comma-assert.cpp.output
  • regression-tests/test-results/msvc-2022-c++20/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/msvc-2022-c++20/pure2-trailing-comma-assert.cpp.output
  • regression-tests/test-results/msvc-2022-c++latest/pure2-trailing-comma-assert.cpp.execution
  • regression-tests/test-results/msvc-2022-c++latest/pure2-trailing-comma-assert.cpp.output

Other than the test itself, it added 25 new files: Two per each target, plus the lowered output, plus the output of cppfront. 23 of those files are empty.

After the rework, the added files would look like this instead:

  • test/regression/pure2-trailing-comma-assert.cpp2
  • test/regression/results/pure2-trailing-comma-assert/00-lowered.cpp
  • test/regression/results/pure2-trailing-comma-assert/01-cppfront.output

In general, the new format for regression tests would be:

  • test/regression/test-name.cpp2
  • test/regression/results/test-name/00-lowered.cpp (if cppfront doesn't throw an error)
  • test/regression/results/test-name/01-cppfront.output
  • test/regression/results/test-name/02-(ctx.target)$-compiler.output (only if compiler outputted something)
  • test/regression/results/test-name/03-executable.output (only if executable outputted something)

Where (ctx.target)$ would be the specified target (e.g. x64-linux-g++-c++20), again, only if the compiler outputted something. This reorganization makes it much easier to find results of a test and decreases considerably the amount of files needed per test, it also makes adding new targets trivial as most tests should not output anything when compiling.

The regression runner itself is written in Cpp2 as opposed to being a shell script. Ideally we treat testing just like we treat code, and re-use most logic of this file when implementing other types of test.

I started this as a side-quest, to pave up the way to test the reflection API (the functions intended to be used by end-users when they author their own metafunctions), and because it seemed to me that adding like 20~ files per new additional regression test is excessive.

@DyXel
Copy link
Contributor Author

DyXel commented Jun 23, 2024

Opened as a draft, as this is not yet ready for merge. I would say its like 80% there, but that remaining 20% could prove to be a lot of effort, so I first want to know if there's interest in moving this forward, otherwise I am happy to shelve the work so far. I had a lot of fun writing the runner itself, and I think we need to rework the way we do tests anyway as we add other types of tests (for example reflection api tests as mentioned above, or unit tests for the compiler code).

@hsutter
Copy link
Owner

hsutter commented Jun 23, 2024

Thanks! Hot take: This could be good, but it will also take me a lot to absorb. I'm used to the current workflow, but that doesn't mean I can't get used to a better one. Is the idea to streamline the CI test pipeline, or also the checked-in tests for each platform? (The answer is probably in the "changed files" but there are over 3,000 so I didn't even try to look.)

The idea is that whatever cppfront lowers or outputs should be platform agnostic,

Yes, corresponds to that there's one /test-results directory independent of the Cpp1 compilers.

likewise for the built executables,

If you mean cppfront.exe, yes. If you mean an executable built for one of the test cases, those will vary considerably by compiler based on mesasges/quirks/bugs. A few of the tests aren't supported by GCC 10 for example but work fine on 11+, or hit a Clang 12 bug but work fine in 13+.

for most tests there should be only a single output with which to test against.

Wait, you mean per Cpp1 compiler? I currently keep one set of outputs (.execution and .output) per Cpp1 compiler because they all have different messages/quirks/bugs.

Additionally, plenty of tests do not output anything when executed, so rather than creating empty files, we should check that the executables under test didn't output anything as well.

That seems reasonable, replacing an empty file with no file. The potential danger is not detecting when a test failed to run (e.g., the Cpp1 compiler crashed)... that's one reason I keep the file even if it's empty.

Thanks for looking into this, though. I'm now leaving for the ISO C++ meeting so I'll be much slower to respond for the next week, but I appreciate all the input.

@DyXel
Copy link
Contributor Author

DyXel commented Jun 23, 2024

Is the idea to streamline the CI test pipeline, or also the checked-in tests for each platform? (The answer is probably in the "changed files" but there are over 3,000 so I didn't even try to look.)

There's no way I am making someone review that many files 😂 So here the idea would be to first enhance the pipeline, without changing any tests (unless we notice some previously broken tests).

If you mean an executable built for one of the test cases, those will vary considerably by compiler based on mesasges/quirks/bugs. A few of the tests aren't supported by GCC 10 for example but work fine on 11+, or hit a Clang 12 bug but work fine in 13+.
Wait, you mean per Cpp1 compiler? I currently keep one set of outputs (.execution and .output) per Cpp1 compiler because they all have different messages/quirks/bugs.

Yes, I am aware some tests have the compiler output something, that should be caught with this new system as well. The main difference would be that their outputs will sit on the same folder.

The potential danger is not detecting when a test failed to run (e.g., the Cpp1 compiler crashed)... that's one reason I keep the file even if it's empty.

I think we can easily detect if something like that happened (for example by checking the error code and making sure there was actually some output in stdout/stderr). But yes, the main idea is to reduce the filecount to something reasonable, while giving a better organization to the test suite. I actually have an item in my todo list to create a "control test" for each case that a test can result in.

Thanks for looking into this, though. I'm now leaving for the ISO C++ meeting so I'll be much slower to respond for the next week, but I appreciate all the input.

Thanks for the heads-up, see you around!

@DyXel
Copy link
Contributor Author

DyXel commented Jun 23, 2024

To Dos if we decide to move this forward (in no particular order):

  • Make regression-runner run on Windows
  • Clean-up regression-runner code a bit
  • Add all previous targets/configs to the new pipeline file
  • Add extra target that compiles in debug (as opposed to O2 build)
  • Add some "control cases" to make sure the logic on the test runner behaves as expected (and fix bugs along the way of course)
  • Create a script that will rename test files to the new structure
    • We shouldn't actually create the new test results with the new runner, because that would break the "trust chain" established by the current pipeline.
  • Execute compiler and cppfront version tests
  • Execute passthrough test on Pipeline
  • Copy patch upload from existing regression test pipeline
  • Add option to the test runner to overwrite test results

Wishlist (not necessary for merge IMO but would be really cool to have):

  • Add pretty/colored output
  • Run tests in parallel (multi-threaded)
    • The problematic bit seems to be launching programs (which changes work directory), maybe a global lock + std::async ought to work.
  • Write some guide on how to use the regression-runner
  • Allow adding specific settings to a test within the file itself as a comment
    • For example, different flags for cppfront, in case we want to test the flag itself

@DyXel
Copy link
Contributor Author

DyXel commented Jun 23, 2024

Removed all the test changes for now.

@MaxSagebaum
Copy link
Contributor

Opened as a draft, as this is not yet ready for merge. I would say its like 80% there, but that remaining 20% could prove to be a lot of effort, so I first want to know if there's interest in moving this forward, otherwise I am happy to shelve the work so far. I had a lot of fun writing the runner itself, and I think we need to rework the way we do tests anyway as we add other types of tests (for example reflection api tests as mentioned above, or unit tests for the compiler code).

I think this would be really nice and it would also simplify updating the results.

@MaxSagebaum
Copy link
Contributor

I have not looked at the MR yet. But one addition we currently do not have is parallel evaluation. Is this now possible would it be "easy" to add with the new framework?

@DyXel

This comment was marked as resolved.

@DyXel DyXel force-pushed the rework-ci branch 5 times, most recently from 0e2a0a0 to 08cb411 Compare June 25, 2024 13:59
@DyXel DyXel mentioned this pull request Jun 25, 2024
@DyXel DyXel force-pushed the rework-ci branch 2 times, most recently from df829f1 to e42cce5 Compare June 25, 2024 19:54
@DyXel
Copy link
Contributor Author

DyXel commented Jun 25, 2024

I have not looked at the MR yet. But one addition we currently do not have is parallel evaluation. Is this now possible would it be "easy" to add with the new framework?

Just pushed the changes to run things in parallel, super simple implementation with std::async, take the results with a grain of salt, but on my machine running all tests went roughly from 1m23.470s to 0m14.370s. 🤯

@MaxSagebaum
Copy link
Contributor

This sound nice. I will have closer look next week. Maybe I can already use this to improve testing for the regexes.

@DyXel
Copy link
Contributor Author

DyXel commented Aug 28, 2024

So, I wrote a small Python script to start count (and accounting) for all the tests to see where to go next. A ballpark of the redundant files is about 1415, that is, duplicated output files. I say an estimate because some compiler outputs between versions are the same, but we would still want to check in case their output changes. MSVC outputs the filename even if success and there are no warnings, we can probably teach the test suite to account for that and save plenty of file count there as well.

I have pushed the script in case anybody wants to play around with it.

@DyXel
Copy link
Contributor Author

DyXel commented Aug 28, 2024

@jarzec while testing this script I found some extraneous files, could you look into it?

copy.execution files:

regression-tests/test-results/clang-18-c++20/pure2-bugfix-for-ufcs-arguments.cpp copy.execution
regression-tests/test-results/clang-18-c++23-libcpp/pure2-bugfix-for-ufcs-arguments.cpp copy.execution

pure2-regex_13_posessive_modifier only has test results:

regression-tests/test-results/apple-clang-14-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/apple-clang-15-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-15-c++20-libcpp/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-15-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-18-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-18-c++23-libcpp/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/gcc-13-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/msvc-2022-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/msvc-2022-c++20/pure2-regex_13_posessive_modifier.cpp.output

@jarzec
Copy link
Contributor

jarzec commented Aug 28, 2024

@jarzec while testing this script I found some extraneous files, could you look into it?

copy.execution files:

regression-tests/test-results/clang-18-c++20/pure2-bugfix-for-ufcs-arguments.cpp copy.execution
regression-tests/test-results/clang-18-c++23-libcpp/pure2-bugfix-for-ufcs-arguments.cpp copy.execution

pure2-regex_13_posessive_modifier only has test results:

regression-tests/test-results/apple-clang-14-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/apple-clang-15-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-15-c++20-libcpp/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-15-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-18-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-18-c++23-libcpp/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/gcc-13-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/msvc-2022-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/msvc-2022-c++20/pure2-regex_13_posessive_modifier.cpp.output

Looks like accidentally committed. I can remove them in #1263.

@hsutter
Copy link
Owner

hsutter commented Oct 31, 2024

Hi! Just checking in as I clean up the PRs with the relicensing (see #1322), is still intended to be still a draft PR? If so perhaps I'll close it for now and you can reopen this or a new PR once it's ready to look at?

As described in #1322, this didn't make release 0.8, so I will need a new one-time CLA that will cover all your future contributions. I've emailed you the new CLA, and one it's completed I can look at PRs.

@DyXel
Copy link
Contributor Author

DyXel commented Oct 31, 2024

Hi, I signed the CLA, but at least for now I don't plan to work on this PR unless its a priority to get it merged. I am currently focused on my job and other projects. 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants