Rework CI & Tests Handling #1137

DyXel · 2024-06-23T02:47:51Z

This PR reworks the Github Actions workflow files and its associated code, so that there's only 1 pipeline that does all that we might need:

Build cppfront
Run passthrough tests
Run regressions tests
... more in the future? see below

Currently, the pass-through tests are not executed in the CI, and regressions test have duplicated logic to build cppfront (granted, its not that complex anyways).

The idea is that whatever cppfront lowers or outputs should be platform agnostic, likewise for the built executables, and so, for most tests there should be only a single output with which to test against. Additionally, plenty of tests do not output anything when executed, so rather than creating empty files, we should check that the executables under test didn't output anything as well.

Let's take as an example this recent commit, which added 1 test: 54edaba. It created the following files (related to testing):

regression-tests/pure2-trailing-comma-assert.cpp2
regression-tests/test-results/pure2-trailing-comma-assert.cpp
regression-tests/test-results/pure2-trailing-comma-assert.cpp2.output
regression-tests/test-results/pure2-trailing-comma-assert.out
regression-tests/test-results/apple-clang-14-c++2b/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/apple-clang-14-c++2b/pure2-trailing-comma-assert.cpp.output
regression-tests/test-results/apple-clang-15-c++2b/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/apple-clang-15-c++2b/pure2-trailing-comma-assert.cpp.output
regression-tests/test-results/clang-12-c++20/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/clang-12-c++20/pure2-trailing-comma-assert.cpp.output
regression-tests/test-results/clang-15-c++20-libcpp/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/clang-15-c++20-libcpp/pure2-trailing-comma-assert.cpp.output
regression-tests/test-results/clang-15-c++20/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/clang-15-c++20/pure2-trailing-comma-assert.cpp.output
regression-tests/test-results/clang-18-c++20/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/clang-18-c++20/pure2-trailing-comma-assert.cpp.output
regression-tests/test-results/gcc-10-c++20/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/gcc-10-c++20/pure2-trailing-comma-assert.cpp.output
regression-tests/test-results/gcc-13-c++2b/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/gcc-13-c++2b/pure2-trailing-comma-assert.cpp.output
regression-tests/test-results/gcc-14-c++2b/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/gcc-14-c++2b/pure2-trailing-comma-assert.cpp.output
regression-tests/test-results/msvc-2022-c++20/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/msvc-2022-c++20/pure2-trailing-comma-assert.cpp.output
regression-tests/test-results/msvc-2022-c++latest/pure2-trailing-comma-assert.cpp.execution
regression-tests/test-results/msvc-2022-c++latest/pure2-trailing-comma-assert.cpp.output

Other than the test itself, it added 25 new files: Two per each target, plus the lowered output, plus the output of cppfront. 23 of those files are empty.

After the rework, the added files would look like this instead:

test/regression/pure2-trailing-comma-assert.cpp2
test/regression/results/pure2-trailing-comma-assert/00-lowered.cpp
test/regression/results/pure2-trailing-comma-assert/01-cppfront.output

In general, the new format for regression tests would be:

test/regression/test-name.cpp2
test/regression/results/test-name/00-lowered.cpp (if cppfront doesn't throw an error)
test/regression/results/test-name/01-cppfront.output
test/regression/results/test-name/02-(ctx.target)$-compiler.output (only if compiler outputted something)
test/regression/results/test-name/03-executable.output (only if executable outputted something)

Where (ctx.target)$ would be the specified target (e.g. x64-linux-g++-c++20), again, only if the compiler outputted something. This reorganization makes it much easier to find results of a test and decreases considerably the amount of files needed per test, it also makes adding new targets trivial as most tests should not output anything when compiling.

The regression runner itself is written in Cpp2 as opposed to being a shell script. Ideally we treat testing just like we treat code, and re-use most logic of this file when implementing other types of test.

I started this as a side-quest, to pave up the way to test the reflection API (the functions intended to be used by end-users when they author their own metafunctions), and because it seemed to me that adding like 20~ files per new additional regression test is excessive.

DyXel · 2024-06-23T02:50:30Z

Opened as a draft, as this is not yet ready for merge. I would say its like 80% there, but that remaining 20% could prove to be a lot of effort, so I first want to know if there's interest in moving this forward, otherwise I am happy to shelve the work so far. I had a lot of fun writing the runner itself, and I think we need to rework the way we do tests anyway as we add other types of tests (for example reflection api tests as mentioned above, or unit tests for the compiler code).

test/regression/results/mixed-allcpp1-hello/02-x64-linux-g++-c++20-compiler.output

test/regression/results/mixed-bugfix-for-ufcs-non-local/02-x64-linux-g++-c++20-compiler.output

test/regression/results/mixed-intro-example-three-loops/02-x64-linux-g++-c++20-compiler.output

.../regression/results/mixed-postexpression-with-capture/02-x64-linux-g++-c++20-compiler.output

hsutter · 2024-06-23T03:15:21Z

Thanks! Hot take: This could be good, but it will also take me a lot to absorb. I'm used to the current workflow, but that doesn't mean I can't get used to a better one. Is the idea to streamline the CI test pipeline, or also the checked-in tests for each platform? (The answer is probably in the "changed files" but there are over 3,000 so I didn't even try to look.)

The idea is that whatever cppfront lowers or outputs should be platform agnostic,

Yes, corresponds to that there's one /test-results directory independent of the Cpp1 compilers.

likewise for the built executables,

If you mean cppfront.exe, yes. If you mean an executable built for one of the test cases, those will vary considerably by compiler based on mesasges/quirks/bugs. A few of the tests aren't supported by GCC 10 for example but work fine on 11+, or hit a Clang 12 bug but work fine in 13+.

for most tests there should be only a single output with which to test against.

Wait, you mean per Cpp1 compiler? I currently keep one set of outputs (.execution and .output) per Cpp1 compiler because they all have different messages/quirks/bugs.

Additionally, plenty of tests do not output anything when executed, so rather than creating empty files, we should check that the executables under test didn't output anything as well.

That seems reasonable, replacing an empty file with no file. The potential danger is not detecting when a test failed to run (e.g., the Cpp1 compiler crashed)... that's one reason I keep the file even if it's empty.

Thanks for looking into this, though. I'm now leaving for the ISO C++ meeting so I'll be much slower to respond for the next week, but I appreciate all the input.

DyXel · 2024-06-23T03:37:35Z

Is the idea to streamline the CI test pipeline, or also the checked-in tests for each platform? (The answer is probably in the "changed files" but there are over 3,000 so I didn't even try to look.)

There's no way I am making someone review that many files 😂 So here the idea would be to first enhance the pipeline, without changing any tests (unless we notice some previously broken tests).

If you mean an executable built for one of the test cases, those will vary considerably by compiler based on mesasges/quirks/bugs. A few of the tests aren't supported by GCC 10 for example but work fine on 11+, or hit a Clang 12 bug but work fine in 13+.
Wait, you mean per Cpp1 compiler? I currently keep one set of outputs (.execution and .output) per Cpp1 compiler because they all have different messages/quirks/bugs.

Yes, I am aware some tests have the compiler output something, that should be caught with this new system as well. The main difference would be that their outputs will sit on the same folder.

The potential danger is not detecting when a test failed to run (e.g., the Cpp1 compiler crashed)... that's one reason I keep the file even if it's empty.

I think we can easily detect if something like that happened (for example by checking the error code and making sure there was actually some output in stdout/stderr). But yes, the main idea is to reduce the filecount to something reasonable, while giving a better organization to the test suite. I actually have an item in my todo list to create a "control test" for each case that a test can result in.

Thanks for looking into this, though. I'm now leaving for the ISO C++ meeting so I'll be much slower to respond for the next week, but I appreciate all the input.

Thanks for the heads-up, see you around!

DyXel · 2024-06-23T03:48:41Z

DyXel · 2024-06-23T04:00:29Z

Removed all the test changes for now.

MaxSagebaum · 2024-06-24T09:07:15Z

Opened as a draft, as this is not yet ready for merge. I would say its like 80% there, but that remaining 20% could prove to be a lot of effort, so I first want to know if there's interest in moving this forward, otherwise I am happy to shelve the work so far. I had a lot of fun writing the runner itself, and I think we need to rework the way we do tests anyway as we add other types of tests (for example reflection api tests as mentioned above, or unit tests for the compiler code).

I think this would be really nice and it would also simplify updating the results.

MaxSagebaum · 2024-06-24T09:08:36Z

I have not looked at the MR yet. But one addition we currently do not have is parallel evaluation. Is this now possible would it be "easy" to add with the new framework?

DyXel · 2024-06-25T20:08:34Z

I have not looked at the MR yet. But one addition we currently do not have is parallel evaluation. Is this now possible would it be "easy" to add with the new framework?

Just pushed the changes to run things in parallel, super simple implementation with std::async, take the results with a grain of salt, but on my machine running all tests went roughly from 1m23.470s to 0m14.370s. 🤯

MaxSagebaum · 2024-06-26T12:40:09Z

This sound nice. I will have closer look next week. Maybe I can already use this to improve testing for the regexes.

DyXel · 2024-08-28T22:09:17Z

So, I wrote a small Python script to start count (and accounting) for all the tests to see where to go next. A ballpark of the redundant files is about 1415, that is, duplicated output files. I say an estimate because some compiler outputs between versions are the same, but we would still want to check in case their output changes. MSVC outputs the filename even if success and there are no warnings, we can probably teach the test suite to account for that and save plenty of file count there as well.

I have pushed the script in case anybody wants to play around with it.

DyXel · 2024-08-28T22:10:20Z

@jarzec while testing this script I found some extraneous files, could you look into it?

copy.execution files:

regression-tests/test-results/clang-18-c++20/pure2-bugfix-for-ufcs-arguments.cpp copy.execution
regression-tests/test-results/clang-18-c++23-libcpp/pure2-bugfix-for-ufcs-arguments.cpp copy.execution

pure2-regex_13_posessive_modifier only has test results:

regression-tests/test-results/apple-clang-14-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/apple-clang-15-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-15-c++20-libcpp/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-15-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-18-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-18-c++23-libcpp/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/gcc-13-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/msvc-2022-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/msvc-2022-c++20/pure2-regex_13_posessive_modifier.cpp.output

jarzec · 2024-08-28T22:45:45Z

@jarzec while testing this script I found some extraneous files, could you look into it?

copy.execution files:

regression-tests/test-results/clang-18-c++20/pure2-bugfix-for-ufcs-arguments.cpp copy.execution
regression-tests/test-results/clang-18-c++23-libcpp/pure2-bugfix-for-ufcs-arguments.cpp copy.execution

pure2-regex_13_posessive_modifier only has test results:

regression-tests/test-results/apple-clang-14-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/apple-clang-15-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-15-c++20-libcpp/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-15-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-18-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/clang-18-c++23-libcpp/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/gcc-13-c++2b/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/msvc-2022-c++20/pure2-regex_13_posessive_modifier.cpp.execution
regression-tests/test-results/msvc-2022-c++20/pure2-regex_13_posessive_modifier.cpp.output

Looks like accidentally committed. I can remove them in #1263.

hsutter · 2024-10-31T02:46:51Z

Hi! Just checking in as I clean up the PRs with the relicensing (see #1322), is still intended to be still a draft PR? If so perhaps I'll close it for now and you can reopen this or a new PR once it's ready to look at?

As described in #1322, this didn't make release 0.8, so I will need a new one-time CLA that will cover all your future contributions. I've emailed you the new CLA, and one it's completed I can look at PRs.

DyXel · 2024-10-31T09:20:25Z

Hi, I signed the CLA, but at least for now I don't plan to work on this PR unless its a priority to get it merged. I am currently focused on my job and other projects. 😅

DyXel commented Jun 23, 2024

View reviewed changes

test/regression/results/mixed-allcpp1-hello/02-x64-linux-g++-c++20-compiler.output Outdated Show resolved Hide resolved

DyXel commented Jun 23, 2024

View reviewed changes

test/regression/results/mixed-bugfix-for-ufcs-non-local/02-x64-linux-g++-c++20-compiler.output Outdated Show resolved Hide resolved

DyXel commented Jun 23, 2024

View reviewed changes

test/regression/results/mixed-intro-example-three-loops/02-x64-linux-g++-c++20-compiler.output Outdated Show resolved Hide resolved

DyXel commented Jun 23, 2024

View reviewed changes

.../regression/results/mixed-postexpression-with-capture/02-x64-linux-g++-c++20-compiler.output Outdated Show resolved Hide resolved

DyXel force-pushed the rework-ci branch from b8f70e9 to 73d5e51 Compare June 23, 2024 03:59

DyXel force-pushed the rework-ci branch from 73d5e51 to b5ec5c0 Compare June 23, 2024 11:59

This comment was marked as resolved.

Sign in to view

DyXel force-pushed the rework-ci branch 5 times, most recently from 0e2a0a0 to 08cb411 Compare June 25, 2024 13:59

DyXel mentioned this pull request Jun 25, 2024

[CI] Fix regression tests #1140

Merged

DyXel force-pushed the rework-ci branch 2 times, most recently from df829f1 to e42cce5 Compare June 25, 2024 19:54

DyXel force-pushed the rework-ci branch from e42cce5 to 612a2eb Compare June 25, 2024 20:20

jarzec mentioned this pull request Jul 15, 2024

CI Remove empty test reference files #1164

Merged

CI Rework WIP

9ac6069

DyXel force-pushed the rework-ci branch from bca31fb to 9ac6069 Compare August 28, 2024 16:53

add python script to count and later rename tests

3d150db

DyXel force-pushed the rework-ci branch from 09c7717 to 3d150db Compare August 28, 2024 22:38

hsutter added defer pri-5: other ideas and suggestions labels Sep 24, 2024

hsutter added the awaiting-cla label Oct 31, 2024

hsutter removed the awaiting-cla label Nov 1, 2024

hsutter closed this Mar 1, 2025

Rework CI & Tests Handling #1137

Rework CI & Tests Handling #1137

Uh oh!

Conversation

DyXel commented Jun 23, 2024

Uh oh!

DyXel commented Jun 23, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hsutter commented Jun 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DyXel commented Jun 23, 2024

Uh oh!

DyXel commented Jun 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DyXel commented Jun 23, 2024

Uh oh!

MaxSagebaum commented Jun 24, 2024

Uh oh!

MaxSagebaum commented Jun 24, 2024

Uh oh!

This comment was marked as resolved.

DyXel commented Jun 25, 2024

Uh oh!

MaxSagebaum commented Jun 26, 2024

Uh oh!

DyXel commented Aug 28, 2024

Uh oh!

DyXel commented Aug 28, 2024

Uh oh!

jarzec commented Aug 28, 2024

Uh oh!

hsutter commented Oct 31, 2024

Uh oh!

DyXel commented Oct 31, 2024

Uh oh!

Uh oh!

hsutter commented Jun 23, 2024 •

edited

Loading

DyXel commented Jun 23, 2024 •

edited

Loading