Support Maven Surefire Plugin rerunning flaky tests #539

fibbers · 2023-12-14T19:16:16Z

Hi,

First of all, I think this plugin works great and provides many GitHub reporting options. 👌

I'm currently in the process of running my tests in parallel to speedup my test run. Some of my tests now fail because of concurrency issues and that's why I also configured the Maven Surefire Plugin to rerun failed tests.

However, I notice a few things:

my regular mvn test command ultimately succeeds (i.e. after rerunning some tests, all my tests pass):
however, publish-unit-test-result-action creates a check with a failed status (❌ Maven Test Results)

the annotations show failed tests, since the first attempt they did indeed fail (however, succeeding attempts pass):

Note that my mvn test command completes with the following output (Flakes signifies the reruns):

2023-12-14T17:23:48.1537837Z [WARNING] Tests run: 5587, Failures: 0, Errors: 0, Skipped: 116, Flakes: 15
2023-12-14T17:23:48.1538689Z [INFO] 
2023-12-14T17:23:48.1539320Z [INFO] ------------------------------------------------------------------------
2023-12-14T17:23:48.1540050Z [INFO] BUILD SUCCESS
2023-12-14T17:23:48.1540749Z [INFO] ------------------------------------------------------------------------
2023-12-14T17:23:48.1541494Z [INFO] Total time:  21:03 min
2023-12-14T17:23:48.1542240Z [INFO] Finished at: 2023-12-14T18:23:48+01:00
2023-12-14T17:23:48.1543112Z [INFO] ------------------------------------------------------------------------

the duration shown in the summary (57m) seems to be the sum of the parallel runs instead of the wall clock time

My simplified workflow:

- name: Run tests with Maven
  run: mvn --file pom.xml test
- name: Publish Test Results
  uses: EnricoMi/publish-unit-test-result-action@v2
  if: always()
  with:
    files: |
      example/target/surefire-reports/*.xml
    check_name: "Maven Test Results"
    report_individual_runs: "true"
    action_fail_on_inconclusive: "true"

Log output of the Publish Test Results step (sensitive parts removed):

2023-12-14 18:23:52 +0100 - publish -  INFO - Available memory to read files: 14.5 GiB
2023-12-14 18:23:52 +0100 - publish -  INFO - Reading files example/target/surefire-reports/*.xml (448 files, 40.5 MiB)
2023-12-14 18:23:53 +0100 - publish -  INFO - Detected 448 JUnit XML files (40.5 MiB)
2023-12-14 18:23:53 +0100 - publish -  INFO - Finished reading 448 files in 0.64 seconds
2023-12-14 18:23:54 +0100 - publish -  INFO - Publishing failure results for commit abcdef123
2023-12-14 18:23:55 +0100 - publish -  INFO - Created check https://github.com/example/runs/12649858973
2023-12-14 18:23:55 +0100 - publish -  INFO - Created job summary
2023-12-14 18:24:01 +0100 - publish -  INFO - Edited comment for pull request #...: https://github.com/example/pull/123#issuecomment-123

Example of annotations of 2 different tests where in both cases only the first attempt failed:

Just for completeness, but probably not necessary, relevant parts of pom.xml:

	<plugin>
		<groupId>org.apache.maven.plugins</groupId>
		<artifactId>maven-surefire-plugin</artifactId>
		<version>3.2.3</version>
		<configuration>
			<rerunFailingTestsCount>3</rerunFailingTestsCount>
		</configuration>
	</plugin>

Example of (part of) an xml report file that has a passing test after a rerun:

<testcase name="testSomething" classname="example.SomeTest" time="0.001">
    <flakyFailure message="[someAssertionHere] &#10;Expecting actual not to be empty" type="java.lang.AssertionError">
        <stackTrace><![CDATA[java.lang.AssertionError: 
[someAssertionHere] 
Expecting actual not to be empty
at ...
]]></stackTrace>
        <system-out><![CDATA[... 
]]></system-out>
        <system-err><![CDATA[... 
]]></system-err>
    </flakyFailure>
</testcase>

Is there a configuration parameter I missed that might create a succeeding Check (✅ Maven Test Results) and that causes the report to show only succeeding tests (optionally after retries)? Or is rerunning simply not supported?

The repository I'm working on is private, so unfortunately I can't give any direct links, but let me know if more information is needed.

The text was updated successfully, but these errors were encountered:

EnricoMi · 2023-12-15T06:42:02Z

Thanks for the detailed description of your use case. There is no option available currently. Rather than changing the logic, I would specifically handle the "flaky" annotations in the test result files: https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html

Can you provide an example test result with one failure and success reruns, as well as one result file with all rerun failures?

fibbers · 2023-12-15T09:26:55Z

Sure, for completeness, I made 4 tests fail/error by either throwing an exception or doing a false assertion:

throwsOnlyFirstAttempt()
throwsAlways()
failsAssertionOnlyFirstAttempt()
failsAssertionAlways()

Output of mvn test command with rerunFailingTestsCount=3:

[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] example.ExampleTests.failsAssertionAlways
[ERROR]   Run 1: ExampleTests.failsAssertionAlways:40 Always fails assertion ==> expected: <1> but was: <2>
[ERROR]   Run 2: ExampleTests.failsAssertionAlways:40 Always fails assertion ==> expected: <1> but was: <2>
[ERROR]   Run 3: ExampleTests.failsAssertionAlways:40 Always fails assertion ==> expected: <1> but was: <2>
[ERROR]   Run 4: ExampleTests.failsAssertionAlways:40 Always fails assertion ==> expected: <1> but was: <2>
[INFO]
[ERROR] Errors:
[ERROR] example.ExampleTests.throwsAlways
[ERROR]   Run 1: ExampleTests.throwsAlways:26 Runtime Always throws exception
[ERROR]   Run 2: ExampleTests.throwsAlways:26 Runtime Always throws exception
[ERROR]   Run 3: ExampleTests.throwsAlways:26 Runtime Always throws exception
[ERROR]   Run 4: ExampleTests.throwsAlways:26 Runtime Always throws exception
[INFO]
[WARNING] Flakes:
[WARNING] example.ExampleTests.failsAssertionOnlyFirstAttempt
[ERROR]   Run 1: ExampleTests.failsAssertionOnlyFirstAttempt:32 First attempt fails assertion ==> expected: <1> but was: <2>
[INFO]   Run 2: PASS
[INFO]
[WARNING] example.ExampleTests.throwsOnlyFirstAttempt
[ERROR]   Run 1: ExampleTests.throwsOnlyFirstAttempt:18 Runtime First attempt throws exception
[INFO]   Run 2: PASS
[INFO]
[INFO]
[ERROR] Tests run: 4, Failures: 1, Errors: 1, Skipped: 0, Flakes: 2
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  24.788 s
[INFO] Finished at: 2023-12-15T10:22:37+01:00
[INFO] ------------------------------------------------------------------------

TEST-example.ExampleTests.xml:

<?xml version="1.0" encoding="UTF-8"?>
<testsuite xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://maven.apache.org/surefire/maven-surefire-plugin/xsd/surefire-test-report-3.0.xsd" version="3.0" name="example.ExampleTests" time="0.002" tests="2" errors="1" skipped="0" failures="1">
  <testcase name="throwsAlways" classname="example.ExampleTests" time="0.01">
    <error message="Always throws exception" type="java.lang.RuntimeException"><![CDATA[java.lang.RuntimeException: Always throws exception
	at example.ExampleTests.throwsAlways(ExampleTests.java:26)
]]></error>
    <rerunError message="Always throws exception" type="java.lang.RuntimeException">
      <stackTrace><![CDATA[java.lang.RuntimeException: Always throws exception
	at example.ExampleTests.throwsAlways(ExampleTests.java:26)
]]></stackTrace>
    </rerunError>
    <rerunError message="Always throws exception" type="java.lang.RuntimeException">
      <stackTrace><![CDATA[java.lang.RuntimeException: Always throws exception
	at example.ExampleTests.throwsAlways(ExampleTests.java:26)
]]></stackTrace>
    </rerunError>
    <rerunError message="Always throws exception" type="java.lang.RuntimeException">
      <stackTrace><![CDATA[java.lang.RuntimeException: Always throws exception
	at example.ExampleTests.throwsAlways(ExampleTests.java:26)
]]></stackTrace>
    </rerunError>
  </testcase>
  <testcase name="failsAssertionAlways" classname="example.ExampleTests" time="0.003">
    <failure message="Always fails assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError"><![CDATA[org.opentest4j.AssertionFailedError: Always fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionAlways(ExampleTests.java:40)
]]></failure>
    <rerunFailure message="Always fails assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError">
      <stackTrace><![CDATA[org.opentest4j.AssertionFailedError: Always fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionAlways(ExampleTests.java:40)
]]></stackTrace>
    </rerunFailure>
    <rerunFailure message="Always fails assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError">
      <stackTrace><![CDATA[org.opentest4j.AssertionFailedError: Always fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionAlways(ExampleTests.java:40)
]]></stackTrace>
    </rerunFailure>
    <rerunFailure message="Always fails assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError">
      <stackTrace><![CDATA[org.opentest4j.AssertionFailedError: Always fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionAlways(ExampleTests.java:40)
]]></stackTrace>
    </rerunFailure>
  </testcase>
  <testcase name="failsAssertionOnlyFirstAttempt" classname="example.ExampleTests" time="0.0">
    <flakyFailure message="First attempt fails assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError">
      <stackTrace><![CDATA[org.opentest4j.AssertionFailedError: First attempt fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionOnlyFirstAttempt(ExampleTests.java:32)
]]></stackTrace>
    </flakyFailure>
  </testcase>
  <testcase name="throwsOnlyFirstAttempt" classname="example.ExampleTests" time="0.001">
    <flakyError message="First attempt throws exception" type="java.lang.RuntimeException">
      <stackTrace><![CDATA[java.lang.RuntimeException: First attempt throws exception
	at example.ExampleTests.throwsOnlyFirstAttempt(ExampleTests.java:18)
]]></stackTrace>
    </flakyError>
  </testcase>
</testsuite>

example.ExampleTests.txt:

-------------------------------------------------------------------------------
Test set: example.ExampleTests
-------------------------------------------------------------------------------
Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 0.002 s <<< FAILURE! -- in example.ExampleTests
example.ExampleTests.failsAssertionAlways -- Time elapsed: 0.001 s <<< FAILURE!
org.opentest4j.AssertionFailedError: Always fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionAlways(ExampleTests.java:40)

example.ExampleTests.throwsAlways -- Time elapsed: 0.001 s <<< ERROR!
java.lang.RuntimeException: Always throws exception
	at example.ExampleTests.throwsAlways(ExampleTests.java:26)

EnricoMi · 2023-12-15T13:30:55Z

Thanks for the input. With the provided XML the action reports one test as error, one as failure and the last two as pass.

Can you add another two tests that fail / error twice and pass on third attempt?

fibbers · 2023-12-15T14:06:45Z

Sure, see below for these 7 tests so far:

throwsOnlyFirstAttempt()
throwsFirstTwoAttempts() <-- added
throwsAlways()
failsAssertionOnlyFirstAttempt()
failsAssertionFirstTwoAttempts() <-- added
failsAssertionAlways()
alwaysPasses() <-- added, to be sure

Output of mvn test

[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] example.ExampleTests.failsAssertionAlways
[ERROR]   Run 1: ExampleTests.failsAssertionAlways:60 Always fails assertion ==> expected: <1> but was: <2>
[ERROR]   Run 2: ExampleTests.failsAssertionAlways:60 Always fails assertion ==> expected: <1> but was: <2>
[ERROR]   Run 3: ExampleTests.failsAssertionAlways:60 Always fails assertion ==> expected: <1> but was: <2>
[ERROR]   Run 4: ExampleTests.failsAssertionAlways:60 Always fails assertion ==> expected: <1> but was: <2>
[INFO]
[ERROR] Errors:
[ERROR] example.ExampleTests.throwsAlways
[ERROR]   Run 1: ExampleTests.throwsAlways:37 Runtime Always throws exception
[ERROR]   Run 2: ExampleTests.throwsAlways:37 Runtime Always throws exception
[ERROR]   Run 3: ExampleTests.throwsAlways:37 Runtime Always throws exception
[ERROR]   Run 4: ExampleTests.throwsAlways:37 Runtime Always throws exception
[INFO]
[WARNING] Flakes:
[WARNING] example.ExampleTests.failsAssertionFirstTwoAttempts
[ERROR]   Run 1: ExampleTests.failsAssertionFirstTwoAttempts:52 First two attempts fail assertion ==> expected: <1> but was: <2>
[ERROR]   Run 2: ExampleTests.failsAssertionFirstTwoAttempts:52 First two attempts fail assertion ==> expected: <1> but was: <2>
[INFO]   Run 3: PASS
[INFO]
[WARNING] example.ExampleTests.failsAssertionOnlyFirstAttempt
[ERROR]   Run 1: ExampleTests.failsAssertionOnlyFirstAttempt:43 First attempt fails assertion ==> expected: <1> but was: <2>
[INFO]   Run 2: PASS
[INFO]
[WARNING] example.ExampleTests.throwsFirstTwoAttempts
[ERROR]   Run 1: ExampleTests.throwsFirstTwoAttempts:29 Runtime First two attempts throw exception
[ERROR]   Run 2: ExampleTests.throwsFirstTwoAttempts:29 Runtime First two attempts throw exception
[INFO]   Run 3: PASS
[INFO]
[WARNING] example.ExampleTests.throwsOnlyFirstAttempt
[ERROR]   Run 1: ExampleTests.throwsOnlyFirstAttempt:20 Runtime First attempt throws exception
[INFO]   Run 2: PASS
[INFO]
[INFO]
[ERROR] Tests run: 7, Failures: 1, Errors: 1, Skipped: 0, Flakes: 4
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  23.284 s
[INFO] Finished at: 2023-12-15T14:58:05+01:00
[INFO] ------------------------------------------------------------------------

TEST-example.ExampleTests.xml

<?xml version="1.0" encoding="UTF-8"?>
<testsuite xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://maven.apache.org/surefire/maven-surefire-plugin/xsd/surefire-test-report-3.0.xsd" version="3.0" name="example.ExampleTests" time="0.001" tests="2" errors="1" skipped="0" failures="1">
  <testcase name="throwsAlways" classname="example.ExampleTests" time="0.01">
    <error message="Always throws exception" type="java.lang.RuntimeException"><![CDATA[java.lang.RuntimeException: Always throws exception
	at example.ExampleTests.throwsAlways(ExampleTests.java:37)
]]></error>
    <rerunError message="Always throws exception" type="java.lang.RuntimeException">
      <stackTrace><![CDATA[java.lang.RuntimeException: Always throws exception
	at example.ExampleTests.throwsAlways(ExampleTests.java:37)
]]></stackTrace>
    </rerunError>
    <rerunError message="Always throws exception" type="java.lang.RuntimeException">
      <stackTrace><![CDATA[java.lang.RuntimeException: Always throws exception
	at example.ExampleTests.throwsAlways(ExampleTests.java:37)
]]></stackTrace>
    </rerunError>
    <rerunError message="Always throws exception" type="java.lang.RuntimeException">
      <stackTrace><![CDATA[java.lang.RuntimeException: Always throws exception
	at example.ExampleTests.throwsAlways(ExampleTests.java:37)
]]></stackTrace>
    </rerunError>
  </testcase>
  <testcase name="throwsFirstTwoAttempts" classname="example.ExampleTests" time="0.0">
    <flakyError message="First two attempts throw exception" type="java.lang.RuntimeException">
      <stackTrace><![CDATA[java.lang.RuntimeException: First two attempts throw exception
	at example.ExampleTests.throwsFirstTwoAttempts(ExampleTests.java:29)
]]></stackTrace>
    </flakyError>
    <flakyError message="First two attempts throw exception" type="java.lang.RuntimeException">
      <stackTrace><![CDATA[java.lang.RuntimeException: First two attempts throw exception
	at example.ExampleTests.throwsFirstTwoAttempts(ExampleTests.java:29)
]]></stackTrace>
    </flakyError>
  </testcase>
  <testcase name="failsAssertionAlways" classname="example.ExampleTests" time="0.003">
    <failure message="Always fails assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError"><![CDATA[org.opentest4j.AssertionFailedError: Always fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionAlways(ExampleTests.java:60)
]]></failure>
    <rerunFailure message="Always fails assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError">
      <stackTrace><![CDATA[org.opentest4j.AssertionFailedError: Always fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionAlways(ExampleTests.java:60)
]]></stackTrace>
    </rerunFailure>
    <rerunFailure message="Always fails assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError">
      <stackTrace><![CDATA[org.opentest4j.AssertionFailedError: Always fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionAlways(ExampleTests.java:60)
]]></stackTrace>
    </rerunFailure>
    <rerunFailure message="Always fails assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError">
      <stackTrace><![CDATA[org.opentest4j.AssertionFailedError: Always fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionAlways(ExampleTests.java:60)
]]></stackTrace>
    </rerunFailure>
  </testcase>
  <testcase name="failsAssertionOnlyFirstAttempt" classname="example.ExampleTests" time="0.001">
    <flakyFailure message="First attempt fails assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError">
      <stackTrace><![CDATA[org.opentest4j.AssertionFailedError: First attempt fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionOnlyFirstAttempt(ExampleTests.java:43)
]]></stackTrace>
    </flakyFailure>
  </testcase>
  <testcase name="alwaysPasses" classname="example.ExampleTests" time="0.001"/>
  <testcase name="throwsOnlyFirstAttempt" classname="example.ExampleTests" time="0.001">
    <flakyError message="First attempt throws exception" type="java.lang.RuntimeException">
      <stackTrace><![CDATA[java.lang.RuntimeException: First attempt throws exception
	at example.ExampleTests.throwsOnlyFirstAttempt(ExampleTests.java:20)
]]></stackTrace>
    </flakyError>
  </testcase>
  <testcase name="failsAssertionFirstTwoAttempts" classname="example.ExampleTests" time="0.0">
    <flakyFailure message="First two attempts fail assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError">
      <stackTrace><![CDATA[org.opentest4j.AssertionFailedError: First two attempts fail assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionFirstTwoAttempts(ExampleTests.java:52)
]]></stackTrace>
    </flakyFailure>
    <flakyFailure message="First two attempts fail assertion ==&gt; expected: &lt;1&gt; but was: &lt;2&gt;" type="org.opentest4j.AssertionFailedError">
      <stackTrace><![CDATA[org.opentest4j.AssertionFailedError: First two attempts fail assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionFirstTwoAttempts(ExampleTests.java:52)
]]></stackTrace>
    </flakyFailure>
  </testcase>
</testsuite>

example.ExampleTests.txt

(this shows Tests run: 2, not sure why though, and this file is probably insignificant)

-------------------------------------------------------------------------------
Test set: example.ExampleTests
-------------------------------------------------------------------------------
Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 0.001 s <<< FAILURE! -- in example.ExampleTests
example.ExampleTests.failsAssertionAlways -- Time elapsed: 0.001 s <<< FAILURE!
org.opentest4j.AssertionFailedError: Always fails assertion ==> expected: <1> but was: <2>
	at example.ExampleTests.failsAssertionAlways(ExampleTests.java:60)

example.ExampleTests.throwsAlways -- Time elapsed: 0 s <<< ERROR!
java.lang.RuntimeException: Always throws exception
	at example.ExampleTests.throwsAlways(ExampleTests.java:37)

EnricoMi · 2023-12-15T19:28:33Z

The action only considers known tags like <failure> and <error>. Given they only exist when all attempts fail / error, the reporting behaviour is correct. The test pass if not all attempts fail / error. Do you agree this is the current behaviour?

The 1 out of 2 here is because you run your test suites twice, once the test fails all attempts, once it succeeds at least one attempt. The reruns are in the same file, but your report tests with two files.

fibbers · 2023-12-18T13:51:31Z

I think you're right and I agree with the current behaviour.

After reading your comment, I've dug a bit deeper and found that I had not accurately reproduced the actual problem.

Some context: we run ~5000 tests spread over many classes and each class gets its own TEST-fqdnClassNameHere.xml report generated (i.e. the 2 TEST-..xml files in my report are from 2 different classes, I'm assuming each class is considered a test suite).

Though our mvn test command exited successfully and reported that it reran some tests, the resulting xml reports from GitHub Actions still contained <failure>/<error> tags (even without any <flaky*> sibling tags 🤔). This probably means my problem lies in generating the xml reports instead of analyzing and reporting the failed tests with publish-unit-test-result-action. This action indeed seems to behave correctly.

I'll focus my efforts now on how the xml reports are generated, it might have to be related to the fact we're running our tests in parallel.
But that's all unrelated to this action 🙂 so I'll close this issue.

Thanks for helping along and the quick responses.

EnricoMi · 2023-12-19T15:35:14Z

Thanks for clarification and for raising this!

EnricoMi · 2024-08-18T13:53:20Z

Let's reopen this to track supporting the flaky annotation for eventual successful tests. That would be a great addition to the summary.

LorenzoBettini · 2024-08-18T17:31:07Z

So, currently, the only way is to use "fail-on nothing* and rely on the failure of the build job, right?

EnricoMi · 2024-08-18T18:19:13Z

Depends on what you want. Do you want to fail on falkiness? Or do you want to fail on consistent test failure? This action fails on the latter (with default settings).

LorenzoBettini · 2024-08-18T18:26:43Z

We don't want it to fail on flakyness, but it does.. looks like it interprets first failed attempts of flaky tests as failures, and the corresponding job is marked as failed. We don't want that.

LorenzoBettini · 2024-08-18T19:57:17Z

@EnricoMi I've also tried with fail_on nothing, eclipse-xtext/xtext#3164 but the job for publishing test results is still marked with failure (due to flakes). Am I missing something?

EnricoMi · 2024-08-21T06:06:16Z

We don't want it to fail on flakyness, but it does.. looks like it interprets first failed attempts of flaky tests as failures

Can you point me to an example workflow run?

HannesWell · 2024-08-24T13:43:55Z

We don't want it to fail on flakyness, but it does.. looks like it interprets first failed attempts of flaky tests as failures

Can you point me to an example workflow run?

Many of the latest runs in https://github.com/eclipse/xtext/ are affected.
One specific example is https://github.com/eclipse/xtext/actions/runs/10538099413
where the test results are collected in https://github.com/eclipse/xtext/actions/runs/10538099413/job/29201077926 which leads to this check result:
https://github.com/eclipse/xtext/pull/3154/checks?check_run_id=29201077926

EnricoMi · 2024-08-26T13:05:07Z

In file org.eclipse.xtext.ui.tests/target/surefire-reports/TEST-org.eclipse.xtext.ui.tests.core.resource.Storage2UriMapperJdtImplTest.xml in artifact test-results-Linux-21 of workflow run https://github.com/eclipse/xtext/actions/runs/10538099413 the suite claims 11 tests and 1 failure, but there are only 10 <testcase>s. Looks like the last test is flaky and was rerun (which should not increment the tests counter).

Looking at ¹, a <flakyFailure> indicates the first run failed, but the second succeeded. So this test case should not be considered a failure. If all runs of the test have failed, we would see a <failure> tag for the first run and a <rerunFailure> for all reruns.

The xml files are inconsistent with ¹.

https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html ↩ ↩²

LorenzoBettini · 2024-08-26T15:12:39Z

We use maven-surefire and for Eclipse tests (the flaky ones) we use tycho-surefire, which uses maven-surefire under the hood I think. So we don't do anything specific...

HannesWell · 2024-08-26T21:27:38Z

In file org.eclipse.xtext.ui.tests/target/surefire-reports/TEST-org.eclipse.xtext.ui.tests.core.resource.Storage2UriMapperJdtImplTest.xml in artifact test-results-Linux-21 of workflow run https://github.com/eclipse/xtext/actions/runs/10538099413 the suite claims 11 tests and 1 failure, but there are only 10 <testcase>s. Looks like the last test is flaky and was rerun (which should not increment the tests counter).

Looking at 1, a <flakyFailure> indicates the first run failed, but the second succeeded. So this test case should not be considered a failure. If all runs of the test have failed, we would see a <failure> tag for the first run and a <rerunFailure> for all reruns.

Thank you for investigating this. And although it is correct that Xtext uses the tycho-surefire-plugin, that plugin 'only' calls the functionality of the maven-surefire-plugin from an Eclipse/OSGi runtime.

And indeed I can reproduce the described behavior with a very simple Maven project that only has the following test class:

package bar;
import static org.junit.Assert.assertEquals;
import java.util.concurrent.atomic.AtomicInteger;
import org.junit.Test;

public class FailOnFirstExecutionTest {
	private static final AtomicInteger EXECUTIONS = new AtomicInteger(0);
	@Test
	public void testName() throws Exception {
		assertEquals(2, EXECUTIONS.incrementAndGet());
	}
}

and a very simple pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>
	<groupId>foo</groupId>
	<artifactId>bar</artifactId>
	<version>0.0.1-SNAPSHOT</version>

	<properties>
		<maven.compiler.release>17</maven.compiler.release>
	</properties>

	<dependencies>
		<dependency>
			<groupId>junit</groupId>
			<artifactId>junit</artifactId>
			<version>4.13.2</version>
		</dependency>
	</dependencies>
	<build>
		<pluginManagement>
			<plugins>
				<plugin>
					<groupId>org.apache.maven.plugins</groupId>
					<artifactId>maven-surefire-plugin</artifactId>
					<version>3.4.0</version>
					<configuration>
						<rerunFailingTestsCount>3</rerunFailingTestsCount>
					</configuration>
				</plugin>
			</plugins>
		</pluginManagement>
	</build>
</project>

The resulting text-report.xml has the described content:

<?xml version="1.0" encoding="UTF-8"?>
<testsuite xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://maven.apache.org/surefire/maven-surefire-plugin/xsd/surefire-test-report.xsd" version="3.0.2" name="bar.FailOnFirstExecutionTest" time="0.042" tests="2" errors="0" skipped="0" failures="1">
  <properties>
...
  </properties>
  <testcase name="testName" classname="bar.FailOnFirstExecutionTest" time="0.0">
    <flakyFailure message="expected:&lt;2&gt; but was:&lt;1&gt;" type="java.lang.AssertionError">
      <stackTrace><![CDATA[java.lang.AssertionError: expected:<2> but was:<1>
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.failNotEquals(Assert.java:835)
	at org.junit.Assert.assertEquals(Assert.java:647)
	at org.junit.Assert.assertEquals(Assert.java:633)
	at bar.FailOnFirstExecutionTest.testName(FailOnFirstExecutionTest.java:14)
...
]]></stackTrace>
    </flakyFailure>
  </testcase>
</testsuite>

The xml files are inconsistent with 1.

I cannot se that the this page make any statements regarding the summary in the head-line. But I agree with you that the content is indeed surprising. I would also expect the tests attribute to count the number of actual tests and not the number of runs, while I would expect that failures/errors would only count eventual failures/errors.

I'll ask the Maven devs if this is intended. I don't see anything regarding flakes in JUnit so I assume it's a Maven-Surefire feature.

michael-o · 2024-08-28T09:28:03Z

https://issues.apache.org/jira/browse/SUREFIRE-1751

EnricoMi · 2024-08-28T13:36:29Z

I think the <testsuite> counts all flaky failures / errors in addition to test case outcomes:

<testsuite xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:noNamespaceSchemaLocation="https://maven.apache.org/surefire/maven-surefire-plugin/xsd/surefire-test-report.xsd"
           version="3.0"
           name="org.eclipse.xtext.ui.tests.core.resource.Storage2UriMapperJavaImplTest"
           time="8.454" tests="8" errors="0" skipped="1" failures="1">
  <properties>
    ...
  </properties>
  <testcase name="testOnRemoveTwoProjects" classname="org.eclipse.xtext.ui.tests.core.resource.Storage2UriMapperJavaImplTest" time="0.177"/>
  <testcase name="testBug574908_Performance" classname="org.eclipse.xtext.ui.tests.core.resource.Storage2UriMapperJavaImplTest" time="3.674"/>
  <testcase name="testBug574908_Performance_500" classname="org.eclipse.xtext.ui.tests.core.resource.Storage2UriMapperJavaImplTest" time="0.0">
    <skipped message="Can be enabled for performance test purposes"/>
  </testcase>
  <testcase name="testGetStorages_Performance" classname="org.eclipse.xtext.ui.tests.core.resource.Storage2UriMapperJavaImplTest" time="3.771"/>
  <testcase name="testOnClasspathChange" classname="org.eclipse.xtext.ui.tests.core.resource.Storage2UriMapperJavaImplTest" time="0.452"/>
  <testcase name="testBug574908" classname="org.eclipse.xtext.ui.tests.core.resource.Storage2UriMapperJavaImplTest" time="0.107"/>
  <testcase name="testOnCloseOpenRemoveProject" classname="org.eclipse.xtext.ui.tests.core.resource.Storage2UriMapperJavaImplTest" time="0.152">
    <flakyFailure message="{} expected:&lt;1&gt; but was:&lt;0&gt;" type="java.lang.AssertionError">
      <stackTrace><![CDATA[...]]></stackTrace>
    </flakyFailure>
  </testcase>
</testsuite>

We have 7 testcases, so you would expect <suite tests="7" ...>. Further, the failure is a flaky one, so the test case is considered successful. Still the suite counts a failure. Looks like all attempts / runs of a test and their outcomes are provided in the <suite>, which is not what you would expect. The test result report / this action has to ignore the <suite> and process the individual <testcase>s to distinguish successful from flaky tests.

I have a workaround to derive suite counts from test cases, which will make flaky tests count as successful tests and not over-count the number of tests.

Supporting flaky tests natively will require extending JunitParser. Then this action will mark flaky test as ❄️ instead of ✅.

HannesWell · 2024-08-29T21:17:04Z

https://issues.apache.org/jira/browse/SUREFIRE-1751

Thanks for the pointer! But from the report I'm not sure if it's just about processing a given test-report XML or if it's about the creation of that file. I have the impression it's about the former.

I think the <testsuite> counts all flaky failures / errors in addition to test case outcomes:

We have 7 testcases, so you would expect <suite tests="7" ...>. Further, the failure is a flaky one, so the test case is considered successful. Still the suite counts a failure. Looks like all attempts / runs of a test and their outcomes are provided in the <suite>, which is not what you would expect. The test result report / this action has to ignore the <suite> and process the individual <testcase>s to distinguish successful from flaky tests.

Yes I think that's exactly was happening.

I have a workaround to derive suite counts from test cases, which will make flaky tests count as successful tests and not over-count the number of tests.

Yes I wanted to suggest that as well, but first wanted to know if the current content is intended or not.
Actually the individual testcase elements just have to be inspected for real error or failure children, if the errors or failures attribute of the testsuite element is non-zero. If both are zero it means everything passed on the first run and the tests attribute value is exactly as expected and you could just rely on the summary.
Or is the entire XML parsed any way in any case?

Supporting flaky tests natively will require extending JunitParser. Then this action will mark flaky test as ❄️ instead of ✅.

That would be nice :) Thank you!

michael-o · 2024-08-30T19:25:47Z

https://issues.apache.org/jira/browse/SUREFIRE-1751

Thanks for the pointer! But from the report I'm not sure if it's just about processing a given test-report XML or if it's about the creation of that file. I have the impression it's about the former.

Just wanted to share that this issue exists already in some form.

EnricoMi changed the title ~~Maven Surefire Plugin reruns tests and completes successfully, but the plugin marks the test results as failed~~ Support Maven Surefire Plugin rerunning flaky tests Dec 15, 2023

EnricoMi mentioned this issue Dec 15, 2023

Run with test results that contain flaky tests. #541

Closed

fibbers closed this as completed Dec 18, 2023

knsonier mentioned this issue May 27, 2024

Help supporting flaky tests using Gradle #604

Open

cdietrich mentioned this issue Aug 17, 2024

[Build] Set up publication of Test results as comment on pull-requests eclipse-xtext/xtext#3144

Merged

EnricoMi reopened this Aug 18, 2024

HannesWell mentioned this issue Aug 24, 2024

Still problems with test results? eclipse-xtext/xtext#3171

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Maven Surefire Plugin rerunning flaky tests #539

Support Maven Surefire Plugin rerunning flaky tests #539

fibbers commented Dec 14, 2023

EnricoMi commented Dec 15, 2023

fibbers commented Dec 15, 2023

EnricoMi commented Dec 15, 2023

fibbers commented Dec 15, 2023 •

edited

Loading

EnricoMi commented Dec 15, 2023 •

edited

Loading

fibbers commented Dec 18, 2023

EnricoMi commented Dec 19, 2023

EnricoMi commented Aug 18, 2024

LorenzoBettini commented Aug 18, 2024

EnricoMi commented Aug 18, 2024

LorenzoBettini commented Aug 18, 2024

LorenzoBettini commented Aug 18, 2024

EnricoMi commented Aug 21, 2024

HannesWell commented Aug 24, 2024

EnricoMi commented Aug 26, 2024

LorenzoBettini commented Aug 26, 2024

HannesWell commented Aug 26, 2024 •

edited

Loading

michael-o commented Aug 28, 2024

EnricoMi commented Aug 28, 2024 •

edited

Loading

HannesWell commented Aug 29, 2024

michael-o commented Aug 30, 2024

Support Maven Surefire Plugin rerunning flaky tests #539

Support Maven Surefire Plugin rerunning flaky tests #539

Comments

fibbers commented Dec 14, 2023

EnricoMi commented Dec 15, 2023

fibbers commented Dec 15, 2023

EnricoMi commented Dec 15, 2023

fibbers commented Dec 15, 2023 • edited Loading

EnricoMi commented Dec 15, 2023 • edited Loading

fibbers commented Dec 18, 2023

EnricoMi commented Dec 19, 2023

EnricoMi commented Aug 18, 2024

LorenzoBettini commented Aug 18, 2024

EnricoMi commented Aug 18, 2024

LorenzoBettini commented Aug 18, 2024

LorenzoBettini commented Aug 18, 2024

EnricoMi commented Aug 21, 2024

HannesWell commented Aug 24, 2024

EnricoMi commented Aug 26, 2024

Footnotes

LorenzoBettini commented Aug 26, 2024

HannesWell commented Aug 26, 2024 • edited Loading

michael-o commented Aug 28, 2024

EnricoMi commented Aug 28, 2024 • edited Loading

HannesWell commented Aug 29, 2024

michael-o commented Aug 30, 2024

fibbers commented Dec 15, 2023 •

edited

Loading

EnricoMi commented Dec 15, 2023 •

edited

Loading

HannesWell commented Aug 26, 2024 •

edited

Loading

EnricoMi commented Aug 28, 2024 •

edited

Loading