test_runner: add bail out #56490

pmarchini · 2025-01-06T19:15:47Z

Catching up with the last attempt (#48919), this is another try at introducing the bailout feature.
I'm opening this PR as a draft to discuss the implementation and because refactoring may be needed if this approach is well-received by the community.

Note: In some tests, I had to enforce a concurrency=1 setting because testing the bailout feature across multiple files concurrently proved to be extremely flaky.

nodejs-github-bot · 2025-01-06T19:15:54Z

Review requested:

@nodejs/test_runner

atlowChemi · 2025-01-06T19:50:41Z

lib/internal/test_runner/runner.js

+      this.root.harness.testsProcesses.forEach((child) => {
+        child.kill();
+      });
+      return;


replace with an abort signal?

hey @atlowChemi, sure!

We already have an existing signal passed to the spawned process, but that is a signal at the Test level IIRC. perhaps we can add a new controller on root.harness, and spawn with a AbortSignal.any

Done!
I moved the AbortSignal.any directly into createTestTree for consistency

cjihrig · 2025-01-06T19:59:57Z

In some tests, I had to enforce a concurrency=1 setting because testing the bailout feature across multiple files concurrently proved to be extremely flaky.

Can you explain more about what was flaky? I'm guessing you mean the tests were at different points of execution when they received the bail out signal. I think the best way to work around this is to use test fixtures that never finish.

pmarchini · 2025-01-06T20:08:53Z

Can you explain more about what was flaky? I'm guessing you mean the tests were at different points of execution when they received the bail out signal. I think the best way to work around this is to use test fixtures that never finish.

Hey @cjihrig, you're guessing right!

Your proposed solution sounds good.

I think we should add a test as follows:

First file test: A test that fails after a long timeout (maybe 5-10 seconds) to allow other file test processes to be spawned correctly.
Second file test: A test with an infinite loop.

WDYT?

lib/internal/test_runner/runner.js

doc/api/test.md

cjihrig · 2025-01-06T20:16:20Z

doc/api/test.md

+By enabling this flag, the test runner will exit the test suite early
+when it encounters the first failing test, preventing
+the execution of subsequent tests.
+Already running tests will be canceled, and no further tests will be started.


I'm not sure if we need to get into this small of a detail, but technically there will be a window of time where new tests may be started before being cancelled. Maybe we can say something like "The test runner will cancel all remaining tests."

doc/api/test.md

cjihrig · 2025-01-06T20:18:04Z

doc/api/test.md

@@ -1483,6 +1512,11 @@ changes:
  does not have a name.
 * `options` {Object} Configuration options for the test. The following
  properties are supported:
+  * `bail` {boolean}


I don't think this should be an option to test(). Should it be part of run() instead?

Yes, my bad, the bail option it's part of run. I'll update the doc ASAP !

test/fixtures/test-runner/output/bail-spec.js

test/parallel/test-runner-run.mjs

cjihrig · 2025-01-06T20:38:28Z

test/parallel/test-runner-run.mjs

+    it('should only allow a boolean in options.bail', async () => {
+      [Symbol(), {}, [], () => { }, 0, 1, 0n, 1n, '', '1', Promise.resolve([])]
+        .forEach((bail) => assert.throws(() => run({ bail }), {
+          code: 'ERR_INVALID_ARG_TYPE'


Just out of curiosity, are we sure that these errors actually correspond to the use of bail? For example, it's possible that ERR_INVALID_ARG_TYPE or ERR_INVALID_ARG_VALUE are thrown by run() completely unrelated to the bail option.

I've written the test following a red-green cycle.
I'll take another look and I'll add a comment here 😁

Another option is to update the assertion to check the code and the message.

I agree, we should probably do the same for the other "validation" tests as well

test/parallel/test-runner-run.mjs

cjihrig · 2025-01-06T20:40:59Z

test/parallel/test-runner-run.mjs

+
+  // TODO(pmarchini): Bailout is not supported in watch mode yet but it should be.
+  // We should enable this test once it is supported.
+  it.todo('should handle the bail option with watch mode');


Should this be documented?

cjihrig · 2025-01-06T20:52:06Z

I think we should add a test as follows:

First file test: A test that fails after a long timeout (maybe 5-10 seconds) to allow other file test processes to be spawned correctly.

Second file test: A test with an infinite loop.

There are all sorts of annoying edge cases to account for here, and it might be worth doing a survey of how the tap, mocha, and vitest runners handle bailing out when things are running in parallel. It's much more straightforward when only one thing is running. But, for example, if the very first test in the first process fails, do we bother spawning the other child processes at all? Or, in a bail out situation, how important is it to have an accurate summary at the end of the test run with correct counts for total tests, cancelled tests, etc.

pmarchini · 2025-01-12T10:38:57Z

@cjihrig I'm still checking different runners and it seems that mostly the behaviour is "non-standard".
Vitest returns, even after the bail out, the list of all the skipped test files.

Mocha stops the execution returning a partial result without reporting the full list of cancelled tests.

      const result = await runMochaAsync('options/parallel/test-*', [
        '--parallel',
        '--bail'
      ]);
      // we don't know _exactly_ how many tests will be skipped here
      // due to the --bail, but the number of tests completed should be
      // less than the total, which is 5.
      return expect(
        result.passing + result.pending + result.failing,
        'to be less than',
        5
      );

Checking tests like this one I have the impression that the testing of the feature itself follows a "best effort" approach in more than one tool.

While I'm still checking other examples I think that the most common use case for the bailout is to stop as soon as possible the execution in a CI/automation env.
In this scenario, IMHO, all that matters is the "fail fast" and "fail exit".
I'm not even sure that it would make any sense at all to provide a report after a bail and, if a report is being provided, then I prefer the idea of having an output that gives an exact trace of the actual run (so partial).
WDYT?

Regarding the tests I was thinking about:

sequential with single file
sequential isolation none
parallel with 2 test files ( 1 with some "loading-time" and 1 with infinite waiting loop )
parallel with 2+x test files with concurrency fixed to 2 -> this in order to cover the behaviour of one test file that should not even start its run
order of the hooks while in bail out -> to ensure the hooks are being run as supposed.

Do you have any suggestions / other behaviours you think we should ensure?

codecov · 2025-01-12T23:11:14Z

Codecov Report

Attention: Patch coverage is 94.11765% with 6 lines in your changes missing coverage. Please review.

Project coverage is 89.19%. Comparing base (062ae6f) to head (39a21ba).
Report is 43 commits behind head on main.

Files with missing lines	Patch %	Lines
lib/internal/test_runner/runner.js	86.36%	6 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #56490      +/-   ##
==========================================
+ Coverage   89.12%   89.19%   +0.07%     
==========================================
  Files         662      662              
  Lines      191555   191854     +299     
  Branches    36859    36935      +76     
==========================================
+ Hits       170722   171133     +411     
+ Misses      13694    13565     -129     
- Partials     7139     7156      +17

Files with missing lines	Coverage Δ
lib/internal/test_runner/harness.js	`92.83% <100.00%> (+0.16%)`	⬆️
lib/internal/test_runner/reporter/spec.js	`96.29% <100.00%> (+0.10%)`	⬆️
lib/internal/test_runner/reporter/tap.js	`95.43% <100.00%> (+0.06%)`	⬆️
lib/internal/test_runner/reporter/utils.js	`96.84% <100.00%> (+0.13%)`	⬆️
lib/internal/test_runner/test.js	`96.98% <100.00%> (+0.05%)`	⬆️
lib/internal/test_runner/tests_stream.js	`92.13% <100.00%> (+0.41%)`	⬆️
lib/internal/test_runner/utils.js	`58.79% <100.00%> (+2.58%)`	⬆️
src/node_options.cc	`87.94% <100.00%> (+0.01%)`	⬆️
src/node_options.h	`98.32% <100.00%> (+<0.01%)`	⬆️
lib/internal/test_runner/runner.js	`89.88% <86.36%> (+0.37%)`	⬆️

... and 55 files with indirect coverage changes

nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Jan 6, 2025

pmarchini force-pushed the test_runner/bail-out branch from 89f64db to 55874e8 Compare January 6, 2025 19:18

pmarchini requested review from cjihrig, MoLow, marco-ippolito and jakecastelli January 6, 2025 19:18

atlowChemi reviewed Jan 6, 2025

View reviewed changes

cjihrig reviewed Jan 6, 2025

View reviewed changes

test_runner: add bail out

f1c269c

pmarchini force-pushed the test_runner/bail-out branch from 55874e8 to f1c269c Compare January 10, 2025 17:36

pmarchini added 2 commits January 11, 2025 11:58

test_runner: replace process.kill with harness abort controller

1d68db4

test_runner: integrate abort signal handling in test tree creation

e691cd6

pmarchini added 6 commits January 12, 2025 19:32

test_runner: add bail out tests

d8efaec

test_runner: update bail-out documentation

e213dbf

test: lint fixtures

9514d89

test: fix assertion

a97c98b

test_runner: add bailout tests for long-running and multi-file cases

4d85df5

test: remove comments

39a21ba

pmarchini marked this pull request as ready for review January 12, 2025 21:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_runner: add bail out #56490

test_runner: add bail out #56490

pmarchini commented Jan 6, 2025

nodejs-github-bot commented Jan 6, 2025

atlowChemi Jan 6, 2025

pmarchini Jan 6, 2025

atlowChemi Jan 6, 2025

pmarchini Jan 11, 2025

cjihrig commented Jan 6, 2025

pmarchini commented Jan 6, 2025

cjihrig Jan 6, 2025

pmarchini Jan 12, 2025

cjihrig Jan 6, 2025

pmarchini Jan 7, 2025

pmarchini Jan 12, 2025

cjihrig Jan 6, 2025

pmarchini Jan 7, 2025

cjihrig Jan 8, 2025

pmarchini Jan 8, 2025

pmarchini Jan 12, 2025

cjihrig Jan 6, 2025

cjihrig commented Jan 6, 2025

pmarchini commented Jan 12, 2025

codecov bot commented Jan 12, 2025 •

edited

Loading

test_runner: add bail out #56490

Are you sure you want to change the base?

test_runner: add bail out #56490

Conversation

pmarchini commented Jan 6, 2025

nodejs-github-bot commented Jan 6, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cjihrig commented Jan 6, 2025

pmarchini commented Jan 6, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cjihrig commented Jan 6, 2025

pmarchini commented Jan 12, 2025

codecov bot commented Jan 12, 2025 • edited Loading

Codecov Report

codecov bot commented Jan 12, 2025 •

edited

Loading