Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: split up test-async-context-frame into individual tests #54818

Closed
wants to merge 1 commit into from

Conversation

jasnell
Copy link
Member

@jasnell jasnell commented Sep 6, 2024

Using the test runner on these appears to be flaky on windows. This is an experiment to see if splitting these up shows the same flakiness in order to help determine if the issue is with the async-content-frame implementation or the test runner itself.

This PR is opened for testing purposes. It may or may not land. I'm opening it so that I can test the flakiness of these changes in CI.

I do not expect this PR to land as is, and there are several steps to this test meant to use a process of elimination to identify the specifc issue at play. There are three distinct components here that need to be evaluated:

  1. Use of experimental-async-context-frame
  2. Spawning multiple python test runner instances in parallel
  3. Use of node:test to structure the test

Each of these individually or in combination could be leading to the access violation on windows and I've been unable to reproduce the problem locally, so this experiment will work through each using stress tests on the CI

Refs: #54808

Using the test runner on these appears to be flaky on windows.
This is an experiment to see if splitting these up shows the
same flakiness in order to help determine if the issue is with
the async-content-frame implementation or the test runner itself.
@jasnell jasnell requested review from Qard and anonrig September 6, 2024 17:15
@nodejs-github-bot nodejs-github-bot added async_hooks Issues and PRs related to the async hooks subsystem. needs-ci PRs that need a full CI run. test Issues and PRs related to the tests. labels Sep 6, 2024
@jasnell
Copy link
Member Author

jasnell commented Sep 6, 2024

Stress Test 1: https://ci.nodejs.org/job/node-stress-single-test/528/ (hopefully this still works)

This comment was marked as outdated.

@Qard
Copy link
Member

Qard commented Sep 6, 2024

Seems like the stress test thing isn't working anymore. Seems to have a bunch of unrelated visual studio project file errors. I'll just kick off a regular CI and see how it goes.

@Qard Qard added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 6, 2024
@cjihrig
Copy link
Contributor

cjihrig commented Sep 6, 2024

I didn't scroll through all of the logs, so maybe I missed something, but it looks like it ran successfully:

17:06:10 TAP version 13
17:06:10 1..2
17:06:10 ok 1 async-hooks/test-async-local-storage-socket-frame
17:06:10   ---
17:06:10   duration_ms: 259.01700
17:06:10   ...
17:06:10 ok 2 async-hooks/test-async-local-storage-socket
17:06:10   ---
17:06:10   duration_ms: 310.98800
17:06:10   ...
17:06:10 
17:06:10 All tests passed.
17:06:10 1000   OK: 1000   NOT OK: 0   TOTAL: 1000

@jasnell
Copy link
Member Author

jasnell commented Sep 6, 2024

I was able to get at least part of it working: https://ci.nodejs.org/job/node-stress-single-test/528/ ... with only a couple of modified tests (running the *-socket and *-socket-frame variants 1000 times each. Zero flakiness reported over 1000 runs, which is a good sign that the issue is not with the async-context-frame changes. I'm really thinking the issue lies more with the spawning of the parallel python test-runner processes, in which case tracking down the specific problem is likely going to be difficult. I propose that if the regular CI run on this PR is good, we go ahead and merge this PR with the individual tests split out into individual files. We can separate keep investigating as time allows what may be happening with the python runner. Any objections to that approach? @Qard @cjihrig @anonrig ?

@anonrig
Copy link
Member

anonrig commented Sep 6, 2024

Sounds good to me 👍

@Qard
Copy link
Member

Qard commented Sep 6, 2024

I'm not a huge fan of the doubled tests. Can we first try eliminating the parallelism of the original test and see if that is sufficient? I suspect it's a matter of we're spinning up a ton of different tests at the same time and perhaps just running out of memory from so many things running at once. 🤔

@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 6, 2024
@nodejs-github-bot
Copy link
Collaborator

@Qard
Copy link
Member

Qard commented Sep 6, 2024

I've opened #54823 to test my assumption that it might just be the concurrency that's problematic.

@jasnell jasnell closed this Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async_hooks Issues and PRs related to the async hooks subsystem. needs-ci PRs that need a full CI run. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants