-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-35089: [CI][C++][Flight] Test failures in macos release verification nightlies #35090
Conversation
zeroshade
commented
Apr 12, 2023
•
edited by github-actions
bot
Loading
edited by github-actions
bot
- Closes: [CI][C++][Flight] Test failures in macos release verification nightlies #35089
|
@github-actions crossbow submit verify-rc-source-cpp-macos-* |
Revision: 01d81d6 Submitted crossbow builds: ursacomputing/crossbow @ actions-6610cb8aee
|
@github-actions crossbow submit verify-rc-source-cpp-macos-arm64 |
Revision: 76d6522 Submitted crossbow builds: ursacomputing/crossbow @ actions-73cf368dff
|
76d6522
to
3ab5a0f
Compare
@github-actions crossbow submit verify-rc-source-cpp-macos-* |
Revision: 3ab5a0f Submitted crossbow builds: ursacomputing/crossbow @ actions-d506a04ac8
|
I've reduced the failures at least, but I can't seem to figure out the cause of these Segfaults in the macos-amd64-conda release verification. Any assistance here would be amazing. Thanks! |
Weston spun up a macOS environment that I plan to look into tonight |
However we can/should get this merged |
Ok, I checked out the instance that Weston set up. (Thanks a lot Weston!) The test fails probabilistically if I just try to run
Meanwhile, one of the gRPC server threads is executing the same method:
That said, this is supposed to be a thread local, so I don't see how they'd trample each other here. I also can't get lldb to print out the thread-local value so I can't check if it's initialized or not. |
Ah, this gets lldb to recognize it:
lldb says they're the same:
But I wonder if it recognizes thread locals properly. That said, this seems to mostly be a gRPC problem... |
I do see that grpc-cpp is 1.51.1 in the env and in the crashing CI build. (The brew CI build fails, but doesn't crash.) But conda-forge has 1.52 now - maybe we should try that? |
Aha, and we were artificially on an old version because the conda package was renamed |
Unfortunately, gRPC 1.52.1 has the same issue. |
Correction: the main thread has a different stack trace, so it seems less likely that two threads are trampling on ExecCtx and more that there's something being tickled in gRPC:
|
I suppose:
|
@lidavidm When I followed the crossbow builds back I can't see an environmental change, the last successful commit before this started failing was f2d632e...1d74483 on Feb 20th (which is before the last update to I'm currently building grpc from source on the macos instance that @westonpace spun up to compare against the conda env. If i'm able to figure out anything I'll comment back on here. |
@lidavidm I was able to reproduce these crashes by building grpc v1.51.1 from source on the macos box that @westonpace spun up while keeping the rest of the dependencies coming from Conda. In addition, weston found this: conda-forge/grpc-cpp-feedstock#281 which could likely explain the issue (assuming we're not missing something). I'm currently building v1.54.0 from source to confirm for myself that I didn't screw any environmental stuff up, but it looks like this explains the problem. I can try pinning conda to v1.50.0 or something instead and seeing if that alleviates the issue afterwards. |
@github-actions crossbow submit verify-rc-source-cpp-macos-conda-amd64 |
Revision: 4ef0c66 Submitted crossbow builds: ursacomputing/crossbow @ actions-18a35681a6
|
Pinning the conda version of grpc-cpp to <=1.50.1 looks like it worked on the macosx ec2 box. So here's hoping it works for crossbow also, if so we can call this done and set! 😄 |
Wow, that's an incredible find. Thanks for figuring this out! |
Wow, amazing job! |
…on nightlies (#35090) * Closes: #35089 Authored-by: Matt Topol <[email protected]> Signed-off-by: Matthew Topol <[email protected]>
Benchmark runs are scheduled for baseline = 3ff3cc8 and contender = 2daa0c3. 2daa0c3 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
…fication nightlies (apache#35090) * Closes: apache#35089 Authored-by: Matt Topol <[email protected]> Signed-off-by: Matthew Topol <[email protected]>
…fication nightlies (apache#35090) * Closes: apache#35089 Authored-by: Matt Topol <[email protected]> Signed-off-by: Matthew Topol <[email protected]>
…fication nightlies (apache#35090) * Closes: apache#35089 Authored-by: Matt Topol <[email protected]> Signed-off-by: Matthew Topol <[email protected]>