-
-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UDPSocket#close
hangs
#368
Comments
Managed to repro 🎉 It seems to be caused by a combination of sync and async invocations - at least with 3.4.1. https://github.com/jscheid/socketry-async-368 Please let me know what else I can do to help find a fix? I could go deep myself but I'm hoping that you, being way more familiar with all the internals, can pinpoint the problem much more quickly than me. |
I'll investigate. cc @KJTsanaktsidis - do you have any thoughts? |
The test case (https://github.com/jscheid/socketry-async-368) appears to fail regardless of selector implementation (tested with select, epoll, uring, kqueue). |
When it hangs, I can observe the following behaviour in With this my latest theory is: when there is a write in progress in one thread as the fd is closed in another thread, the close operation will wait for the write to complete. This mechanism, used to notify of write completion, is aware of fibers but appears to be prone to race conditions involving the special |
Hi, I'm trying to debug an issue where a call to
statsd.timing
(see https://github.com/reinh/statsd) sometimes hangs when invoked from within an Async task, with latest everything (see environment below).I've tracked it down to the following Fiber backtrace:
(This is the last line executed in "user" code.)
It just sits there indefinitely. I'm not sure that this happens every time that
close
is called in an Async block, but I can reproduce it reliably. Outside of Async it never hangs.The UDPSocket connects to a port without listener. Looking at statsd source, the sequence of events should be something like:
I've tried creating a test case based on the above but so far I can't reproduce it in isolation.
I'm a bit lost now - I might keep trying to reduce it to an isolated repro. In the meantime, do you have any ideas that could point me in the right direction?
My environment:
async (2.21.1)
io-event (1.7.5)
(also tested with 1.7.4)ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [aarch64-linux]
(also tested with ruby 3.3.1) - viaruby:3.4.1-bullseye
Docker imageDocker version 27.3.1, build ce12230
Linux df0dd8712c2d 6.12.5-orbstack-00287-gf8da5d508983 #19 SMP Tue Dec 17 08:07:20 UTC 2024 aarch64 GNU/Linux
The text was updated successfully, but these errors were encountered: