Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethrow async exception #1013

Merged
merged 6 commits into from
Nov 19, 2024

Conversation

kazu-yamamoto
Copy link
Contributor

@Vlix This is the first step to fix the stuffs in #1012.

@kazu-yamamoto kazu-yamamoto force-pushed the rethrow-async-exception branch from 7313278 to 88b67b4 Compare November 14, 2024 02:03
@kazu-yamamoto kazu-yamamoto requested a review from Vlix November 14, 2024 03:25
@Vlix
Copy link
Contributor

Vlix commented Nov 14, 2024

I'm a bit busy this week, but will try to take a look next week 👍

We now have listThreads.
So, it would be nice to revisit the old idea.
@kazu-yamamoto
Copy link
Contributor Author

time-manager is leaked sometime:

  1. In this pattern, time-manager is solely leaked:
20 QUIC dispatcher: ThreadBlocked BlockedOnSTM
24 WAI timeout manager (Reaper): ThreadBlocked BlockedOnMVar
3462 Warp HTTP/1.1 [::ffff:81.161.238.88]:53801: ThreadBlocked BlockedOnMVar
3464 Warp HTTP/1.1 [::ffff:81.161.238.88]:53803: ThreadBlocked BlockedOnMVar
3467 Warp HTTP/1.1 [::ffff:81.161.238.88]:53831: ThreadBlocked BlockedOnMVar
3570 Warp HTTP/1.1 [::ffff:81.161.238.88]:56968: ThreadBlocked BlockedOnMVar
3575 Warp HTTP/1.1 [::ffff:81.161.238.88]:57059: ThreadBlocked BlockedOnMVar
  1. In this pattern, time-manager leaks other workers by holding ThreadId. I can tell this because they are in ThreadFinished. af20093 tries to fix this.
10150 WAI timeout manager (Reaper): ThreadBlocked BlockedOnMVar
10190 Warp HTTP/1.1 [::ffff:103.8.117.203]:43390: ThreadFinished
10191 Warp HTTP/1.1 [::ffff:103.8.117.203]:43392: ThreadFinished
10193 Warp HTTP/1.1 [::ffff:103.8.117.203]:43404: ThreadFinished
10202 Warp HTTP/1.1 [::ffff:103.8.117.203]:43406: ThreadFinished
10204 Warp HTTP/1.1 [::ffff:103.8.117.203]:41256: ThreadFinished

@kazu-yamamoto
Copy link
Contributor Author

Both Reaper and TimeManager do not use MVar at all.
Why BlockedOnMVar?

@kazu-yamamoto
Copy link
Contributor Author

kazu-yamamoto commented Nov 19, 2024

I hope that 1400f27 fixes the leak of TimeManager.

@kazu-yamamoto
Copy link
Contributor Author

TimeManager is still leaking, sigh.

23 WAI timeout manager (Reaper): ThreadBlocked BlockedOnMVar

@kazu-yamamoto
Copy link
Contributor Author

The 23 thread is gone. Why did it took so a long time?

@kazu-yamamoto
Copy link
Contributor Author

OK. I understand.
threadDelay is implemented with MVar.
So, it can be in the BlockedOnMVar state.
IOManager keeps working if connections come repeatedly.
So, it can have a long life and is finished when all connections are finished.
Now I think TimeManager is not leaking.

TimeManager is leaking because of "BlockedOnMVar".
Both TimeManger and Reaper do not use MVar at all.
TLS "bye" is the only registered function which uses MVar.
When TimeManager kills Warp, "bye" is called twice.
One is via "bracket", the other is by TimeManager.
This ensures that "bye" is called only once.
@kazu-yamamoto kazu-yamamoto force-pushed the rethrow-async-exception branch from 489c063 to 1400f27 Compare November 19, 2024 20:51
@kazu-yamamoto
Copy link
Contributor Author

I'm going to merge this PR and release new versions as this PR fixes some thread leaks.
Of course, comments after releasing are welcome.

@kazu-yamamoto kazu-yamamoto merged commit 5736771 into yesodweb:master Nov 19, 2024
13 of 14 checks passed
@kazu-yamamoto kazu-yamamoto deleted the rethrow-async-exception branch November 19, 2024 20:54
@Vlix
Copy link
Contributor

Vlix commented Nov 20, 2024

Few things I've found:

  • onException in httpOverTLS is not masked, so the close s can be interrupted by any exception. (do we want that?)
  • this catch handler is interruptible now, is that ok?
  • the withFdCache's after is now not uninterruptibleMasked. Though I think that's probably fine?
  • getAndRegisterInfo will now also catch AsyncExceptions and negative turns them into a syncronous Exception.
  • this terminate will not be uninterruptibleMasked. Is that ok? How important is reaperStop?
    Same with this close, this stopManager, this cleanUp, this cancel, this onClose, this teardown, this close
  • time-manager now seems to also eat up AsyncExceptions here and here

The difference between uninterruptibleMask and regular mask is only that AsyncExceptions can come through with mask in case the thread is blocked, right? Which would be when waiting for an MVar, or a threadDelay too? (I forgot how it goes with FFI calls 🤔)

@kazu-yamamoto
Copy link
Contributor Author

From the doc of Control.Exception:

If the target thread is currently making a foreign call, then the exception will not be raised (and hence throwTo will not return) until the call has completed.

That is, asynchronous exceptions cannot interrupt foreign calls.

@kazu-yamamoto
Copy link
Contributor Author

ThreadIds in auto-update are not disclosed.
Nobody can throw asynchronous exceptions to threads spawned by auto-update, I believe.

@Vlix
Copy link
Contributor

Vlix commented Nov 21, 2024

ThreadIds in auto-update are not disclosed. Nobody can throw asynchronous exceptions to threads spawned by auto-update, I believe.

StackOverflow and HeapOverflow are also asynchronous, so I don't think you'd want to assume anything (though the chances are super small)

@kazu-yamamoto
Copy link
Contributor Author

StackOverflowandHeapOverflow` are also asynchronous, so I don't think you'd want to assume anything (though the chances are super small)

Good point!
I will think your review results again.

@Vlix
Copy link
Contributor

Vlix commented Nov 21, 2024

OK. I understand. threadDelay is implemented with MVar. So, it can be in the BlockedOnMVar state. IOManager keeps working if connections come repeatedly. So, it can have a long life and is finished when all connections are finished. Now I think TimeManager is not leaking.

How did you figure this out, btw? I can only narrow it down to an internal built-in delay# function 🤔

@kazu-yamamoto kazu-yamamoto mentioned this pull request Nov 21, 2024
@kazu-yamamoto
Copy link
Contributor Author

How did you figure this out, btw? I can only narrow it down to an internal built-in delay# function 🤔

See: https://www.stackage.org/haddock/lts-22.42/base-4.18.2.1/src/GHC.Conc.IO.html#threadDelay

Our RTS is threaded. So, we should follow Event.threadDelay

ghc/libraries/base/GHC/Event/Thread.hs:threadDelay is defined as follows:

threadDelay :: Int -> IO ()
threadDelay usecs = mask_ $ do
  mgr <- getSystemTimerManager
  m <- newEmptyMVar
  reg <- TM.registerTimeout mgr usecs (putMVar m ())
  takeMVar m `onException` TM.unregisterTimeout mgr reg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants