Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long running ResponseBody async Task never gets canceled #618

Open
vanwagonet opened this issue Nov 24, 2024 · 5 comments
Open

Long running ResponseBody async Task never gets canceled #618

vanwagonet opened this issue Nov 24, 2024 · 5 comments

Comments

@vanwagonet
Copy link

I was able to write an event source text stream as a ResponseBody, but I have found that when the client disconnects, the task continues running indefinitely, even when the server tries to gracefully shut down.

I had assumed that the client disconnecting from the request would cause the body sending task to cancel. I tried diving through the code, but it got pretty deep into NIO, and well beyond my depth pretty quickly. Should the body async task become cancelled when the client disconnects?

Is there some other signal that such a long-running response might tap into for client disconnection, or graceful shutdown?

Here's a response that reproduces the issue:

Response(
    status: .ok,
    headers: [
        .cacheControl: "no-store",
        .connection: "keep-alive",
        .contentType: "text/event-stream",
    ],
    body: ResponseBody(asyncSequence: AsyncStream { stream in
        Task {
            stream.yield(ByteBuffer(string: "event: ping\ndata: \n\n"))
            while !Task.isCancelled {
                if case .terminated = stream.yield(ByteBuffer(string: "event: ping\ndata: \n\n")) {
                    break
                }
                try? await Task.sleep(for: .seconds(20))
            }
            // Note that more useful events would also be sent as applicable.
        }
    })
)
@vanwagonet
Copy link
Author

Note that I also tried something like the example below, in case my extra layer of Task was causing the issue. However, this also did not become canceled when the client disconnected, nor during graceful shutdown.

ResponseBody { writer in
    for await buffer in stream where !Task.isCancelled {
        try await writer.write(buffer)
    }
    try await writer.finish(nil)
}

@adam-fowler
Copy link
Member

There two issues here

  1. Your unstructured task will cause issues as cancellation is not propagated to unstructured classes as they are not the child of another task.

  2. There is currently no easy way to catch client input close and pass it onto the HTTP2 handler. If the client completely closes the connection then the writes will fail and you will exit the write function. But if it does a clean close it only closes its side of the connection, the server is allowed to continue writing and there is currently no way to pass the event inputClosed to the HTTP handler. The way the NIO concurrency types are structured makes it hard to do this.

It is perfectly valid for a client to send a request and close its side of the connection before getting a response, so I can't cancel all requests based off the client closing input, but I need to be able to pass that information to users so they can make that call. Just haven't worked out exactly how to do that yet.

One possibility is to add an option to disable half closure and then writes should fail as soon as the input is closed.

I will speak with NIO team as well to see if they have any ideas.

@vanwagonet
Copy link
Author

Thanks for the quick reply! Yeah, this sounds like a tough one. For most normal requests I anticipate the body completes quickly, and actual task canceling may cause more trouble than it's worth. However, for these long-running event streams, it can become a memory leak. My unstructured tasks don't help either. I'll see if I can clean that up.

I did find the cancelOnGracefulShutdown helper from swift-service-lifecycle, that gives a signal for when the server is shutting down. I see they are using some @TaskLocal machinery for that. Perhaps a similar api could be made for inputClosed? If there was a cancelOnRequestDisconnect that would be enough for this case.

@adam-fowler
Copy link
Member

Thanks for the quick reply! Yeah, this sounds like a tough one. For most normal requests I anticipate the body completes quickly, and actual task canceling may cause more trouble than it's worth. However, for these long-running event streams, it can become a memory leak. My unstructured tasks don't help either. I'll see if I can clean that up.

Yeah I understand this is an issue and want to provide a solution. I've a few ideas to try out but it might require a special http1 handler for servers that have long running cancellable requests.

I did find the cancelOnGracefulShutdown helper from swift-service-lifecycle, that gives a signal for when the server is shutting down. I see they are using some @TaskLocal machinery for that. Perhaps a similar api could be made for inputClosed? If there was a cancelOnRequestDisconnect that would be enough for this case.

Yes a TaskLocal might be a solution for passing inputClosed state down the callstack.

On the subject of graceful shutdown. When the server receives a graceful shutdown message it stops accepting new requests and waits until all requests have been responded to before shutting down the server. In the case of your long running sse request it will never shutdown because you never return a finished response so you need to handle graceful shutdown within your response writer to end the sending of server sent events and finish the response.

@vanwagonet
Copy link
Author

I'd like to emphasize that this is not an urgent issue.

I was able to get my sse response shutting down gracefully with cancelOnGracefulShutdown. I don't have anything heading to production right now. I'm just learning & trying to understand server-side swift for future recommendations. For my toy app, once I got graceful shutdown handled, my responses do seem to terminate a few minutes after disconnects, probably because the other half finally closes & the writes fail.

In a future production app, I'm not sure how concerning it would be to have these streams hanging out longer than needed, but it definitely would give more confidence if there was a mechanism to clean them up immediately after disconnecting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants