Process requests after write EOF #209

dhouse-js · 2021-04-29T18:49:03Z

In many runtimes, there are separate reader and writer threads that
drive the reading and writing from httpaf independently. So a thing
that can happen is:

A request arrives.
The response handler begins but does not finish responding to it.
The writer thread calls Server_connection.report_write_result `Closed.
The reader thread delivers another request.

In this case, httpaf will never deliver the second request to the
handler, because the first request's output_state never gets to
Complete. We have no hope of responding to the second request, but we
should still deliver it to the handler in case the request has
side-effects (e.g. as many POST requests do).

This PR fixes this by noticing when the writer is closed in
Reqd.output_state and just always returning Complete in this case,
as no more output progress can be made.

This is a prelude to inhabitedtype#159 which introduces upgrade requests, with a few major changes in `Server_connection`. The goals here is to try to make queue management easier to reason about by folding bits of logic from `advance_request_queue_if_necessary` into `next_read_operation` and `next_write_operation` such that we only perform side-effects when the operation in question demands it. One of the ways I tried to make this easier to reason about was to make the `next_<read|write>_operation` functions very parallel. Getting the read operation starts out with a short-circuit for shutting down when the server can no longer make progress (reader is closed and queue is empty). This doesn't feel like it belongs here. Perhaps this check should be part of `advance_request_queue` with some extra logic triggering in `shutdown_reader`? After that, the next-operation functions use some very simple probing of the input/output state of `Reqd` to determine what to do next. Only in the case of `Complete` do we move into a separate function (to make it easier to read): `_final_<read|write>_operation`. In these functions, we decide if we should shutdown the respective reader/writer or consider the `reqd` complete and move it off the queue. What's happening is that we don't know if the write action or read action will be last, so each function checks the state of the other to see if they're both complete. When we do shift it off, we recursively ask for the next operation given the new queue state. In the case of the writer triggering the advancing, before we return the result, we wakeup the reader so that it can evaluate the next operation given the new queue state. Note that in the case of a non-persistent connection, the queue is never advanced and the connection is shut down when both sides are done. Though on the surface, these pieces feel fairly straightforward, there are still a slew of re-entrancy bugs to consider. I think there are two things that we can do to make this drastically easier to manage: 1. We call `t.request_handler` in two places, and this is mostly because we want to keep the invariant that the head of the request queue has already been passed off to the handler. I feel like splitting this up into a simple queue of unhandled requests and a [Reqd.t option] that represents the current request would be easier to manage. 2. It would be nice to schedule calls. Things like waking up the writer before you let the read loop know its next operation just immediately makes my mind fall apart and lose track of state. There's a fairly obvious solution of asking for a `schedule : (unit -> unit) -> unit` function from the runtime that promises to not call the thunk synchronously, but rather waits until it is outside of the read and write loops. But maybe we can solve it using what we have now, like establishing a contract that when the reader/writer is woken up, they must schedule their work for a fresh call stack and not immediately ask for operations.

This is because the writer is always woken by the appropriate calls that push chunks onto the body or writer or calls that close the body. Had to import an additional line from a recent band-aid fix regarding setting the flag on non-chunked streaming responses. It feels like we should find an alternative means of maintaining this piece of information.

We basically never want to call `Queue.clear` because the head of the queue has special semantic meaning. Instead, we never try to clear the queue and rely on the fact that the queue will never be advanced. This is easy to reason about because the only time we advance the request queue is when the current request is not persistent. I added an explicit test of this situation to build confidence. Additionally, there was an incorrect assertion that you couldn't finish a write with reads still pending. A test was added upstream and it no longer fails with this fix. The final change was some subtle but unused code. In the write loop, we have something that decides to shutdown the connection if the reader is closed, parallel to the next read operation. But this felt weird, the reader should always be awake in the case that it is closed, which means that either 1) it will shutdown the connection or 2) it will wait for the writer, which will wake the reader once it's advanced the request queue, and then it will shutdown the connection.

…efactor-request-queue

In many runtimes, there are separate reader and writer threads that drive the reading and writing from httpaf independently. So a thing that can happen is: - A request arrives. - The response handler begins but does not finish responding to it. - The writer thread calls [Server_connection.report_write_result `Closed]. - The reader thread delivers another request. In this case, httpaf will never deliver the second request to the handler, because the first request's output_state never gets to Complete. We have no hope of responding to the second request, but we should still deliver it to the handler in case the request has side-effects (e.g. as many POST requests do). This PR fixes this by noticing when the writer is closed in [Reqd.output_state] and just always returning [Complete] in this case, as no more output progress can be made.

seliopou · 2021-05-10T12:33:47Z

Can you rebase this?

dhouse-js · 2021-05-12T17:56:09Z

Done.

The diff still has spurious stuff in it. I think this is because refactor-request-queue is behind master by a few commits in the main httpaf repo. Perhaps that branch could be merged with httpaf/master?

seliopou · 2021-05-16T21:09:21Z

Sorry, I thought this was based off master but it's based off the #172 branch. I'll talk to @dpatti to see if that is good, and then all the other PRs will follow.

In many runtimes, there are separate reader and writer threads that drive the reading and writing from httpaf independently. So a thing that can happen is: - A request arrives. - The response handler begins but does not finish responding to it. - The writer thread calls [Server_connection.report_write_result `Closed]. - The reader thread delivers another request. In this case, httpaf will never deliver the second request to the handler, because the first request's output_state never gets to Complete. We have no hope of responding to the second request, but we should still deliver it to the handler in case the request has side-effects (e.g. as many POST requests do). This PR fixes this by noticing when the writer is closed in [Reqd.output_state] and just always returning [Complete] in this case, as no more output progress can be made.

dpatti · 2021-05-22T20:14:43Z

Rebased to master

dpatti · 2021-05-22T20:36:05Z

I think I'm seeing a problem here. Consider a change like this:

diff --git a/lib_test/test_server_connection.ml b/lib_test/test_server_connection.ml
index d47c62a..f55af7a 100644
--- a/lib_test/test_server_connection.ml
+++ b/lib_test/test_server_connection.ml
@@ -913,6 +913,8 @@ let test_can_read_more_requests_after_write_eof () =
   reqd := None;
   reader_ready t;
   read_request t request;
+  Reqd.respond_with_streaming (Option.get !reqd) response ~flush_headers_immediately:true
+  |> (ignore : [ `write ] Body.t -> unit);
   Alcotest.(check bool) "request handler fired" true (Option.is_some !reqd)
 ;;

Roughly, imagine that this is called in every request handler like it would in a real application. This will raise with [failure] cannot write to closed writer, which comes from Faraday because respond_with_streaming (and likewise, respond_with_string) both try to write to t.writer, which is already closed. I think we could repro this trivially already (edit: yes), though: if your request handler is asynchronous, and something shuts the writer down before it runs (which I think can only happen with an explicit shutdown call), then it would raise for the same reasons. In this case, though, we'd always raise because we're willingly calling the request handler even though the writer is closed.

I'm not sure what the ideal behavior here is.

dhouse-js · 2021-05-24T10:51:13Z

FWIW, when developing my application I ran into the exact case you're describing -- a response handler needs some time to decide how to respond, and gets an exception when it finally does respond because the writer has been closed in the mean time.

It seems to me that the only options are:

Keep the current behaviour: raise if a write has already happened.
Silently drop responses if the writer is already closed.
Return a result type that indicates whether the response was successfully written to the writer or not.

I went with (2) in my application -- i.e., I wrapped all of the respond functions to catch this exception. This felt acceptable to me because servers don't typically care if their responses were actually sent to the client or not. And of course, even if we do successfully put the response into the writer, that is no guarantee that the bytes will be able to be delivered to the network and received by the client.

(2) does have the disadvantage that it destroys information, but perhaps we could expose a Reqd.writer_is_open function that allows the caller to recover it should they wish.

It's worth noting that, even with (2), you have a further choice to make when it comes to respond_with_streaming: do you still deliver a Body.t to the client, even though no content in that body will ever be sent to the writer? In my application, I drew the line here, and instead wrapped respond_with_streaming to have a return type of [`write] Body.t option.

I can write a separate PR for this should people wish.

dpatti and others added 5 commits April 22, 2021 15:27

Merge remote-tracking branch 'upstream/refactor-request-queue' into r…

a49ea3c

…efactor-request-queue

dhouse-js changed the title ~~Complete when writer closed~~ Process requests after write EOF Apr 29, 2021

Merge branch 'master' into complete-when-writer-closed

004daf2

dpatti force-pushed the complete-when-writer-closed branch from 004daf2 to e5fd51f Compare May 22, 2021 20:13

dpatti changed the base branch from refactor-request-queue to master May 22, 2021 20:14

This was referenced May 22, 2021

fix read loop with shutdown #210

Merged

Flush response before shutting down #208

Merged

dpatti and others added 4 commits June 2, 2021 19:03

demonstrate writing to closed writer

ed4780b

resolve conflicts

53ee77b

resolve conflicts

379d6d4

cosmetic --move test to end of list

edfb152

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process requests after write EOF #209

Process requests after write EOF #209

dhouse-js commented Apr 29, 2021

seliopou commented May 10, 2021

dhouse-js commented May 12, 2021

seliopou commented May 16, 2021

dpatti commented May 22, 2021

dpatti commented May 22, 2021 •

edited

Loading

dhouse-js commented May 24, 2021 •

edited

Loading

Process requests after write EOF #209

Are you sure you want to change the base?

Process requests after write EOF #209

Conversation

dhouse-js commented Apr 29, 2021

seliopou commented May 10, 2021

dhouse-js commented May 12, 2021

seliopou commented May 16, 2021

dpatti commented May 22, 2021

dpatti commented May 22, 2021 • edited Loading

dhouse-js commented May 24, 2021 • edited Loading

dpatti commented May 22, 2021 •

edited

Loading

dhouse-js commented May 24, 2021 •

edited

Loading