Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connections leaks in server #57

Closed
paurkedal opened this issue Jun 21, 2018 · 2 comments
Closed

Connections leaks in server #57

paurkedal opened this issue Jun 21, 2018 · 2 comments

Comments

@paurkedal
Copy link

While running benchmarks for #53, I noticed that there were connections left after wrk2 was done. I think the issue is in the generic httpaf code, since the async benchmark also reports remaining connections with the same test. Here is an example output from the benchmark:

2018-06-21 09:27:38.670988+02:00 conns: 0
2018-06-21 09:27:39.174527+02:00 conns: 0
2018-06-21 09:27:39.674665+02:00 conns: 0
2018-06-21 09:27:40.174861+02:00 conns: 0
2018-06-21 09:27:40.675038+02:00 conns: 0
2018-06-21 09:27:41.220915+02:00 conns: 284
2018-06-21 09:27:41.721138+02:00 conns: 684
2018-06-21 09:27:41.908716+02:00 Error (monitor.ml.Error (Unix.Unix_error "Connection reset by peer" read "")
  ("Raised at file \"src/import0.ml\" (inlined), line 351, characters 22-32"
    "Called from file \"src/result.ml\" (inlined), line 168, characters 17-26"
    "Called from file \"src/raw_fd.ml\", line 272, characters 4-60"
    "Called from file \"src/raw_fd.ml\", line 265, characters 10-26"
    "Re-raised at file \"async/httpaf_async.ml\", line 72, characters 6-15"
    "Called from file \"src/deferred0.ml\", line 61, characters 64-69"
    "Called from file \"src/job_queue.ml\", line 159, characters 6-47"
    "Caught by monitor (id 26)"))
2018-06-21 09:27:41.908801+02:00 Error (monitor.ml.Error (Unix.Unix_error "Connection reset by peer" read "")
  ("Raised at file \"src/import0.ml\" (inlined), line 351, characters 22-32"
    "Called from file \"src/result.ml\" (inlined), line 168, characters 17-26"
    "Called from file \"src/raw_fd.ml\", line 272, characters 4-60"
    "Called from file \"src/raw_fd.ml\", line 265, characters 10-26"
    "Re-raised at file \"async/httpaf_async.ml\", line 72, characters 6-15"
    "Called from file \"src/deferred0.ml\", line 61, characters 64-69"
    "Called from file \"src/job_queue.ml\", line 159, characters 6-47"
    "Caught by monitor (id 20)"))
2018-06-21 09:27:42.221287+02:00 conns: 2
2018-06-21 09:27:42.721474+02:00 conns: 2
2018-06-21 09:27:43.221622+02:00 conns: 2
2018-06-21 09:27:43.721809+02:00 conns: 2
2018-06-21 09:27:44.221947+02:00 conns: 2
^C

My wrk2 command line for this was

../wrk2/wrk --rate 1K --connections 1K --timeout 5m --duration 1s --threads 4 --latency http://127.0.0.1:8080

Running multiple times, usually leaving more open connections, I can verify that the number of exceptions reported is always the same as the number of connections left.

@paurkedal
Copy link
Author

I investigated this a bit further, and the remaining connections handlers after wrk exists seems to be stuck in read. This is likely the issue described e.g. in Detection of Half-Open (Dropped) Connections, which explicitly mentions timeout as the solution for HTTP servers which support persistent connections. So, it's not a bug.

For a production-ready server we would need a timeout. I don't know Async well, but Lwt this can be easily done by joining the handle thread with an Lwt_unix.timeout, but this is not fully correct, since a big transfer could be cancelled even though it's still streaming. So, maybe the handler constructors should take a ~idle_timout or ~read_idle_timeout parameter? (Same semantics, different name suggestions.)

@seliopou
Copy link
Member

Thanks for the report, Closing in favor of #86.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants