-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-45908: Fix client-side HTTP timeouts when communicating with Butler server #1065
Conversation
6b79966
to
b0fab7c
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1065 +/- ##
=======================================
Coverage 89.54% 89.54%
=======================================
Files 359 359
Lines 46473 46509 +36
Branches 9558 9566 +8
=======================================
+ Hits 41612 41648 +36
Misses 3511 3511
Partials 1350 1350 ☔ View full report in Codecov by Sentry. |
The default HTTPX timeout of 5 seconds is much too low when communicating with Butler server -- many requests can exceed that while waiting for the database to respond.
Fetching database results for Butler queries can take an unpredictable amount of time, in some cases several minutes. To ensure that clients do not time out while waiting, send a keep-alive message to the stream every fifteen seconds.
Remove the GzipMiddleware for two reasons: 1. It was preventing "keep-alive" messages from working in streamed responses, because it was batching them until it had a full gzip chunk. 2. It was slowing down queries by about 10% at the RSP -- it turns out that compressing/decompressing is expensive, and there is lots of bandwidth available within Google.
0c1baf5
to
3d29584
Compare
@@ -54,7 +53,6 @@ def create_app() -> FastAPI: | |||
config = load_config() | |||
|
|||
app = FastAPI() | |||
app.add_middleware(GZipMiddleware, minimum_size=1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was preventing the keep-alives from working because it would buffer/compress them and never send them.
I tested it and it turns out that gzipping was slowing down queries by 10% on the RSP -- there's oodles of bandwidth there and apparently the compression overhead is non-trivial. So I just removed this, since there are no configuration options to make it only apply to certain content-types or routes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Great test coverage.
Fixed an issue where slow queries would result in
httpx
timeouts on the client side. The server now sends keep-alives from slow-running queries to prevent a timeout. The timeout duration has been increased to 2 minutes on the client side -- previously it was 5 seconds.Checklist
doc/changes
configs/old_dimensions