-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize StandaloneAhcWSRequest.add* methods. #301
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good to me, thanks for adding the jmh tests, @htmldoug.
See the other review comments.
@@ -480,6 +478,24 @@ lazy val `integration-tests` = project.in(file("integration-tests")) | |||
) | |||
.disablePlugins(sbtassembly.AssemblyPlugin) | |||
|
|||
//--------------------------------------------------------------- | |||
// Benchmarks (run manually) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Thanks for setting this up!
@@ -496,13 +512,15 @@ lazy val root = project | |||
.settings(formattingSettings) | |||
.settings(disableDocs) | |||
.settings(disablePublishing) | |||
.settings(crossScalaVersions := Seq(scala212)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pulled it up from disablePublishing. integration-tests
overrode it to include all 3 versions (the ThisBuild default), and bench
needs to do the same things or has build failures from the dependencies on the other projects. Values of the crossScalaVersions
settingKey should be equivalent to what was there before.
git blame points to #244. No idea what's happening there, although I assume Eugene knows what he's doing.
@BenchmarkMode(Array(Mode.AverageTime)) | ||
@Fork(jvmArgsAppend = Array("-Xmx350m", "-XX:+HeapDumpOnOutOfMemoryError"), value = 1) | ||
@State(Scope.Benchmark) | ||
class AddQueryParamsBench { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also have a test for the header changes? It would be good if we have tests for both with*
and add*
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added benchmarks for addHttpHeaders
but stopped short of adding them for with*
since it doesn't do anything interesting.
Since these are run manually, they're not going to catch perf regressions. For that, you'd need a bigger investment to hook them into the build and have consistent (ideally dedicated) hardware. Personally, I like micro-benchmarks for validating before/after changes to interesting parts.
I'll defer and add them to with*
if you really want to, but I'm not convinced there's much to be gained from it. An E2E benchmark for typical request construction might be interesting to further micro-optimize the library, but I'd suggest it's beyond the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we are not running them in dedicated hardware yet. But we have this for playframework, running nightly. We can later add a nightly script to run the benchmarks here, but I agree with you that we can add them to with*
methods then.
Summary ------- `StandaloneAhcWSRequest` stores query params and headers as immutable maps that support structural sharing, yet the implementation of `addQueryStringParameters((String, String)*)` and `addHttpHeaders(hdrs: (String, String)*)` methods rebuild the entire map from scratch on each call. That's an `O(n)` implementation over an `O(eC)` data structure. Inserting *n* elements is *O(n²)* complexity. Benchmarks ---------- I put together a benchmark to add *size* query params, then measure adding the next one. Before ``` [info] Benchmark (size) Mode Cnt Score Error Units [info] AddQueryParamsBench.addParams 1 avgt 10 417.620 ± 17.237 ns/op [info] AddQueryParamsBench.addParams 10 avgt 10 2257.053 ± 76.440 ns/op [info] AddQueryParamsBench.addParams 100 avgt 10 18839.600 ± 801.003 ns/op [info] AddQueryParamsBench.addParams 1000 avgt 10 224491.116 ± 12133.870 ns/op [info] AddQueryParamsBench.addParams 10000 avgt 10 3065837.189 ± 82797.559 ns/op ``` After ``` [info] Benchmark (size) Mode Cnt Score Error Units [info] AddQueryParamsBench.addParams 1 avgt 10 47.978 ± 0.388 ns/op [info] AddQueryParamsBench.addParams 10 avgt 10 76.495 ± 0.663 ns/op [info] AddQueryParamsBench.addParams 100 avgt 10 118.205 ± 1.072 ns/op [info] AddQueryParamsBench.addParams 1000 avgt 10 141.506 ± 1.681 ns/op [info] AddQueryParamsBench.addParams 10000 avgt 10 151.452 ± 2.560 ns/op ```
5d4e00c
to
1cd3384
Compare
@marcospereira @wsargent do you guys need anything else from me for this to be mergeable? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thank you, @htmldoug. And sorry for the long time to give you a proper response. We were too focused on other tasks.
Summary ------- `StandaloneAhcWSRequest` stores query params and headers as immutable maps that support structural sharing, yet the implementation of `addQueryStringParameters((String, String)*)` and `addHttpHeaders(hdrs: (String, String)*)` methods rebuild the entire map from scratch on each call. That's an `O(n)` implementation over an `O(eC)` data structure. Inserting *n* elements is *O(n²)* complexity. Benchmarks ---------- I put together a benchmark to add *size* query params, then measure adding the next one. Before ``` [info] Benchmark (size) Mode Cnt Score Error Units [info] AddQueryParamsBench.addParams 1 avgt 10 417.620 ± 17.237 ns/op [info] AddQueryParamsBench.addParams 10 avgt 10 2257.053 ± 76.440 ns/op [info] AddQueryParamsBench.addParams 100 avgt 10 18839.600 ± 801.003 ns/op [info] AddQueryParamsBench.addParams 1000 avgt 10 224491.116 ± 12133.870 ns/op [info] AddQueryParamsBench.addParams 10000 avgt 10 3065837.189 ± 82797.559 ns/op ``` After ``` [info] Benchmark (size) Mode Cnt Score Error Units [info] AddQueryParamsBench.addParams 1 avgt 10 47.978 ± 0.388 ns/op [info] AddQueryParamsBench.addParams 10 avgt 10 76.495 ± 0.663 ns/op [info] AddQueryParamsBench.addParams 100 avgt 10 118.205 ± 1.072 ns/op [info] AddQueryParamsBench.addParams 1000 avgt 10 141.506 ± 1.681 ns/op [info] AddQueryParamsBench.addParams 10000 avgt 10 151.452 ± 2.560 ns/op ```
Backport to 2.0.x: 5bd1434 |
Pull Request Checklist
Purpose
Optimizes
StandaloneAhcWSRequest.add*
methods from O(n) down to O(eC) by avoiding reconstructing maps from scratch.Background Context
I spotted this while benching #288.
StandaloneAhcWSRequest
uses immutable maps that support structural sharing, yet the implementations ofaddQueryStringParameters((String, String)*)
andaddHttpHeaders(hdrs: (String, String)*)
rebuild the entire map on each call. For a request with n elements, appending a new one is an O(n) operation over an O(eC) data structure. Inserting n elements is O(n²).Play 2.5
WSRequest
has an O(eC) implementation, so this seems like new behavior in the pre-release of play-ahc-ws-standalone v2.Benchmarks
AddQueryParamsBench
measures adding a single query param to a request that already has size query params.Before
We cross the 1s/call threshold somewhere between the 1,000 and 10,000th param. At 10k query params, constructing the whole request takes 16 seconds.
Admittedly 10k query params is on the high end of what we're likely to see in real apps, but we can also bring it down to nanoseconds pretty trivially.
After
Caveats
There is a default implementation of these methods in the
trait StandaloneWSRequest
. I don't see any harm in leaving them there.References
No other issues or PRs that I'm aware of.