-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ddtrace/tracer: optimize span tags storage in the hot path #2799
base: main
Are you sure you want to change the base?
Conversation
BenchmarksBenchmark execution time: 2024-12-02 16:08:30 Comparing candidate commit 51cf835 in PR branch Found 0 performance improvements and 8 performance regressions! Performance is the same for 50 metrics, 1 unstable metrics. scenario:BenchmarkPartialFlushing/Disabled-24
scenario:BenchmarkPartialFlushing/Enabled-24
scenario:BenchmarkSetTagMetric-24
scenario:BenchmarkSetTagString-24
scenario:BenchmarkSetTagStringer-24
scenario:BenchmarkSingleSpanRetention/no-rules-24
scenario:BenchmarkSingleSpanRetention/with-rules/match-all-24
scenario:BenchmarkSingleSpanRetention/with-rules/match-half-24
|
…to test realistic scenarios
… tests and production code
Releasing the map after encoding the span for submission doesn't play well with our current codebase
… the Meta map at the right size impacts performance
This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
What does this PR do?
Implements an automated optimizer based on selecting the best size for maps to store metas and metrics (also known as tags). It's going to be based on
sketches-go
and it'll be configurable, so costumers can choose the percentile used to initialize theMeta
map size.Also considering if there is benefit on pooling maps that are returned when the span is finished.
Motivation
Gathered data from our intake shows that each services has different usage patterns and requirements for tags. While some services may generate spans with a minimal number of tags (around 5 meta tags with no metrics), others may generate thousands of meta tags or thousands of metric tags.
In general, our internal data offers insights like an 80/20 proportion between meta tags (
strings
) and metric tags (float64
).This points that there may be wildly different needs in the same organization among services. So, this proposal tries to offer a solution that adapts better than just applying some sort of universal or average that doesn't apply to all services.
Pros
This approach has multiple advantages:
Cons
Some disadvantages are:
Benchmarks
Original: map-based
The following results show the impact, and prove the idea, that map allocations have in performance:
The average number of tags is based on the distribution obtained from our intake. Baseline is just setting a single tag over and over (
b.N
times) to map with size of 5. Allocations are caused by map's regrowth, and the average time corresponds to setting from 5 to 70 tags per span - selected randomly by the distribution-based number generator - onb.N
span.The upcoming Go
1.24
shows different results from previous versions due to swiss maps enabled by default.Other implementations considered
Other approaches have been explored:
unsafe
. Reference implementation here.These approaches were discarded because of their performance, complexity or type safety.
Reviewer's Checklist
Unsure? Have a question? Request a review!