-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add matcher cache for proxy store and tsdb store API #111
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
} | ||
|
||
func NewMatcherConverter(cacheCapacity int, reg prometheus.Registerer) (*MatcherConverter, error) { | ||
c, err := cache.New2Q[LabelMatcher, *labels.Matcher](cacheCapacity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice findings. TwoQueue cache fits out use case very well.
NoCache CacheAction_Type = 2 | ||
) | ||
|
||
func TestMatcherConverter_MatchersToPromMatchers(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome job to benchmark this, would encourage to submit a PR to OSS as well :)
pkg/store/storepb/custom.go
Outdated
@@ -381,31 +384,112 @@ func PromMatchersToMatchers(ms ...*labels.Matcher) ([]LabelMatcher, error) { | |||
return res, nil | |||
} | |||
|
|||
func MatcherToPromMatcher(m LabelMatcher) (*labels.Matcher, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: lower case this function if we are not expose it outside.
} | ||
}) | ||
|
||
b.Run("With Cache", func(b *testing.B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could possible to add some random cache miss cases like regex randomkey[0-4]
and generate test data randomkey[0-9]
so it will have 50% hit miss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome job @yuchen-db , highly encourage to propose this optimization to OSS once we've tested in our setup, a few comments on the styling and better code structure.
@@ -488,8 +488,14 @@ func (p *PrometheusStore) startPromRemoteRead(ctx context.Context, q *prompb.Que | |||
|
|||
// matchesExternalLabels returns false if given matchers are not matching external labels. | |||
// If true, matchesExternalLabels also returns Prometheus matchers without those matching external labels. | |||
func matchesExternalLabels(ms []storepb.LabelMatcher, externalLabels labels.Labels) (bool, []*labels.Matcher, error) { | |||
tms, err := storepb.MatchersToPromMatchers(ms...) | |||
func matchesExternalLabels(ms []storepb.LabelMatcher, externalLabels labels.Labels, mc *storepb.MatcherConverter) (bool, []*labels.Matcher, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, this is much cleaner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice change!
@@ -1097,6 +1136,8 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) { | |||
Default("10000").Uint64Var(&rc.topMetricsMinimumCardinality) | |||
cmd.Flag("receive.top-metrics-update-interval", "The interval at which the top metrics are updated."). | |||
Default("5m").DurationVar(&rc.topMetricsUpdateInterval) | |||
cmd.Flag("receive.store-matcher-converter-cache-capacity", "The number of label matchers to cache in the matcher converter for the Store API. Set to 0 to disable to cache. Default is 0."). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious how do we decide this cache size about how many matcher in each region?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use a constant 30k cache size across all regions. I tried 2k, 30k and 100k, and 30k gives great cache hit rate (99.8%) with negligible memory footprint in oregon-dev.
bechmark shows 100x speedup if hit rate is 100%. More tests in https://github.com/databricks-eng/universe/pull/837859
goos: darwin
goarch: arm64
pkg: github.com/thanos-io/thanos/pkg/store/storepb
cpu: Apple M1 Max
BenchmarkMatcherConverter_REWithAndWithoutCache
BenchmarkMatcherConverter_REWithAndWithoutCache/Without_Cache
BenchmarkMatcherConverter_REWithAndWithoutCache/Without_Cache-10 29792 42061 ns/op
BenchmarkMatcherConverter_REWithAndWithoutCache/With_Cache
BenchmarkMatcherConverter_REWithAndWithoutCache/With_Cache-10 3147709 371.9 ns/op
PASS