ratelimit: Overhaul metrics for the our existing rate limits #7054

beautifulentropy · 2023-08-25T22:12:14Z

Use constants for each rate limit name to ensure consistency when labeling metrics
Consistently check .Enabled() outside of each limit check RA method
Replace the existing checks counter with a latency histogram

Note for reviewers: Previous to this PR some of errors emitted by rate limit check methods were being counted as denials in the metrics and logs. Please ensure that these changes are desirable.

Part of #5545

pgporada · 2023-08-29T18:57:09Z

ra/ra.go

-	}, []string{"limit", "result"})
-	stats.MustRegister(rateLimitCounter)
+	rlCheckLatency := prometheus.NewHistogramVec(prometheus.HistogramOpts{
+		Name: "ratelimitsv1_check_latency_seconds",


I think it's intended that rate limits names between v1 ratelimit and v2 ratelimits packages will be identical. You could add a label version="v1" or version="v2" rather than have two separate metrics.

I had precisely this idea myself, however:

My understanding is that once a time series is collected by Prometheus with a specific label set, you cannot change its labels retroactively. So if we use a label like 'version', which has a pretty short lifetime, we're stuck with it unless we change the name of the time-series.

The key-value version of this histogram is very-likely to have much smaller buckets.

If I'm wrong on either of these points, please let me know. Generally though, this choice was very-much intentional.

Got it, reading the prometheus docs states that to change the old metrics we'd need to do relabeling and at that point just make a new metric.

Something about this conclusion strikes me as unlikely or surprising. IIRC it's totally possible to set metric datapoints without specifying values for every label they have. And old data points which have the "version" label set will age out of the grafana database at the same speed as old datapoints from this whole metric would.

Most of what grafana/prometheus are talking about when they talk about "relabeling" is transforming the labels on one metric to match the labels on another metric so that you can query both of them at the same time. That's definitely a pain, but I don't think we'd need to do that here. We'd simply remove the "version" label from this code, stop exporting it, and the timeseries database will catch up when the old datapoints eventually fall out.

I think? Maybe I'm totally wrong.

I'd prefer to leave this alone rather than juggle histogram buckets and labels for v1 and v2 in the same time series. I could also omit v2 when I add the corresponding time-series to key-value rate limits.

ratelimit/rate-limits.go

ra/ra.go

aarongable · 2023-08-31T00:28:55Z

ra/ra.go

-	}, []string{"limit", "result"})
-	stats.MustRegister(rateLimitCounter)
+	rlCheckLatency := prometheus.NewHistogramVec(prometheus.HistogramOpts{
+		Name: "ratelimitsv1_check_latency_seconds",


Something about this conclusion strikes me as unlikely or surprising. IIRC it's totally possible to set metric datapoints without specifying values for every label they have. And old data points which have the "version" label set will age out of the grafana database at the same speed as old datapoints from this whole metric would.

Most of what grafana/prometheus are talking about when they talk about "relabeling" is transforming the labels on one metric to match the labels on another metric so that you can query both of them at the same time. That's definitely a pain, but I don't think we'd need to do that here. We'd simply remove the "version" label from this code, stop exporting it, and the timeseries database will catch up when the old datapoints eventually fall out.

I think? Maybe I'm totally wrong.

aarongable · 2023-09-07T02:40:23Z

ra/ra.go

 			return err
 		}
+		ra.rlCheckLatency.WithLabelValues(ratelimit.CertificatesPerFQDNSetFast, ratelimits.Allowed).Observe(elapsed.Seconds())
 	}

 	fqdnLimits := ra.rlPolicies.CertificatesPerFQDNSet()
 	if fqdnLimits.Enabled() {


Not the change for it, but can't we get rid of the not-fast version of the CertsPerFQDNSet limit by now?

We're still setting a limit and an override for this in our integration and unit tests, so I'm not sure that statement is true. I could certainly dig into it more though.

ratelimit: Remove deprecated PendingOrdersPerAccount

8deb67f

beautifulentropy force-pushed the limits-v1-metric-overhaul branch 5 times, most recently from 705d1a5 to d2ef470 Compare August 28, 2023 16:50

ratelimit: Overhaul the way we observe existing rate limits

e080c1a

beautifulentropy force-pushed the limits-v1-metric-overhaul branch from d2ef470 to e080c1a Compare August 28, 2023 16:53

beautifulentropy mentioned this pull request Aug 28, 2023

Key-value based rate limiting #5545

Closed

18 tasks

beautifulentropy marked this pull request as ready for review August 28, 2023 17:02

beautifulentropy requested a review from a team as a code owner August 28, 2023 17:02

beautifulentropy requested a review from pgporada August 28, 2023 17:02

pgporada reviewed Aug 29, 2023

View reviewed changes

beautifulentropy requested review from pgporada and a team August 30, 2023 16:37

pgporada reviewed Aug 30, 2023

View reviewed changes

pgporada requested a review from a team August 30, 2023 20:00

pgporada previously approved these changes Aug 30, 2023

View reviewed changes

aarongable reviewed Aug 31, 2023

View reviewed changes

Address comments and beef up the metrics test.

1f1813a

beautifulentropy dismissed pgporada’s stale review via 1f1813a September 6, 2023 18:54

beautifulentropy requested review from pgporada and aarongable September 6, 2023 18:54

aarongable approved these changes Sep 7, 2023

View reviewed changes

beautifulentropy mentioned this pull request Sep 8, 2023

ratelimit: Add an override usage gauge #7076

Merged

pgporada approved these changes Sep 11, 2023

View reviewed changes

beautifulentropy merged commit 636d30f into main Sep 11, 2023

beautifulentropy deleted the limits-v1-metric-overhaul branch September 11, 2023 19:06

beautifulentropy mentioned this pull request Sep 18, 2023

WFE: Add new key-value ratelimits implementation #7089

Merged

beautifulentropy mentioned this pull request May 27, 2024

Cleanup Rate Limit Metrics #6466

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ratelimit: Overhaul metrics for the our existing rate limits #7054

ratelimit: Overhaul metrics for the our existing rate limits #7054

Uh oh!

beautifulentropy commented Aug 25, 2023 •

edited

Loading

Uh oh!

pgporada Aug 29, 2023

Uh oh!

beautifulentropy Aug 30, 2023 •

edited

Loading

Uh oh!

pgporada Aug 30, 2023

Uh oh!

aarongable Aug 31, 2023

Uh oh!

beautifulentropy Sep 6, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aarongable Aug 31, 2023

Uh oh!

aarongable Sep 7, 2023

Uh oh!

beautifulentropy Sep 7, 2023

Uh oh!

Uh oh!

Uh oh!

ratelimit: Overhaul metrics for the our existing rate limits #7054

ratelimit: Overhaul metrics for the our existing rate limits #7054

Uh oh!

Conversation

beautifulentropy commented Aug 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pgporada Aug 29, 2023

Choose a reason for hiding this comment

Uh oh!

beautifulentropy Aug 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pgporada Aug 30, 2023

Choose a reason for hiding this comment

Uh oh!

aarongable Aug 31, 2023

Choose a reason for hiding this comment

Uh oh!

beautifulentropy Sep 6, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aarongable Aug 31, 2023

Choose a reason for hiding this comment

Uh oh!

aarongable Sep 7, 2023

Choose a reason for hiding this comment

Uh oh!

beautifulentropy Sep 7, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

beautifulentropy commented Aug 25, 2023 •

edited

Loading

beautifulentropy Aug 30, 2023 •

edited

Loading