-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add prometheus metrics #431
Add prometheus metrics #431
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the plan was to use promauto? This feels like a lot of machinery to me
var activeRequests = promauto.NewGauge(prometheus.GaugeOpts{ | ||
Name: "active_requests", | ||
Help: "The number of requests being handled by the service.", | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in statsd this is a "runtime gauge" instead of a plain gauge. a "runtime gauge" comes with two tags, the hostname and the pid, and auto add "runtime." prefix.
then we have some special handling on telegraf on everything under runtime.
prefix: we strip the 2 special tags, and calculate the average/max/min/etc. from all the pods reporting this gauge. this is kind of like treating it as a histogram instead of a gauge.
we probably should have an internal spec discussion on how do we want to handle runtime gauges on prometheus first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't really exist the same way in Prometheus. This is because Prometheus always includes the instance
of the monitored target in every metric. For Go, there is only ever one process, so effectively every metric includes those tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds like me might want to add pid
based on this: https://github.snooguts.net/reddit/baseplate.spec/pull/39/files#diff-ebe820b9ffcd96083b5e0695835dbada2bb1d2b3b158092ccbb947c95779129bR16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That part of the spec needs to be corrected for Prometheus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the PID part is only for einhorn-esque things
@@ -63,7 +63,8 @@ type Config struct { | |||
// with the global tracing hook registry. | |||
func InitFromConfig(ctx context.Context, cfg Config) io.Closer { | |||
M = NewStatsd(ctx, cfg) | |||
tracing.RegisterCreateServerSpanHooks(CreateServerSpanHook{Metrics: M}) | |||
pm := NewPrometheusMetrics(ctx, cfg) | |||
tracing.RegisterCreateServerSpanHooks(CreateServerSpanHook{Metrics: M, PrometheusMetrics: pm}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a style nit, I feel that the only appropriate case of writing a whole struct in one line is when you only fill zero or one of its fields. anything >1 should be write in one-per-line instead:
tracing.RegisterCreateServerSpanHooks(CreateServerSpanHook{Metrics: M, PrometheusMetrics: pm}) | |
tracing.RegisterCreateServerSpanHooks(CreateServerSpanHook{ | |
Metrics: M, | |
PrometheusMetrics: pm, | |
}) |
) | ||
|
||
var activeRequests = promauto.NewGauge(prometheus.GaugeOpts{ | ||
Name: "active_requests", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this specific to a protocol? Typically for this metric we would prefix it with the protocol.
This can be done with the Namespace
. Something like this:
prometheus.GaugeOpts{
Namespace: "http",
Name: "active_requests",
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of adding a label with transport type like stated here just until the baseplate.go remodel next year when tracing and metrics are decoupled. For simplicity, I was going to leave this as is, but I can look into whats involved to move over into the specific protocol metrics that I'm adding shortly if thats better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are doing per-transport active_requests then the server span hook is no longer the correct place to do it, because server span hook runs on all servers and has no knowledge which transport it is in. this needs to be done on the server middleware instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds like the best thing to do is move this to the per transport server middleware. will make that change.
@@ -35,21 +38,23 @@ func (h CreateServerSpanHook) OnCreateServerSpan(span *tracing.Span) error { | |||
// ends, with success=True/False tags for the counter based on whether an error | |||
// was passed to `span.End` or not. | |||
type spanHook struct { | |||
metrics *Statsd | |||
metrics *Statsd | |||
prometheusMetrics *PrometheusMetrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I've seen, the "best practice" for injecting prometheus metrics (especially when there's just one, like "active_requests" is to pass in the prometheus.Counter
-- that also could allow you to do the "active_requests" per-transport here potentially, using prometheus.CounterVec.With
to pre-populate the label.
Leaving this out to just do in the per-transport section also sounds fine if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the details. sounds like i will close this PR and add active requests to the per transport PRs.
Closing this to add active_request to the per transport server middleware instead. |
This PR is for issue #378 and adds the beginning scaffolding to export metrics for Prometheus. Ultimately we will fully migrate over to Prometheus instead of using the current Statsd metrics, but in the meantime we will need to support both.
The baseplate spec specifies that an
active_requests
metrics must be (ref: spec) emitted so this PR adds that as the first custom metric.The
/metrics
endpoint has been added to the golang baseplate code via thebaseplate-cookiecutter
repo in this PR.The next work that will be done in a different PR is to add the scaffolding that allows applications to generate custom metrics for their own purposes.