-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a way to prevent unbounded metric growth #197
Comments
In your above example, rolling out new binaries, can't you add a hook, that on removing a binary, calls |
Related discussion: prometheus/client_golang#920 |
We could know a few of the labels, but not all of them. Consider a case of a network proxy acting on behalf of a few clients. We may have something like
etc. In real world its more than 3 labels as well. if we had something like family.retain_if(|(labelset)| {
labelset.source == "app-v1"
}) or similar |
Thank you for prototyping this idea in #196. I am still not sure I fully understand either the use-case itself, or why that is not supported by
Given that you must have at some point called |
I could but then I need to effectively maintain my own mirror of Family to keep track; we might as well just not use Family at all at that point and fork
To make it more concrete, I am building a Kubernetes node proxy. For all intents and purposes you could call it like kube-proxy. Our metrics have |
We are interested in a mechanism to control unbounded growth of metrics. While we generally follow best practices around limiting cardinality, for extreme long lived processes this is still problematic. For instance, its common to record the binary version of something in a metric, but with 100s of rollouts over days or months, these can explode in time series if the metrics collection is never restarted.
We would like some way to control this in our application.
Currently, there is a a
.clear()
and.remove()
. These are good building blocks, but I am not sure they are sufficient on their own.remove()
is challenging on its own because we don't have any way to understand the entire set of labels stored in the metric at any point. In theory you could useEncodeMetric::encode
and parse the results, but that is quite hacky.clear()
is also challenging, because it is all or nothing.Ideally, I think we would have some interface like:
(remove any metrics not modified for an hour)
This would require a method on the family, but also maybe some changes on the metric type as well to make this easier to encode.
In #196 I have put up a small draft of what this could look like, but very open to alternatives
The text was updated successfully, but these errors were encountered: