-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High cardinality of Process Exporter metrics #289
Comments
Hi,
I don't have much to suggest I'm afraid. process-exporter fills a void in
the prometheus exporter ecosystem, but it's not going to be suitable to all
use cases, due to the high cardinality requirements.
You could define fewer named groups.
You could filter out some metrics you can live without.
You could split the load between multiple Prometheus instances.
I can't think of any obvious other solutions.
…On Wed, Feb 21, 2024 at 1:24 PM lamjob1993 ***@***.***> wrote:
*Hi!*
Our stack: Grafana + Mimir + Prometheus + Process Exporter in K8S.
Help me answer the question about the high cardinality of metrics:
*namedprocess_namegroup_memory_bytes namedprocess_namegroup_num_procs
namedprocess_namegroup_memory_bytes*
Our Prometheus cant handle the load because of these metrics. What advice
can you give on optimizing the high cardinality process-exporter? I also
noticed that these metrics keep their values in *process-exporter/metrics*
and until you restart the process-exporter process-exporter/metrics page
will be updated with outdated metric vlues.
*I am interested in solving the issue at the process-exporter level!*
*It is clear that the problem of high cardinality can be solved at the
Prometheus level, but this will be the next stage of optimization.*
I can't send metric logs because I work in a bank.
Thanks!
—
Reply to this email directly, view it on GitHub
<#289>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKUCJH6HE2D4Q5RVZPEFILYUY3WZAVCNFSM6AAAAABDTRLBNCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE2DONBSGMYDENI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
We have already deployed 1000 instances of Process exporter to monitor 1000 instances of Greenplum DB. So we will first try to solve the problem at the Process exporter level. Prometheus - this will be the next stage. Thanks for the answer! |
@lamjob1993 have you found a solution to this issue? |
We have found a solution within our process. We told our customer, who uses Process Exporter, that he is using the exporter logic incorrectly. He used the exporter to monitor user activity by processor time (CPU) on nodes, instead of using the Process Exporter for its intended purpose. This led to high cardinality and our Prometheus was drowning in data. Therefore, the solution to the problem turned out to be on the surface and closer than we thought. Customers do not always make adequate decisions. |
Hi!
Our stack: Grafana + Mimir + Prometheus + Process Exporter in K8S.
Help me answer the question about the high cardinality of metrics:
namedprocess_namegroup_memory_bytes
namedprocess_namegroup_num_procs
namedprocess_namegroup_memory_bytes
Our Prometheus cant handle the load because of these metrics. What advice can you give on optimizing the high cardinality process-exporter? I also noticed that these metrics keep their values in process-exporter/metrics and until you restart the process-exporter process-exporter/metrics page will be updated with outdated metric vlues.
I am interested in solving the issue at the process-exporter level!
It is clear that the problem of high cardinality can be solved at the Prometheus level, but this will be the next stage of optimization.
I can't send metric logs because I work in a bank.
Addition
Process exporter also collects the IDs of Greenplum DB users. About 20 metrics appear for each user account. After the user has finished working, 20 metrics for him are not deleted from the Process explorer page, as a result, Process exporter remembers all users. At the moment, it is treated only by restarting the Process exporter.
Thanks!
The text was updated successfully, but these errors were encountered: