Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High cardinality of Process Exporter metrics #289

Open
lamjob1993 opened this issue Feb 21, 2024 · 4 comments
Open

High cardinality of Process Exporter metrics #289

lamjob1993 opened this issue Feb 21, 2024 · 4 comments

Comments

@lamjob1993
Copy link

lamjob1993 commented Feb 21, 2024

Hi!

Our stack: Grafana + Mimir + Prometheus + Process Exporter in K8S.

Help me answer the question about the high cardinality of metrics:

namedprocess_namegroup_memory_bytes
namedprocess_namegroup_num_procs
namedprocess_namegroup_memory_bytes

Our Prometheus cant handle the load because of these metrics. What advice can you give on optimizing the high cardinality process-exporter? I also noticed that these metrics keep their values in process-exporter/metrics and until you restart the process-exporter process-exporter/metrics page will be updated with outdated metric vlues.

I am interested in solving the issue at the process-exporter level!

It is clear that the problem of high cardinality can be solved at the Prometheus level, but this will be the next stage of optimization.

I can't send metric logs because I work in a bank.

Addition
Process exporter also collects the IDs of Greenplum DB users. About 20 metrics appear for each user account. After the user has finished working, 20 metrics for him are not deleted from the Process explorer page, as a result, Process exporter remembers all users. At the moment, it is treated only by restarting the Process exporter.

Thanks!

@ncabatoff
Copy link
Owner

ncabatoff commented Feb 21, 2024 via email

@lamjob1993
Copy link
Author

We have already deployed 1000 instances of Process exporter to monitor 1000 instances of Greenplum DB. So we will first try to solve the problem at the Process exporter level. Prometheus - this will be the next stage. Thanks for the answer!

@StefanSander3
Copy link

@lamjob1993 have you found a solution to this issue?

@lamjob1993
Copy link
Author

@lamjob1993 have you found a solution to this issue?

We have found a solution within our process. We told our customer, who uses Process Exporter, that he is using the exporter logic incorrectly.

He used the exporter to monitor user activity by processor time (CPU) on nodes, instead of using the Process Exporter for its intended purpose. This led to high cardinality and our Prometheus was drowning in data. Therefore, the solution to the problem turned out to be on the surface and closer than we thought.

Customers do not always make adequate decisions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants