-
-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes Scaphandre Deployment reporting 0 W #353
Comments
All nodes return 0.5 for this metric. I have installed the helm chart from the dev branch using the dev tag as well. Besides, something I also noted is that whenever I run the quick docker version (as in https://hubblo-org.github.io/scaphandre-documentation/tutorials/installation-linux) I also get reported 0W on one of the malfunctioning nodes. I feel like this has something to do with the container not being allowed to access the proper files, even though I have disabled all firewalls and used the command chmod 777 on both /sys/class/powercap and /proc (for testing purposes). I'm wondering why only one node is able to get the measurements correctly. |
Now I've tried to run the dev image locally and there is a warning |
It is indeed a permission issue. As I came back to office and typed "kubectl logs 'scaphandre pod'", this time a got a warning message stating: |
I think it had something to do with the containerd container runtime. In the project I'm taking part on we decided to change from containerd to CRI-O and the problem was solved afterwards. All nodes report sensible values now |
Hi, it seems related to #391 that has been merged in dev a few days ago. If anyone wants to give it a try with a containerd runtime that would be interesting.
This would be related to a intel-rapl module issue, not scaphandre itself.
This would be related (probably) to #391 |
Bug description
First of all, I would like to thank all the Scaphandre team for a tool like this. It has been extremely helpful so far! So, the bug consists on some nodes from my local k8s cluster reporting 0W of consumption. As way to illustrate the issue, there is a print screen from the official Scaphandre Grafana dashboard on the screenshot section.
Each color represents a node and, as you can see, 3 of them report 0W. The thing that is most intriguing is that if I run Scaphandre locally, I'm able to get actual values. There is also a print screen of the logs of a local execution of Scaphandre in one of those nodes reporting 0W in the k8s version.
As you can see, Scaphandre is able to obtain those metrics locally, however, the pods from the k8s cluster cannot. Doing "kubectl logs 'scaphandre pod ' " has been of no help since it just returns:
" Scaphandre prometheus exporter
Sending ⚡ metrics
Press CTRL-C to stop scaphandre "
And describing the pods does not return anything worth mentioning either.
It is relevant to note that the firewall is disabled on all cluster machines.
Could you give any insights on solving this, please?
To Reproduce
Expected behavior
The Grafana dashboard should report the same values obtained from the local execution rather than 0W
Screenshots
Environment
Additional context
One interesting aspect is that all of malfunctioning machines have been formatted quite recently so I'm guessing there might be a misconfiguration somewhere.
The text was updated successfully, but these errors were encountered: