-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(bpf): exclude swapper process bpf_cpu_time #1830
Conversation
🤖 SeineSailor Here's a concise summary of the pull request changes: Summary: This pull request, "fix(bpf): exclude swapper process bpf_cpu_time", introduces a configuration option to exclude the swapper process's Key Modifications:
Impact: These changes affect the internal implementation of the code, but do not alter the external interface or behavior. Observations:
Suggestions for Improvement:
|
Instead of disabling idle time accounting, I am thinking of using the idle time to calculate the idle power as the next step. |
@@ -89,6 +89,11 @@ func UpdateProcessBPFMetrics(bpfExporter bpf.Exporter, processStats map[uint64]* | |||
for _, ct := range processesData { | |||
comm := C.GoString((*C.char)(unsafe.Pointer(&ct.Comm))) | |||
|
|||
if ct.Pid == 0 && config.ExcludeSwapperProcess() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about add a new metrics called idle cpu time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a good idea and aligns with user / system / idle time in https://man7.org/linux/man-pages/man5/proc_stat.5.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vimalk78 with this patch are you able to get sum(rate(kepler_process_bpf_cpu_time_ms_total[20s]))) == sum(rate(node_cpu_seconds_total{mode=~"user|system"}[20s]))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
841f681
to
0ff4209
Compare
Signed-off-by: Vimal Kumar <[email protected]>
0ff4209
to
d505dc8
Compare
I am merging this now and will come up with a new idle time metrics PR. |
@vimalk78 @sthaha @KaiyiLiu1234 @sunya-ch I tested this idle time exclusion in metal CI. There is improvement on SGD trainer but not much on exponential or logarithmic trainer. Please find the new config parameter in kepler metal and VM systemd. In the trained model validation report, those in Nov 01 are without this change, and those in Nov 04 are with this new config param. The SGD trainer MAPE sees improvement from 51% to 21% |
We should see improvement in all models in theory. @sunya-ch is there a way to plot or export all metrics used for training the model ? |
when cpu is idle, swapper process is scheduled, and kepler adds its bpf_cpu_time to kernel processes
adding a configuration to exclude swapper process
with-swapper

without-swapper
