-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kpatch/kpatch: Optimize patch transition timing with a while loop #1408
Conversation
issues: #1409 |
@joe-lawrence @jpoimboe PTAL. Thanks. |
Hi @oceansue , a few questions and comments about the PR:
|
Optimize the timing of patch transitions by using a `while` loop to check the transition status, replacing the `for` loop with a 1-second sleep. This approach reduces the likelihood of timeouts due to system load, potentially speeding up patch transitions. Signed-off-by: oceansue <[email protected]> Signed-off-by: Kernel Group <[email protected]>
9453dba
to
9dde564
Compare
Hi @joe-lawrence ,Thank you for your response:
I‘ve been considering whether there might be a better solution for this issue (#1409) on my system. But so far, This patch seems to be working well. However, It does result in a short-term increase in CPU load. |
Hi @oceansue :
Given the long transition time reported in #1409, I might have expected a change like: diff --git a/kpatch/kpatch b/kpatch/kpatch
index edfccfead917..0ca6aaf78fd1 100755
--- a/kpatch/kpatch
+++ b/kpatch/kpatch
@@ -26,9 +26,9 @@
INSTALLDIR=/var/lib/kpatch
SCRIPTDIR="$(readlink -f "$(dirname "$(type -p "$0")")")"
VERSION="0.9.9"
-POST_ENABLE_WAIT=15 # seconds
-POST_SIGNAL_WAIT=60 # seconds
-MODULE_REF_WAIT=15 # seconds
+POST_ENABLE_WAIT="${POST_ENABLE_WAIT:-15}" # seconds
+POST_SIGNAL_WAIT="${POST_SIGNAL_WAIT:-60}" # seconds
+MODULE_REF_WAIT="${MODULE_REF_WAIT:-15}" # seconds
# How many times to try loading the patch if activeness safety check fails.
MAX_LOAD_ATTEMPTS=5 which would let the user set |
I've been trying to fix this recently, but unfortunately, I haven't made any progress. It seems that this commit is the only one that can solve the problem I'm encountering. This is the patch I'm considering. diff --git a/kpatch/kpatch b/kpatch/kpatch
index edfccfe..815dacb 100755
--- a/kpatch/kpatch
+++ b/kpatch/kpatch
@@ -245,29 +245,32 @@ signal_stalled_processes() {
wait_for_patch_transition() {
local module="$1"
- local i
+ local i=0
in_transition "$module" || return 0
echo "waiting (up to $POST_ENABLE_WAIT seconds) for patch transition to complete..."
- for (( i=0; i<POST_ENABLE_WAIT; i++ )); do
+ start_time=$SECONDS
+ while [ $i -lt $POST_ENABLE_WAIT ]; do
if ! in_transition "$module" ; then
echo "transition complete ($i seconds)"
+ sleep 1s
return 0
fi
- sleep 1s
+ i=$(( SECONDS - start_time ))
done
echo "patch transition has stalled!"
signal_stalled_processes
echo "waiting (up to $POST_SIGNAL_WAIT seconds) for patch transition to complete..."
- for (( i=0; i<POST_SIGNAL_WAIT; i++ )); do
+ start_time=$SECONDS
+ while [ $i -lt $POST_SIGNAL_WAIT ]; do
if ! in_transition "$module" ; then
echo "transition complete ($i seconds)"
return 0
fi
- sleep 1s
+ i=$(( SECONDS - start_time ))
done
return 1 |
@oceansue : with the minimal patch in place, what is reported by "transition complete ($i seconds)" ? Also, can you attach/inline an example kpatch that requires the modification? Thanks. |
Test environment: use #1408 (comment) patch: sudo ./kpatch load ../kpatch-build/livepatch-cmdline.ko
loading patch module: ../kpatch-build/livepatch-cmdline.ko
waiting (up to 15 seconds) for patch transition to complete...
patch transition has stalled!
signaling stalled process(es):
waiting (up to 60 seconds) for patch transition to complete...
transition complete (11 seconds)
ocean@ocean-PC:~/kpatch/kpatch$ sudo ./kpatch unload ../kpatch-build/livepatch-cmdline.ko
disabling patch module: livepatch_cmdline
waiting (up to 15 seconds) for patch transition to complete...
patch transition has stalled!
signaling stalled process(es):
waiting (up to 60 seconds) for patch transition to complete...
transition complete (8 seconds)
unloading patch module: livepatch_cmdline revert #1408 (comment) patch: ocean@ocean-PC:~/kpatch/kpatch$ sudo ./kpatch load ../kpatch-build/livepatch-cmdline.ko
module already loaded, re-enabling
waiting (up to 15 seconds) for patch transition to complete...
patch transition has stalled!
signaling stalled process(es):
waiting (up to 60 seconds) for patch transition to complete...
Stalled processes:
module livepatch_cmdline did not complete its transition, disabling...
waiting (up to 15 seconds) for patch transition to complete...
transition complete (2 seconds)
kpatch: error: failed to re-enable module livepatch_cmdline (transition stalled), patch disabled
For other CPUs, if the transition can be completed in 15 seconds, the patch will have no effect. If it cannot be completed in more than 15 seconds, the program will enter a 60 second transition period during which the program will frequently access the sys interface to speed up the transition. Also, I found that the bug seems to be caused by multiple cores, and when I limit the number of cores to less than 8, the transformation can be done successfully. When the number of cores is more than 8, the transformation doesn't go so well and requires this patch to fix it.I was unable to find a multi-core (>32) processor with x86 architecture to verify. |
@jpoimboe - can you have a look at this report/patch, I think I'm just missing something on how/why it works. |
Hm? Look closer, the patch actually removes the 1s sleep even for the first 15 seconds. Hammering the CPU with sysfs reads is not going to be an acceptable solution. Is this with the standard cmdline-string.patch? If so, I suspect something is going wrong on the kernel side, possibly livepatch itself. Maybe the unwinder is encountering errors. Kernel 4.19 is ancient, who knows what it could be... |
@joe-lawrence @jpoimboe Thanks for the replies, the problem has been solved, according to the new commit from kernel.org, after applying it to the 4.19.0 kernel. Close the issue. Thanks again. |
Optimize the timing of patch transitions by using a
while
loop to check the transition status, replacing thefor
loop with a 1-second sleep. This approach reduces the likelihood of timeouts due to system load, potentially speeding up patch transitions.