Skip to content

Commit

Permalink
debugging: slides/labs: re-split kernel debugging
Browse files Browse the repository at this point in the history
Similarly to ebpf lab, the kernel lab is pretty big and covers a lot of
content. Re-split it as well into two parts: one part with all the
runtime checkers, and one part with kgdb/vmcore debugging. Once again,
it should not change the overall schedule of the training since the new
parts follow each other, so the agenda remains untouched.

Signed-off-by: Alexis Lothoré <[email protected]>
Reviewed-by: Luca Ceresoli <[email protected]>
  • Loading branch information
Tropicao committed Jan 2, 2025
1 parent 6fca0d6 commit 44f706c
Show file tree
Hide file tree
Showing 4 changed files with 113 additions and 93 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
\subchapter
{Kernel debugging}
{Objectives:
\begin{itemize}
\item Debugging locks and sleeps mistakes using {\em PROVE\_LOCKING} and {\em
DEBUG\_ATOMIC\_SLEEP} options.
\item Find a module memory leak using {\em kmemleak}.
\end{itemize}
}

\section{Locking and sleeps problems}

\kconfig{CONFIG_PROVE_LOCKING} and \kconfig{CONFIG_DEBUG_ATOMIC_SLEEP} have been
enabled in the provided kernel image.
First, compile the module on your development host using the following command line:

\begin{bashinput}
$ cd /home/$USER/debugging-labs/nfsroot/root/locking
$ export CROSS_COMPILE=/home/$USER/debugging-labs/buildroot/output/host/bin/arm-linux-
$ export ARCH=arm
$ export KDIR=/home/$USER/debugging-labs/buildroot/output/build/linux-%\workingkernel%/
$ make
\end{bashinput}

On the target, load the \code{locking.ko} module and look at the output in dmesg:

\begin{bashinput}
# cd /root/locking
# insmod locking_test.ko
# dmesg
\end{bashinput}

Once analyzed, unload the module. Try to understand and fix all the problems that
have been reported by the \code{lockdep} system.

\section{Kmemleak}

The provided kernel image contains kmemleak but it is disabled by default to
avoid having a large overhead. In order to enable it, reboot the target and enable
kmemleak by adding \code{kmemleak=on} on the command line. Interrupt U-Boot at
reboot and modify the \code{bootargs} variable:

\begin{bashinput}
STM32MP> env edit bootargs
STM32MP> <existing bootargs> kmemleak=on
STM32MP> boot
\end{bashinput}

Then compile the dummy test module on your development host:

\begin{bashinput}
$ cd /home/$USER/debugging-labs/nfsroot/root/kmemleak
$ export CROSS_COMPILE=/home/$USER/debugging-labs/buildroot/output/host/bin/arm-linux-
$ export ARCH=arm
$ export KDIR=/home/$USER/debugging-labs/buildroot/output/build/linux-%\workingkernel%/
$ make
\end{bashinput}

On the target, load the \code{kmemleak_test.ko} and trigger an immediate
kmemleak scan using:

\begin{bashinput}
# cd /root/kmemleak
# insmod kmemleak_test.ko
# rmmod kmemleak_test
# echo scan > /sys/kernel/debug/kmemleak
\end{bashinput}

Note that you might need to run the \code{scan} command several times
before it detects a leakage due to memory still containing references to
the leaked pointer. Soon after that, the kernel will report that some leaks
have been identified. Display them and analyze them using:

\begin{bashinput}
# cat /sys/kernel/debug/kmemleak
\end{bashinput}

You will see that the symbols addresses do not make sense. This is due to the
\code{kptr_restrict} configuration which must be change to allow displaying
pointer addresses. To do so, use the following command on the target:

\begin{bashinput}
# sysctl kernel.kptr_restrict=1
\end{bashinput}

You can use \code{addr2line} to identify the location in source code of the
lines that did cause the reports. You may need to substract module loading address:
you can guess the address by taking a look at \code{/proc/modules} while the module
is loaded.

You may also notice other memory leaks that are actually some real memory leaks
that did exist in the kernel version used for this training !

Once the lab is done, don't forget to remove \code{kmemleak=on} from your
kernel commandline.

Original file line number Diff line number Diff line change
Expand Up @@ -2,98 +2,12 @@
{Kernel debugging}
{Objectives:
\begin{itemize}
\item Debugging locks and sleeps mistakes using {\em PROVE\_LOCKING} and {\em
DEBUG\_ATOMIC\_SLEEP} options.
\item Find a module memory leak using {\em kmemleak}.
\item Analyzing an {\em oops}.
\item Debugging with {\em KGDB}.
\item Setting up {\em Kexec \& kdump}.
\item Analyzing an {\em oops}.
\item Debugging with {\em KGDB}.
\item Setting up {\em Kexec \& kdump}.
\end{itemize}
}

\section{Locking and sleeps problems}

\kconfig{CONFIG_PROVE_LOCKING} and \kconfig{CONFIG_DEBUG_ATOMIC_SLEEP} have been
enabled in the provided kernel image.
First, compile the module on your development host using the following command line:

\begin{bashinput}
$ cd /home/$USER/debugging-labs/nfsroot/root/locking
$ export CROSS_COMPILE=/home/$USER/debugging-labs/buildroot/output/host/bin/arm-linux-
$ export ARCH=arm
$ export KDIR=/home/$USER/debugging-labs/buildroot/output/build/linux-%\workingkernel%/
$ make
\end{bashinput}

On the target, load the \code{locking.ko} module and look at the output in dmesg:

\begin{bashinput}
# cd /root/locking
# insmod locking_test.ko
# dmesg
\end{bashinput}

Once analyzed, unload the module. Try to understand and fix all the problems that
have been reported by the \code{lockdep} system.

\section{Kmemleak}

The provided kernel image contains kmemleak but it is disabled by default to
avoid having a large overhead. In order to enable it, reboot the target and enable
kmemleak by adding \code{kmemleak=on} on the command line. Interrupt U-Boot at
reboot and modify the \code{bootargs} variable:

\begin{bashinput}
STM32MP> env edit bootargs
STM32MP> <existing bootargs> kmemleak=on
STM32MP> boot
\end{bashinput}

Then compile the dummy test module on your development host:

\begin{bashinput}
$ cd /home/$USER/debugging-labs/nfsroot/root/kmemleak
$ export CROSS_COMPILE=/home/$USER/debugging-labs/buildroot/output/host/bin/arm-linux-
$ export ARCH=arm
$ export KDIR=/home/$USER/debugging-labs/buildroot/output/build/linux-%\workingkernel%/
$ make
\end{bashinput}

On the target, load the \code{kmemleak_test.ko} and trigger an immediate
kmemleak scan using:

\begin{bashinput}
# cd /root/kmemleak
# insmod kmemleak_test.ko
# rmmod kmemleak_test
# echo scan > /sys/kernel/debug/kmemleak
\end{bashinput}

Note that you might need to run the \code{scan} command several times
before it detects a leakage due to memory still containing references to
the leaked pointer. Soon after that, the kernel will report that some leaks
have been identified. Display them and analyze them using:

\begin{bashinput}
# cat /sys/kernel/debug/kmemleak
\end{bashinput}

You will see that the symbols addresses do not make sense. This is due to the
\code{kptr_restrict} configuration which must be change to allow displaying
pointer addresses. To do so, use the following command on the target:

\begin{bashinput}
# sysctl kernel.kptr_restrict=1
\end{bashinput}

You can use \code{addr2line} to identify the location in source code of the
lines that did cause the reports. You may need to substract module loading address:
you can guess the address by taking a look at \code{/proc/modules} while the module
is loaded.

You will also notice other memory leaks that are actually some real memory leaks
that did exist in the kernel version used for this training !

\section{OOPS analysis}
We noticed that the watchdog command generated a crash on the kernel. In order
to reproduce the crash, run the following command on the target:
Expand Down
3 changes: 2 additions & 1 deletion mk/debugging.mk
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,5 @@ DEBUGGING_LABS = \
debugging-system-wide-profiling \
debugging-ebpf-bcc \
debugging-ebpf-libbpf \
debugging-kernel-debugging
debugging-kernel-debugging-frameworks \
debugging-kernel-debugging-kgdb
15 changes: 12 additions & 3 deletions slides/debugging-kernel-debugging/debugging-kernel-debugging.tex
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,17 @@ \subsection{Built-in kernel self tests}

\input{../common/prove-locking.tex}

\setuplabframe
{Kernel debugging}
{
Debugging kernel programming mistakes with integrated frameworks
\begin{itemize}
\item Debug locking issues using lockdep
\item Spot function calls in invalid context
\item Use kmemleak to detect memory leaks on the system
\end{itemize}
}

\subsection{KGDB}

\input{../common/kgdb.tex}
Expand Down Expand Up @@ -696,10 +707,8 @@ \subsection{Post-mortem analysis}
\setuplabframe
{Kernel debugging}
{
Debugging kernel crashes and driver problems
Debugging kernel crashes either at runtime or post-mortem
\begin{itemize}
\item Debug locking issues using lockdep
\item Use kmemleak to detect memory leaks on the system
\item Analyze an OOPS message
\item Debug a crash with KGDB
\item Setup kexec, kdump and extract a kernel coredump
Expand Down

0 comments on commit 44f706c

Please sign in to comment.