Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does linux profiling add NOP instruction before a function? #113

Open
jetlime opened this issue Jan 19, 2025 · 1 comment
Open

Why does linux profiling add NOP instruction before a function? #113

jetlime opened this issue Jan 19, 2025 · 1 comment

Comments

@jetlime
Copy link

jetlime commented Jan 19, 2025

I find the explanation regarding the eBPF trampolines (https://docs.ebpf.io/linux/concepts/trampolines/) a bit confusing because it is not stated why the NOP instructions are there in the first place. Why does profiling require to have additional NOP instructions before the begnining of all kernel functions.
To be able to use fentry/fexit programs, is it required to compile the kernel using the CONFIG_FUNCTION_TRACER flag?

Since kprobe does not modify the kernel source code at runtime, but solely when the eBPF program is loaded into the kernel, then why is it considered to result in more overhead than fentry/fexit?

@dylandreimerink
Copy link
Collaborator

dylandreimerink commented Jan 30, 2025

Thank you for the feedback, I will try to make the page clearer.

it is not stated why the NOP instructions are there in the first place. Why does profiling require to have additional NOP instructions before the begnining of all kernel functions.

The reason to have the NOP instructions is so you can replace them. When they are added your assembly might look something like:

# Symbol where instructions like: CALL some_kernel_func will jump to.
some_kernel_func:
  NOP
  NOP
# actual start of the function
  PUSH RBP
  MOV RSP, RBP
  ....
  RET

The NOPs are ignored by the CPU. So if we never touch them, it is as if they were never there. But when you attach a fentry/fexit program the following happens:

# Symbol where instructions like: CALL some_kernel_func will jump to.
some_kernel_func:
  CALL some_kernel_func_trampoline
  RET
# actual start of the function
  PUSH RBP
  MOV RSP, RBP
  [...]
  RET

some_kernel_func_trampoline:
  [...]
  CALL jited_fentry_program
  [...]
  CALL some_kernel_func + 5
  RET

We replaced the NOPs, and will call into some_kernel_func_trampoline instead. Code is generated at runtime to do everything described in https://docs.ebpf.io/linux/concepts/trampolines/#bpf-trampolines. It will first call any fentry program. Then it will call the original function. Then any fexit program. After that we return to the caller.

To be able to use fentry/fexit programs, is it required to compile the kernel using the CONFIG_FUNCTION_TRACER flag?

Yes, correct. This flag controls if the NOPs are added or not. Without the NOPs, program types that depend on trampolines will not work.

Since kprobe does not modify the kernel source code at runtime, but solely when the eBPF program is loaded into the kernel, then why is it considered to result in more overhead than fentry/fexit?

A kprobe actually does modify the CPU instructions at runtime as well, just different. Lets take our example:

some_kernel_func:
  PUSH RBP
  MOV RSP, RBP
  ....
  RET

A kprobe replaces an actual instruction part of the function, not a NOP like so:

some_kernel_func:
  INT3
  MOV RSP, RBP
  ....
  RET

When this INT3 instruction (on x86) gets called, it triggers a CPU interrupt. So the CPU jumps to an interrupt service routine. Which copies all of the CPU state into memory (pt_regs). It then calls your kprobe program. It then copies back most of the state into the CPU. The kprobe system remembered which instruction it replaced and will execute that instruction now. It then jumps back the next instruction in the program we originally came from.

The copying of all of the CPU state back and forth is why kprobes are more expensive. The tramlines only add the overhead of a function call and its glue logic which is way less.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants