Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linux/concepts: Add trampolines concept page #110

Merged
merged 1 commit into from
Jan 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .aspell.en.pws
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ Fexit
fexit
Fmodify
fmodify
Freplace
freplace
datagram
datagrams
epoll
Expand Down Expand Up @@ -182,6 +184,7 @@ spinlock
spinlocks
tracepoint
tracepoints
observability
bytecode
crypto
cryptographic
Expand Down
1 change: 1 addition & 0 deletions docs/linux/concepts/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@
* [KFuncs](kfuncs.md)
* [Dynptrs](dynptrs.md)
* [Token](token.md)
* [Trampolines](trampolines.md)
8 changes: 8 additions & 0 deletions docs/linux/concepts/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,4 +113,12 @@ This is an index of Linux specific eBPF concepts and features. For more generic

[:octicons-arrow-right-24: Token](./token.md)

- __Trampolines__

---

Trampolines are used to attach eBPF programs to kernel functions

[:octicons-arrow-right-24: Trampolines](./trampolines.md)

</div>
71 changes: 71 additions & 0 deletions docs/linux/concepts/trampolines.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Trampolines

Trampolines in computing has a number of meanings. In the context of Linux this refers to locations in memory containing addresses of logic to jump to. Trampolines are also referred to as "indirect jump vectors". It is a mechanism that has a number of use cases such as interrupt service routines or I/O routines. In these classic use cases the hardware hard-codes memory locations to which execution will jump when certain events such as interrupt happen. A trampoline typically jumps immediately to some other function where the actual handler lives, hence the term trampoline.

## ftrace

[:octicons-tag-24: v2.6.27](https://github.com/torvalds/linux/commit/16444a8a40d4c7b4f6de34af0cae1f76a4f6c901)

Ftrace (function trace) is a mechanism in the kernel for observing function execution, traditionally for debugging purposes. Ftrace also uses trampolines, but in a slightly different way. It is enabled when compiled with the `CONFIG_FUNCTION_TRACER` kernel configuration. When enabled, most files in the kernel are compiled with the `-pg` and `-mnop-mcount=...` flags. The `-pg` flag enables profiling, which adds some additional CPU instructions to the start of most global functions. The `-mnop-mcount=...` flag makes it so these CPU instructions are [`NOP` (No Operation)](https://en.wikipedia.org/wiki/NOP_(code)) instructions as apposed to the normal profiling logic. So by default, when a function is called, the CPU encounters a few `NOP` instructions which it will ignore and then continue the actual function. However, at runtime the kernel can replace these `NOP` instructions with a trampoline to in the case of ftrace some tracing logic.

It should be noted that certain directories, files or functions in the kernel are excluded. For example the trace subsystem (kernel/trace) is excluded and any functions with the `notrace` attribute. This is done to protect the user from infinite recursion or breaking assumptions in critical code sections of the kernel.

## BPF trampolines

[:octicons-tag-24: v5.5](https://github.com/torvalds/linux/commit/fec56f5890d93fc2ed74166c397dc186b1c25951)

BPF programs can be attached to any function in the kernel that has enough `NOP` instructions. When this is done, a "BPF trampoline" is created. This is a architecture specific function that boils down to:

* Allocate room on the stack for all function arguments + return value
* Copy all arguments to the stack, from the registers specified by the calling convention
* For each fentry program attached
* Call the fentry program with a pointer to the stack as context (if any is attached)
* Disable CPU migration (preemption on older kernels) before calling the program, re-enable after
* If stats tracking is enabled, start timer before execution and add run time to stats after execution
* For each fmodify_return program attached
* Call a fmodify_return program with a pointer to the stack as context (if any is attached)
* Disable CPU migration (preemption on older kernels) before calling the program, re-enable after
* If stats tracking is enabled, start timer before execution and add run time to stats after execution
* If the return value is non zero, return it instead of continuing
* Call the original function (unless a fmodify_return program returned a non `0` value)
* Copy return value from register onto the stack
* For each fexit program attached
* Call a fexit program with a pointer to the stack as context (if any is attached)
* Disable CPU migration (preemption on older kernels) before calling the program, re-enable after
* If stats tracking is enabled, start timer before execution and add run time to stats after execution
* Return the return value from the stack

The `NOP`s are replaced with a call to the generated trampoline and a return instruction.

### Architecture support

Since BPF trampolines are architecture specific, support for a given architecture may be added later than the initial support for BPF trampolines. Here is a table of when support was added for each architecture:

| Architecture | Support added |
|--------------|---------------|
| X86-64 | [:octicons-tag-24: v5.5](https://github.com/torvalds/linux/commit/fec56f5890d93fc2ed74166c397dc186b1c25951) |
| ARM64 | [:octicons-tag-24: v6.0](https://github.com/torvalds/linux/commit/efc9909fdce00a827a37609628223cd45bf95d0b) |
| RISC-V | [:octicons-tag-24: v6.3](https://github.com/torvalds/linux/commit/49b5e77ae3e214acff4728595b4ac7bf776693ca) |
| S390 | [:octicons-tag-24: v6.3](https://github.com/torvalds/linux/commit/528eb2cb87bc1353235a6384696b4849bde8b0ba) |

Other architectures currently lack support for BPF trampolines.

### fentry/fexit/fmodify_return

[:octicons-tag-24: v5.5](https://github.com/torvalds/linux/commit/f1b9509c2fb0ef4db8d22dac9aef8e856a5d81f6)

The [`BPF_PROG_TYPE_TRACING`](../program-type/BPF_PROG_TYPE_TRACING.md) fentry/fexit/fmodify_return programs make use of these trampolines to attach and execute just before entering a function or right after it exits. Since the trampoline is essentially just a function call, the overhead is very minimal. So a way faster alternative to a [kprobe](../program-type/BPF_PROG_TYPE_KPROBE.md) which uses an interrupt, and thus a context switch which is way more expensive.

Not only native kernel functions can have these blank spots for trampolines. When BPF programs are JIT-ed, the kernel also gives them these spots which can be instrumented in the same manner. This allows fentry/fexit/fmodify_return programs to attach to other programs, for the purposes of observability.

### Program replacement

[:octicons-tag-24: v5.6](https://github.com/torvalds/linux/commit/be8704ff07d2374bcc5c675526f95e70c6459683)

Trampolines are also used to implement [freplace programs](../program-type/BPF_PROG_TYPE_EXT.md), where one program replaces another. When a freplace program is attached, it installs a trampoline before the original program, jumps to the extension program and executes it instead, then returns without calling the original.

### LSM

[:octicons-tag-24: v5.7](https://github.com/torvalds/linux/commit/9e4e01dfd3254c7f04f24b7c6b29596bc12332f3)

[LSM](../program-type/BPF_PROG_TYPE_LSM.md) programs also attach via trampolines very similar to fexit/fret_mod programs. The kernel defines placeholder functions for every hook, always starting with the prefix `bpf_lsm_`. These placeholders simply return the default return value. When attached, the LSM program acts as fexit or fret_mod probe for the purposes of the BPF trampoline on these placeholder hooks.
12 changes: 5 additions & 7 deletions docs/linux/program-type/BPF_PROG_TYPE_EXT.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,16 @@ Extension programs can be used to dynamically extend another BPF program.

## Usage

These programs can be used to replace global functions in already loaded BPF programs. Global functions are verified individually by the verifier based on their types only.
Hence the global function in the new program which types match older function can
safely replace that corresponding function.
These programs can be used to replace global functions in already loaded BPF programs. Global functions are verified individually by the verifier based on their types only. Hence the global function in the new program which types match older function can safely replace that corresponding function.

Programs of this type are typically placed in an ELF section prefixed with `freplace/`

The main use case for extensions is to provide generic mechanism to plug external programs into policy program or function call chaining. The [libxdp](https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp) project uses this functionality to implement XDP program chaining from a dispatcher program.

This new function/program is called 'an extension' of old program. At load time
the verifier uses (`attach_prog_fd`, `attach_btf_id`) pair to identify the function
to be replaced. The BPF program type is derived from the target program into
extension program.
This new function/program is called 'an extension' of old program. At load time the verifier uses (`attach_prog_fd`, `attach_btf_id`) pair to identify the function to be replaced. The BPF program type is derived from the target program into extension program.

!!! note
Replacing the original program uses a [trampoline](../concepts/trampolines.md), the same mechanism used by fentry/fexit programs. So attaching such probes to a program and extending it are mutually exclusive.

!!! note
The verifier allows only one level of replacement. Meaning that the extension program cannot recursively extend an extension.
Expand Down
4 changes: 2 additions & 2 deletions docs/linux/program-type/BPF_PROG_TYPE_TRACING.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ description: "This page documents the 'BPF_PROG_TYPE_TRACING' eBPF program type,
[:octicons-tag-24: v5.5](https://github.com/torvalds/linux/commit/f1b9509c2fb0ef4db8d22dac9aef8e856a5d81f6)
<!-- [/FEATURE_TAG] -->

Tracing programs are a newer alternative to kprobes and tracepoints. Tracing programs utilize BPF trampolines, a new mechanism which provides practically zero overhead. In addition, tracing programs can be attached to BPF programs to provide troubleshooting and debugging capabilities, something that is not possible with kprobes.
Tracing programs are a newer alternative to kprobes and tracepoints. Tracing programs utilize [BPF trampolines](../concepts/trampolines.md), a new mechanism which provides practically zero overhead. In addition, tracing programs can be attached to BPF programs to provide troubleshooting and debugging capabilities, something that is not possible with kprobes.

## Usage

Expand All @@ -34,7 +34,7 @@ Fentry programs are similar in function to a kprobe attached to a functions firs

Kprobes do not have to be attached at the entry point of a function, kprobes can be installed at any point in the function, whereas fentry programs are always attached at the entry point of a function.

Fentry programs are attached to a BPF trampoline which causes less overhead than kprobes. Fentry programs can also be attached to BPF programs such as XDP, TC or cGroup programs which makes debugging eBPF programs easier. Kprobes lack this capability.
Fentry programs are attached using a [BPF trampoline](../concepts/trampolines.md) which causes less overhead than kprobes. Fentry programs can also be attached to BPF programs such as XDP, TC or cGroup programs which makes debugging eBPF programs easier. Kprobes lack this capability.

Fentry programs are typically located in an ELF section prefixed with `fentry/`.

Expand Down
Loading