Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build with frame pointers for improved profiling #10224

Closed
erikgrinaker opened this issue Dec 22, 2024 · 3 comments
Closed

Build with frame pointers for improved profiling #10224

erikgrinaker opened this issue Dec 22, 2024 · 3 comments
Assignees
Labels
a/observability Area: related to observability c/storage Component: storage

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Dec 22, 2024

Release binaries are currently built without frame pointers. This frees up a register for the compiler and avoids a couple of instructions per function call, which can improve performance (typically <1%). However, stack unwinding and profiling then has to use DWARF information to generate backtraces, which is far more expensive and can cause difficulty e.g. for perf and eBPF profilers.

We're considering continuous profiling, and jemalloc heap profiling already probabilistically takes stack traces during allocations. These stack traces will be much cheaper with frame pointers enabled. This might save more CPU than we lose with the dedicated frame pointer register, and allow us to profile at higher frequency.

The Rust stdlib recently enabled frame pointers by default for this reason. It's also possible that frame pointers are already enabled by default on aarch64 CPUs (used for Pageservers), since this architecture uses a dedicated frame pointer register.

Related reading:

@erikgrinaker erikgrinaker added a/performance Area: relates to performance of the system c/storage Component: storage labels Dec 22, 2024
@erikgrinaker erikgrinaker self-assigned this Dec 22, 2024
@erikgrinaker erikgrinaker added a/observability Area: related to observability and removed a/performance Area: relates to performance of the system labels Dec 22, 2024
@erikgrinaker
Copy link
Contributor Author

It's also possible that frame pointers are already enabled by default on aarch64 CPUs (used for Pageservers)

This isn't the case. I looked at the assembly function prologue, which doesn't maintain frame pointers. This is confirmed by the rustc target specs, which default to FramePointer::MayOmiton Linux and doesn't override this for the aarch64_unknown_linux_gnu profile:

https://github.com/rust-lang/rust/blob/fd19773d2f8a070dc03f0072f9bc41a65fd04fed/compiler/rustc_target/src/spec/base/linux.rs

https://github.com/rust-lang/rust/blob/fd19773d2f8a070dc03f0072f9bc41a65fd04fed/compiler/rustc_target/src/spec/targets/aarch64_unknown_linux_gnu.rs

Apple aarch64 does enable frame pointers, since this is required by Apple debug tooling:

https://github.com/rust-lang/rust/blob/fd19773d2f8a070dc03f0072f9bc41a65fd04fed/compiler/rustc_target/src/spec/base/apple/mod.rs#L122

@erikgrinaker
Copy link
Contributor Author

tikv-jemallocator builds jemalloc with frame pointers:

[tikv-jemalloc-sys 0.6.0+5.3.0-1-ge13ca993e8ccb9ba9847cc330696e02839f328f7] CFLAGS="-O0 -ffunction-sections -fdata-sections -fPIC -gdwarf-4 -fno-omit-frame-pointer -m64 -Wall"

@erikgrinaker
Copy link
Contributor Author

The frame-pointer feature in pprof-rs is 10x slower than libunwind without frame pointers. See #10226 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/observability Area: related to observability c/storage Component: storage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant