Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export cold_path() in std::hint #510

Open
x17jiri opened this issue Dec 21, 2024 · 8 comments
Open

Export cold_path() in std::hint #510

x17jiri opened this issue Dec 21, 2024 · 8 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@x17jiri
Copy link

x17jiri commented Dec 21, 2024

Proposal

Problem statement

It is sometimes helpful to let the compiler know what code path is the fast path, so it can be optimized at the expense of the slow path. This proposal suggests that the cold_path() intrinsic is simple and reliable way to provide this information and it could be reexported in std::hint.

grep-ing the LLVM source code for BlockFrequencyInfo and BranchProbabilityInfo shows that this information is used at many places in the optimizer. Such as:

  • block placement - improve locality by making the fast path compact and move everything else out of the way
  • inlining, loop unrolling - these optimizations can be less aggressive on the cold path therefore reducing code size
  • register allocation - preferably keep in registers the data needed on the fast path

Motivating examples or use cases

The cold_path() call can be simply placed on some code path marking it as cold.

    if condition {
        // this is the fast path
    } else {
        cold_path();
        // this path is unlikely
    }
    match a {
        1 => a,
        2 => b,
        3 => { cold_path(); c }, // this branch is unlikely
        _ => { cold_path(); d }, // this is also unlikely
    }
    let buf = Global.allocate(layout).map_err(|_| {
        // the error is unlikely
        cold_path();
        Error::new_alloc_failed("Cannot allocate memory.")
    })?;

Solution sketch

This is already implemented in intrinsics. All we have to do is create a wrapper in std::hint:

    #[inline(always)]
    pub fn cold_path() {
        std::intrinsics::cold_path()
    }

Alternatives

likely/unlikely

These are harder to use for idiomatic Rust. For example they don't work well with match arms. On the other hand, sometimes they can be more readable and may also be worth reexporting in std::hint.

For example, this would be harder to express using cold_path():

    if likely(x) && unlikely(y) {
        true_branch
    }

And this looks better without the extra branch:

    if likely(cond) {
        true_branch
    }

    if cond {
        true_branch
    } else {
        cold_path()
    }

extending the functionality of #[cold] attribute

This attribute could be allowed on match arms or on closures. I'm not sure if it's worth it adding extra syntax if the functionality can be implemented by a library call.

Links and related work

rust-lang/rust#120370 - added cold_path() as part of fixing likely and unlikely

rust-lang/rust#133852 - improvements of the cold_path() implementation

rust-lang/rust#120193 - proposal for #[cold] on match arms

@saethlin
Copy link
Member

I have a minor compiler concern for this proposal: The proposed library implementation of

    #[inline(always)]
    pub fn cold_path() {
        std::intrinsics::cold_path()
    }

given the current compiler implementation, relies on hint::cold_path being inlined in MIR in order to work.

While technically this is a hint so it's fine to do nothing, and branch hints are unlikely to be useful in compilations that disable the MIR inliner, it would be a bummer if this got stabilized, then later we figured out a better way to do branch hints and had to deprecate this one.

Perhaps the reliance on the MIR inliner will be alleviated by rust-lang/rust#134082 but at the time of writing that PR isn't even merged. It might be by the time this ACP gets discussed, who knows.

@programmerjake
Copy link
Member

if we run into compiler issues, the cold_path wrapper could always end up becoming:

#[cold]
pub fn cold_path() {}

this should work since it is entirely stable code now, even if less optimal.

@hanna-kruppe
Copy link

Hashbrown without nightly feature had such a formulation for a while but removed it because it didn’t work. I guess it has the opposite problem: rather than relying on MIR inlining, it’s too easily optimized out (by the inliner or other passes) before it can influence branch weights.

@x17jiri
Copy link
Author

x17jiri commented Dec 23, 2024

@hanna-kruppe Before rust-lang/rust#120370 was merged, such a formulation would not work.

rustc doesn't inline a cold function regardless how small it is. But LLVM does and doesn't set the branch weights.

Now, there is code that detects calls to cold functions and sets the weights.

@hanna-kruppe
Copy link

I have not reviewed that PR but I wonder: if the formulation with a #[cold] function works now, why did that PR add a new intrinsic?

@x17jiri
Copy link
Author

x17jiri commented Dec 23, 2024

@hanna-kruppe Intrinsic gives us more control. Without it, the functionality depends on two assumptions:

  1. rustc inline pass doesn't inline a cold function, so it can be detected later in the code gen
  2. llvm inline pass will inline it so the binary doesn't contain actual call to cold_path()

At the moment both assumptions hold.

With intrinsic, however, we can make sure that future updates will not break one or both of them.

  1. rustc inline pass cannot inline an instrinsic because it doesn't know how it will be implemented in the backend
  2. llvm pass doesn't need to inline it because we can remove it during codegen

@hanna-kruppe
Copy link

Thanks for explaining. Sounds like the slightly generalized version of @saethlin's concern, that hint::cold_path only works reliably by relying on on implementation details of the optimization/codegen, also applies to the intrinsic-less approach?

@x17jiri
Copy link
Author

x17jiri commented Dec 23, 2024

The intrinsic-less version depends on whether a tiny function marked as cold gets inlined. I would call this implementation detail.

The export of intrinsic in std::hint depends on whether a function marked #[inline(always)] gets inlined. I would NOT call this implementation detail.

But yes, it's a valid concern and hopefully rust-lang/rust#134082 will fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

4 participants