Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the trifecta div algorithm for 128-bit div on wasm #685

Merged
merged 1 commit into from
Sep 5, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 15 additions & 5 deletions src/int/specialized_div_rem/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -136,9 +136,15 @@ fn u64_by_u64_div_rem(duo: u64, div: u64) -> (u64, u64) {

// Whether `trifecta` or `delegate` is faster for 128 bit division depends on the speed at which a
// microarchitecture can multiply and divide. We decide to be optimistic and assume `trifecta` is
// faster if the target pointer width is at least 64.
// faster if the target pointer width is at least 64. Note that this
// implementation is additionally included on WebAssembly despite the typical
// pointer width there being 32 because it's typically run on a 64-bit machine
// that has access to faster 64-bit operations.
#[cfg(all(
not(any(target_pointer_width = "16", target_pointer_width = "32")),
any(
target_family = "wasm",
not(any(target_pointer_width = "16", target_pointer_width = "32")),
),
not(all(not(feature = "no-asm"), target_arch = "x86_64")),
not(any(target_arch = "sparc", target_arch = "sparc64"))
))]
Expand All @@ -152,10 +158,14 @@ impl_trifecta!(
u128
);

// If the pointer width less than 64, then the target architecture almost certainly does not have
// the fast 64 to 128 bit widening multiplication needed for `trifecta` to be faster.
// If the pointer width less than 64 and this isn't wasm, then the target
// architecture almost certainly does not have the fast 64 to 128 bit widening
// multiplication needed for `trifecta` to be faster.
#[cfg(all(
any(target_pointer_width = "16", target_pointer_width = "32"),
not(any(
target_family = "wasm",
not(any(target_pointer_width = "16", target_pointer_width = "32")),
)),
not(all(not(feature = "no-asm"), target_arch = "x86_64")),
not(any(target_arch = "sparc", target_arch = "sparc64"))
))]
Expand Down
Loading