Closed
Description
Reproduction steps
git clone https://github.com/mcountryman/simd-adler32
cd simd-adler32
cargo +1.86.0 bench
cargo +1.87.0 bench
On AVX2-capable CPUs, on Rust 1.86 you can see the AVX2 codepath is performing well, at or above the SSE3 level.
On Rust 1.87 performance collapses completely, with the AVX2 implementation being the slowest, behind even the scalar implementation.
SSE2, SSE3 and AVX-512 are not affected.
The simd-adler32
crate uses runtime feature detection and explicit AVX2 intrinsics from std::arch.
Version it worked on
1.86
Version with regression
1.87
rustc --version --verbose
:
rustc 1.87.0 (17067e9ac 2025-05-09)
binary: rustc
commit-hash: 17067e9ac6d7ecb70e50f92c1944e545188d2359
commit-date: 2025-05-09
host: x86_64-unknown-linux-gnu
release: 1.87.0
LLVM version: 20.1.1