Skip to content

32x performance regression for AVX2 intrinsics in Rust v1.87 #142603

Closed
@Shnatsel

Description

@Shnatsel

Reproduction steps

git clone https://github.com/mcountryman/simd-adler32
cd simd-adler32
cargo +1.86.0 bench
cargo +1.87.0 bench

On AVX2-capable CPUs, on Rust 1.86 you can see the AVX2 codepath is performing well, at or above the SSE3 level.

On Rust 1.87 performance collapses completely, with the AVX2 implementation being the slowest, behind even the scalar implementation.

SSE2, SSE3 and AVX-512 are not affected.

The simd-adler32 crate uses runtime feature detection and explicit AVX2 intrinsics from std::arch.

Version it worked on

1.86

Version with regression

1.87

rustc --version --verbose:

rustc 1.87.0 (17067e9ac 2025-05-09)
binary: rustc
commit-hash: 17067e9ac6d7ecb70e50f92c1944e545188d2359
commit-date: 2025-05-09
host: x86_64-unknown-linux-gnu
release: 1.87.0
LLVM version: 20.1.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-SIMDArea: SIMD (Single Instruction Multiple Data)C-bugCategory: This is a bug.T-libsRelevant to the library team, which will review and decide on the PR/issue.regression-from-stable-to-stablePerformance or correctness regression from one stable version to another.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions