Run-time feature detection #2

gnzlbg · 2017-11-22T15:51:08Z

Currently the vector size and the SIMD instructions to be used are fixed at compile-time. This is a good first step.

It allows me to write an algorithm once, and by changing a compiler-flag generate different versions of this algorithm for different target architectures (with different vector sizes, SIMD instructions, etc.).

I really like to be able to do this at compile-time within the same binary as well, so that I can write:

// a generic algorithm
fn my_generic_algorithm<T: SimdVec>(x: T, y: T) {
   // generic simd operations
}

and monomorphize it for different instruction sets:

let x: VecAvx;
my_generic_algorithm(x, x); // AVX version
let y: VecSSE42;
my_generic_algorithm(y, y); // SSE42 version

That way input.simd_iter().map(my_generic_algorithm) could monomorphize my_generic_algorithm for SSE, SSE42, AVX, and AVX2, and do run-time feature detection to detect the best that a given CPU supports at run-time, and then dispatch to that one.

The text was updated successfully, but these errors were encountered:

nixpulvis · 2017-11-30T02:05:11Z

Once Rust has compile time constants in types, this could get even more awesome I think.

AdamNiederer · 2017-12-19T21:03:13Z

I'm definitely interested in implementing this before 1.0.0, but I'd like to wait for stdsimd to mature a bit before jumping on it. I'll probably have methods for both static and dynamic dispatch to appease those who absolutely must save a branch.

gnzlbg · 2017-12-20T08:58:13Z

It would be great if one could figure out a way of doing the dispatch only once when an iterator is consumed. That is, faster would need to monomorphize an iterator chain for multiple targets, and at run-time when the iterator is consumed the best branch is chosen before the outer-most loop.

lilianmoraru · 2017-12-20T11:39:59Z

Shouldn't the compiler also have support for this(run-time feature detection)?
Like Intel Compiler's -ax flag?
Or the different code paths just need to be created by the library and it is enough?

gnzlbg · 2017-12-20T11:41:25Z

@lilianmoraru stdsimd has support for it: if cfg_feature_enabled!("avx") { ... } returns whether the CPU where the binary is running on supports avx.

AndreKR · 2020-04-05T17:49:02Z

I'm genuinely curious what the use case for compile-time feature detection (as it is currently implemented?) would even be. Never in my life have I seen a project that provides different binaries depending on the CPU you want to run it on. I probably wouldn't even know which one to pick.

jgarvin · 2021-07-13T02:59:25Z

@AndreKR it's the main way to not have the runtime cost of branching to pick the appropriate instruction set. A lot of (most?) software gets written internally by businesses for the same business, and they know what hardware they target so they don't need multiple binaries, or if they do it's simple like "last-gen" binary and "current-gen" binary. You pick the one closest to the actual CPU you know you'll be running on.

ralfbiedert mentioned this issue Dec 3, 2017

Expose width on vec types and provide sum() ability. #5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run-time feature detection #2

Run-time feature detection #2

gnzlbg commented Nov 22, 2017

nixpulvis commented Nov 30, 2017

AdamNiederer commented Dec 19, 2017

gnzlbg commented Dec 20, 2017

lilianmoraru commented Dec 20, 2017

gnzlbg commented Dec 20, 2017

AndreKR commented Apr 5, 2020

jgarvin commented Jul 13, 2021

Run-time feature detection #2

Run-time feature detection #2

Comments

gnzlbg commented Nov 22, 2017

nixpulvis commented Nov 30, 2017

AdamNiederer commented Dec 19, 2017

gnzlbg commented Dec 20, 2017

lilianmoraru commented Dec 20, 2017

gnzlbg commented Dec 20, 2017

AndreKR commented Apr 5, 2020

jgarvin commented Jul 13, 2021