You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SIMD instructions target minimum 128 bits registers for floats, so for 2 dimensions vectors, the minimal alignment is 128bits, this makes use of SIMD only possible for 2*64bit float component vectors. So I'm planning to test and maybe move the vkvg api to double float instead of single float to be able to use SIMD.
The text was updated successfully, but these errors were encountered:
First tests with double vec2d with _m128d show no improvements with _m_add, mul, div. perfs are even worse (20% slower than vec2 (2*float) and normal arithmetic, load/store overhead for single ops on 2d vectors of doubles cant bring speedup.
SIMD usage must be considered for full algorithm with few load/store ops.
The idea here was to try to optimize atomic arithmetic operations by hands, but I was missing the load/store overheads. I guess a lot of vectorization is already done by the compilers. This subject is a hard one, this tests was my first attempt with simd ops.
SIMD instructions target minimum 128 bits registers for floats, so for 2 dimensions vectors, the minimal alignment is 128bits, this makes use of SIMD only possible for 2*64bit float component vectors. So I'm planning to test and maybe move the vkvg api to double float instead of single float to be able to use SIMD.
The text was updated successfully, but these errors were encountered: