-
Notifications
You must be signed in to change notification settings - Fork 58
Ranges: scan shuffle naming problem
Denis Yaroshevskiy edited this page Jun 11, 2021
·
3 revisions
I have a ludicrous naming problem for a shuffle.
Fundamentally the scan
should be done with the following shuffle (0 - is 0 for the operation)
[a b c d ]
+ [0 a b c ]
[a a + b b + c c + d ]
+ [0 0 a + b b + c ]
[a a + b a + b + c a + b + c + d ]
However this 2nd shuffle (and more for smaller types) is inefficient on avx2
A better option is to compute the scan in both halves, bcast left sum and mix (or zero extend for eve::zero)
[a b c d ]
+ [0 a 0 c ] // _mm256_alignr_epi8
[a a + b c c + d ]
+ [0 0 a + b a + b ] // extract + bcast + extend
[a a + b a + b + c a + b + c + d ]
The problem is:
The first case shuffle should be called slide_right(a, b, n)
or shift_right(a, b, n)
However how to call the second ones I'm completely blank.
Alternative to having a proper named shuffles in avx case is to do intrinsics in place.
But that seems a touch icky.