-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix simd::gatherBits for Mac M1 or when AVX2 is disabled #8415
Conversation
✅ Deploy Preview for meta-velox canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yuhta Jimmy, would you help review this PR?
velox/common/base/SimdUtil.cpp
Outdated
*(resultPtr++) = | ||
simd::gather8Bits(bits, xsimd::load_unaligned(indices + i), 8); | ||
for (; i + kStep < size; i += kStep) { | ||
if constexpr (kStep == 8) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a copy-paste from DecoderUtil::nonNullRowsFromSparse ? Would it be possible to refactor to avoid that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I copy the code structure from DecoderUtil::nonNullRowsFromSparse.
Maybe we can refactor simd::gather8Bits
to make it work as the function name says. But I'm not sure this is acceptable or not. Currently, I want to change little to fix this, to minimize the impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think gather8Bits
can be changed as the indices is from one single register. For removing the duplicates I created #8416, @icejoywoo you can rebase upon my commit and use the new function, and @mbasmanova can you review #8416?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yuhta Ok, I will rebase code to use new function bits::storeBitsToByte
.
Summary: We will need this to fix `simd::gatherBits` for non-AVX cases. See facebookincubator#8415 Differential Revision: D52837098
Summary: We will need this to fix `simd::gatherBits` for non-AVX cases. See facebookincubator#8415 Differential Revision: D52837098
@Yuhta I already rebase and use the new function |
@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Fix #8377
When avx2 is disabled or run on Mac M1(arm64), simd::gatherBits works incorrectly.
This fix comes from DecoderUtil::nonNullRowsFromSparse.