Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix simd::gatherBits for Mac M1 or when AVX2 is disabled #8415

Closed
wants to merge 3 commits into from

Conversation

icejoywoo
Copy link
Contributor

@icejoywoo icejoywoo commented Jan 17, 2024

Fix #8377

When avx2 is disabled or run on Mac M1(arm64), simd::gatherBits works incorrectly.

This fix comes from DecoderUtil::nonNullRowsFromSparse.

Copy link

netlify bot commented Jan 17, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit cee7fc2
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/65a88ce9aa67ce0008c57a65

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 17, 2024
@mbasmanova mbasmanova changed the title Fix simd::gatherBits works incorrectly when AVX2 was disabled or run on Mac M1 Fix simd::gatherBits for Mac M1 or when AVX2 is disabled Jan 17, 2024
@mbasmanova mbasmanova requested a review from Yuhta January 17, 2024 12:22
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yuhta Jimmy, would you help review this PR?

*(resultPtr++) =
simd::gather8Bits(bits, xsimd::load_unaligned(indices + i), 8);
for (; i + kStep < size; i += kStep) {
if constexpr (kStep == 8) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a copy-paste from DecoderUtil::nonNullRowsFromSparse ? Would it be possible to refactor to avoid that?

Copy link
Contributor Author

@icejoywoo icejoywoo Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I copy the code structure from DecoderUtil::nonNullRowsFromSparse.

Maybe we can refactor simd::gather8Bits to make it work as the function name says. But I'm not sure this is acceptable or not. Currently, I want to change little to fix this, to minimize the impact.

Copy link
Contributor

@Yuhta Yuhta Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think gather8Bits can be changed as the indices is from one single register. For removing the duplicates I created #8416, @icejoywoo you can rebase upon my commit and use the new function, and @mbasmanova can you review #8416?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yuhta Ok, I will rebase code to use new function bits::storeBitsToByte.

Yuhta added a commit to Yuhta/velox that referenced this pull request Jan 17, 2024
Summary:
We will need this to fix `simd::gatherBits` for non-AVX cases.

See facebookincubator#8415

Differential Revision: D52837098
Yuhta added a commit to Yuhta/velox that referenced this pull request Jan 17, 2024
Summary:

We will need this to fix `simd::gatherBits` for non-AVX cases.

See facebookincubator#8415

Differential Revision: D52837098
facebook-github-bot pushed a commit that referenced this pull request Jan 17, 2024
Summary:
Pull Request resolved: #8416

We will need this to fix `simd::gatherBits` for non-AVX cases.

See #8415

Reviewed By: mbasmanova

Differential Revision: D52837098

fbshipit-source-id: 328f49a19c5b01857f5a38ac8eef0678bc62a4fd
@icejoywoo
Copy link
Contributor Author

@Yuhta I already rebase and use the new function bits::storeBitsToByte.

@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@Yuhta merged this pull request in 44cdc9e.

Copy link

Conbench analyzed the 1 benchmark run on commit 44cdc9e5.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SimdUtilTest.gatherBits does not work for arm64(M1)
4 participants