Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Change C style casts to C++ style (Part 3) #11686

Closed
wants to merge 1 commit into from
Closed

Conversation

aditi-pandit
Copy link
Collaborator

As per the security guideline in
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#es49-if-you-must-use-a-cast-use-a-named-cast

Covers the findings in velox/vector

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 28, 2024
Copy link

netlify bot commented Nov 28, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit c7521fe
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6752bb082d2e3f00083c2037

@aditi-pandit aditi-pandit changed the title refactor : Change C-style casts to C++-style (Part 3) refactor: Change C-style casts to C++-style (Part 3) Nov 28, 2024
@aditi-pandit aditi-pandit changed the title refactor: Change C-style casts to C++-style (Part 3) refactor: Change C style casts to C++ style (Part 3) Nov 28, 2024
@aditi-pandit aditi-pandit marked this pull request as draft November 28, 2024 08:38
@aditi-pandit aditi-pandit marked this pull request as ready for review November 28, 2024 09:05
@@ -582,13 +583,15 @@ VectorPtr createStringFlatVectorFromUtf8View(
// buffer-index, 4-byte buffer-offset] to 16-byte Velox StringView [4-byte
// length, 4-byte prefix, 8-byte buffer-ptr]
for (int32_t idx_64 = 0; idx_64 < arrowArray.length; ++idx_64) {
auto* view = (uint32_t*)(&((uint64_t*)arrowArray.buffers[1])[2 * idx_64]);
rawStringViews[2 * idx_64] = *(uint64_t*)view;
auto* view = reinterpret_cast<const uint32_t*>(&(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add const here and elsewhere in the block instead of keeping it non-const?
Compared to line 783 change as well for example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arrwoArray.buffers is a const void** so the reinterpret_cast has error if we lose the constness. Same in other places where const is added.

velox/vector/arrow/Bridge.cpp Outdated Show resolved Hide resolved
if (!vec.isNullAt(i) && view[0] > 12) {
uint64_t currAddr = *(uint64_t*)&view[2];
uint64_t currAddr = *reinterpret_cast<const uint64_t*>(&view[2]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using const here isn't needed in this line? view being uint32* and then the value is immediately de-referenced.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Done.

@aditi-pandit aditi-pandit force-pushed the cve3 branch 3 times, most recently from 761ee2d to 5c5817e Compare December 5, 2024 15:25
@aditi-pandit
Copy link
Collaborator Author

@majetideepak @czentgr : There is one pending cast in Bridge.cpp:792 that I've not been able to resolve. Its a bit ugly because its overwriting the Velox StringView representation to Arrow StringView format and using some 64 bit to 32 bit pointer magic.

It might need wider changes and I would prefer to handle that individually as it would need a very close inspection of the code and writing more tests.

The changes in this PR are more straightforward changes in casts.

Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aditi-pandit one comment.
The casts do look tricky here since we are deserializing nested arrow buffer contents.
Pointers are being generated from values.
But since we are only changing C casts to C++ casts, this should be safe.

});

auto utf8Views = (uint64_t*)out.buffers[1];
auto utf8Views = reinterpret_cast<const uint64_t*>(out.buffers[1]);
int32_t bufferIdxCache = 0;
uint64_t bufferAddrCache = 0;

rows.apply([&](vector_size_t i) {
auto view = (uint32_t*)&utf8Views[2 * i];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be okay to make this reinterpret_cast<const uint32_t*>(&utf8Views[2 * i])

Copy link
Collaborator Author

@aditi-pandit aditi-pandit Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@majetideepak : Ah well... that isn't sufficient as the view pointer is used to modify contents around line 811. It can't be const.

view[2] = bufferIdxCache;
view[3] = currAddr - bufferAddrCache;

Copy link
Collaborator

@majetideepak majetideepak Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do const_cast<uint32_t*>(reinterpret_cast<const uint32_t*>(&utf8Views[2 * i]));?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@majetideepak : There is also guidance to not cast away const https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#es50-dont-cast-away-const

I was leaning to add a temporary buffer for StringView value modification from Velox format -> Arrow format in VeloxToArrowBridgeHolder and then initialize this just once into the ArrowStruct https://github.com/facebookincubator/velox/blob/main/velox/vector/arrow/Bridge.cpp#L43. That fits better with the class design.

But temporarily we can add const_cast if you are okay with that.

Copy link
Collaborator

@majetideepak majetideepak Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous C-style cast is removing the const away anyway. I don't think this change will make anything worse.
The root cause seems to be that we use the same VeloxToArrowBridgeHolder class for import and export.
The following field must then be.
std::vector<const void*> buffers_{numBuffers_, nullptr}; -> std::vector<void*> buffers_{numBuffers_, nullptr};

@aditi-pandit aditi-pandit force-pushed the cve3 branch 2 times, most recently from 227b64b to 20bda29 Compare December 6, 2024 06:31
@majetideepak
Copy link
Collaborator

CI failure seems unrelated.

@majetideepak majetideepak added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Dec 6, 2024
@facebook-github-bot
Copy link
Contributor

@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@bikramSingh91 merged this pull request in 12ae85b.

athmaja-n pushed a commit to athmaja-n/velox that referenced this pull request Jan 10, 2025
…or#11686)

Summary:
As per the security guideline in
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#es49-if-you-must-use-a-cast-use-a-named-cast

Covers the findings in velox/vector

Pull Request resolved: facebookincubator#11686

Reviewed By: kagamiori

Differential Revision: D66886326

Pulled By: bikramSingh91

fbshipit-source-id: dcece7a01bd880bfd12307d5829edc28057a049f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants