-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix checksum Presto aggregate function #6910
Conversation
✅ Deploy Preview for meta-velox canceled.
|
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - I have a corner case, but I am not sure if its a valid case.
AlignedBuffer::allocate<int64_t>(elementRows.end(), baseRow->pool()); | ||
auto* rawCombinedChildHashes = combinedChildHashes->asMutable<int64_t>(); | ||
std::fill_n(rawCombinedChildHashes, rows.end(), 1); | ||
|
||
std::fill_n(rawHashes, rows.end(), 1); | ||
|
||
for (int i = 0; i < baseRow->childrenSize(); i++) { | ||
children_[i]->hash(baseRow->childAt(i), elementRows, childHashes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we reset childHashes , before computing for every child.
Can we have a child vector that has the null bit set at a point but the parent doesnt have the null bit set, in that case we will not compute hash at that point for child and reuse and older hash value which might be wrong.
Let me know if this is not possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expectation is that hash() function will compute hashes for all rows requested regardless whether they are null or not.
row->setNull(1, true); | ||
assertChecksum(row, "DpXXC2Pzbjw="); | ||
|
||
row = makeRowVector({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we add a nested row test case ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
4ca5100
to
b4f66e6
Compare
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@mbasmanova merged this pull request in f747c38. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Summary: PrestoHasher used to ignore null flags for structs producing incorrect hashes. Fixes facebookincubator#6715 and facebookincubator#6449 Pull Request resolved: facebookincubator#6910 Reviewed By: kgpai Differential Revision: D49953433 Pulled By: mbasmanova fbshipit-source-id: 828476371bc649af7b1bd29d80a072cd4bab739a
PrestoHasher used to ignore null flags for structs producing incorrect hashes.
Fixes #6715 and #6449