-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(hashjoin): Add fast row size estimation for hash probe #11558
Conversation
tanjialiang
commented
Nov 16, 2024
•
edited
Loading
edited
- Add column stats for row container to collect aggregated column stats. The aggregated column stats will be used in hash probe to decide if row size estimation is applicable. If it is applicable, column stats will be used to compose a fast row size estimation to avoid memory exploding when probing and listing results. This added feature makes hash join more performant, and in some extreme skew cases that we've seen in Meta internal queries, it helped to decrease the query latency by >20x.
- The work of this feature also helped to discovered a bug in HashTable when using simd for fast path result listing -> when max number of rows is smaller than kWidth, the unsigned integer overflow bug will make the max number of rows be ignored. Fixed the bug and the new test covers that case.
✅ Deploy Preview for meta-velox canceled.
|
852cbe2
to
9ed88ab
Compare
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
9ed88ab
to
3f1b34c
Compare
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
a9dc3bc
to
b4758b6
Compare
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanjialiang thanks for the improvement % minors.
velox/exec/RowContainer.h
Outdated
|
||
private: | ||
// Aggregated stats for non-null rows of the column. | ||
int32_t minBytes_{0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use uint32_t? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int32_t is widely used as cell size in row container. I'm trying to make it compatible so that no cast is needed when performing ::max ::min. If we want we can have another PR for refactoring the types in general.
velox/exec/RowContainer.cpp
Outdated
if (columnStatsValid) { | ||
for (uint32_t columnIndex = 0; columnIndex < rowColumnsStats_.size(); | ||
columnIndex++) { | ||
if (types_[columnIndex]->isFixedWidth()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep this simple by having column stats for fixed width as well like null count sth.
b4758b6
to
261ab55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanjialiang LGTM % minors. Thanks!
if (totalMaxBytes == 0) { | ||
return 0; | ||
} | ||
return std::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why return null here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After offline discussion, I re-thought about it. The reason we want to return nullopt is as follows:
Imagine we have a case where 99999999 rows are of size 0, and 1 row is of size 1MB. And all left side join with this row. This will explode the memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanjialiang thanks!
totalAvgBytes += stats.avgBytes(); | ||
totalMaxBytes += stats.maxBytes(); | ||
} | ||
if (totalAvgBytes == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (totalAvgBytes == 0) {
return 0;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After offline discussion, I re-thought about it. The reason we want to return nullopt is as follows:
Imagine we have a case where 99999999 rows are of size 0, and 1 row is of size 1MB. And all left side join with this row. This will explode the memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it explodes? So most of rows are zero size so it is ok to execute fast path? There is number of output row limit.
b69ee43
to
9a3270b
Compare
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
03bbefd
to
613aebe
Compare
c041da5
to
5ab4a6e
Compare
f99c1be
to
ccfd4e7
Compare
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanjialiang can you add to track the row column size for non-join case as well. The prefix sort might require that as well @zhli1142015 . Thanks!
@@ -27,7 +27,7 @@ namespace facebook::velox::functions::prestosql { | |||
namespace { | |||
|
|||
class SimpleComparisonMatcherTest : public testing::Test, | |||
public test::VectorTestBase { | |||
public velox::test::VectorTestBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need these namespace changes?
@@ -26,6 +26,7 @@ | |||
#include "velox/parse/TypeResolver.h" | |||
#include "velox/vector/tests/utils/VectorTestBase.h" | |||
|
|||
using namespace facebook; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need additional facebook namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For making complier not confused with the newly added test namespace in exec.
Yes, we should also need max length for string columns for prefix sort. |
I'll do a followup PR for that, to make sure there's no regression if enabled by default. |
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1f56443
to
397baeb
Compare
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
397baeb
to
f3724d3
Compare
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…incubator#11558) Summary: * Add column stats for row container to collect aggregated column stats. The aggregated column stats will be used in hash probe to decide if row size estimation is applicable. If it is applicable, column stats will be used to compose a fast row size estimation to avoid memory exploding when probing and listing results. This added feature makes hash join more performant, and in some extreme skew cases that we've seen in Meta internal queries, it helped to decrease the query latency by >20x. * The work of this feature also helped to discovered a bug in HashTable when using simd for fast path result listing -> when max number of rows is smaller than kWidth, the unsigned integer overflow bug will make the max number of rows be ignored. Fixed the bug and the new test covers that case. Pull Request resolved: facebookincubator#11558 Reviewed By: xiaoxmeng Differential Revision: D66064300 Pulled By: tanjialiang
f3724d3
to
07ede69
Compare
This pull request was exported from Phabricator. Differential Revision: D66064300 |
@tanjialiang merged this pull request in 059337f. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Thank you for the fix. @zhouyuan @zhztheplayer Is our Gluten Jenkins verified? |
…incubator#11558) Summary: * Add column stats for row container to collect aggregated column stats. The aggregated column stats will be used in hash probe to decide if row size estimation is applicable. If it is applicable, column stats will be used to compose a fast row size estimation to avoid memory exploding when probing and listing results. This added feature makes hash join more performant, and in some extreme skew cases that we've seen in Meta internal queries, it helped to decrease the query latency by >20x. * The work of this feature also helped to discovered a bug in HashTable when using simd for fast path result listing -> when max number of rows is smaller than kWidth, the unsigned integer overflow bug will make the max number of rows be ignored. Fixed the bug and the new test covers that case. Pull Request resolved: facebookincubator#11558 Reviewed By: xiaoxmeng Differential Revision: D66064300 Pulled By: tanjialiang fbshipit-source-id: 886cd943036350b1c1bf0b6741ebe7165883a30f