-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnest array of rows based on unnestArrayOfRows #7477
Conversation
✅ Deploy Preview for meta-velox canceled.
|
664999b
to
bf83606
Compare
6405b9e
to
6221695
Compare
6221695
to
2dfc809
Compare
2dfc809
to
f2d6429
Compare
@mbasmanova : Updated the code. PTAL. |
f2d6429
to
8ede58a
Compare
e8fad87
to
22a676e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aditi-pandit Aditi, thank you for working on this. Overall looks good % a comment about parameter name, another comment about output of UnnestNode::toString() and some comments about tests.
22a676e
to
884ff44
Compare
3aa76f7
to
7d91e78
Compare
@mbasmanova : Have addressed all the comments. PTAL. #8077 is separated for refactoring the long method in Unnest logic pointed in review. |
435ff31
to
4efbdc6
Compare
Velox unnests array of row type into a single column of the element row type. Presto also supports unnesting such arrays into multiple columns one for each child type of element row type. Adds unnestArrayOfRows to add such unnesting support in Unnest PlanNode.
4efbdc6
to
72d07d4
Compare
@aditi-pandit CI is red. Would you try rebasing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aditi-pandit Aditi, this PR is shaping up nicely. A few remaining comments.
/// @param ordinalityName Optional name for the ordinality columns. If not | ||
/// present, ordinality column is not produced. | ||
/// @param unnestArrayOfRows Expands array of rows to multiple columns, one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If true, expands .... Otherwise, ...
unnestVariables, | ||
unnestNames, | ||
ordinalityName, | ||
true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious if default should be false. Doesn't current logic match unnestArrayOfRows = false?
@@ -665,9 +667,11 @@ number starting with 1. | |||
* - unnestVariables | |||
- Input columns of type array or map to expand. | |||
* - unnestNames | |||
- Names to use for expanded columns. One name per array column. Two names per map column. | |||
- Names to use for expanded columns. One name per array column. Two names per map column. If array of rows (and not legacyUnnest), then one column per child type of the row. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
legacyUnnest
Should we update this?
|
||
ASSERT_EQ("-- Unnest\n", plan->toString()); | ||
ASSERT_EQ( | ||
"-- Unnest[a0, unnest arrays of rows] -> c1:INTEGER, a0_e:SMALLINT\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this example use array(row) and not array(smallint)?
@@ -493,4 +494,23 @@ TEST_F(QueryAssertionsTest, intervalDayTime) { | |||
assertQuery(plan, "SELECT * FROM tmp"); | |||
} | |||
|
|||
TEST_F(QueryAssertionsTest, equalPlans) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this test for?
|
||
// Tests with an array with dictionary encoded elements. | ||
VectorFuzzer::Options opts; | ||
opts.containerVariableLength = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why false? Usually, these kinds of tests have a loop with 100 or so iterations and allow Fuzzer to generate whatever encoding it chooses using fuzzInputRow.
|
||
// Test with array of rows (one of which is a constant). Each array is | ||
// encoded. | ||
vector_size_t size = array1.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use fuzzer.
.planNode(); | ||
}; | ||
|
||
auto noUnnestPlan = [&](const VectorPtr& child1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use the same plans but first use data generated by Fuzzer, then use same data but flattened. The results should be the same.
@@ -1426,8 +1426,10 @@ PlanBuilder& PlanBuilder::nestedLoopJoin( | |||
PlanBuilder& PlanBuilder::unnest( | |||
const std::vector<std::string>& replicateColumns, | |||
const std::vector<std::string>& unnestColumns, | |||
const std::optional<std::string>& ordinalColumn) { | |||
const std::optional<std::string>& ordinalColumn, | |||
const bool unnestArrayOfRows) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove const
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions! |
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions! |
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions! |
Velox unnests array of row type into a single column of the
element row type. Presto also supports unnesting such arrays
into multiple columns one for each child type of element row
type. Adds unnestArrayOfRows to add such unnesting support
in Unnest PlanNode.
Needed for prestodb/presto#20643