Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Multi-stage] Support is_leaf_return_final_result agg option #14645

Merged
merged 1 commit into from
Dec 13, 2024

Conversation

Jackie-Jiang
Copy link
Contributor

Introduce is_leaf_return_final_result as an agg option, which leverages the query option serverReturnFinalResultKeyUnpartitioned introduced in #13208 to reduce the data transferred for aggregation group-by.
When this option is set to true, LEAF aggregate will directly send final result to the FINAL aggregate. This is particular useful for DISTINCT_COUNT family aggregate when aggregated values are partitioned.

E.g.

SELECT /*+ aggOptions(is_leaf_return_final_result='true') */ {tbl1}.name, COUNT(*), SUM({tbl1}.num), COUNT(DISTINCT {tbl1}.num) FROM {tbl1} /*+ tableOptions(partition_function='hashcode', partition_key='num', partition_size='4', partition_parallelism='2') */ GROUP BY {tbl1}.name
SELECT /*+ aggOptions(is_leaf_return_final_result='true') */ {tbl1}.name, SUM({tbl2}.num), COUNT(DISTINCT {tbl2}.num) FROM {tbl1} /*+ tableOptions(partition_function='hashcode', partition_key='num', partition_size='4') */ JOIN {tbl2} /*+ tableOptions(partition_function='hashcode', partition_key='num', partition_size='4') */ ON {tbl1}.num = {tbl2}.num GROUP BY {tbl1}.name

@Jackie-Jiang Jackie-Jiang added enhancement documentation Configuration Config changes (addition/deletion/change in behavior) multi-stage Related to the multi-stage query engine labels Dec 12, 2024
@codecov-commenter
Copy link

codecov-commenter commented Dec 12, 2024

Codecov Report

Attention: Patch coverage is 87.27273% with 28 lines in your changes missing coverage. Please review.

Project coverage is 64.00%. Comparing base (59551e4) to head (014ce43).
Report is 1460 commits behind head on master.

Files with missing lines Patch % Lines
...pinot/calcite/rel/hint/PinotHintStrategyTable.java 50.00% 14 Missing ⚠️
...ry/runtime/operator/MultistageGroupByExecutor.java 90.00% 2 Missing and 2 partials ⚠️
...che/pinot/segment/spi/AggregationFunctionType.java 92.68% 3 Missing ⚠️
...he/pinot/query/planner/explain/PlanNodeMerger.java 0.00% 2 Missing ⚠️
.../query/planner/logical/EquivalentStagesFinder.java 0.00% 2 Missing ⚠️
...el/rules/PinotAggregateExchangeNodeInsertRule.java 96.29% 0 Missing and 1 partial ⚠️
...he/pinot/query/planner/plannode/AggregateNode.java 75.00% 1 Missing ⚠️
.../pinot/query/planner/serde/PlanNodeSerializer.java 98.18% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14645      +/-   ##
============================================
+ Coverage     61.75%   64.00%   +2.25%     
- Complexity      207     1602    +1395     
============================================
  Files          2436     2696     +260     
  Lines        133233   148473   +15240     
  Branches      20636    22756    +2120     
============================================
+ Hits          82274    95027   +12753     
- Misses        44911    46475    +1564     
- Partials       6048     6971     +923     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.97% <87.27%> (+2.26%) ⬆️
java-21 63.88% <87.27%> (+2.25%) ⬆️
skip-bytebuffers-false 63.99% <87.27%> (+2.24%) ⬆️
skip-bytebuffers-true 63.85% <87.27%> (+36.13%) ⬆️
temurin 64.00% <87.27%> (+2.25%) ⬆️
unittests 63.99% <87.27%> (+2.25%) ⬆️
unittests1 56.21% <87.27%> (+9.32%) ⬆️
unittests2 34.51% <3.18%> (+6.78%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Jackie-Jiang Jackie-Jiang merged commit 3677671 into apache:master Dec 13, 2024
21 checks passed
@Jackie-Jiang Jackie-Jiang deleted the leaf_final_result branch December 13, 2024 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Configuration Config changes (addition/deletion/change in behavior) documentation enhancement multi-stage Related to the multi-stage query engine
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants