-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADBDEV-5521: Add support of CTE with modifying DML operations on replicated tables (#1168) #1168
Conversation
…#1168) Several issues could occur while running queries with writable CTEs, which modify replicated tables. This patch proposes solution for main aspects of handling the modifying CTE over replicated tables. Those issues appeared due to inability of major interfaces to handle locus type CdbLocusType_Replicated and flow type FLOW_REPLICATED, which is set for the result of modifying operation over replicated table inside the adjust_modifytable_flow function. 1. Choosing Join locus type The planner failed to form a plan when trying to choose correct final locus type for join operation between modifying CTE over replicated table and any other relation. The planner were failing either with errors, because cdbpath_motion_for_join function could not choose a proper join locus, or produced invalid suboptimal plan when handling Replicated locus. Therefore, this patch reconsiders the join logic with locus Replicated. The main principle to decide join locus of Replicated locus with others is: the slice which does the modifying operation on replicated table must be executed on all segments with data. And we also have to keep in mind that Redistributing or Broadcasting Replicated locus will lead to data duplication. The joins are performed in the following way: SegmentGeneral + Replicated When the join between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral occured, the planner failed with the assert, which checks that the common UPDATE/DELETE ... FROM/USING (modifying operation with join) of a replicated table takes place (root->upd_del_replicated_table == 0 condition in the cdbpath_motion_for_join function).This assert was added by the author without proper test coverage, and, currently, there is no case when this check can be true. Without the assert the planner produced the valid plan, however it could cut the number of segments in the final locus. In this context locus Replicated was not correctly handled, therefore one need to add the proper logic which takes into account the writable CTE case. This patch allows joins between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral in the following way: for the cases, when SegmentGeneral's number of segments is greater or equal than Replicated's, the final join locus type becomes Replicated as well. Otherwise we gather both Replicated and SegmentGeneral parts to SingleQE in order to perform join at single segment. Replicated + General The logic related to General locus remained unchanged. The join locus type becomes Replicated. Replicated + SingleQE If the join between CdbLocusType_Replicated and CdbLocusType_Entry or CdbLocusType_SingleQE takes place, the Replicated locus will be brought to Entry or to SingleQE respectively. Replicated + Partitioned If the join between CdbLocusType_Replicated and CdbLocusType_Hashed or CdbLocusType_Strewn takes place, and the number of segments for Replicated locus is equal to number of segments for other locus, and it's not an outer join, the join will be performed with Hashed or Strewn locus respectively. Otherwise, both parts are brought to SingleQE. Replicated + Replicated Join between 2 parts with Replicated locus leads to final join locus being also Replicated. 2. UNION ALL which includes Replicated path Here two issues occured. When UNION ALL contained a path with locus Replicated, and other path, which is propagated to less number of segments than Replicated is, the UNION target locus is aligned to the smallest number of segments among the operands inside the set_append_path_locus function, what leads to segments reduction of locus Replicated. This behaviour is invalid, because Replicated locus must be executed on all segments with data. Therefore this patch does not allow to make alignment of final UNION locus in case when target locus is Replicated. When UNION ALL were executed with operands, one of which had locus Replicated and the other had locus SegmentGeneral, the planner were failing with assertion error from cdbpath_create_motion_path. Final locus of UNION was decied to be Replicated, and therefore, all other arguments had to become Replicated as well. The issue was that cdbpath_create_motion_path function forbidden operations of moving SegmentGeneral path to Replicated locus. This patch solves the issue by making SegmentGeneral locus stay unchanged in case when SegmentGeneral's number of segment is greater or equal than Replicated's. In this case no motion needed. Otherwise SegmentGeneral path is broadcasted from single segment. 3. Volatile functions handling When a query had quals, which were applied to the modifying CTE and contained volatile functions, the final plan became invalid, because its execution led to volatile tuple set on different segments (however, we expect the set of Replicated locus be the same everywhere). That could produce inconsistent results in case of joins, SubPlans. And the issue is not only with quals, but it can occur for volatile targetlists or joinquals. This patch solves this issue by total prohibition of volatile functions applied to the plan or path with locus Replicated. The funcions turn_volatile_seggen_to_singleqe, create_agg_subplan, create_groupingsets_plan, create_modifytable_plan, create_motion_path, make_subplan were extended by the condition, which cheks whether given locus type is Replicated and there are volatile functions over it. If condition is satisfied the proper error is thrown The changes cover volatile targetlist, returning list, plan quals or join quals, having clauses. 4. Replicated locus with different number of segments inside SubPlans Another issue solved by this patch occured when the modifying CTE was referenced inside any SubPlan. In this case cdbllize_decorate_subplans_with_motions and fix_outer_query_motions_mutator functions tried to broadcast already replicated plan, what could lead to data duplication. Therefore, one need to prevent these funcions from broadcasting the result with Replicated locus. This patch modifies both functions by adding the condition, depending on which the planning goes on or the error is thrown. If plan's locus type is CdbLocusType_Replicated and its numsegments is equal to number of segments of target distribution, the broadcast doesn't occur. If the number of segments is different, the error is thrown. 5. Executor with sorted Explicit Gather Motion execMotionSortedReceiver function had an assert preventing Explicit Gather from working for sorted data. Since there isn't anything preventing it from working correctly, this patch removes the assert and that case now works correctly. Co-Authored-By: Alexey Gordeev <[email protected]> (cherry picked from commit 76964c0)
3157fc6
to
f6be0c8
Compare
This comment was marked as resolved.
This comment was marked as resolved.
…#1168) Several issues could occur while running queries with writable CTEs, which modify replicated tables. This patch proposes solution for main aspects of handling the modifying CTE over replicated tables. Those issues appeared due to inability of major interfaces to handle locus type CdbLocusType_Replicated and flow type FLOW_REPLICATED, which is set for the result of modifying operation over replicated table inside the adjust_modifytable_flow function. 1. Choosing Join locus type The planner failed to form a plan when trying to choose correct final locus type for join operation between modifying CTE over replicated table and any other relation. The planner were failing either with errors, because cdbpath_motion_for_join function could not choose a proper join locus, or produced invalid suboptimal plan when handling Replicated locus. Therefore, this patch reconsiders the join logic with locus Replicated. The main principle to decide join locus of Replicated locus with others is: the slice which does the modifying operation on replicated table must be executed on all segments with data. And we also have to keep in mind that Redistributing or Broadcasting Replicated locus will lead to data duplication. The joins are performed in the following way: SegmentGeneral + Replicated When the join between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral occured, the planner failed with the assert, which checks that the common UPDATE/DELETE ... FROM/USING (modifying operation with join) of a replicated table takes place (root->upd_del_replicated_table == 0 condition in the cdbpath_motion_for_join function).This assert was added by the author without proper test coverage, and, currently, there is no case when this check can be true. Without the assert the planner produced the valid plan, however it could cut the number of segments in the final locus. In this context locus Replicated was not correctly handled, therefore one need to add the proper logic which takes into account the writable CTE case. This patch allows joins between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral in the following way: for the cases, when SegmentGeneral's number of segments is greater or equal than Replicated's, the final join locus type becomes Replicated as well. Otherwise we gather both Replicated and SegmentGeneral parts to SingleQE in order to perform join at single segment. Replicated + General The logic related to General locus remained unchanged. The join locus type becomes Replicated. Replicated + SingleQE If the join between CdbLocusType_Replicated and CdbLocusType_Entry or CdbLocusType_SingleQE takes place, the Replicated locus will be brought to Entry or to SingleQE respectively. Replicated + Partitioned If the join between CdbLocusType_Replicated and CdbLocusType_Hashed or CdbLocusType_Strewn takes place, and the number of segments for Replicated locus is equal to number of segments for other locus, and it's not an outer join, the join will be performed with Hashed or Strewn locus respectively. Otherwise, both parts are brought to SingleQE. Replicated + Replicated Join between 2 parts with Replicated locus leads to final join locus being also Replicated. 2. UNION ALL which includes Replicated path Here two issues occured. When UNION ALL contained a path with locus Replicated, and other path, which is propagated to less number of segments than Replicated is, the UNION target locus is aligned to the smallest number of segments among the operands inside the set_append_path_locus function, what leads to segments reduction of locus Replicated. This behaviour is invalid, because Replicated locus must be executed on all segments with data. Therefore this patch does not allow to make alignment of final UNION locus in case when target locus is Replicated. When UNION ALL were executed with operands, one of which had locus Replicated and the other had locus SegmentGeneral, the planner were failing with assertion error from cdbpath_create_motion_path. Final locus of UNION was decied to be Replicated, and therefore, all other arguments had to become Replicated as well. The issue was that cdbpath_create_motion_path function forbidden operations of moving SegmentGeneral path to Replicated locus. This patch solves the issue by making SegmentGeneral locus stay unchanged in case when SegmentGeneral's number of segment is greater or equal than Replicated's. In this case no motion needed. Otherwise SegmentGeneral path is broadcasted from single segment. 3. Volatile functions handling When a query had quals, which were applied to the modifying CTE and contained volatile functions, the final plan became invalid, because its execution led to volatile tuple set on different segments (however, we expect the set of Replicated locus be the same everywhere). That could produce inconsistent results in case of joins, SubPlans. And the issue is not only with quals, but it can occur for volatile targetlists or joinquals. This patch solves this issue by total prohibition of volatile functions applied to the plan or path with locus Replicated. The funcions turn_volatile_seggen_to_singleqe, create_agg_subplan, create_groupingsets_plan, create_modifytable_plan, create_motion_path, make_subplan were extended by the condition, which cheks whether given locus type is Replicated and there are volatile functions over it. If condition is satisfied the proper error is thrown The changes cover volatile targetlist, returning list, plan quals or join quals, having clauses. 4. Replicated locus with different number of segments inside SubPlans Another issue solved by this patch occured when the modifying CTE was referenced inside any SubPlan. In this case cdbllize_decorate_subplans_with_motions and fix_outer_query_motions_mutator functions tried to broadcast already replicated plan, what could lead to data duplication. Therefore, one need to prevent these funcions from broadcasting the result with Replicated locus. This patch modifies both functions by adding the condition, depending on which the planning goes on or the error is thrown. If plan's locus type is CdbLocusType_Replicated and its numsegments is equal to number of segments of target distribution, the broadcast doesn't occur. If the number of segments is different, the error is thrown. 5. Executor with sorted Explicit Gather Motion execMotionSortedReceiver function had an assert preventing Explicit Gather from working for sorted data. Since there isn't anything preventing it from working correctly, this patch removes the assert and that case now works correctly. Since the planner in GPDB 7 was considerably reworked, there are a lot of changes from the original commit: 1. Changes for ParallelizeCorrelatedSubPlanMutator amd ParallelizeSubplan are moved to cdbllize_decorate_subplans_with_motions. 2. Additional related fix to fix_outer_query_motions_mutator, disabling broadcasts for Replicated tables. Explicit Gather is still allowed, and motions to the same number of segments are omitted. 3. Fix for apply_motion is not needed anymore (original case 1). 4. Original fix for cdbpath_create_motion_path is not needed, already fixed (corresponds to original case 3). 5. In cdbpath_motion_for_join, whole case is missing, added the original GPDB 6 version from the patch. Also removed the asserts. 6. Additional fix for create_motion_plan for fixing outer query behavior, now MOTIONTYPE_OUTER_QUERY is not lost. 7. Fix for set_append_path_locus slightly reworked because the condition is more complicated on GPDB 7. 8. Tests for volatile functions are moved to different places. Quals are now checked in cdbpath_create_motion_path, returning lists in create_modifytable_plan, target lists in make_subplan, HAVING clauses in create_agg_plan and create_groupingsets_planю 9. Test output is updated for GPDB 7 (different plans). Also GPDB 7 managed to correctly plan a query that was failing before using Explicit Gather Motion: ``` explain (costs off) with cte as ( insert into with_dml_dr_seg2 select i, i * 100 from generate_series(1,6) i returning i, j ) select * from t1 where t1.i in (select i from cte) order by 1; ``` Co-Authored-By: Alexey Gordeev <[email protected]> (cherry picked from commit 76964c0)
f6be0c8
to
3ff46a7
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
…#1168) Several issues could occur while running queries with writable CTEs, which modify replicated tables. This patch proposes solution for main aspects of handling the modifying CTE over replicated tables. Those issues appeared due to inability of major interfaces to handle locus type CdbLocusType_Replicated and flow type FLOW_REPLICATED, which is set for the result of modifying operation over replicated table inside the adjust_modifytable_flow function. 1. Choosing Join locus type The planner failed to form a plan when trying to choose correct final locus type for join operation between modifying CTE over replicated table and any other relation. The planner were failing either with errors, because cdbpath_motion_for_join function could not choose a proper join locus, or produced invalid suboptimal plan when handling Replicated locus. Therefore, this patch reconsiders the join logic with locus Replicated. The main principle to decide join locus of Replicated locus with others is: the slice which does the modifying operation on replicated table must be executed on all segments with data. And we also have to keep in mind that Redistributing or Broadcasting Replicated locus will lead to data duplication. The joins are performed in the following way: SegmentGeneral + Replicated When the join between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral occurred, the planner failed with the assert, which checks that the common UPDATE/DELETE ... FROM/USING (modifying operation with join) of a replicated table takes place (root->upd_del_replicated_table == 0 condition in the cdbpath_motion_for_join function).This assert was added by the author without proper test coverage, and, currently, there is no case when this check can be true. Without the assert the planner produced the valid plan, however it could cut the number of segments in the final locus. In this context locus Replicated was not correctly handled, therefore one need to add the proper logic which takes into account the writable CTE case. This patch allows joins between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral in the following way: for the cases, when SegmentGeneral's number of segments is greater or equal than Replicated's, the final join locus type becomes Replicated as well. Otherwise we gather both Replicated and SegmentGeneral parts to SingleQE in order to perform join at single segment. Replicated + General The logic related to General locus remained unchanged. The join locus type becomes Replicated. Replicated + SingleQE If the join between CdbLocusType_Replicated and CdbLocusType_Entry or CdbLocusType_SingleQE takes place, the Replicated locus will be brought to Entry or to SingleQE respectively. Replicated + Partitioned If the join between CdbLocusType_Replicated and CdbLocusType_Hashed or CdbLocusType_Strewn takes place, and the number of segments for Replicated locus is equal to number of segments for other locus, and it's not an outer join, the join will be performed with Hashed or Strewn locus respectively. Otherwise, both parts are brought to SingleQE. Replicated + Replicated Join between 2 parts with Replicated locus leads to final join locus being also Replicated. 2. UNION ALL which includes Replicated path Here two issues occurred. When UNION ALL contained a path with locus Replicated, and other path, which is propagated to less number of segments than Replicated is, the UNION target locus is aligned to the smallest number of segments among the operands inside the set_append_path_locus function, what leads to segments reduction of locus Replicated. This behaviour is invalid, because Replicated locus must be executed on all segments with data. Therefore this patch does not allow to make alignment of final UNION locus in case when target locus is Replicated. When UNION ALL were executed with operands, one of which had locus Replicated and the other had locus SegmentGeneral, the planner was failing with assertion error from cdbpath_create_motion_path. Final locus of UNION was decided to be Replicated, and therefore, all other arguments had to become Replicated as well. The issue was that cdbpath_create_motion_path function forbidden operations of moving SegmentGeneral path to Replicated locus. This patch solves the issue by making SegmentGeneral locus stay unchanged in case when SegmentGeneral's number of segment is greater or equal than Replicated's. In this case no motion needed. Otherwise SegmentGeneral path is broadcasted from single segment. 3. Volatile functions handling When a query had quals, which were applied to the modifying CTE and contained volatile functions, the final plan became invalid, because its execution led to volatile tuple set on different segments (however, we expect the set of Replicated locus be the same everywhere). That could produce inconsistent results in case of joins, SubPlans. And the issue is not only with quals, but it can occur for volatile targetlists or joinquals. This patch solves this issue by total prohibition of volatile functions applied to the plan or path with locus Replicated. The functions turn_volatile_seggen_to_singleqe, create_agg_subplan, create_groupingsets_plan, create_modifytable_plan, create_motion_path, make_subplan were extended by the condition, which checks whether given locus type is Replicated and there are volatile functions over it. If condition is satisfied the proper error is thrown The changes cover volatile targetlist, returning list, plan quals or join quals, having clauses. 4. Replicated locus with different number of segments inside SubPlans Another issue solved by this patch occurred when the modifying CTE was referenced inside any SubPlan. In this case cdbllize_decorate_subplans_with_motions and fix_outer_query_motions_mutator functions tried to broadcast already replicated plan, what could lead to data duplication. Therefore, one need to prevent these functions from broadcasting the result with Replicated locus. This patch modifies both functions by adding the condition, depending on which the planning goes on or the error is thrown. If plan's locus type is CdbLocusType_Replicated and its numsegments is equal to number of segments of target distribution, the broadcast doesn't occur. If the number of segments is different, the error is thrown. 5. Executor with sorted Explicit Gather Motion execMotionSortedReceiver function had an assert preventing Explicit Gather from working for sorted data. Since there isn't anything preventing it from working correctly, this patch removes the assert and that case now works correctly. Since the planner in GPDB 7 was considerably reworked, there are a lot of changes from the original commit: 1. Changes for ParallelizeCorrelatedSubPlanMutator amd ParallelizeSubplan are moved to cdbllize_decorate_subplans_with_motions. 2. Additional related fix to fix_outer_query_motions_mutator, disabling broadcasts for Replicated tables. Explicit Gather is still allowed, and motions to the same number of segments are omitted. 3. Fix for apply_motion is not needed anymore (original case 1). 4. Original fix for cdbpath_create_motion_path is not needed, already fixed (corresponds to original case 3). 5. In cdbpath_motion_for_join, whole case is missing, added the original GPDB 6 version from the patch. Also removed the asserts. 6. Additional fix for create_motion_plan for fixing outer query behavior, now MOTIONTYPE_OUTER_QUERY is not lost. 7. Fix for set_append_path_locus slightly reworked because the condition is more complicated on GPDB 7. 8. Checks for volatile functions are moved to different places. Quals are now checked in cdbpath_create_motion_path, returning lists in create_modifytable_plan, target lists in make_subplan, HAVING clauses in create_agg_plan and create_groupingsets_planю 9. Test output is updated for GPDB 7 (different plans). Also GPDB 7 managed to correctly plan a query that was failing before using Explicit Gather Motion: ``` explain (costs off) with cte as ( insert into with_dml_dr_seg2 select i, i * 100 from generate_series(1,6) i returning i, j ) select * from t1 where t1.i in (select i from cte) order by 1; ``` Co-Authored-By: Alexey Gordeev <[email protected]> (cherry picked from commit 76964c0)
3ff46a7
to
9a9bbb9
Compare
…#1168) Several issues could occur while running queries with writable CTEs, which modify replicated tables. This patch proposes solution for main aspects of handling the modifying CTE over replicated tables. Those issues appeared due to inability of major interfaces to handle locus type CdbLocusType_Replicated and flow type FLOW_REPLICATED, which is set for the result of modifying operation over replicated table inside the adjust_modifytable_flow function. 1. Choosing Join locus type The planner failed to form a plan when trying to choose correct final locus type for join operation between modifying CTE over replicated table and any other relation. The planner were failing either with errors, because cdbpath_motion_for_join function could not choose a proper join locus, or produced invalid suboptimal plan when handling Replicated locus. Therefore, this patch reconsiders the join logic with locus Replicated. The main principle to decide join locus of Replicated locus with others is: the slice which does the modifying operation on replicated table must be executed on all segments with data. And we also have to keep in mind that Redistributing or Broadcasting Replicated locus will lead to data duplication. The joins are performed in the following way: SegmentGeneral + Replicated When the join between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral occurred, the planner failed with the assert, which checks that the common UPDATE/DELETE ... FROM/USING (modifying operation with join) of a replicated table takes place (root->upd_del_replicated_table == 0 condition in the cdbpath_motion_for_join function).This assert was added by the author without proper test coverage, and, currently, there is no case when this check can be true. Without the assert the planner produced the valid plan, however it could cut the number of segments in the final locus. In this context locus Replicated was not correctly handled, therefore one need to add the proper logic which takes into account the writable CTE case. This patch allows joins between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral in the following way: for the cases, when SegmentGeneral's number of segments is greater or equal than Replicated's, the final join locus type becomes Replicated as well. Otherwise we gather both Replicated and SegmentGeneral parts to SingleQE in order to perform join at single segment. Replicated + General The logic related to General locus remained unchanged. The join locus type becomes Replicated. Replicated + SingleQE If the join between CdbLocusType_Replicated and CdbLocusType_Entry or CdbLocusType_SingleQE takes place, the Replicated locus will be brought to Entry or to SingleQE respectively. Replicated + Partitioned If the join between CdbLocusType_Replicated and CdbLocusType_Hashed or CdbLocusType_Strewn takes place, and the number of segments for Replicated locus is equal to number of segments for other locus, and it's not an outer join, the join will be performed with Hashed or Strewn locus respectively. Otherwise, both parts are brought to SingleQE. Replicated + Replicated Join between 2 parts with Replicated locus leads to final join locus being also Replicated. 2. UNION ALL which includes Replicated path Here two issues occurred. When UNION ALL contained a path with locus Replicated, and other path, which is propagated to less number of segments than Replicated is, the UNION target locus is aligned to the smallest number of segments among the operands inside the set_append_path_locus function, what leads to segments reduction of locus Replicated. This behaviour is invalid, because Replicated locus must be executed on all segments with data. Therefore this patch does not allow to make alignment of final UNION locus in case when target locus is Replicated. When UNION ALL were executed with operands, one of which had locus Replicated and the other had locus SegmentGeneral, the planner was failing with assertion error from cdbpath_create_motion_path. Final locus of UNION was decided to be Replicated, and therefore, all other arguments had to become Replicated as well. The issue was that cdbpath_create_motion_path function forbidden operations of moving SegmentGeneral path to Replicated locus. This patch solves the issue by making SegmentGeneral locus stay unchanged in case when SegmentGeneral's number of segment is greater or equal than Replicated's. In this case no motion needed. Otherwise SegmentGeneral path is broadcasted from single segment. 3. Volatile functions handling When a query had quals, which were applied to the modifying CTE and contained volatile functions, the final plan became invalid, because its execution led to volatile tuple set on different segments (however, we expect the set of Replicated locus be the same everywhere). That could produce inconsistent results in case of joins, SubPlans. And the issue is not only with quals, but it can occur for volatile targetlists or joinquals. This patch solves this issue by total prohibition of volatile functions applied to the plan or path with locus Replicated. The functions turn_volatile_seggen_to_singleqe, create_agg_subplan, create_groupingsets_plan, create_modifytable_plan, create_motion_path, make_subplan were extended by the condition, which checks whether given locus type is Replicated and there are volatile functions over it. If condition is satisfied the proper error is thrown The changes cover volatile targetlist, returning list, plan quals or join quals, having clauses. 4. Replicated locus with different number of segments inside SubPlans Another issue solved by this patch occurred when the modifying CTE was referenced inside any SubPlan. In this case cdbllize_decorate_subplans_with_motions and fix_outer_query_motions_mutator functions tried to broadcast already replicated plan, what could lead to data duplication. Therefore, one need to prevent these functions from broadcasting the result with Replicated locus. This patch modifies both functions by adding the condition, depending on which the planning goes on or the error is thrown. If plan's locus type is CdbLocusType_Replicated and its numsegments is equal to number of segments of target distribution, the broadcast doesn't occur. If the number of segments is different, the error is thrown. 5. Executor with sorted Explicit Gather Motion execMotionSortedReceiver function had an assert preventing Explicit Gather from working for sorted data. Since there isn't anything preventing it from working correctly, this patch removes the assert and that case now works correctly. Since the planner in GPDB 7 was considerably reworked, there are a lot of changes from the original commit: 1. Changes for ParallelizeCorrelatedSubPlanMutator amd ParallelizeSubplan are moved to cdbllize_decorate_subplans_with_motions. 2. Additional related fix to fix_outer_query_motions_mutator, disabling broadcasts for Replicated tables. Explicit Gather is still allowed, and motions to the same number of segments are omitted. 3. Fix for apply_motion is not needed anymore (original case 1). 4. Original fix for cdbpath_create_motion_path is not needed, already fixed (corresponds to original case 3). 5. In cdbpath_motion_for_join, whole case is missing, added the original GPDB 6 version from the patch. Also removed the asserts. 6. Additional fix for create_motion_plan for fixing outer query behavior, now MOTIONTYPE_OUTER_QUERY is not lost. 7. Fix for set_append_path_locus slightly reworked because the condition is more complicated on GPDB 7. 8. Checks for volatile functions are moved to different places. Quals are now checked in cdbpath_create_motion_path, returning lists in create_modifytable_plan, target lists in make_subplan, HAVING clauses in create_agg_plan and create_groupingsets_plan. 9. Test output is updated for GPDB 7 (different plans). Also GPDB 7 managed to correctly plan a query that was failing before using Explicit Gather Motion: ``` explain (costs off) with cte as ( insert into with_dml_dr_seg2 select i, i * 100 from generate_series(1,6) i returning i, j ) select * from t1 where t1.i in (select i from cte) order by 1; ``` Co-Authored-By: Alexey Gordeev <[email protected]> (cherry picked from commit 76964c0)
9a9bbb9
to
9458821
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
I think it is. The plan that is produced is valid: |
…#1168) Several issues could occur while running queries with writable CTEs, which modify replicated tables. This patch proposes solution for main aspects of handling the modifying CTE over replicated tables. Those issues appeared due to inability of major interfaces to handle locus type CdbLocusType_Replicated and flow type FLOW_REPLICATED, which is set for the result of modifying operation over replicated table inside the adjust_modifytable_flow function. 1. Choosing Join locus type The planner failed to form a plan when trying to choose correct final locus type for join operation between modifying CTE over replicated table and any other relation. The planner was failing either with errors, because cdbpath_motion_for_join function could not choose a proper join locus, or produced invalid suboptimal plan when handling Replicated locus. Therefore, this patch reconsiders the join logic with locus Replicated. The main principle to decide join locus of Replicated locus with others is: the slice which does the modifying operation on replicated table must be executed on all segments with data. And we also have to keep in mind that Redistributing or Broadcasting Replicated locus will lead to data duplication. The joins are performed in the following way: SegmentGeneral + Replicated When the join between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral occurred, the planner failed with the assert, which checks that the common UPDATE/DELETE ... FROM/USING (modifying operation with join) of a replicated table takes place (root->upd_del_replicated_table == 0 condition in the cdbpath_motion_for_join function).This assert was added by the author without proper test coverage, and, currently, there is no case when this check can be true. Without the assert the planner produced the valid plan, however it could cut the number of segments in the final locus. In this context locus Replicated was not correctly handled, therefore one need to add the proper logic which takes into account the writable CTE case. This patch allows joins between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral in the following way: for the cases, when SegmentGeneral's number of segments is greater or equal than Replicated's, the final join locus type becomes Replicated as well. Otherwise we gather both Replicated and SegmentGeneral parts to SingleQE in order to perform join at single segment. Replicated + General The logic related to General locus remained unchanged. The join locus type becomes Replicated. Replicated + SingleQE If the join between CdbLocusType_Replicated and CdbLocusType_Entry or CdbLocusType_SingleQE takes place, the Replicated locus will be brought to Entry or to SingleQE respectively. Replicated + Partitioned If the join between CdbLocusType_Replicated and CdbLocusType_Hashed or CdbLocusType_Strewn takes place, and the number of segments for Replicated locus is equal to number of segments for other locus, and it's not an outer join, the join will be performed with Hashed or Strewn locus respectively. Otherwise, both parts are brought to SingleQE. Replicated + Replicated Join between 2 parts with Replicated locus leads to final join locus being also Replicated. 2. UNION ALL which includes Replicated path Here two issues occurred. When UNION ALL contained a path with locus Replicated, and other path, which is propagated to less number of segments than Replicated is, the UNION target locus is aligned to the smallest number of segments among the operands inside the set_append_path_locus function, what leads to segments reduction of locus Replicated. This behaviour is invalid, because Replicated locus must be executed on all segments with data. Therefore this patch does not allow to make alignment of final UNION locus in case when target locus is Replicated. When UNION ALL were executed with operands, one of which had locus Replicated and the other had locus SegmentGeneral, the planner was failing with assertion error from cdbpath_create_motion_path. Final locus of UNION was decided to be Replicated, and therefore, all other arguments had to become Replicated as well. The issue was that cdbpath_create_motion_path function forbidden operations of moving SegmentGeneral path to Replicated locus. This patch solves the issue by making SegmentGeneral locus stay unchanged in case when SegmentGeneral's number of segment is greater or equal than Replicated's. In this case no motion needed. Otherwise SegmentGeneral path is broadcasted from single segment. 3. Volatile functions handling When a query had quals, which were applied to the modifying CTE and contained volatile functions, the final plan became invalid, because its execution led to volatile tuple set on different segments (however, we expect the set of Replicated locus be the same everywhere). That could produce inconsistent results in case of joins, SubPlans. And the issue is not only with quals, but it can occur for volatile targetlists or joinquals. This patch solves this issue by total prohibition of volatile functions applied to the plan or path with locus Replicated. The functions turn_volatile_seggen_to_singleqe, create_agg_subplan, create_groupingsets_plan, create_modifytable_plan, create_motion_path, make_subplan were extended by the condition, which checks whether given locus type is Replicated and there are volatile functions over it. If condition is satisfied the proper error is thrown The changes cover volatile targetlist, returning list, plan quals or join quals, having clauses. 4. Replicated locus with different number of segments inside SubPlans Another issue solved by this patch occurred when the modifying CTE was referenced inside any SubPlan. In this case cdbllize_decorate_subplans_with_motions and fix_outer_query_motions_mutator functions tried to broadcast already replicated plan, what could lead to data duplication. Therefore, one need to prevent these functions from broadcasting the result with Replicated locus. This patch modifies both functions by adding the condition, depending on which the planning goes on or the error is thrown. If plan's locus type is CdbLocusType_Replicated and its numsegments is equal to number of segments of target distribution, the broadcast doesn't occur. If the number of segments is different, the error is thrown. 5. Executor with sorted Explicit Gather Motion execMotionSortedReceiver function had an assert preventing Explicit Gather from working for sorted data. Since there isn't anything preventing it from working correctly, this patch removes the assert and that case now works correctly. Since the planner in GPDB 7 was considerably reworked, there are a lot of changes from the original commit: 1. Changes for ParallelizeCorrelatedSubPlanMutator amd ParallelizeSubplan are moved to cdbllize_decorate_subplans_with_motions. 2. Additional related fix to fix_outer_query_motions_mutator, disabling broadcasts for Replicated tables. Explicit Gather is still allowed, and motions to the same number of segments are omitted. 3. Fix for apply_motion is not needed anymore (original case 1). 4. Original fix for cdbpath_create_motion_path is not needed, already fixed (corresponds to original case 3). 5. In cdbpath_motion_for_join, whole case is missing, added the original GPDB 6 version from the patch. Also removed the asserts. 6. Additional fix for create_motion_plan for fixing outer query behavior, now MOTIONTYPE_OUTER_QUERY is not lost. 7. Fix for set_append_path_locus slightly reworked because the condition is more complicated on GPDB 7. 8. Checks for volatile functions are moved to different places. Quals are now checked in cdbpath_create_motion_path, returning lists in create_modifytable_plan, target lists in make_subplan, HAVING clauses in create_agg_plan and create_groupingsets_plan. 9. Test output is updated for GPDB 7 (different plans). Also GPDB 7 managed to correctly plan a query that was failing before using Explicit Gather Motion: ``` explain (costs off) with cte as ( insert into with_dml_dr_seg2 select i, i * 100 from generate_series(1,6) i returning i, j ) select * from t1 where t1.i in (select i from cte) order by 1; ``` Co-Authored-By: Alexey Gordeev <[email protected]> (cherry picked from commit 76964c0)
9458821
to
df36aa4
Compare
Is this situation covered in the tests? |
Actually, turns out that particular fix is now unnecessary because of the way I fixed |
…#1168) Several issues could occur while running queries with writable CTEs, which modify replicated tables. This patch proposes solution for main aspects of handling the modifying CTE over replicated tables. Those issues appeared due to inability of major interfaces to handle locus type CdbLocusType_Replicated and flow type FLOW_REPLICATED, which is set for the result of modifying operation over replicated table inside the adjust_modifytable_flow function. 1. Choosing Join locus type The planner failed to form a plan when trying to choose correct final locus type for join operation between modifying CTE over replicated table and any other relation. The planner was failing either with errors, because cdbpath_motion_for_join function could not choose a proper join locus, or produced invalid suboptimal plan when handling Replicated locus. Therefore, this patch reconsiders the join logic with locus Replicated. The main principle to decide join locus of Replicated locus with others is: the slice which does the modifying operation on replicated table must be executed on all segments with data. And we also have to keep in mind that Redistributing or Broadcasting Replicated locus will lead to data duplication. The joins are performed in the following way: SegmentGeneral + Replicated When the join between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral occurred, the planner failed with the assert, which checks that the common UPDATE/DELETE ... FROM/USING (modifying operation with join) of a replicated table takes place (root->upd_del_replicated_table == 0 condition in the cdbpath_motion_for_join function).This assert was added by the author without proper test coverage, and, currently, there is no case when this check can be true. Without the assert the planner produced the valid plan, however it could cut the number of segments in the final locus. In this context locus Replicated was not correctly handled, therefore one need to add the proper logic which takes into account the writable CTE case. This patch allows joins between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral in the following way: for the cases, when SegmentGeneral's number of segments is greater or equal than Replicated's, the final join locus type becomes Replicated as well. Otherwise we gather both Replicated and SegmentGeneral parts to SingleQE in order to perform join at single segment. Replicated + General The logic related to General locus remained unchanged. The join locus type becomes Replicated. Replicated + SingleQE If the join between CdbLocusType_Replicated and CdbLocusType_Entry or CdbLocusType_SingleQE takes place, the Replicated locus will be brought to Entry or to SingleQE respectively. Replicated + Partitioned If the join between CdbLocusType_Replicated and CdbLocusType_Hashed or CdbLocusType_Strewn takes place, and the number of segments for Replicated locus is equal to number of segments for other locus, and it's not an outer join, the join will be performed with Hashed or Strewn locus respectively. Otherwise, both parts are brought to SingleQE. Replicated + Replicated Join between 2 parts with Replicated locus leads to final join locus being also Replicated. 2. UNION ALL which includes Replicated path Here two issues occurred. When UNION ALL contained a path with locus Replicated, and other path, which is propagated to less number of segments than Replicated is, the UNION target locus is aligned to the smallest number of segments among the operands inside the set_append_path_locus function, what leads to segments reduction of locus Replicated. This behaviour is invalid, because Replicated locus must be executed on all segments with data. Therefore this patch does not allow to make alignment of final UNION locus in case when target locus is Replicated. When UNION ALL were executed with operands, one of which had locus Replicated and the other had locus SegmentGeneral, the planner was failing with assertion error from cdbpath_create_motion_path. Final locus of UNION was decided to be Replicated, and therefore, all other arguments had to become Replicated as well. The issue was that cdbpath_create_motion_path function forbidden operations of moving SegmentGeneral path to Replicated locus. This patch solves the issue by making SegmentGeneral locus stay unchanged in case when SegmentGeneral's number of segment is greater or equal than Replicated's. In this case no motion needed. Otherwise SegmentGeneral path is broadcasted from single segment. 3. Volatile functions handling When a query had quals, which were applied to the modifying CTE and contained volatile functions, the final plan became invalid, because its execution led to volatile tuple set on different segments (however, we expect the set of Replicated locus be the same everywhere). That could produce inconsistent results in case of joins, SubPlans. And the issue is not only with quals, but it can occur for volatile targetlists or joinquals. This patch solves this issue by total prohibition of volatile functions applied to the plan or path with locus Replicated. The functions turn_volatile_seggen_to_singleqe, create_agg_subplan, create_groupingsets_plan, create_modifytable_plan, create_motion_path, make_subplan were extended by the condition, which checks whether given locus type is Replicated and there are volatile functions over it. If condition is satisfied the proper error is thrown The changes cover volatile targetlist, returning list, plan quals or join quals, having clauses. 4. Replicated locus with different number of segments inside SubPlans Another issue solved by this patch occurred when the modifying CTE was referenced inside any SubPlan. In this case cdbllize_decorate_subplans_with_motions and fix_outer_query_motions_mutator functions tried to broadcast already replicated plan, what could lead to data duplication. Therefore, one need to prevent these functions from broadcasting the result with Replicated locus. This patch modifies both functions by adding the condition, depending on which the planning goes on or the error is thrown. If plan's locus type is CdbLocusType_Replicated and its numsegments is equal to number of segments of target distribution, the broadcast doesn't occur. If the number of segments is different, the error is thrown. 5. Executor with sorted Explicit Gather Motion execMotionSortedReceiver function had an assert preventing Explicit Gather from working for sorted data. Since there isn't anything preventing it from working correctly, this patch removes the assert and that case now works correctly. Since the planner in GPDB 7 was considerably reworked, there are a lot of changes from the original commit: 1. Changes for ParallelizeCorrelatedSubPlanMutator amd ParallelizeSubplan are moved to cdbllize_decorate_subplans_with_motions. 2. Additional related fix to fix_outer_query_motions_mutator, disabling broadcasts for Replicated tables. Explicit Gather is still allowed, and motions to the same number of segments are omitted. 3. Fix for apply_motion is not needed anymore (original case 1). 4. Original fix for cdbpath_create_motion_path is not needed, already fixed (corresponds to original case 3). 5. In cdbpath_motion_for_join, whole case is missing, added the original GPDB 6 version from the patch. Also removed the asserts. 6. Fix for set_append_path_locus slightly reworked because the condition is more complicated on GPDB 7. 7. Checks for volatile functions are moved to different places. Quals are now checked in cdbpath_create_motion_path, returning lists in create_modifytable_plan, target lists in make_subplan, HAVING clauses in create_agg_plan and create_groupingsets_plan. 8. Test output is updated for GPDB 7 (different plans). Also GPDB 7 managed to correctly plan a query that was failing before using Explicit Gather Motion: ``` explain (costs off) with cte as ( insert into with_dml_dr_seg2 select i, i * 100 from generate_series(1,6) i returning i, j ) select * from t1 where t1.i in (select i from cte) order by 1; ``` Co-Authored-By: Alexey Gordeev <[email protected]> (cherry picked from commit 76964c0)
df36aa4
to
7e88b34
Compare
@silent-observer, please check the CI failure and update the branch |
Add support of CTE with modifying DML operations on replicated tables (#1168)
Several issues could occur while running queries with writable CTEs, which
modify replicated tables. This patch proposes solution for main aspects of
handling the modifying CTE over replicated tables. Those issues appeared due to
inability of major interfaces to handle locus type CdbLocusType_Replicated and
flow type FLOW_REPLICATED, which is set for the result of modifying operation
over replicated table inside the adjust_modifytable_flow function.
The planner failed to form a plan when trying to choose
correct final locus type for join operation between modifying CTE over
replicated table and any other relation. The planner were failing either with
errors, because cdbpath_motion_for_join function could not choose a proper
join locus, or produced invalid suboptimal plan when handling Replicated locus.
Therefore, this patch reconsiders the join logic with locus Replicated.
The main principle to decide join locus of Replicated locus with others is: the
slice which does the modifying operation on replicated table must be executed on
all segments with data. And we also have to keep in mind that Redistributing or
Broadcasting Replicated locus will lead to data duplication. The joins are
performed in the following way:
SegmentGeneral + Replicated
When the join between CdbLocusType_Replicated and CdbLocusType_SegmentGeneral
occurred, the planner failed with the assert, which checks that the common
UPDATE/DELETE ... FROM/USING (modifying operation with join) of a replicated
table takes place (root->upd_del_replicated_table == 0 condition in the
cdbpath_motion_for_join function).This assert was added by the author without
proper test coverage, and, currently, there is no case when this check can be
true.
Without the assert the planner produced the valid plan, however it could cut the
number of segments in the final locus. In this context locus Replicated was not
correctly handled, therefore one need to add the proper logic which takes into
account the writable CTE case.
This patch allows joins between CdbLocusType_Replicated and
CdbLocusType_SegmentGeneral in the following way: for the cases, when
SegmentGeneral's number of segments is greater or equal than Replicated's, the
final join locus type becomes Replicated as well. Otherwise we gather both
Replicated and SegmentGeneral parts to SingleQE in order to perform join at
single segment.
Replicated + General
The logic related to General locus remained unchanged. The join locus type
becomes Replicated.
Replicated + SingleQE
If the join between CdbLocusType_Replicated and CdbLocusType_Entry or
CdbLocusType_SingleQE takes place, the Replicated locus will be brought to Entry
or to SingleQE respectively.
Replicated + Partitioned
If the join between CdbLocusType_Replicated and CdbLocusType_Hashed or
CdbLocusType_Strewn takes place, and the number of segments for Replicated locus
is equal to number of segments for other locus, and it's not an outer join, the
join will be performed with Hashed or Strewn locus respectively. Otherwise,
both parts are brought to SingleQE.
Replicated + Replicated
Join between 2 parts with Replicated locus leads to final join locus being also
Replicated.
Here two issues occurred. When UNION ALL contained a path with locus Replicated,
and other path, which is propagated to less number of segments than Replicated
is, the UNION target locus is aligned to the smallest number of segments among
the operands inside the set_append_path_locus function, what leads to segments
reduction of locus Replicated. This behaviour is invalid, because Replicated
locus must be executed on all segments with data. Therefore this patch does
not allow to make alignment of final UNION locus in case when target locus
is Replicated.
When UNION ALL were executed with operands, one of which had locus
Replicated and the other had locus SegmentGeneral, the planner was failing
with assertion error from cdbpath_create_motion_path. Final locus of UNION
was decided to be Replicated, and therefore, all other arguments had to become
Replicated as well. The issue was that cdbpath_create_motion_path function
forbidden operations of moving SegmentGeneral path to Replicated locus.
This patch solves the issue by making SegmentGeneral locus stay unchanged
in case when SegmentGeneral's number of segment is greater or equal
than Replicated's. In this case no motion needed. Otherwise SegmentGeneral path
is broadcasted from single segment.
When a query had quals, which were applied to the modifying CTE and
contained volatile functions, the final plan became invalid, because its
execution led to volatile tuple set on different segments (however, we expect
the set of Replicated locus be the same everywhere). That could produce
inconsistent results in case of joins, SubPlans. And the issue is not only with
quals, but it can occur for volatile targetlists or joinquals.
This patch solves this issue by total prohibition of volatile functions applied
to the plan or path with locus Replicated. The functions
turn_volatile_seggen_to_singleqe, create_agg_subplan, create_groupingsets_plan,
create_modifytable_plan, create_motion_path, make_subplan were extended by the
condition, which checks whether given locus type is Replicated and there are
volatile functions over it. If condition is satisfied the proper error is thrown
The changes cover volatile targetlist, returning list, plan quals or join quals,
having clauses.
Another issue solved by this patch occurred when the modifying CTE was
referenced inside any SubPlan. In this case
cdbllize_decorate_subplans_with_motions and fix_outer_query_motions_mutator
functions tried to broadcast already replicated plan, what could lead to data
duplication. Therefore, one need to prevent these functions from broadcasting
the result with Replicated locus.
This patch modifies both functions by adding the condition, depending on which
the planning goes on or the error is thrown. If plan's locus type is
CdbLocusType_Replicated and its numsegments is equal to number of segments of
target distribution, the broadcast doesn't occur. If the number of segments is
different, the error is thrown.
execMotionSortedReceiver function had an assert preventing Explicit Gather from
working for sorted data. Since there isn't anything preventing it from working
correctly, this patch removes the assert and that case now works correctly.
Since the planner in GPDB 7 was considerably reworked, there are a lot of
changes from the original commit:
moved to cdbllize_decorate_subplans_with_motions.
broadcasts for Replicated tables. Explicit Gather is still allowed, and motions
to the same number of segments are omitted.
(corresponds to original case 3).
GPDB 6 version from the patch. Also removed the asserts.
MOTIONTYPE_OUTER_QUERY is not lost.
more complicated on GPDB 7.
checked in cdbpath_create_motion_path, returning lists in create_modifytable_plan,
target lists in make_subplan, HAVING clauses in create_agg_plan and
create_groupingsets_plan.
correctly plan a query that was failing before using Explicit Gather Motion:
Co-Authored-By: Alexey Gordeev <[email protected]>
(cherry picked from commit 76964c0)
Note: do not squash to preserve authorship.
It's easier to review this patch with "Hide whitespace"