Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADBDEV-4152: Optimize the Explicit Redistribute application logic #666

Merged
merged 85 commits into from
Mar 21, 2024
Merged
Show file tree
Hide file tree
Changes from 70 commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
fa1e719
Check for Motions before scan if requesting Explicit Redistribute Motion
bandetto Jan 30, 2024
51960cd
Update tests
bandetto Jan 30, 2024
4d5edb5
Update tests for optimizer
bandetto Jan 30, 2024
cbb6f6b
Add original queries to tests
bandetto Jan 30, 2024
684ac6a
Add cases with inherited tables to tests
bandetto Jan 30, 2024
da58c43
Clean up
bandetto Jan 30, 2024
208903b
Apply remarks
bandetto Jan 30, 2024
848cc16
Run pgindent
bandetto Jan 30, 2024
b24fea9
Look into hashExprs for Redistribute Motion
bandetto Jan 30, 2024
2144872
Move the logic to cdbmutate.c
bandetto Jan 30, 2024
671b439
Change comments, since varnoold is used
bandetto Jan 30, 2024
7c19171
Run pgindent
bandetto Jan 30, 2024
e61fbf3
Bring back resetting slice depth for InitPlan nodes
bandetto Jan 30, 2024
e0e839c
Save state if there is several ModifyTable nodes
bandetto Jan 30, 2024
aad110b
Fix case when there are multiple subtrees with scan on the same table
bandetto Jan 30, 2024
b775728
Clean up
bandetto Jan 30, 2024
8e69fae
Remove extra checks
bandetto Jan 30, 2024
9ebcff5
Fix formatting
bandetto Jan 30, 2024
951ece2
Fix isChecking being enabled for INSERT ModifyTable after first Expli…
bandetto Jan 30, 2024
049d111
Revert "Fix isChecking being enabled for INSERT ModifyTable after fir…
bandetto Jan 30, 2024
db1b2f4
Restore state only for UPDATE/DELETE
bandetto Jan 30, 2024
ec27689
Simplify the logic and don't look into motions' targetlist
bandetto Jan 30, 2024
894ffd3
Fix a test
bandetto Jan 30, 2024
7d9eea9
Don't save previous context (we can't have two ModifyTable nodes in t…
bandetto Jan 30, 2024
3531176
Revert "Change comments, since varnoold is used"
bandetto Jan 30, 2024
0225657
Describe ModifyTableState members
bandetto Jan 30, 2024
5281022
Change null assignments to asserts
bandetto Jan 30, 2024
7efdd12
Change List to Bitmapset
bandetto Jan 30, 2024
4f609c4
Use getrelid macro
bandetto Jan 30, 2024
fe9905b
Continue checking only when we reach the Explicit Redistribute Motion
bandetto Jan 30, 2024
98132eb
Revert "Fix a test"
bandetto Jan 30, 2024
d5f68e1
Ignore motions in InitPlans
bandetto Jan 30, 2024
31f4f3e
Fix comment in test
bandetto Jan 30, 2024
5a53ecd
Add a test that requies Explicit Redistribute Motion
bandetto Jan 30, 2024
bf60efa
Add a test for motions in InitPlans
bandetto Jan 30, 2024
a60385f
Squash if condition
bandetto Jan 30, 2024
8d91f93
Add a comment
bandetto Jan 30, 2024
c0d7269
Squash if condition
bandetto Jan 30, 2024
a99c4b1
Fix nMotionsAbove not decrementing when we already found motions
bandetto Jan 30, 2024
43b562b
Revert "Fix nMotionsAbove not decrementing when we already found moti…
bandetto Jan 30, 2024
2f170f8
Reset nMotionsAbove and needExplicitMotion when we are exiting recursion
bandetto Jan 30, 2024
44e7492
Check if scan retrieves ctid
bandetto Jan 30, 2024
7eb73d5
Add missing assert
bandetto Jan 30, 2024
b6f5049
Revert "Check if scan retrieves ctid"
bandetto Jan 30, 2024
b0bebc8
Check for matching RTIs instead of relation IDs
bandetto Jan 30, 2024
38beaa2
Get rid of rti variable
bandetto Feb 9, 2024
b8c919c
Add tests for partitions and swapped trees
bandetto Feb 9, 2024
4b98b7b
Fix typos in tests, use the same name for tables
bandetto Feb 9, 2024
d2c801a
Simplify motion conditions
bandetto Feb 9, 2024
11be3ca
Fix comments
bandetto Feb 9, 2024
73aac4c
Ignore DROP TABLE
bandetto Feb 9, 2024
053ea46
Remove scan types that aren't used for UPDATE
bandetto Feb 21, 2024
289068b
Count gather motions as well
bandetto Feb 21, 2024
5f2299f
Add test for gather motion
bandetto Feb 21, 2024
a270ca1
Fix comment
bandetto Feb 27, 2024
8d7fd19
Remove extra break
bandetto Mar 6, 2024
2970372
Clean up tests
bandetto Mar 6, 2024
4bef525
Remove flaky ORCA test
bandetto Mar 7, 2024
16fc477
Remove scans that can't cause an Explicit Motion
bandetto Mar 7, 2024
de565c2
Add comment about the list of scan nodes
bandetto Mar 7, 2024
bf188a4
Adjust comments
bandetto Mar 7, 2024
267077f
Remove the braces
bandetto Mar 7, 2024
4ffca2a
Remove unneeded verbose, only check test cases with changed plans
bandetto Mar 7, 2024
9d35370
Update comments
bandetto Mar 11, 2024
54dec02
Stop checking if we're out of ModifyTable
bandetto Mar 11, 2024
a628050
Squash if condition
bandetto Mar 11, 2024
4b43f33
Get rid of ModifyTableMotionState
bandetto Mar 14, 2024
d8ec1f0
Reorder members and update comments
bandetto Mar 14, 2024
63f63cd
Remove blank line
bandetto Mar 14, 2024
4301bf6
Use unary increment and decrement
bandetto Mar 14, 2024
71dbb77
Force a motion in InitPlan test case
bandetto Mar 15, 2024
5e69424
Replace bitmapset with a pointer to existing List and get rid of need…
bandetto Mar 15, 2024
e32d271
Replace loop with search
bandetto Mar 15, 2024
d927e62
Squash if condition
bandetto Mar 15, 2024
2d4550c
Squash if condition 2
bandetto Mar 15, 2024
8504082
if -> else if
bandetto Mar 15, 2024
3e6acf1
Fix alignment
bandetto Mar 15, 2024
bc89960
Try to restore ABI
bandetto Mar 15, 2024
cd9b81f
Merge branch 'adb-6.x-dev' into ADBDEV-4152
bandetto Mar 15, 2024
3f250be
Revert "Try to restore ABI"
bandetto Mar 15, 2024
24a0b49
mtResultRtis -> mtResultRelations
bandetto Mar 15, 2024
f2b9cbd
Get rid of extra logic and tests related to InitPlans
bandetto Mar 18, 2024
15d2190
Merge branch 'adb-6.x-dev' into ADBDEV-4152
andr-sokolov Mar 18, 2024
b19810b
Update a comment
bandetto Mar 18, 2024
d00b0e3
Revert "Get rid of extra logic and tests related to InitPlans"
bandetto Mar 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
165 changes: 138 additions & 27 deletions src/backend/cdb/cdbmutate.c
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,18 @@ typedef struct ApplyMotionState
* plan_tree_walker/mutator */
int nextMotionID;
int sliceDepth;
bool containMotionNodes;
HTAB *planid_subplans; /* hash table for InitPlanItem */

/* Context for ModifyTable to elide Explicit Redistribute Motion */
bool mtIsChecking; /* True if we encountered ModifyTable
* node with UPDATE/DELETE and we plan
* to insert Explicit Motions. */
Bitmapset *mtResultRtis; /* Indexes into rtable for relations to
* be modified. Only valid if mtIsChecking
* is true. */
bool needExplicitMotion;
andr-sokolov marked this conversation as resolved.
Show resolved Hide resolved
int nMotionsAbove; /* Number of motions above the current
* node */
} ApplyMotionState;

typedef struct InitPlanItem
Expand Down Expand Up @@ -404,7 +414,11 @@ apply_motion(PlannerInfo *root, Plan *plan, Query *query)
* plan */
state.nextMotionID = 1; /* Start at 1 so zero will mean "unassigned". */
state.sliceDepth = 0;
state.containMotionNodes = false;
state.mtResultRtis = NULL;
state.needExplicitMotion = false;
state.nMotionsAbove = 0;
state.mtIsChecking = false;

memset(&ctl, 0, sizeof(ctl));
ctl.keysize = sizeof(int);
ctl.entrysize = sizeof(InitPlanItem);
Expand Down Expand Up @@ -755,14 +769,14 @@ apply_motion_mutator(Node *node, ApplyMotionState *context)
if (!is_plan_node(node))
{
/*
* The containMotionNodes flag keeps track of whether there are any
* Motion nodes, ignoring any in InitPlans. So if we recurse into an
* InitPlan, save and restore the flag.
* mtIsChecking is true whether we are checking for motions underneath
* to add Explicit Reditribute Motion, ignoring any in InitPlans. So if
* we recurse into an InitPlan, save it and temporarily set it to false.
*/
if (IsA(node, SubPlan) &&((SubPlan *) node)->is_initplan)
{
bool found;
bool saveContainMotionNodes = context->containMotionNodes;
bool saveMtIsChecking = context->mtIsChecking;
int saveSliceDepth = context->sliceDepth;
SubPlan *subplan = (SubPlan *) node;
/*
Expand All @@ -777,9 +791,10 @@ apply_motion_mutator(Node *node, ApplyMotionState *context)

/* reset sliceDepth for each init plan */
context->sliceDepth = 0;
context->mtIsChecking = false;
node = plan_tree_mutator(node, apply_motion_mutator, context);

context->containMotionNodes = saveContainMotionNodes;
context->mtIsChecking = saveMtIsChecking;
context->sliceDepth = saveSliceDepth;

return node;
Expand All @@ -788,6 +803,102 @@ apply_motion_mutator(Node *node, ApplyMotionState *context)
return plan_tree_mutator(node, apply_motion_mutator, context);
}

/*
* For UPDATE/DELETE, we check if there's any motions before scan in the
* same subtree for the table we're going to modify. If we encounter the
* scan before any motions, then we can elide unnecessary Explicit
* Redistribute Motion.
*/
if (IsA(node, ModifyTable))
{
ModifyTable *mt = (ModifyTable *) node;

if (mt->operation == CMD_UPDATE || mt->operation == CMD_DELETE)
{
ListCell *lcr;

/*
* Sanity check, since we don't allow multiple ModifyTable nodes.
*/
Assert(context->mtResultRtis == NULL);
Assert(context->nMotionsAbove == 0);
Assert(!context->needExplicitMotion);
Assert(!context->mtIsChecking);

/*
* When UPDATE/DELETE occurs on a partitioned table, or a table that
* is a part of inheritance tree, ModifyTable node will have more
* than one relation in resultRelations.
*
* We make a set of resulting relations' indexes to compare them
* later.
*/
foreach(lcr, mt->resultRelations)
{
context->mtResultRtis = bms_add_member(context->mtResultRtis,
andr-sokolov marked this conversation as resolved.
Show resolved Hide resolved
lfirst_int(lcr));
}

context->mtIsChecking = true;
}
}

if (context->mtIsChecking)
{

/* Remember if we are descending into a motion node. */
if (IsA(node, Motion))
context->nMotionsAbove++;

/*
* If this is a scan and it's scanrelid matches ModifyTable's relid,
* we need to check if there were any motions above.
*/
else
andr-sokolov marked this conversation as resolved.
Show resolved Hide resolved
{
/*
* These are scan nodes that can be used to perform distributed
* UPDATE/DELETE on the relation they scan, possibly with motions
* above them. This list needs to be updated for other nodes if they
* are changed to support DML execution on segments.
*/
switch (nodeTag(node))
{
case T_SeqScan:
case T_DynamicSeqScan:
case T_IndexScan:
case T_DynamicIndexScan:
case T_IndexOnlyScan:
case T_BitmapIndexScan:
case T_DynamicBitmapIndexScan:
case T_BitmapHeapScan:
case T_DynamicBitmapHeapScan:
case T_TidScan:
{
RekGRpth marked this conversation as resolved.
Show resolved Hide resolved
Scan *scan = (Scan *) node;

if (bms_is_member(scan->scanrelid, context->mtResultRtis))
{
/*
* We need Explicit Redistribute Motion only if
* there were any motions above.
*/
context->needExplicitMotion = context->nMotionsAbove > 0;

/*
* We don't need to check other nodes in this
* subtree anymore.
*/
context->mtIsChecking = false;
}
}
break;
andr-sokolov marked this conversation as resolved.
Show resolved Hide resolved
default:
break;
}
}
}

plan = (Plan *) node;
flow = plan->flow;

Expand Down Expand Up @@ -936,17 +1047,11 @@ apply_motion_mutator(Node *node, ApplyMotionState *context)
break;

case MOVEMENT_EXPLICIT:

/*
* add an ExplicitRedistribute motion node only if child plan
* nodes have a motion node
* Were there any motions above the scan?
*/
if (context->containMotionNodes)
if (context->needExplicitMotion)
{
/*
* motion node in child nodes: add a ExplicitRedistribute
* motion
*/
newnode = (Node *) make_explicit_motion(plan,
flow->segidColIdx,
true /* useExecutorVarFormat */
Expand All @@ -955,13 +1060,21 @@ apply_motion_mutator(Node *node, ApplyMotionState *context)
else
{
/*
* no motion nodes in child plan nodes - no need for
* ExplicitRedistribute: restore flow
* Restore flow if Explicit Redistribute Motion is not needed
*/
flow->req_move = MOVEMENT_NONE;
flow->flow_before_req_move = NULL;
}

/*
* If we're here, it means we are directly under the ModifyTable
* node. We are about to go to out of the recursion and go into
* other subtree. So reset the state and continue checking in case
* of another Explicit Redistribute Motion is needed.
*/
context->mtIsChecking = true;
context->needExplicitMotion = false;
andr-sokolov marked this conversation as resolved.
Show resolved Hide resolved
context->nMotionsAbove = 0;
break;

case MOVEMENT_NONE:
Expand Down Expand Up @@ -1009,16 +1122,14 @@ apply_motion_mutator(Node *node, ApplyMotionState *context)
plan->nMotionNodes = context->nextMotionID - saveNextMotionID;
plan->nInitPlans = hash_get_num_entries(context->planid_subplans) - saveNumInitPlans;

/*
* Remember if this was a Motion node. This is used at the top of the
* tree, with MOVEMENT_EXPLICIT, to avoid adding an explicit motion, if
* there were no Motion in the subtree. Note that this does not take
* InitPlans containing Motion nodes into account. InitPlans are executed
* as a separate step before the main plan, and hence any Motion nodes in
* them don't need to affect the way the main plan is executed.
*/
if (IsA(newnode, Motion))
context->containMotionNodes = true;
if (context->mtIsChecking)
{
/* We're going out of this motion node. */
if (IsA(node, Motion))
context->nMotionsAbove--;
else if (IsA(node, ModifyTable))
context->mtIsChecking = false;
}

return newnode;
} /* apply_motion_mutator */
Expand Down
17 changes: 8 additions & 9 deletions src/test/regress/expected/bfv_dml.out
Original file line number Diff line number Diff line change
Expand Up @@ -521,17 +521,16 @@ alter table bar set with(REORGANIZE=false) distributed randomly;
analyze foo;
analyze bar;
explain delete from foo using bar;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
QUERY PLAN
--------------------------------------------------------------------------------------------------
Delete on foo (cost=10000000000.00..10000000009.98 rows=34 width=16)
-> Explicit Redistribute Motion 3:3 (slice2; segments: 3) (cost=10000000000.00..10000000009.98 rows=34 width=16)
-> Nested Loop (cost=10000000000.00..10000000009.98 rows=34 width=16)
-> Seq Scan on foo (cost=0.00..3.10 rows=4 width=10)
-> Materialize (cost=0.00..3.65 rows=10 width=6)
-> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..3.50 rows=10 width=6)
-> Seq Scan on bar (cost=0.00..3.10 rows=4 width=6)
-> Nested Loop (cost=10000000000.00..10000000009.98 rows=34 width=16)
-> Seq Scan on foo (cost=0.00..3.10 rows=4 width=10)
-> Materialize (cost=0.00..3.65 rows=10 width=6)
-> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..3.50 rows=10 width=6)
-> Seq Scan on bar (cost=0.00..3.10 rows=4 width=6)
Optimizer: Postgres query optimizer
(8 rows)
(7 rows)

delete from foo using bar;
drop table foo;
Expand Down
2 changes: 0 additions & 2 deletions src/test/regress/expected/gangsize.out
Original file line number Diff line number Diff line change
Expand Up @@ -173,13 +173,11 @@ end;
INFO: Distributed transaction command 'Distributed Commit (one-phase)' to PARTIAL contents: 0 1
update random_2_0 set a = 1 from hash_3_3_2 where hash_3_3_2.b = random_2_0.c;
INFO: (slice 1) Dispatch command to ALL contents: 0 1 2
INFO: (slice 2) Dispatch command to PARTIAL contents: 0 1
INFO: (slice 0) Dispatch command to PARTIAL contents: 0 1
INFO: Distributed transaction command 'Distributed Commit (one-phase)' to ALL contents: 0 1 2
begin;
update random_2_0 set a = 1 from hash_3_3_2 where hash_3_3_2.b = random_2_0.c;
INFO: (slice 1) Dispatch command to ALL contents: 0 1 2
INFO: (slice 2) Dispatch command to PARTIAL contents: 0 1
INFO: (slice 0) Dispatch command to PARTIAL contents: 0 1
end;
INFO: Distributed transaction command 'Distributed Commit (one-phase)' to ALL contents: 0 1 2
Expand Down
2 changes: 0 additions & 2 deletions src/test/regress/expected/gangsize_optimizer.out
Original file line number Diff line number Diff line change
Expand Up @@ -171,13 +171,11 @@ end;
INFO: Distributed transaction command 'Distributed Commit (one-phase)' to PARTIAL contents: 0 1
update random_2_0 set a = 1 from hash_3_3_2 where hash_3_3_2.b = random_2_0.c;
INFO: (slice 1) Dispatch command to ALL contents: 0 1 2
INFO: (slice 2) Dispatch command to PARTIAL contents: 0 1
INFO: (slice 0) Dispatch command to PARTIAL contents: 0 1
INFO: Distributed transaction command 'Distributed Commit (one-phase)' to ALL contents: 0 1 2
begin;
update random_2_0 set a = 1 from hash_3_3_2 where hash_3_3_2.b = random_2_0.c;
INFO: (slice 1) Dispatch command to ALL contents: 0 1 2
INFO: (slice 2) Dispatch command to PARTIAL contents: 0 1
INFO: (slice 0) Dispatch command to PARTIAL contents: 0 1
end;
INFO: Distributed transaction command 'Distributed Commit (one-phase)' to ALL contents: 0 1 2
Expand Down
56 changes: 27 additions & 29 deletions src/test/regress/expected/gp_unique_rowid.out
Original file line number Diff line number Diff line change
Expand Up @@ -192,22 +192,21 @@ where e.x in
select b from t2_12512
)
;
QUERY PLAN
---------------------------------------------------------------------------
QUERY PLAN
---------------------------------------------------------------------
Update on t_12512
-> Explicit Redistribute Motion 3:3 (slice2; segments: 3)
-> HashAggregate
Group Key: t_12512.ctid, x.ctid
-> Hash Join
Hash Cond: (x.x = t2_12512.b)
-> Nested Loop
-> Seq Scan on t_12512
-> Function Scan on x
-> Hash
-> Broadcast Motion 3:3 (slice1; segments: 3)
-> Seq Scan on t2_12512
-> HashAggregate
Group Key: t_12512.ctid, x.ctid
-> Hash Join
Hash Cond: (x.x = t2_12512.b)
-> Nested Loop
-> Seq Scan on t_12512
-> Function Scan on x
-> Hash
-> Broadcast Motion 3:3 (slice1; segments: 3)
-> Seq Scan on t2_12512
Optimizer: Postgres query optimizer
(13 rows)
(12 rows)

update t_12512 set b = 1
from
Expand All @@ -231,23 +230,22 @@ where e.x in
select b from t2_12512
)
;
QUERY PLAN
---------------------------------------------------------------------------
QUERY PLAN
---------------------------------------------------------------------
Update on t_12512
-> Explicit Redistribute Motion 3:3 (slice2; segments: 3)
-> HashAggregate
Group Key: t_12512.ctid, "*VALUES*".ctid
-> Hash Join
Hash Cond: ("*VALUES*".column1 = t2_12512.b)
-> Nested Loop
-> Seq Scan on t_12512
-> Materialize
-> Values Scan on "*VALUES*"
-> Hash
-> Broadcast Motion 3:3 (slice1; segments: 3)
-> Seq Scan on t2_12512
-> HashAggregate
Group Key: t_12512.ctid, "*VALUES*".ctid
-> Hash Join
Hash Cond: ("*VALUES*".column1 = t2_12512.b)
-> Nested Loop
-> Seq Scan on t_12512
-> Materialize
-> Values Scan on "*VALUES*"
-> Hash
-> Broadcast Motion 3:3 (slice1; segments: 3)
-> Seq Scan on t2_12512
Optimizer: Postgres query optimizer
(14 rows)
(13 rows)

update t_12512 set b = 1
from
Expand Down
21 changes: 10 additions & 11 deletions src/test/regress/expected/gporca.out
Original file line number Diff line number Diff line change
Expand Up @@ -11861,19 +11861,18 @@ select b, count(*) from gpexp_hash group by b order by b;
(11 rows)

explain update gpexp_rand set b=(select b from gpexp_hash where gpexp_rand.a = gpexp_hash.a);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Update on gpexp_rand (cost=0.00..133.75 rows=25 width=14)
-> Explicit Redistribute Motion 2:2 (slice2; segments: 2) (cost=0.00..133.75 rows=25 width=14)
-> Seq Scan on gpexp_rand (cost=0.00..133.75 rows=25 width=14)
SubPlan 1 (slice2; segments: 2)
-> Result (cost=0.00..2.63 rows=1 width=4)
Filter: (gpexp_rand.a = gpexp_hash.a)
-> Materialize (cost=0.00..2.63 rows=1 width=4)
-> Broadcast Motion 2:2 (slice1; segments: 2) (cost=0.00..2.62 rows=1 width=4)
-> Seq Scan on gpexp_hash (cost=0.00..2.62 rows=1 width=4)
-> Seq Scan on gpexp_rand (cost=0.00..133.75 rows=25 width=14)
SubPlan 1 (slice0; segments: 2)
-> Result (cost=0.00..2.63 rows=1 width=4)
Filter: (gpexp_rand.a = gpexp_hash.a)
-> Materialize (cost=0.00..2.63 rows=1 width=4)
-> Broadcast Motion 2:2 (slice1; segments: 2) (cost=0.00..2.62 rows=1 width=4)
-> Seq Scan on gpexp_hash (cost=0.00..2.62 rows=1 width=4)
Optimizer: Postgres query optimizer
(10 rows)
(9 rows)

update gpexp_rand set b=(select b from gpexp_hash where gpexp_rand.a = gpexp_hash.a);
select b, count(*) from gpexp_rand group by b order by b;
Expand Down
Loading
Loading