-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Execute coordinator piece of DDJLobs immediately #7325
Conversation
Codecov Report
@@ Coverage Diff @@
## main #7325 +/- ##
==========================================
- Coverage 89.58% 89.01% -0.57%
==========================================
Files 275 275
Lines 59613 59664 +51
Branches 7428 7438 +10
==========================================
- Hits 53403 53109 -294
- Misses 4079 4423 +344
- Partials 2131 2132 +1 |
@@ -632,6 +637,32 @@ citus_ProcessUtilityInternal(PlannedStmt *pstmt, | |||
{ | |||
ddlJobs = ops->preprocess(parsetree, queryString, context); | |||
} | |||
|
|||
if (ddlJobs != NIL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this isn't a completely generic solution at this point, because it does not work when postprocess returns the ddljobs. So for DDL where postprocess is the one that's necessary we'd need some other solution (e.g. only taking the locks on the coordinator and not executing the SQL yet). I think this PR seems like a good first step though, and the postprocess one could be solved in a follow up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this isn't a completely generic solution at this point, because it does not work when postprocess returns the ddljobs.
Correct.
So for DDL where postprocess is the one that's necessary we'd need some other solution (e.g. only taking the locks on the coordinator and not executing the SQL yet).
Right, we either need this^ or need to move Postprocess pieces to Preprocess stage for the commands we want to support from any-node.
|
||
if (*coordinatorDDLJob && *nonCoordinatorNodesDDLJob) | ||
{ | ||
/* defer the metadata sync to "other nodes" phase */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR doesn't have any description, so I cannot follow the goal.
But, isn't this complicated to reason about? E.g., now, some of the DDLJobs are different.
If the goal is to support DDL from any node, wouldn't it be simpler to have something like: #7016
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afaict #7016 would not be enough to solve the problem. My thinking is that if we execute standardProcessUtility
first on the current node, that will actually take locks on distributed shell table on the current node first (which could be a worker). And instead we want those locks to be taken on the coordinator first to avoid deadlocks. Sorting the workers where to send DDL would prevent deadlocks when actually changing the shards, but it will not prevent deadlocks that occur when changing the distributed shell table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR or #7016 or #6021 are both prone to distributed deadlocks. I think we should seek for simplicity at this point unless we want to implement DDL from any node properly -- which we didn't know how to do at the time
I think it could even be more acceptable to generalize #6021, it is at least easier to reason about.
To avoid the deadlocks, we could be more aggressive such that we acquire a global advisory lock on the first worker node per OID of the object changed. Such approach differs from PG in terms of concurrency, but at least simple and easy to explain to user (e.g., we allow one DDL from workers at a time).
Well, this is a tough problem, and hope doesn't create too much complexity in the code & architecture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid the deadlocks, we could be more aggressive such that we acquire a global advisory lock on the first worker node per OID of the object changed.
This would help avoiding from the deadlocks for the cases we acquire this adv lock. However, say if distributed_ddl (that's allowed from any node) can get into a distributed deadlock with citus_operation() (that's only allowed from coordinator), then we need to not forget about acquiring this adv lock in those operations too.
|
||
if (coordinatorJob) | ||
{ | ||
ExecuteDistributedDDLJob(coordinatorJob); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just fyi:
There are some good reasons for running DDL commands via standardProcessUtility locally at first. For example, if a command fails, you get the error on the table instead of on the shards -- which has been very confusing.
Similarly, if there are permission problems or any other problems, Citus does not use any resources (e.g., connections) before doing things locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a command fails, you get the error on the table instead of on the shards -- which has been very confusing.
That's indeed a good goal, but I don't think that's a big issue as long as we send the actual metadata updating DDL command first to the coordinator (instead of a shard command). You'd still get the "while running on coordinator.some.host.com:5432" DETAIL, but that seems rather harmless.
Similarly, if there are permission problems or any other problems, Citus does not use any resources (e.g., connections) before doing things locally.
I don't think this is an issue we really need to worry about. If an attacker is trying to exhaust resources, then this does not seem like a very smart way of doing so (e.g. run_command_on_workers + generate_series + cross join seems a much easier way to bring stuff down). And if it's by accident I don't think opening a connection is a huge resource hog, and probably a connection is already cached anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's indeed a good goal, but I don't think that's a big issue as long as we send the actual metadata updating DDL command first to the coordinator (instead of a shard command). You'd still get the "while running on coordinator.some.host.com:5432", but that seems rather harmless.
That caused quite a bit of confusion for the users in the past, so I would not underestimate impact of that UX change
(*nonCoordinatorNodesDDLJob)->taskList = NIL; | ||
|
||
Task *task = NULL; | ||
foreach_ptr(task, ddlJob->taskList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not thought enough about this, but I sense some possible deadlock scenarios if we run DDL from workers concurrently, so let's be careful about that (e.g., run DDL commands from workers via pgbench and make sure there are no deadlocks),
2f1d068
to
78be00b
Compare
DESCRIPTION: Adds support from issuing role management commands from worker nodes It's unlikely to get into a distributed deadlock with role commands, we don't care much about them at the moment. There were several attempts to reduce the chances of a deadlock but we didn't any of them merged into main branch yet, see: #7325 #7016 #7009
6bab65a
to
214ea8b
Compare
DESCRIPTION: Adds support from issuing role management commands from worker nodes It's unlikely to get into a distributed deadlock with role commands, we don't care much about them at the moment. There were several attempts to reduce the chances of a deadlock but we didn't any of them merged into main branch yet, see: #7325 #7016 #7009
No description provided.