-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement clusterStagedUpdateRun execution #1000
Conversation
continue | ||
} | ||
// The cluster status is not deleting yet | ||
if err := r.Client.Delete(ctx, binding); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compared with the original implementation, I did not check if binding is up-to-date before deleting. I don't think it's necessary. Do we want that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do support multiple update run in parallel, I am not sure if there is any race condition that we may delete some other run's binidng.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of both resource changes and scheduling policy changes, the binding to be deleted may not have up-to-date resources. If there are concurrent update runs, the same binding could be deleted by either updateRun. If there's a new scheduling policy change that marks this to-be-deleted binding as scheduled again, this updateRun can fail during validation. Or if there's chance that the change happens right after the validation passes and before deleting the bindings, I think I can add a check to make sure the binding's state is unScheduled or report unexpected error if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do have a concern: in rollout controller, we make sure to complete the deletion of the bindings on a cluster before scheduling a newly created binding on the same cluster: https://github.com/Azure/fleet/blob/main/pkg/controllers/rollout/controller.go#L226. I wonder if we should have similar logic in the updateRun controller too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the concurrent updateRun is not valid if the schedule results are different. This is only for the case one needs to release different versions of the resources so the bindings should all be scheduled or bound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the check no duplicate logic, should the check at
if binding.Spec.State != placementv1beta1.BindingStateScheduled && binding.Spec.State != placementv1beta1.BindingStateBound { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the concurrent updateRun is not valid if the schedule results are different. This is only for the case one needs to release different versions of the resources so the bindings should all be scheduled or bound."
In this case, the bindings won't be marked as to-be-deleted because they both have the latest scheduling snapshot index.
continue | ||
} | ||
// The cluster status is not deleting yet | ||
if err := r.Client.Delete(ctx, binding); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do support multiple update run in parallel, I am not sure if there is any race condition that we may delete some other run's binidng.
Description of your changes
Implement clusterStagedUpdateRun execution. Also made some refactoring.
Fixes #
I have:
make reviewable
to ensure this PR is ready for review.How has this code been tested
Special notes for your reviewer