Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement clusterStagedUpdateRun execution #1000

Merged
merged 2 commits into from
Dec 30, 2024
Merged

Conversation

jwtty
Copy link
Contributor

@jwtty jwtty commented Dec 23, 2024

Description of your changes

Implement clusterStagedUpdateRun execution. Also made some refactoring.

Fixes #

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Special notes for your reviewer

continue
}
// The cluster status is not deleting yet
if err := r.Client.Delete(ctx, binding); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared with the original implementation, I did not check if binding is up-to-date before deleting. I don't think it's necessary. Do we want that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do support multiple update run in parallel, I am not sure if there is any race condition that we may delete some other run's binidng.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of both resource changes and scheduling policy changes, the binding to be deleted may not have up-to-date resources. If there are concurrent update runs, the same binding could be deleted by either updateRun. If there's a new scheduling policy change that marks this to-be-deleted binding as scheduled again, this updateRun can fail during validation. Or if there's chance that the change happens right after the validation passes and before deleting the bindings, I think I can add a check to make sure the binding's state is unScheduled or report unexpected error if not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have a concern: in rollout controller, we make sure to complete the deletion of the bindings on a cluster before scheduling a newly created binding on the same cluster: https://github.com/Azure/fleet/blob/main/pkg/controllers/rollout/controller.go#L226. I wonder if we should have similar logic in the updateRun controller too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the concurrent updateRun is not valid if the schedule results are different. This is only for the case one needs to release different versions of the resources so the bindings should all be scheduled or bound.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the check no duplicate logic, should the check at

if binding.Spec.State != placementv1beta1.BindingStateScheduled && binding.Spec.State != placementv1beta1.BindingStateBound {
be suffice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the concurrent updateRun is not valid if the schedule results are different. This is only for the case one needs to release different versions of the resources so the bindings should all be scheduled or bound."

In this case, the bindings won't be marked as to-be-deleted because they both have the latest scheduling snapshot index.

pkg/controllers/updaterun/controller.go Show resolved Hide resolved
pkg/controllers/updaterun/execution.go Show resolved Hide resolved
continue
}
// The cluster status is not deleting yet
if err := r.Client.Delete(ctx, binding); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do support multiple update run in parallel, I am not sure if there is any race condition that we may delete some other run's binidng.

pkg/controllers/updaterun/execution.go Show resolved Hide resolved
pkg/controllers/updaterun/validation.go Show resolved Hide resolved
pkg/controllers/updaterun/execution_integration_test.go Outdated Show resolved Hide resolved
pkg/controllers/updaterun/controller.go Show resolved Hide resolved
pkg/controllers/updaterun/execution.go Show resolved Hide resolved
pkg/controllers/updaterun/execution.go Show resolved Hide resolved
@jwtty jwtty merged commit 61f7292 into Azure:main Dec 30, 2024
12 checks passed
@jwtty jwtty deleted the stagerun-exec branch December 30, 2024 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants