Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cluster-wide in-place upgrade proposal #55

Merged

Conversation

HomayoonAlimohammadi
Copy link
Contributor

@HomayoonAlimohammadi HomayoonAlimohammadi commented Sep 11, 2024

This PR adds the cluster-wide in-place upgrade proposal, a folllow-up on the original in-place upgrade proposal and should be merged after that.

Copy link
Member

@berkayoz berkayoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work and thank you for picking this up, on some previous discussions we've discussed the need for phases/steps for upgrades where control-plane nodes and worker-nodes can be upgraded separately.

For this I believe we can utilize the CK8sControlPlane and MachineDeployment objects and check the annotations on these objects. This would mean creating 2 reconcilers. The phasing/step control would be left to the user as in user can choose to upgrade annotate the CK8sControlPlane first and the MachineDeployments later. What do you think?

@HomayoonAlimohammadi
Copy link
Contributor Author

Thanks a lot for the suggestion @berkayoz!
So IIUC, you're suggesting that instead of applying the UpgradeTo annotation to the cluster object, we apply it to the Ck8sControlPlane and MachineDeployment objects (exactly like when we're doing a rolling upgrade and changing the version field on those manifests) so that the controlplane and bootstrap reconcilers can take care of this?
If I got that correctly, I'm totally up for it. If we have another confirmation, maybe from @bschimke95, I'll go on and make the changes.

@berkayoz
Copy link
Member

Yes exactly @HomayoonAlimohammadi, we would annotate those objects. We would need to update the existing controlplane reconciler and create a new MachineDeployment reconciler to act on these annotations.

@bschimke95
Copy link
Contributor

+1 This sounds reasonable. IIRC, there also was the idea to still support annotating the Cluster object but this would basically just be a convenience and would be passed through to the Ck8sControlPlane and MachineDeployments respectively.

docs/proposals/002-cluster-controller.md Outdated Show resolved Hide resolved
docs/proposals/002-cluster-controller.md Show resolved Hide resolved
docs/proposals/002-cluster-controller.md Outdated Show resolved Hide resolved
Copy link
Contributor

@bschimke95 bschimke95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work!
Congrats to your first proposal.

Copy link
Member

@berkayoz berkayoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work! Thank you for picking this up.
Some points of consideration:

  • We will be getting/creating a list of machines on the reconcilers, we should make sure these are ordered and we should also skip machines that have a release annotation set to our upgrade option which means the upgrade has been performed there(unless the option is local path).
  • We should add events to the resources to indicate which machine is being worked on, which machine had a failed upgrade etc.
  • If we are not gonna retry the operation we should also make sure to remove the upgrade-to annotation from the machine that failed since upgrade retries are done on the machine reconciler.

@HomayoonAlimohammadi HomayoonAlimohammadi merged commit fe015b7 into main Sep 16, 2024
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants