-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Future of Upgrades with CAPI and Gitops #1031
Comments
This is becoming a topic in customer and internal communication, so I think we should raise priority on this one. Current workflow and discussion with customer:
Internal discussion / questions rising
|
Here are my 2 cents from honeybadger PoV:
-> If a customer wants to overwrite a k8s version different from the default in the provider app, then they themselves have to take care of managing the upgrades for those clusters. Or eventually remove that overlay to get the default k8s version again. |
Side note: The automatic app upgrades can also be pushed directly to |
This is what we came up with during todays refinement (you can download the .svg file and edit in https://draw.io). The general idea is to have customer-facing meta-app The general idea is that patch versions (maybe minors as well) should be automatically upgraded by some automation to GitOps repos. The exact versioning rules should be refined as well if the concept is satisfying. I hope the picture illustrates that: |
Just in case zipped svg: |
Just had an other chat with the customer:
|
@kopiczko I don't understand why this added app is needed. I am still missing the motivation / explanation for why efforts here are needed. @cornelius-keller Yes, bases can be structured in any granularity. We need to have an example but you can have one base per environment. Therefor we can update them separately, automatically. |
During the refinement we worked with the assumptions:
|
Talking with @MarcelMue on Slack I have a feeling this is about GitOps repo structure again. And it largely derailed from what @teemow described in the first place. So currently GitOps repo bases are split by DCs instead of stages because we have different image IDs and external network IDs between them. This can be partially solved by referencing images by name in CAPO 0.6 but I guess there will always differences in DCs. E.g. we can't enforce customers to have the same external network name in OpenStack and I'm sure there will be more to it. When it comes to GitOps repo structure I think we need to find a way to structure both DCs and stages. This of course doesn't solve original problems:
|
100% agreed, we need to find this structure. +1 on finding something for the original problems. |
@piontec this discussion seems to be tangential to the one we had in SIG Product sync today. |
@cornelius-keller AFAIK there was a meeting with rocket on monday. Can you post a summary here if there were any new decisions? |
I've added the link to issues in progress there, but this discussion of whether or not to package cluster apps into another layer of an app is irrelevant to this. As for the discussion of upgrades itself, there's an RFC: giantswarm/rfc#36 |
Hey, the RFC is closed now, it explains how app (any, including a cluster app bundle) can be auto-upgraded with Flux. Questions about displaying progress need to be handled separatly, as Flux only takes care about discovering a new version, doing the upgrade and storing the change in the repo. |
Created a dedicated roadmap ticket for the feedback part, I guess we can close this ticket? |
I think we still need some input from KaaS teams about presentation and feedback for upgrades, right? |
Wrong issue sorry |
For the record also for @MarcelMue, the decisions from meeting with customer are tracked here https://github.com/giantswarm/thg/issues/113 |
How has the question regarding the design of the app bundles been settled? Pawel drafted something where a Is this how it's going to work throughout CAPI providers? |
I don't think this is something we agreed on.
Feels dead simple unless I don't understand the question. BTW, yesterday I already had doubts about |
So if this hasn't been decided, I have questions.
Currently in Rainbow this is blocking part of #1192 as it means
|
I believe the alternative is to have independent apps. |
Adding this to the KaaS sync board for this Thursday. |
@pipo02mix FYI |
As part of this ticket we have enabled customers/us to roll changes across regions and stages independently with GitOps. At the same time, we have included a default version in cluster-PROVIDER app here like Marcel's explained. But we found out Kubernetes and OS version is tied in CAPI and we could serve a conversion table to enhance the UX. For that we have created this RFC to align across teams. In that PR ⬆️ there have been some discussions that I think should be moved here (@puja108 @AverageMarcus ) or create an RFC to discuss them. How can we progress towards the proposed goals? |
My point is rather that we did not want to offer such a freedom of choice (at least for the beginning) so we would set those versions via the app/chart (and have them tied to the app version). To me that is already the enhanced UX or what else does the customer need there? If we want to in the background use a conversion table for ourselves is an implementation detail to me. The UX towards the customer is rather that they know that a certain app version has a certain K8s and OS version, just like they currently know that a certain WC release has a certain K8s and OS version. If customers have a good reason to override our defaults, I'd like us to talk to them and see why that is and if the use case is valid for our offering, only then I'd consider a decision to open up things we did not open up before. I would however like to avoid that, especially in line with this story here, as long as the app version is the only thing that depicts certain defaults we come with, we have a way to test and roll out upgrades automatically also with GitOps. If we suddenly open up tons of configuration in there, the upgrade process will always require manual intervention and changes to the configuration. |
But customers need to control the k8s version for different reasons for example API deprecations (PSP now but I am pretty sure will be more in future) or vendor dependencies (we have a customer whose software only runs in certain k8s versions). We cannot have only a single cluster-PROVIDER release if we support multiple k8s minors since every release will overwrite the previous version. In GitOps we will control the k8s version using bases (for every stage/Env like HoneyBadger has proposed), so according to customer criteria, we can propose (automatically) the new versions. I can imagine the flow like this
(same will happen with prod env when the date window criteria are fullied) The logic about what k8s minor customer is supported (based on customer tests, for example, they don't support yet 1.25 because they still have PSP) is in the Upgrade Automation metadata. The same with the other criteria (when is the maintenance window, which stages we upgraded before,...) We must not use cluster-PROVIDER app releases to coordinate all cluster upgrade logic (we will need a higher level). More info in the RFC |
For now that sounds like an ok solution. Going forward we need to think more deeply about how we want to version and rollout, and yes we do need some kind of releases, even within a continously rolled out product. As Timo suggested in the other thread, we might want to consider going away from offering 2 minor versions as stable and more towards having channels like alpha, beta, RC, stable. I do currently see the customer value in being able to pin the k8s minor to latest-1 and be able to cherrypick patches for customers who are not ready to upgrade, yet. This would also reduce dependencies for other teams to roll out their patches and updates as long as they also work on the older k8s version, so we can reduce operative load. We need to see how the models would fit together going forward. I'd just not like us to rush it too much, as we are still a bit in flux on where and how we integrate and create boundaries between current and also future teams. |
This needs refinement or we might just close this issue. There are other issues as well #1791 #1661 @puja108 @alex-dabija please figure out if this overall story is needed or is at least refined with what needs to be done. |
And there is also: https://github.com/giantswarm/giantswarm/issues/25443 |
Trying to clean up here the main issues that need working on are:
IMO the first 3 are on the critical path and needed for us to be comfortable with KaaS going forward, the 4th can be iterated on going into the future with some experience from first upgrades, but needs to be considered in the other 1 and 2 so we do not block our way here. Also related to this will be our interaction processes around GitOps repos: https://github.com/giantswarm/giantswarm/issues/24072 cc @alex-dabija does that sound right to you? Did I forget anything or is there an issue I did not find and link here? |
Closing this issue as we are tracking the main work now within the release revamp stream, some short term work around gitops is handled in the ticket around interaction processes mentioned above. The rest (e.g. UX related topics that got discussed in the above will find themselves again in the iterations we will do around releases and upgrades, but don't need keeping this issue open for. |
Yes, sounds good. |
We have two major things that change the way how users upgrade clusters. CAPI and Gitops
The text was updated successfully, but these errors were encountered: