Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAPI Cluster Autoscaling with Machine Deployments #1376

Open
4 of 6 tasks
Tracked by #2738
alex-dabija opened this issue Sep 7, 2022 · 12 comments
Open
4 of 6 tasks
Tracked by #2738

CAPI Cluster Autoscaling with Machine Deployments #1376

alex-dabija opened this issue Sep 7, 2022 · 12 comments
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/cross-team Epics that span across teams kind/story provider/capvcd provider/cluster-api-azure Cluster API based running on Azure provider/vsphere Related to a VMware vSphere based on-premises solution team/turtles Team Turtles topic/capi

Comments

@alex-dabija
Copy link

alex-dabija commented Sep 7, 2022

Story

-As a cluster admin, I want a CAP(O|G|V|VCD) cluster to autoscale depending on the required resources in order to ensure the stability of applications running on the cluster.

Background

CAP(O|G|V|VCD) cluster don't have support for machine pools, which means we can't leverage the cloud provider's support for autoscaling. Machine deployments are supported and cluster-autoscaler does have support for them.

Giant Swarm has been running the cluster-autoscaler on the workload cluster in order to enable the cluster to still scale in case the communication between the management cluster and the workload cluster is severed.

Unfortunately, the cluster-autoscaler needs to run on the management cluster in order to be able to update the workload cluster's machine deployement resources.

Resources

Stories

  1. area/kaas team/turtles
    primeroz
  2. area/kaas team/turtles
    njuettner
  3. Ready area/kaas team/turtles
  4. Ready area/kaas team/turtles
  5. area/kaas team/turtles
    njuettner
@alex-dabija alex-dabija added team/hydra area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/story topic/capi provider/cluster-api-gcp Cluster API based running on GCP labels Sep 7, 2022
@cornelius-keller cornelius-keller changed the title CAPG cluster autoscaling CAPI cluster autoscaling Sep 23, 2022
@cornelius-keller cornelius-keller changed the title CAPI cluster autoscaling CAPI cluster autoscaling MachinePools Sep 23, 2022
@cornelius-keller cornelius-keller changed the title CAPI cluster autoscaling MachinePools CAPI cluster autoscaling MachineDeployments Sep 23, 2022
@alex-dabija alex-dabija added the provider/openstack Related to provider OpenStack label Sep 23, 2022
@cornelius-keller
Copy link
Contributor

@alex-dabija do we also need a story about making the schedular configurable to optimize for better usage of existing nodes before creating new ones? how is this currently handled in the cloud installations?

@alex-dabija
Copy link
Author

We don't have any special configuration for the Kubernetes scheduler to optimize for better usage. We do have the vertical pod autoscaler running by default on vintage clusters which might have an impact on how many pods are running on a node.

This story is only meant to get the cluster autoscaler running on the management cluster to handle autoscaling of machine deployments.

I wouldn't worry for now about packing pods more efficiently on the nodes. I would just keep things simple for now.

@gawertm
Copy link

gawertm commented Nov 16, 2022

@alex-dabija I changed ownership from Hydra to Rocket, as most likely Rocket will implement first. We are already looking at the documentation and trying it out

@alex-dabija alex-dabija added team/rocket Team Rocket and removed team/hydra labels Nov 16, 2022
@gawertm gawertm added the kind/cross-team Epics that span across teams label Nov 23, 2022
@gawertm
Copy link

gawertm commented Nov 30, 2022

@Rotfuks
Copy link
Contributor

Rotfuks commented Dec 13, 2022

Done in Clippy via: #1793

@brinker211
Copy link
Contributor

@alex-dabija I see the new tickets for CAPA and CAPZ. Is there still a ticket for CAPG? This was the provider this ticket originally started with.

@brinker211
Copy link
Contributor

@gawertm is the referenced internal ticket https://github.com/giantswarm/giantswarm/issues/23820 for CAPG? I see Hydra and Clippy referenced for CAPA and CAPZ but looking for the status of CAPG as well. Thanks!

@gawertm
Copy link

gawertm commented Jan 19, 2023

hi @brinker211 yes this was for CAPG but also for the Rocket providers. We initially planned to help Hydra with that. Eventhough its not a Rocket priority anymore

@gawertm
Copy link

gawertm commented May 15, 2023

@Rotfuks would the autoscaler topic go Turtles? lets move it form rocket backlog then :)
cc @alex-dabija

@gawertm gawertm added team/turtles Team Turtles and removed team/rocket Team Rocket labels May 17, 2023
@alex-dabija
Copy link
Author

@Rotfuks would the autoscaler topic go Turtles? lets move it form rocket backlog then :) cc @alex-dabija

Yes, Turtles makes sense. It would be great to have a common autoscaling solution for all providers (at least for CAPZ, CAPV, CAPVCD). CAPA is still using machine pools which require a different implementation.

@Rotfuks
Copy link
Contributor

Rotfuks commented Jun 26, 2023

Here's the information transfer from the CAPZ specific Autoscaling Epic:

We already had a first discussion on it: https://gigantic.slack.com/archives/C04887ZSU20/p1670926088947149
Conclustion:
All infos around cluster-autoscaler in machinepools issues can be found here:
https://github.com/giantswarm/giantswarm/issues/19313

@Rotfuks Rotfuks changed the title CAPI cluster autoscaling MachineDeployments CAPI Cluster Autoscaling MachineDeployments Jun 27, 2023
@alex-dabija alex-dabija changed the title CAPI Cluster Autoscaling MachineDeployments CAPI MachineDeployments Autoscaling Jul 3, 2023
@Rotfuks Rotfuks changed the title CAPI MachineDeployments Autoscaling CAPI Cluster Autoscaling Jul 31, 2023
@alex-dabija
Copy link
Author

@Rotfuks I would keep this issue scoped to autoscaling for machine deployments because of our current focus on CAPA. I moved the #2692 to #1798. It should make it easier to track the status of CAPA for our first release.

@alex-dabija alex-dabija changed the title CAPI Cluster Autoscaling CAPI Cluster Autoscaling with Machine Deployments Aug 22, 2023
@alex-dabija alex-dabija added provider/vsphere Related to a VMware vSphere based on-premises solution provider/cluster-api-azure Cluster API based running on Azure provider/capvcd and removed provider/cluster-api-gcp Cluster API based running on GCP provider/openstack Related to provider OpenStack team/phoenix Team Phoenix labels Aug 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/cross-team Epics that span across teams kind/story provider/capvcd provider/cluster-api-azure Cluster API based running on Azure provider/vsphere Related to a VMware vSphere based on-premises solution team/turtles Team Turtles topic/capi
Projects
Status: Backlog 📦
Development

No branches or pull requests

5 participants