-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GKE Subsetting recalculation only appears to occur when nodes are added, not removed #2650
Comments
We also have made release nodes with possible mitigations for the time being: https://cloud.google.com/kubernetes-engine/docs/release-notes#August_14_2024 The easiest way to avoid downtime is to switch to using xTP=Local |
Thanks, this sounds exactly like our issue. Not quite up to date enough on reading our release notes I guess! Switching over our production cluster to Local today seems like a bit less of a scary thing than just scaling down bit by bit and trigger some sort of scale up on another node pool after each time. (Though I don't know a better way to trigger scale-up than to schedule something that doesn't fit... I don't want to just scale the pool normally because there's currently a fair amount of imbalance across zones.) |
(Ah OK, we can just scale up a random other node pool.) This will be called out in the release notes when it is fixed? |
Hi @glasser. Not sure if you already noticed, but we recently sent out the release updates for this fix: https://cloud.google.com/kubernetes-engine/docs/release-notes#September_10_2024 |
@gauravkghildiyal Thanks! And the release notes are clear about what we need to do (control plane upgrade). Trying it in our dev cluster now! |
I use GCE ingresses via hosted GKE (not running my own ingress-gce).
We have a large-ish cluster with GKE subsetting and a lot of internal network pass-through load balancers.
We wanted to move all our pods to a new nodepool so we created the nodepool, drained the old nodes gradually until they didn't have any pods (other than daemonsets), and then scaled the original node pool down to 0 nodes.
This broke many of our load balancers because it removed all the
GCE_VM_IP
endpoints from their zonal NEGs.After some experimentation, it appears that when you remove nodes from GKE,
GCE_VM_IP
endpoints can be removed from the zonal NEG groups associated with internal network pass-through load balancers (withexternalTrafficPolicy: Cluster
) and the controller won't actively add endpoints for newer nodes.Adding just one more node to the cluster after this seems sufficient to trigger recalculation and get those NEGs back to 25. But if you don't do that, it seems like you can just lose endpoints from your NEGs and eventually break them!
The text was updated successfully, but these errors were encountered: