-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TCP health checks should be enabled on CAPVCD LBs #2711
Comments
It is automatically set if https://github.com/vmware/cloud-provider-for-cloud-director/blob/main/pkg/vcdsdk/gateway.go#L779 Created a new test cluster with Nginx and it has tcp check enabled... |
Turns out the TCP check is removed when adding/removing nodes (only actions tested) but probably when the LB is reconciled in general. I create an issue upstream : vmware/cloud-provider-for-cloud-director#281 Arun assigned it to someone. |
We need to test it with CPI 1.4.0 |
I have a hypothesis for the bug where health checks are removed from lbs in VCD. |
Nice find! That looks promising 🙂. Let me know if you need a test manifest to create a cluster and I can show you what it looks like in VCD. |
Alright so rather than the |
I was a bit confused that the whole thing is removed and not replaced by Now the interesting twist is that trying to change the |
I deployed this fix giantswarm/cloud-provider-for-cloud-director#4 and it appears that tcp health checks are not removed anymore when machine deployment is scaled. |
Very nice 💪 |
Yeah sure :) do we need to do something to update it or will it be rolled out with the new release? |
I think the easiest is to change the image on the deployment because otherwise we need to create a branch of the app collection and use that branch which is a bit complex for this. |
I have edited the deployment in glasgow. |
Do you not need to update the cluster app to push the cpi change? https://github.com/giantswarm/giantswarm-management-clusters/blob/d81b3db2878084916419cb397732919982e13794/management-clusters/glasgow/cluster-app-manifests.yaml#L113 |
good news - I scaled glasgow up and that was enough for the health monitors to come back :) |
closing since its fixed from our side |
By default CAPVCD creates the load balancer for the kube API with TCP health check enabled but the CPI doesn't for services type load balancer.
This is an issue as we only run 2 NGINX controllers and the svc is set to
externalTrafficPolicy: local
so the customers was seeing failed requests. This was fixed by enabling TCP health check on the LB.https://gigantic.slack.com/archives/CE92C4BST/p1691484403096619
The text was updated successfully, but these errors were encountered: