-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Robot server is tainted and does not initialise because of FailedToCreateRoute
error
#796
Comments
I spent a bit more time and realised that my HCCM was installed without the Is there a way to have both robot servers with no networking and cloud servers with networking? |
That is correct. It is not that easy to handle the Layer 2 routes of Robot servers and vSwitches in HCCM.
There are two parts to the networking:
Many people have the routing functionality active (as it is the default when a network is specified) but do not rely on it. This functionality is usually provided by your CNI, and unless you have explicitly disabled this in your CNI (ie. through I would recommend you to check if you are actually using the cloud routes and disable the routes in HCCM otherwise. You can disable the routes controller by setting |
@apricote Thank you for your reply! I did research the I'm currently have 10 cloud servers as nodes and I would like to migrate to 3 robot servers (mostly because of storage space and Longhorn and raw performance). Eventually I can disable network routes all together because there won't be any cloud servers anymore, but before that I would like to have both cloud and robot servers working. |
Yes, that will stop any routes from being updated (though existing routes will not be cleaned up). Are you sure that your setup requires the Routes? |
TL;DR
Adding a robot server to my existing K3S cluster with HCCM results in my node not initialising because of HCCM trying to create a route which is not supported.
Expected behavior
I would expect the new robot server to become a part of the cluster by having the route registered in the private network or not break the initilization of the node. And I would also expect the IP address of the robot server to be added to the Load Balancer.
Observed behavior
The route is not being added and fails with a error:
Because of this taints are added and the server is never initialised:
The node is not added to my Load Balancer.
Minimal working example
Install command for my robot server.
Log output
Additional information
I have added a vSwitch to my robot server and configured it:
I have added the vSwitch to my private network:
I can confirm that pinning to any node in my existing cluster from my robot server and the other way works as expected. So the connection is definitely there.
Then I've installed K3S using the command above. I've added both the old and the new labels of
provided-by=robot
to be sure that the CSI managed is ignoring this robot server. The documentation of the CSI driver says: "If you are using the hcloud-cloud-controller-manager version 1.21.0 or later, these labels are added automatically. Otherwise, you will need to label the nodes manually." (https://github.com/hetznercloud/csi-driver/blob/main/docs/kubernetes/README.md#integration-with-root-servers) This did not happen, but as far as I can see there is no future update which makes me think 1.21.0 is not released yet. Possibly unrelated, maybe not. Sharing it just in case!Now, I have HCCM installed according to the robot.md documentation including a hcloud secret:
After installing the node the following taints are added:
As far as I can see, these taints are added because HCCM is failing as can be see in the logs:
I did read in the docs that routes & private networks are not possible. (https://github.com/hetznercloud/hcloud-cloud-controller-manager/blob/main/docs/robot.md#routes--private-networks) That is okay because I can do this manually.
However, the problem is, that by not being able to do that the unavailable taint is being added and (presumably) because of that my node is never added to the Load Balancer.
I've got a few questions:
The text was updated successfully, but these errors were encountered: