-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow disabling internalIP routing via arg #370
base: main
Are you sure you want to change the base?
Conversation
Hi @sepich :) thanks for the PR and for the additional cleanup! Setting the CLI flag on the DaemonSet to disable routing packets over internal IPs entirely effectively turns the mesh into a full-mesh, where all nodes are connected only via WireGuard tunnels using their public IPs. If that's the case, why not change the Kilo topology to use a full mesh? |
Sorry if I missing something. We have nodes which have both internal and external IPs at same time. We do not need nodes in different locations to know each others internalIPs, we want full mesh over external IP, and start kilo like this:
Without |
ah I see! You really just want to have no knowledge of internal IPs at all. Hmm. Does the discovery of internal IPs break things for your network? Maybe internal IPs are reused and that causes clashes in the routes or WG config? Why is having unused internal IPs in your routes bad? Basically I want to figure out what the benefit of disabling the discovery / routing of internal IPs is and if it is worth the complexity of adding another flag for configuring Kilo. The fewer configuration options the better :) |
We have several locations, and in each location there is internal subnet where nodes can communicate with each other. So, there are 3 IPs on nodes - private, public and kilo. On top of And again, maybe I misunderstand some of your concepts. But unfortunately I have to +1 for #342 as current docs are just marketing materials. There are no any technical details, and I have to read code to be able to configure current setup. Would be nice to understand design decisions in each case, like "announce internal IPs". |
Thanks for explaining. Yes, that's indeed a really good thing to avoid! Normally this would be avoided by putting all of the nodes in the same "location" so that Kilo is knows about packets to and from the address but doesn't tunnel them over the VPN. Is there a reason why you can't rely on Kilo topologies to automatically place your cloud nodes into the same Kilo location and in this was avoid encapsulating traffic to internal IPs within a location? You mentioned that your cloud nodes come up and down and that you can't annotate them beforehand, however, what about simply relying on the Kubernetes-standard topology labels that are set by the cloud-controller-manager [0]? If you are not running a cloud-controller-manager on your cloud nodes, then that explains why you don't have access to topology labels and why Kilo can't use them to determine the topology of the mesh. Sorry for insisting further; I'm hoping to really understand the details of your deployment to see how to best improve Kilo for you and everyone else. [0] https://kubernetes.io/docs/reference/labels-annotations-taints/#topologykubernetesioregion |
Sorry, but what difference it should make? I've set |
The difference is that when you rely on the standard topology labels you don't need to manually annotate nodes to specify a location, which sounded like an annoying source of toil for your cluster. Ideally, when you let the cloud controller manager set the topology labels, then your Kilo topology can be configured automatically and you can let internal IPs be known to Kilo, because packets within a location won't be tunneled over the VPN. However, it sounds like you have an additional constraint, which is that you don't want to rely on having a single leader to act as the gateway for a group of nodes. However, if all nodes within a location could act as gateways for their own traffic to other locations, then this would be satisfactory for you. Is that right? In that case, it sounds like the ideal scenario would be to merge #328, which allows for all nodes in a location to be gateways and traffic internal to a location is not encapsulated. This eliminates issues of network bandwidth limitations and alleviates concerns of downtime, between when a leader becomes unresponsive and another node is elected the leader. Until that is merged, I agree that having an option to never tunnel traffic to internal IPs would be helpful. |
Thanks for Kilo!
This is somehow related to #127
We use cloud instances with kilo, where nodes come and go, and there is no way to set annotation on them before kilo starts.
Default way of work leaves the way it is now, but adding cli arg do disable internalIP routing to wg at all.