Skip to content

Commit

Permalink
Add IPv6 support for Typhoon Azure clusters
Browse files Browse the repository at this point in the history
* Define a dual-stack virtual network with both IPv4 and IPv6 private
address space. Change `host_cidr` variable (string) to a `network_cidr`
variable (object) with "ipv4" and "ipv6" fields that list CIDR strings.
* Define dual-stack controller and worker subnets. Disable Azure
default outbound access (a deprecated fallback mechanism)
* Enable dual-stack load balancing to Kubernetes Ingress by adding
a public IPv6 frontend IP and LB rule to the load balancer.
* Enable worker outbound IPv6 connectivity through load balancer
SNAT by adding an IPv6 frontend IP and outbound rule
* Configure controller nodes with a public IPv6 address to provide
direct outbound IPv6 connectivity
* Add an IPv6 worker backend pool. Azure requires separate IPv4 and
IPv6 backend pools, though the health probe can be shared
* Extend network security group rules for IPv6 source/destinations

Checklist:

Access to controller and worker nodes via IPv6 addresses:

  * SSH access to controller nodes via public IPv6 address
  * SSH access to worker nodes via (private) IPv6 address (via
    controller)

Outbound IPv6 connectivity from controller and worker nodes:

```
nc -6 -zv ipv6.google.com 80
Ncat: Version 7.94 ( https://nmap.org/ncat )
Ncat: Connected to [2607:f8b0:4001:c16::66]:80.
Ncat: 0 bytes sent, 0 bytes received in 0.02 seconds.
```

Serve Ingress traffic via IPv4 or IPv6 just requires setting
up A and AAAA records and running the ingress controller with
`hostNetwork: true` since, hostPort only forwards IPv4 traffic
  • Loading branch information
dghubble committed Jul 9, 2024
1 parent 3483ed8 commit 48d4973
Show file tree
Hide file tree
Showing 28 changed files with 894 additions and 555 deletions.
32 changes: 32 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,38 @@ Notable changes between versions.

## Latest

### Azure

* Configure the virtual network and subnets with IPv6 private address space
* Change `host_cidr` variable (string) to a `network_cidr` object with `ipv4` and `ipv6` fields that list CIDR strings. Leave the variable unset to use the defaults. (**breaking**)
* Add support for dual-stack Kubernetes Ingress Load Balancing
* Add a public IPv6 frontend, 80/443 rules, and a worker-ipv6 backend pool
* Change the `controller_address_prefixes` output from a list of strings to an object with `ipv4` and `ipv6` fields. Most Azure resources can't accept a mix, so these are split out (**breaking**)
* Change the `worker_address_prefixes` output from a list of strings to an object with `ipv4` and `ipv6` fields. Most Azure resources can't accept a mix, so these are split out (**breaking**)
* Change the `backend_address_pool_id` output (and worker module input) from a string to an object with `ipv4` and `ipv6` fields that list ids (**breaking**)
* Configure nodes to have outbound IPv6 internet connectivity (analogous to IPv4 SNAT)
* Configure controller nodes to have a public IPv6 address
* Configure worker nodes to use outbound rules and the load balancer for SNAT
* Extend network security rules to allow IPv6 traffic, analogous to IPv4

```diff
module "cluster" {
...
# optional
- host_cidr = "10.0.0.0/16"
+ network_cidr = {
+ ipv4 = ["10.0.0.0/16"]
+ }
}
```

## v1.30.2

* Kubernetes [v1.30.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.30.md#v1302)
* Update CoreDNS from v1.9.4 to v1.11.1
* Update Cilium from v1.15.5 to [v1.15.6](https://github.com/cilium/cilium/releases/tag/v1.15.6)
* Update flannel from v0.25.1 to [v0.25.4](https://github.com/flannel-io/flannel/releases/tag/v0.25.4)

## v1.30.1

* Kubernetes [v1.30.1](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.30.md#v1301)
Expand Down
76 changes: 50 additions & 26 deletions azure/fedora-coreos/kubernetes/controllers.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,13 @@ resource "azurerm_dns_a_record" "etcds" {
ttl = 300

# private IPv4 address for etcd
records = [azurerm_network_interface.controllers.*.private_ip_address[count.index]]
records = [azurerm_network_interface.controllers[count.index].private_ip_address]
}

# Controller availability set to spread controllers
resource "azurerm_availability_set" "controllers" {
resource_group_name = azurerm_resource_group.cluster.name

name = "${var.cluster_name}-controllers"
resource_group_name = azurerm_resource_group.cluster.name
location = var.region
platform_fault_domain_count = 2
platform_update_domain_count = 4
Expand All @@ -35,15 +34,13 @@ resource "azurerm_availability_set" "controllers" {

# Controller instances
resource "azurerm_linux_virtual_machine" "controllers" {
count = var.controller_count
resource_group_name = azurerm_resource_group.cluster.name
count = var.controller_count

name = "${var.cluster_name}-controller-${count.index}"
resource_group_name = azurerm_resource_group.cluster.name
location = var.region
availability_set_id = azurerm_availability_set.controllers.id

size = var.controller_type
custom_data = base64encode(data.ct_config.controllers.*.rendered[count.index])
size = var.controller_type

# storage
source_image_id = var.os_image
Expand All @@ -56,10 +53,16 @@ resource "azurerm_linux_virtual_machine" "controllers" {

# network
network_interface_ids = [
azurerm_network_interface.controllers.*.id[count.index]
azurerm_network_interface.controllers[count.index].id
]

# Azure requires setting admin_ssh_key, though Ignition custom_data handles it too
# boot
custom_data = base64encode(data.ct_config.controllers[count.index].rendered)
boot_diagnostics {
# defaults to a managed storage account
}

# Azure requires an RSA admin_ssh_key
admin_username = "core"
admin_ssh_key {
username = "core"
Expand All @@ -74,31 +77,52 @@ resource "azurerm_linux_virtual_machine" "controllers" {
}
}

# Controller public IPv4 addresses
resource "azurerm_public_ip" "controllers" {
count = var.controller_count
# Controller node public IPv4 addresses
resource "azurerm_public_ip" "controllers-ipv4" {
count = var.controller_count

name = "${var.cluster_name}-controller-${count.index}-ipv4"
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location
ip_version = "IPv4"
sku = "Standard"
allocation_method = "Static"
}

name = "${var.cluster_name}-controller-${count.index}"
location = azurerm_resource_group.cluster.location
sku = "Standard"
allocation_method = "Static"
# Controller node public IPv6 addresses
resource "azurerm_public_ip" "controllers-ipv6" {
count = var.controller_count

name = "${var.cluster_name}-controller-${count.index}-ipv6"
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location
ip_version = "IPv6"
sku = "Standard"
allocation_method = "Static"
}

# Controller NICs with public and private IPv4
# Controllers' network interfaces
resource "azurerm_network_interface" "controllers" {
count = var.controller_count
resource_group_name = azurerm_resource_group.cluster.name
count = var.controller_count

name = "${var.cluster_name}-controller-${count.index}"
location = azurerm_resource_group.cluster.location
name = "${var.cluster_name}-controller-${count.index}"
resource_group_name = azurerm_resource_group.cluster.name
location = azurerm_resource_group.cluster.location

ip_configuration {
name = "ip0"
name = "ipv4"
primary = true
subnet_id = azurerm_subnet.controller.id
private_ip_address_allocation = "Dynamic"
private_ip_address_version = "IPv4"
public_ip_address_id = azurerm_public_ip.controllers-ipv4[count.index].id
}
ip_configuration {
name = "ipv6"
subnet_id = azurerm_subnet.controller.id
private_ip_address_allocation = "Dynamic"
# instance public IPv4
public_ip_address_id = azurerm_public_ip.controllers.*.id[count.index]
private_ip_address_version = "IPv6"
public_ip_address_id = azurerm_public_ip.controllers-ipv6[count.index].id
}
}

Expand All @@ -115,7 +139,7 @@ resource "azurerm_network_interface_backend_address_pool_association" "controlle
count = var.controller_count

network_interface_id = azurerm_network_interface.controllers[count.index].id
ip_configuration_name = "ip0"
ip_configuration_name = "ipv4"
backend_address_pool_id = azurerm_lb_backend_address_pool.controller.id
}

Expand Down
130 changes: 91 additions & 39 deletions azure/fedora-coreos/kubernetes/lb.tf
Original file line number Diff line number Diff line change
Expand Up @@ -15,47 +15,61 @@ resource "azurerm_dns_a_record" "apiserver" {

# Static IPv4 address for the apiserver frontend
resource "azurerm_public_ip" "apiserver-ipv4" {
name = "${var.cluster_name}-apiserver-ipv4"
resource_group_name = azurerm_resource_group.cluster.name

name = "${var.cluster_name}-apiserver-ipv4"
location = var.region
sku = "Standard"
allocation_method = "Static"
location = var.region
sku = "Standard"
allocation_method = "Static"
}

# Static IPv4 address for the ingress frontend
resource "azurerm_public_ip" "ingress-ipv4" {
name = "${var.cluster_name}-ingress-ipv4"
resource_group_name = azurerm_resource_group.cluster.name
location = var.region
ip_version = "IPv4"
sku = "Standard"
allocation_method = "Static"
}

name = "${var.cluster_name}-ingress-ipv4"
location = var.region
sku = "Standard"
allocation_method = "Static"
# Static IPv6 address for the ingress frontend
resource "azurerm_public_ip" "ingress-ipv6" {
name = "${var.cluster_name}-ingress-ipv6"
resource_group_name = azurerm_resource_group.cluster.name
location = var.region
ip_version = "IPv6"
sku = "Standard"
allocation_method = "Static"
}

# Network Load Balancer for apiservers and ingress
resource "azurerm_lb" "cluster" {
name = var.cluster_name
resource_group_name = azurerm_resource_group.cluster.name

name = var.cluster_name
location = var.region
sku = "Standard"
location = var.region
sku = "Standard"

frontend_ip_configuration {
name = "apiserver"
public_ip_address_id = azurerm_public_ip.apiserver-ipv4.id
}

frontend_ip_configuration {
name = "ingress"
name = "ingress-ipv4"
public_ip_address_id = azurerm_public_ip.ingress-ipv4.id
}

frontend_ip_configuration {
name = "ingress-ipv6"
public_ip_address_id = azurerm_public_ip.ingress-ipv6.id
}
}

resource "azurerm_lb_rule" "apiserver" {
name = "apiserver"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "apiserver"
disable_outbound_snat = true

protocol = "Tcp"
frontend_port = 6443
Expand All @@ -64,53 +78,74 @@ resource "azurerm_lb_rule" "apiserver" {
probe_id = azurerm_lb_probe.apiserver.id
}

resource "azurerm_lb_rule" "ingress-http" {
name = "ingress-http"
resource "azurerm_lb_rule" "ingress-http-ipv4" {
name = "ingress-http-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress"
frontend_ip_configuration_name = "ingress-ipv4"
disable_outbound_snat = true

protocol = "Tcp"
frontend_port = 80
backend_port = 80
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker.id]
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv4.id]
probe_id = azurerm_lb_probe.ingress.id
}

resource "azurerm_lb_rule" "ingress-https" {
name = "ingress-https"
resource "azurerm_lb_rule" "ingress-https-ipv4" {
name = "ingress-https-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress"
frontend_ip_configuration_name = "ingress-ipv4"
disable_outbound_snat = true

protocol = "Tcp"
frontend_port = 443
backend_port = 443
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker.id]
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv4.id]
probe_id = azurerm_lb_probe.ingress.id
}

# Worker outbound TCP/UDP SNAT
resource "azurerm_lb_outbound_rule" "worker-outbound" {
name = "worker"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration {
name = "ingress"
}
resource "azurerm_lb_rule" "ingress-http-ipv6" {
name = "ingress-http-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress-ipv6"
disable_outbound_snat = true

protocol = "All"
backend_address_pool_id = azurerm_lb_backend_address_pool.worker.id
protocol = "Tcp"
frontend_port = 80
backend_port = 80
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv6.id]
probe_id = azurerm_lb_probe.ingress.id
}

resource "azurerm_lb_rule" "ingress-https-ipv6" {
name = "ingress-https-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
frontend_ip_configuration_name = "ingress-ipv6"
disable_outbound_snat = true

protocol = "Tcp"
frontend_port = 443
backend_port = 443
backend_address_pool_ids = [azurerm_lb_backend_address_pool.worker-ipv6.id]
probe_id = azurerm_lb_probe.ingress.id
}

# Backend Address Pools

# Address pool of controllers
resource "azurerm_lb_backend_address_pool" "controller" {
name = "controller"
loadbalancer_id = azurerm_lb.cluster.id
}

# Address pool of workers
resource "azurerm_lb_backend_address_pool" "worker" {
name = "worker"
resource "azurerm_lb_backend_address_pool" "worker-ipv4" {
name = "worker-ipv4"
loadbalancer_id = azurerm_lb.cluster.id
}

resource "azurerm_lb_backend_address_pool" "worker-ipv6" {
name = "worker-ipv6"
loadbalancer_id = azurerm_lb.cluster.id
}

Expand All @@ -122,10 +157,8 @@ resource "azurerm_lb_probe" "apiserver" {
loadbalancer_id = azurerm_lb.cluster.id
protocol = "Tcp"
port = 6443

# unhealthy threshold
number_of_probes = 3

number_of_probes = 3
interval_in_seconds = 5
}

Expand All @@ -136,10 +169,29 @@ resource "azurerm_lb_probe" "ingress" {
protocol = "Http"
port = 10254
request_path = "/healthz"

# unhealthy threshold
number_of_probes = 3

number_of_probes = 3
interval_in_seconds = 5
}

# Outbound SNAT

resource "azurerm_lb_outbound_rule" "outbound-ipv4" {
name = "outbound-ipv4"
protocol = "All"
loadbalancer_id = azurerm_lb.cluster.id
backend_address_pool_id = azurerm_lb_backend_address_pool.worker-ipv4.id
frontend_ip_configuration {
name = "ingress-ipv4"
}
}

resource "azurerm_lb_outbound_rule" "outbound-ipv6" {
name = "outbound-ipv6"
protocol = "All"
loadbalancer_id = azurerm_lb.cluster.id
backend_address_pool_id = azurerm_lb_backend_address_pool.worker-ipv6.id
frontend_ip_configuration {
name = "ingress-ipv6"
}
}
6 changes: 6 additions & 0 deletions azure/fedora-coreos/kubernetes/locals.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
locals {
backend_address_pool_ids = {
ipv4 = [azurerm_lb_backend_address_pool.worker-ipv4.id]
ipv6 = [azurerm_lb_backend_address_pool.worker-ipv6.id]
}
}
Loading

0 comments on commit 48d4973

Please sign in to comment.