Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Talos 1.8.3 advertising virtual MAC addresses #9837

Open
dbackeus opened this issue Nov 29, 2024 · 3 comments
Open

Talos 1.8.3 advertising virtual MAC addresses #9837

dbackeus opened this issue Nov 29, 2024 · 3 comments

Comments

@dbackeus
Copy link
Contributor

dbackeus commented Nov 29, 2024

Bug Report

After upgrading some of our nodes from Talos 1.8.2 to 1.8.3 we've received MAC address abuse reports from Hetzner (we're deploying on Hetzner dedicated servers).

The reports state:

We have detected that your server is using different MAC addresses from those allowed by your Robot account.

And proceed to give a list of unallowed MAC addresses used, a link to execute a re-check after resolving the issue, and a link to make a statement of why the MAC abuse occurred.

We've been running this Talos cluster on Hetzner for about two years and have only gotten reports like this in the last week after upgrading to Talos 1.8.3. So this upgrade is the only discrepancy we have to go on right now.

Description

Timeline:

  • Sunday 24th November 20:07 - Upgraded control-plane-1
  • Monday 25th November 02:05 - Received abuse report for control-plane-1 from Hetzner
   Allowed MACs:
       a8:a1:59:xx:xx:xx
   Unallowed MACs:
       00:12:40:11:63:e1
       20:00:40:11:43:db
       20:06:40:11:43:d5
       20:0c:40:11:43:cf
  • Tuesday 25th November 22:44 - Upgraded control-plane-2
  • Wednesday 27th November 02:07 - Received abuse report for control-plane-1 from Hetzner
   Allowed MACs:
       a8:a1:59:xx:xx:xx
   Unallowed MACs:
       00:12:40:11:99:0d
       20:00:40:11:79:07
       20:06:40:11:79:01
       20:0c:40:11:78:fb
  • Thursday 28th November 14:39 - Upgraded mixed-2
  • Thursday 28th November 15:12 - Received abuse report for mixed-2 from Hetzner
   Allowed MACs:
       6c:fe:54:xx:xx:xx
       d0:46:0c:xx:xx:xx
   Unallowed MACs:
       00:18:40:11:14:f0
       20:00:40:11:f4:ef
       20:06:40:11:f4:e9
       20:0c:40:11:f4:e3
       20:12:40:11:f4:dd

Our network config is extremely simple. We run the default flannel CNI, get a public IP from Hetzner via DHCP and enable Kubespan, eg:

  network:
    hostname: mixed-2
    interfaces:
      - interface: enp1s0f0
        dhcp: true
    kubespan:
      enabled: true

We've introspected the Talos network via talosctl get links, which does show a lot of veth devices, however none of them have matched the MAC addresses reported by Hetzner. We assume that the intention is for the veth devices is to stay internal to the cluster network and not be advertised on the physical network.

While our "mixed" worker is running all kinds of workloads. The 2 control plane nodes are tainted as control planes and are not running anything out of the ordinary.

When clicking the "re-check" link the report we get back is that the issue has been resolved. So these issues appears to have been transient. It's unclear if they can occur again, eg. when rebooting the nodes or similar. We don't know how to reproduce, or even how to monitor if unallowed MAC addresses are being advertised.

We don't mind spending time further troubleshooting this if someone can guide us in what to do.

For now we'll send a statement to Hetzner about the little we know, including a link to this issue, and hold off upgrading any other nodes for the time being.

Environment

  • Talos version: 1.8.3
  • Kubernetes version: v1.30.6
  • Platform: Hetzner Dedicated Servers
@smira
Copy link
Member

smira commented Nov 29, 2024

First, please check MetalLB or anything else you're running on your host network.

Talos only does forced advertisement for VIPs, but they are advertised with MAC address of the link.

@dbackeus
Copy link
Contributor Author

dbackeus commented Dec 3, 2024

The pods running with hostNetwork: true are...

Managed by Talos:

kube-system/kube-apiserver-control-plane
kube-system/kube-controller-manager-control-plane
kube-system/kube-system kube-scheduler-control-plane
kube-system/kube-flannel
kube-system/kube-proxy

Added by us:

openebs/openebs-ndm

Via: openebs-dynamic-localpv-provisioner

monitoring/monitoring-prometheus-node-exporter
monitoring/metrics-proxy

Via: kube-prometheus-stack

logging/vector

For ingesting system logs into Loki, as suggested by Talos documentation here: https://www.talos.dev/v1.8/talos-guides/configuration/logging/#vector-example

We are not using MetalLB as we are relying on Cloudflare tunnels for HTTP ingress, and NodePort for a pair of Postgres databases.

Note that all of this has been running in our cluster since day one, and appear to be running fine on Talos 1.8.2 without triggering any MAC abuse reports.

@dbackeus
Copy link
Contributor Author

dbackeus commented Dec 3, 2024

As we got another round of abuse reports for these servers, as well as one additional worker node which had also been upgraded to Talos 1.8.3, we have now downgraded all nodes to 1.8.2 to see if this prevents more reports from triggering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants