Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routed Pod gateway-init container fails to resolve gw host #29

Open
NoeSamaille opened this issue May 23, 2023 · 2 comments
Open

Routed Pod gateway-init container fails to resolve gw host #29

NoeSamaille opened this issue May 23, 2023 · 2 comments

Comments

@NoeSamaille
Copy link

Details

What steps did you take and what happened:

Hi there! I have tried to deploy the pod gateway and admission controller using your Helm chart, using the following values:

image:
  repository: ghcr.io/angelnu/pod-gateway
  # I am using dev version for testing - others should be using latest
  tag: v1.8.1
DNSPolicy: ClusterFirst
webhook:
  image:
    repository: ghcr.io/angelnu/gateway-admision-controller
    # Use dev version
    pullPolicy: Always
    tag: v3.9.0
  namespaceSelector:
    type: label
    label: routed-gateway
  gatewayDefault: true
  gatewayLabel: setGateway
addons:
  vpn:
    enabled: true
    type: gluetun
    gluetun:
      image:
        repository: docker.io/qmcgaw/gluetun
        tag: latest
    env:
    - name:  VPN_SERVICE_PROVIDER
      value: custom
    - name:  VPN_TYPE
      value: wireguard
    - name:  VPN_INTERFACE
      value: wg0
    - name:  FIREWALL
      value: "off"
    - name:  DOT
      value: "off"
    - name: DNS_KEEP_NAMESERVER
      value: "on"
    
    envFrom:
      - secretRef:
          name: wireguard-config

    livenessProbe:
      exec:
        command:
          - sh
          - -c
          - if [ $(wget -q -O- https://ipinfo.io/country) == 'NL' ]; then exit 0; else exit $?; fi
      initialDelaySeconds: 30
      periodSeconds: 60
      failureThreshold: 3

    networkPolicy:
      enabled: true

      egress:
        - to:
          - ipBlock:
              cidr: 0.0.0.0/0
          ports:
          # VPN traffic
          - port: 51820
            protocol: UDP
        - to:
          - ipBlock:
              cidr: 10.0.0.0/8

settings:
  # -- If using a VPN, interface name created by it
  VPN_INTERFACE: wg0
  # -- Prevent non VPN traffic to leave the gateway
  VPN_BLOCK_OTHER_TRAFFIC: true
  # -- If VPN_BLOCK_OTHER_TRAFFIC is true, allow VPN traffic over this port
  VPN_TRAFFIC_PORT: 51820
  # -- Traffic to these IPs will be send through the K8S gateway
  VPN_LOCAL_CIDRS: "10.0.0.0/8 192.168.0.0/16"

# -- settings to expose ports, usually through a VPN provider.
# NOTE: if you change it you will need to manually restart the gateway POD
publicPorts:
- hostname: transmission-client.media-center
  IP: 10
  ports:
  - type: udp
    port: 51413
  - type: tcp
    port: 51413

So far so good, I've got the pod gateway and admission controller up and running in my vpn-gateway ns with wireguard VPN client working on the pod gateway, now trying to actually route a pod in my media-center routed ns:

These are the logs of the gateway-init container of my transmission-client pod:

❯ kubectl logs transmission-client-7bbc685b44-xkk6h -n media-center -c gateway-init

...

++ dig +short vpn-gateway-pod-gateway.vpn-gateway.svc.cluster.local @10.43.0.10
+ GATEWAY_IP=';; connection timed out; no servers could be reached'

It looks like it's not able to resolve vpn-gateway-pod-gateway.vpn-gateway.svc.cluster.local in the init container, but the cluster local DNS works fine I tried running the same pod in a non routed namespace, exec into it and nslookup and it worked fine:

root@transmission-client-7bbc685b44-m8l4r:/# nslookup vpn-gateway-pod-gateway.vpn-gateway.svc.cluster.local 10.43.0.10
Server:         10.43.0.10
Address:        10.43.0.10:53


Name:   vpn-gateway-pod-gateway.vpn-gateway.svc.cluster.local
Address: 10.42.2.61

Any idea what can cause this behavior?

What did you expect to happen:

I was expecting the routed pod gateway to be updated successfully with the pod starting up.

Anything else you would like to add:

Any help appreciated, there is probably something I'm missing here, happy to provide more information to debug this, thanks :)

@NoeSamaille
Copy link
Author

Found the issue, so the service IP range of my cluster is 10.43.0.0/16 and my Pod IP range is 10.42.0.0/16, however only the later is specifically mentioned in my Pod routing table:

$ ip route
default via 10.42.1.1 dev eth0 
10.42.0.0/16 via 10.42.1.1 dev eth0 
10.42.1.0/24 dev eth0 proto kernel scope link src 10.42.1.24 

It means that when the client_init.sh script deletes the existing default gateway the pod isn't able to access the DNS server any longer.

To fix that as a workaround I have manually configured my routed deployment as follow with a gateway-preinit initContainer that adds 10.43.0.0/16:

...

      initContainers:
      - command: ["/bin/sh","-c"]
        args: ["ip route add 10.43.0.0/16 via 10.42.1.1 dev eth0"]
        image: ghcr.io/angelnu/pod-gateway:v1.8.1
        imagePullPolicy: IfNotPresent
        name: gateway-preinit
        resources: {}
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
            - NET_RAW
          runAsNonRoot: false
          runAsUser: 0

That way after default gateway deletion it's still able to reach the services IP range and therefore the K8s internal DNS server.

@angelnu I'm sure there is a way to have a clean fix by slightly updating the client_init.sh script e.g. by adding that route ip route add ${K8S_DNS_IP}/16 via ${K8S_DEFAULT_GW} dev eth0 before removing default GW, happy to discuss/contribute.

I'm not sure if it's due to my K8s topology which is pretty standard: K3s running K8s v1.26 with default Flannel CNI.

@TheAceMan
Copy link

Don't have you exact setup but have you looked at using NOT_ROUTED_TO_GATEWAY_CIDRS, these values are put in the ip route as you outline so may help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants