NodeLocal DNS container hung on SIGTERM #394

rtheis · 2020-08-10T18:15:42Z

We've seen the NodeLocal DNS container hang when receiving a SIGTERM. This causes the container to eventually be terminated via SIGKILL thus leaving iptables rules behind. Here is an example of the container logs captured during this hang.

2020/08/10 13:36:53 [INFO] Using Corefile /etc/Corefile
2020/08/10 13:36:53 [INFO] Updated Corefile with 0 custom stubdomains and upstream servers /etc/resolv.conf
2020/08/10 13:36:53 [INFO] Using config file:
cluster.local:53 {
    errors
    cache {
            success 9984 30
            denial 9984 5
    }
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . 172.21.122.96 {
            force_tcp
    }
    prometheus :9253
    health 169.254.20.10:8080
    }
in-addr.arpa:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . 172.21.122.96 {
            force_tcp
    }
    prometheus :9253
    }
ip6.arpa:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . 172.21.122.96 {
            force_tcp
    }
    prometheus :9253
    }
.:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . /etc/resolv.conf {
            force_tcp
    }
    prometheus :9253
    }
2020/08/10 13:36:53 [INFO] Updated Corefile with 0 custom stubdomains and upstream servers /etc/resolv.conf
2020/08/10 13:36:53 [INFO] Using config file:
cluster.local:53 {
    errors
    cache {
            success 9984 30
            denial 9984 5
    }
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . 172.21.122.96 {
            force_tcp
    }
    prometheus :9253
    health 169.254.20.10:8080
    }
in-addr.arpa:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . 172.21.122.96 {
            force_tcp
    }
    prometheus :9253
    }
ip6.arpa:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . 172.21.122.96 {
            force_tcp
    }
    prometheus :9253
    }
.:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . /etc/resolv.conf {
            force_tcp
    }
    prometheus :9253
    }
2020/08/10 13:36:53 [INFO] Added interface - nodelocaldns
in-addr.arpa.:53 on 169.254.20.10
ip6.arpa.:53 on 169.254.20.10
.:53 on 169.254.20.10
cluster.local.:53 on 169.254.20.10
ip6.arpa.:53 on 172.21.0.10
.:53 on 172.21.0.10
cluster.local.:53 on 172.21.0.10
in-addr.arpa.:53 on 172.21.0.10
[INFO] plugin/reload: Running configuration MD5 = e86dda246de588881b4e3ac178781999
CoreDNS-1.6.7
linux/amd64, go1.11.13, 
[INFO] Added back nodelocaldns rule - {raw PREROUTING [-p tcp -d 169.254.20.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw PREROUTING [-p udp -d 169.254.20.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {filter INPUT [-p tcp -d 169.254.20.10 --dport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {filter INPUT [-p udp -d 169.254.20.10 --dport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 169.254.20.10 --sport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p udp -s 169.254.20.10 --sport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {filter OUTPUT [-p tcp -s 169.254.20.10 --sport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {filter OUTPUT [-p udp -s 169.254.20.10 --sport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -d 169.254.20.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p udp -d 169.254.20.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -d 169.254.20.10 --dport 8080 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 169.254.20.10 --sport 8080 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw PREROUTING [-p tcp -d 172.21.0.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw PREROUTING [-p udp -d 172.21.0.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {filter INPUT [-p tcp -d 172.21.0.10 --dport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {filter INPUT [-p udp -d 172.21.0.10 --dport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 172.21.0.10 --sport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p udp -s 172.21.0.10 --sport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {filter OUTPUT [-p tcp -s 172.21.0.10 --sport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {filter OUTPUT [-p udp -s 172.21.0.10 --sport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -d 172.21.0.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p udp -d 172.21.0.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -d 172.21.0.10 --dport 8080 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 172.21.0.10 --sport 8080 -j NOTRACK]}
[INFO] Updated Corefile with 0 custom stubdomains and upstream servers 8.8.8.8 8.8.4.4
[INFO] Using config file:
cluster.local:53 {
    errors
    cache {
            success 9984 30
            denial 9984 5
    }
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . 172.21.122.96 {
            force_tcp
    }
    prometheus :9253
    health 169.254.20.10:8080
    }
in-addr.arpa:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . 172.21.122.96 {
            force_tcp
    }
    prometheus :9253
    }
ip6.arpa:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . 172.21.122.96 {
            force_tcp
    }
    prometheus :9253
    }
.:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 172.21.0.10
    forward . 8.8.8.8 8.8.4.4
    prometheus :9253
    }
[INFO] Reloading
[INFO] SIGTERM: Shutting down servers then terminating

I believe the hang is occurring when SIGTERM is received during a reload of the config.

The text was updated successfully, but these errors were encountered:

prameshj · 2020-08-18T23:05:58Z

What image version is this on?

rtheis · 2020-08-19T10:15:08Z

Image version 1.15.13

prameshj · 2020-08-19T14:33:24Z

How are you updating nodelocaldns config file? Is it by editing the configmap/coreFile directly?

I tried updating kube-dns configmap (which will trigger an update to the coreFile to pick up the new upstreamservers), and that reloads fine. I also passed in SIGTERM via docker to the node-local-dns pod and that was handled fine too. I tried with 1.15.13 image as well.

docker kill --signal=SIGTERM <podID>

Logs:

[INFO] SIGTERM: Shutting down servers then terminating
[INFO] Tearing down
[WARNING] Exiting iptables/interface check goroutine

Do the rules get cleaned up when deleting the pod via kubectl? What env is this being run on?

rtheis · 2020-08-19T14:49:14Z

@prameshj I updated the configmap via kubectl apply. Both the config update and SIGTERM are handled fine by themselves. This is a timing bug where a SIGTERM during the config update hangs.

prameshj · 2020-08-19T15:00:02Z

is there a way to reproduce it?

rtheis · 2020-08-19T15:14:49Z

Yes, but you'll likely have to run the test many times. Basically update the configmap then terminate the pod at about the same time.

prameshj · 2020-08-19T23:50:10Z

ok, few more questions:

How is it manifesting in your environment? Is the config being changed frequently/in an automated mechanism?
Any other pods running in hostNetwork mode that could be modifying iptables rules ? It could be contention for the iptables lock.
Why does the pod get killed with SIGKILL? is that manual?
Does it happen on all node-local-dns pods in the cluster or just on one node at random?

rtheis · 2020-08-20T18:59:26Z

@prameshj We see the failure in our automated testing. We run Calico which uses the hostNetwork and modifies iptables rules. The SIGKILL happens because the Kubernetes termination grace period expires (default 30 seconds and we raised it to 15 minutes and still see the hang). The hang normally only happens on one node during the test.

rtheis · 2020-08-27T11:46:41Z

Update: This may not be related to the config reload. I've seen a couple instances of this problem occur when there was no reload in progress. It seems to me that this hang may be related to iptables lock contention rather than config reload.

prameshj · 2020-08-28T05:30:59Z

seems flannel-io/flannel#988 might be related. Is there any "resource unavailable" error in any of the hostNetwork pod or in journalctl/syslog on the node?

rtheis · 2020-08-28T10:54:10Z

@prameshj I'll check the next time I see the problem.

rtheis · 2020-09-10T11:40:40Z

@prameshj I didn't see any "resource unavailable" errors during the latest recreate of this problem.

raul-te · 2020-10-23T00:42:06Z

I'm experiencing a similar issue. I can confirm that this does not happen on config reloads. From a Kubernetes point of view, the pod is marked as Completed and it does not get recreated:

Status:               Running
Controlled By:  DaemonSet/node-local-dns
Containers:
  node-cache:
    Container ID:  docker://99dcfc63ff748bb5e9aacfa415d70a3754523a6ddd6d8ac67f923fc0c5f169c0
    Image:         k8s.gcr.io/dns/k8s-dns-node-cache:1.15.14
    Image ID:      docker-pullable://k8s.gcr.io/dns/k8s-dns-node-cache@sha256:140f63dddfcdeb898cd3ae4a505988bdca29f79cd43d08f506936db1c94cadab
    Ports:         53/UDP, 53/TCP, 9253/TCP
    Host Ports:    53/UDP, 53/TCP, 9253/TCP
    Args:
      -localip
      169.254.20.10
      -conf
      /etc/Corefile
      -upstreamsvc
      coredns
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 07 Oct 2020 17:24:10 -0700
      Finished:     Sun, 18 Oct 2020 17:00:08 -0700
    Ready:          False

You can see that it finished on Oct 18th, and my last config change was on the 13th.

kube-system   node-local-dns-t8xs8                                                 0/1     Completed           0          14d

 docker ps | grep node-cache
99dcfc63ff74        d57f36ff0fbd                                                                         "/node-cache -locali…"   2 weeks ago         Up 2 weeks                              k8s_node-cache_node-local-dns-t8xs8_kube-system_1f62dea8-2ee2-41a9-ab4f-221e9fdc21bb_0

Extract from docker inspect:

       "State": {
            "Status": "exited",
            "Running": false,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 0,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2020-10-08T00:24:10.190189725Z",
            "FinishedAt": "2020-10-19T00:00:08.063288024Z"
        },

It's interesting that Docker says the container's been up for 2 weeks but it also says that the status is exited.

docker logs 99dcfc63ff74 --tail 15
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 169.254.20.10 --sport 8080 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw PREROUTING [-p tcp -d 169.254.20.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw PREROUTING [-p udp -d 169.254.20.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {filter INPUT [-p tcp -d 169.254.20.10 --dport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {filter INPUT [-p udp -d 169.254.20.10 --dport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 169.254.20.10 --sport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p udp -s 169.254.20.10 --sport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {filter OUTPUT [-p tcp -s 169.254.20.10 --sport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {filter OUTPUT [-p udp -s 169.254.20.10 --sport 53 -j ACCEPT]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -d 169.254.20.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p udp -d 169.254.20.10 --dport 53 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -d 169.254.20.10 --dport 8080 -j NOTRACK]}
[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 169.254.20.10 --sport 8080 -j NOTRACK]}
[INFO] Error checking/adding iptables rule {raw PREROUTING [-p tcp -d 169.254.20.10 --dport 53 -j NOTRACK]}, due to xtables lock in use, retrying in 60ns
[INFO] SIGTERM: Shutting down servers then terminating

The logs don't contain a timestamp, but I checked on Kibana and managed to extract them:

October 16th 2020, 19:05:10.295[INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 169.254.20.10 --sport 8080 -j NOTRACK]}
October 18th 2020, 14:11:12.259[INFO] Error checking/adding iptables rule {raw PREROUTING [-p tcp -d 169.254.20.10 --dport 53 -j NOTRACK]}, due to xtables lock in use, retrying in 60ns
October 18th 2020, 17:00:07.627[INFO] SIGTERM: Shutting down servers then terminating

Based on the time difference I think those messages regarding xtables lock in use are unrelated.

At this point it's unclear to me what's sending the SIGTERM.

This server is running on AWS, but not EKS. I'm running k8s 1.16.13. I don't see any "resource unavailable" logs.

I have this node cordoned and with the pod in Completed state. @prameshj please let me know if you want me to get any extra information before I restart the pod.

raul-te · 2020-10-23T23:43:32Z

I’ve found out that the node became NotReady shortly before the node-local-dns-cache container received the SIGTERM. This happened on the two nodes were we experienced this issue.

It’s interesting to note that the pod that received the SIGTERM has the node.kubernetes.io/not-ready toleration given that it belongs to a DaemonSet.

prameshj · 2020-10-24T14:25:40Z

That's interesting. Did the Node become ready again? Is node-local-dns pod still stuck while terminating?
If node-local-dns got SIGKILL'ed, it should have started up again on the same node and taken care of cleaning up extra iptables rules?

raul-te · 2020-10-26T17:30:05Z

Yes, the node became Ready again and all the pods that needed DNS were in CrashLoopBackOff. As shown in the logs I shared, the container was not running — there was no process for it — but according to docker ps the container has been Up 2 weeks.

There's no trace of any SIGKILL in my case; only the SIGTERM I shared.

fejta-bot · 2021-01-24T18:20:50Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

rtheis · 2021-01-25T11:38:59Z

/remove-lifecycle stale

prameshj · 2021-02-12T20:16:32Z

Yes, the node became Ready again and all the pods that needed DNS were in CrashLoopBackOff. As shown in the logs I shared, the container was not running — there was no process for it — but according to docker ps the container has been Up 2 weeks.

I realize this is from several weeks back, but any idea why they were in crashloop backoff? Was it because port 53 was in use? Is there a reliable way to repro this?

raul-te · 2021-02-12T21:41:06Z

Hey @prameshj ! I have not seen this problem again. Port 53 was not in use.

Unfortunately I'm no able to reproduce this condition.

prameshj · 2021-04-01T00:25:33Z

Thanks, closing for now.

rtheis · 2021-04-01T01:00:30Z

We have not been able to recreate this problem on versions 1.16 and later. Thank you.

rtheis · 2021-04-20T18:11:16Z

@prameshj Can you please reopen this issue? I have recreated the problem on version 1.17.3. Thank you.

prameshj · 2021-04-29T18:47:42Z

Should we reopen this or do you want to create a new issue? I see there are 2 different scenarios described in this issue. Might be cleaner to open a new one with the symptoms you are seeing.

rtheis · 2021-04-30T10:07:54Z

@prameshj Thanks, I'll create a new issue on the next recreate on version 1.17.3.

rtheis · 2021-06-04T11:58:35Z

#453

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 24, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 25, 2021

prameshj closed this as completed Apr 1, 2021

rtheis mentioned this issue Jun 4, 2021

NodeLocal DNS container hung on SIGTERM #453

Open

jakoberpf mentioned this issue Jul 15, 2022

all nodelocaldns are failing in OCI deployment kubernetes-sigs/kubespray#9089

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NodeLocal DNS container hung on SIGTERM #394

NodeLocal DNS container hung on SIGTERM #394

rtheis commented Aug 10, 2020

prameshj commented Aug 18, 2020 •

edited

Loading

rtheis commented Aug 19, 2020

prameshj commented Aug 19, 2020

rtheis commented Aug 19, 2020

prameshj commented Aug 19, 2020

rtheis commented Aug 19, 2020

prameshj commented Aug 19, 2020

rtheis commented Aug 20, 2020

rtheis commented Aug 27, 2020

prameshj commented Aug 28, 2020

rtheis commented Aug 28, 2020

rtheis commented Sep 10, 2020

raul-te commented Oct 23, 2020

raul-te commented Oct 23, 2020

prameshj commented Oct 24, 2020

raul-te commented Oct 26, 2020

fejta-bot commented Jan 24, 2021

rtheis commented Jan 25, 2021

prameshj commented Feb 12, 2021

raul-te commented Feb 12, 2021

prameshj commented Apr 1, 2021

rtheis commented Apr 1, 2021

rtheis commented Apr 20, 2021

prameshj commented Apr 29, 2021

rtheis commented Apr 30, 2021

rtheis commented Jun 4, 2021

NodeLocal DNS container hung on SIGTERM #394

NodeLocal DNS container hung on SIGTERM #394

Comments

rtheis commented Aug 10, 2020

prameshj commented Aug 18, 2020 • edited Loading

rtheis commented Aug 19, 2020

prameshj commented Aug 19, 2020

rtheis commented Aug 19, 2020

prameshj commented Aug 19, 2020

rtheis commented Aug 19, 2020

prameshj commented Aug 19, 2020

rtheis commented Aug 20, 2020

rtheis commented Aug 27, 2020

prameshj commented Aug 28, 2020

rtheis commented Aug 28, 2020

rtheis commented Sep 10, 2020

raul-te commented Oct 23, 2020

raul-te commented Oct 23, 2020

prameshj commented Oct 24, 2020

raul-te commented Oct 26, 2020

fejta-bot commented Jan 24, 2021

rtheis commented Jan 25, 2021

prameshj commented Feb 12, 2021

raul-te commented Feb 12, 2021

prameshj commented Apr 1, 2021

rtheis commented Apr 1, 2021

rtheis commented Apr 20, 2021

prameshj commented Apr 29, 2021

rtheis commented Apr 30, 2021

rtheis commented Jun 4, 2021

prameshj commented Aug 18, 2020 •

edited

Loading