Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zombie process overflow when pods don't reap exec probe processes #5675

Closed
1 task
thejan2009 opened this issue Jun 14, 2022 · 2 comments
Closed
1 task

Zombie process overflow when pods don't reap exec probe processes #5675

thejan2009 opened this issue Jun 14, 2022 · 2 comments

Comments

@thejan2009
Copy link

Environmental Info:
K3s Version:
v1.23.6+k3s1

Node(s) CPU architecture, OS, and Version:
ubuntu 20.04

Cluster Configuration:
single-node installation - 1 server, 1 agent

Describe the bug:
Some pod processes don't reap exec {liveness, readiness, startup}Probe child processes, resulting in zombie processes, which eventually overloads the server.

Steps To Reproduce:
See Enapter/charts#50. The chart in question has two exec probes. To reproduce, deploy the chart and observe the pod's child processes.

Expected behavior:
Pod process or init reaps exec probe child processes.

Actual behavior:
Doesn't do that.

Additional context / logs:
I was also able to reproduce the same issue on k3d, deployed a chart with exec probe and ran while true; do docker exec -it k3d-server-0 ps -eo ppid,comm | wc -l; sleep 5; done. The number of processes was continuously increasing.

Backporting

  • Needs backporting to older releases
@brandond
Copy link
Member

I believe that what you're describing is covered by upstream issues - which is to say this is not a problem with k3s.

@stale
Copy link

stale bot commented Dec 12, 2022

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Dec 12, 2022
@stale stale bot closed this as completed Dec 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants