Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix k8s_drain runs into timeout with pods from stateful sets. #793

Merged

Conversation

OttaviaB
Copy link
Contributor

SUMMARY

Fixes #792 .

The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running:

            try:
                response = self._api_instance.read_namespaced_pod(
                    namespace=pod[0], name=pod[1]
                )
                if not response:
                    pod = None
                time.sleep(wait_sleep)

This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set.

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

k8s_drain

Copy link

Copy link

Copy link

Build succeeded (gate pipeline).
https://ansible.softwarefactory-project.io/zuul/buildset/8d925faefaa1449c9ec645f9dc0b3358

✔️ ansible-galaxy-importer SUCCESS in 4m 14s
✔️ build-ansible-collection SUCCESS in 5m 42s

@softwarefactory-project-zuul softwarefactory-project-zuul bot merged commit fca0dc0 into ansible-collections:main Dec 10, 2024
59 checks passed
Copy link

patchback bot commented Dec 10, 2024

Backport to stable-3: 💚 backport PR created

✅ Backport PR branch: patchback/backports/stable-3/fca0dc0485bf3748b61ac547957617e1e66573be/pr-793

Backported as #807

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

patchback bot pushed a commit that referenced this pull request Dec 10, 2024
SUMMARY
Fixes #792 .
The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running:
            try:
                response = self._api_instance.read_namespaced_pod(
                    namespace=pod[0], name=pod[1]
                )
                if not response:
                    pod = None
                time.sleep(wait_sleep)
This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set.

ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME
k8s_drain

Reviewed-by: Mike Graves <[email protected]>
(cherry picked from commit fca0dc0)
Copy link

patchback bot commented Dec 10, 2024

Backport to stable-5: 💚 backport PR created

✅ Backport PR branch: patchback/backports/stable-5/fca0dc0485bf3748b61ac547957617e1e66573be/pr-793

Backported as #808

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

patchback bot pushed a commit that referenced this pull request Dec 10, 2024
SUMMARY
Fixes #792 .
The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running:
            try:
                response = self._api_instance.read_namespaced_pod(
                    namespace=pod[0], name=pod[1]
                )
                if not response:
                    pod = None
                time.sleep(wait_sleep)
This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set.

ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME
k8s_drain

Reviewed-by: Mike Graves <[email protected]>
(cherry picked from commit fca0dc0)
gravesm pushed a commit that referenced this pull request Dec 11, 2024
SUMMARY
Fixes #792 .
The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running:
            try:
                response = self._api_instance.read_namespaced_pod(
                    namespace=pod[0], name=pod[1]
                )
                if not response:
                    pod = None
                time.sleep(wait_sleep)
This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set.

ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME
k8s_drain

Reviewed-by: Mike Graves <[email protected]>
(cherry picked from commit fca0dc0)
gravesm pushed a commit that referenced this pull request Dec 11, 2024
SUMMARY
Fixes #792 .
The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running:
            try:
                response = self._api_instance.read_namespaced_pod(
                    namespace=pod[0], name=pod[1]
                )
                if not response:
                    pod = None
                time.sleep(wait_sleep)
This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set.

ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME
k8s_drain

Reviewed-by: Mike Graves <[email protected]>
(cherry picked from commit fca0dc0)
softwarefactory-project-zuul bot pushed a commit that referenced this pull request Dec 11, 2024
…807)

This is a backport of PR #793 as merged into main (fca0dc0).
SUMMARY
Fixes #792 .
The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running:
            try:
                response = self._api_instance.read_namespaced_pod(
                    namespace=pod[0], name=pod[1]
                )
                if not response:
                    pod = None
                time.sleep(wait_sleep)
This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set.

ISSUE TYPE


Bugfix Pull Request

COMPONENT NAME
k8s_drain
softwarefactory-project-zuul bot pushed a commit that referenced this pull request Dec 11, 2024
…808)

This is a backport of PR #793 as merged into main (fca0dc0).
SUMMARY
Fixes #792 .
The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running:
            try:
                response = self._api_instance.read_namespaced_pod(
                    namespace=pod[0], name=pod[1]
                )
                if not response:
                    pod = None
                time.sleep(wait_sleep)
This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set.

ISSUE TYPE


Bugfix Pull Request

COMPONENT NAME
k8s_drain
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

k8s_drain runs into a timeout when evicting a pod which is part of a stateful set
2 participants