Fix k8s_drain runs into timeout with pods from stateful sets. #793

OttaviaB · 2024-11-15T10:17:45Z

SUMMARY

Fixes #792 .

The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running:

            try:
                response = self._api_instance.read_namespaced_pod(
                    namespace=pod[0], name=pod[1]
                )
                if not response:
                    pod = None
                time.sleep(wait_sleep)

This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set.

ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME

k8s_drain

softwarefactory-project-zuul · 2024-11-15T10:28:33Z

Build succeeded.
https://ansible.softwarefactory-project.io/zuul/buildset/48dadfbdfec14e67bb0acabf353906f8

✔️ ansible-galaxy-importer SUCCESS in 4m 29s
✔️ build-ansible-collection SUCCESS in 5m 25s

softwarefactory-project-zuul · 2024-11-15T10:41:51Z

Build succeeded.
https://ansible.softwarefactory-project.io/zuul/buildset/43701d13dbd2428294620bd0fe42e1fc

✔️ ansible-galaxy-importer SUCCESS in 5m 00s
✔️ build-ansible-collection SUCCESS in 5m 15s

softwarefactory-project-zuul · 2024-12-10T15:35:06Z

Build succeeded (gate pipeline).
https://ansible.softwarefactory-project.io/zuul/buildset/8d925faefaa1449c9ec645f9dc0b3358

✔️ ansible-galaxy-importer SUCCESS in 4m 14s
✔️ build-ansible-collection SUCCESS in 5m 42s

patchback · 2024-12-10T15:35:11Z

Backport to stable-3: 💚 backport PR created

✅ Backport PR branch: patchback/backports/stable-3/fca0dc0485bf3748b61ac547957617e1e66573be/pr-793

Backported as #807

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

SUMMARY Fixes #792 . The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running: try: response = self._api_instance.read_namespaced_pod( namespace=pod[0], name=pod[1] ) if not response: pod = None time.sleep(wait_sleep) This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set. ISSUE TYPE Bugfix Pull Request COMPONENT NAME k8s_drain Reviewed-by: Mike Graves <[email protected]> (cherry picked from commit fca0dc0)

patchback · 2024-12-10T15:35:17Z

Backport to stable-5: 💚 backport PR created

✅ Backport PR branch: patchback/backports/stable-5/fca0dc0485bf3748b61ac547957617e1e66573be/pr-793

Backported as #808

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

SUMMARY Fixes #792 . The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running: try: response = self._api_instance.read_namespaced_pod( namespace=pod[0], name=pod[1] ) if not response: pod = None time.sleep(wait_sleep) This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set. ISSUE TYPE Bugfix Pull Request COMPONENT NAME k8s_drain Reviewed-by: Mike Graves <[email protected]> (cherry picked from commit fca0dc0)

…807) This is a backport of PR #793 as merged into main (fca0dc0). SUMMARY Fixes #792 . The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running: try: response = self._api_instance.read_namespaced_pod( namespace=pod[0], name=pod[1] ) if not response: pod = None time.sleep(wait_sleep) This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set. ISSUE TYPE Bugfix Pull Request COMPONENT NAME k8s_drain

…808) This is a backport of PR #793 as merged into main (fca0dc0). SUMMARY Fixes #792 . The function wait_for_pod_deletion in k8s_drain never checks on which node a pod is actually running: try: response = self._api_instance.read_namespaced_pod( namespace=pod[0], name=pod[1] ) if not response: pod = None time.sleep(wait_sleep) This means that if a pod is successfully evicted and restarted with the same name on a new node, k8s_drain does not notice and thinks that the original pod is still running. This is the case for pods which are part of a stateful set. ISSUE TYPE Bugfix Pull Request COMPONENT NAME k8s_drain

Fix k8s_drain runs into timeout with pods from stateful sets.

a6ee6b6

OttaviaB force-pushed the main branch from 1c44b38 to a6ee6b6 Compare November 15, 2024 10:30

gravesm approved these changes Dec 10, 2024

View reviewed changes

gravesm added mergeit backport-3 backport-5 labels Dec 10, 2024

softwarefactory-project-zuul bot merged commit fca0dc0 into ansible-collections:main Dec 10, 2024
59 checks passed

patchback bot mentioned this pull request Dec 10, 2024

[PR #793/fca0dc04 backport][stable-3] Fix k8s_drain runs into timeout with pods from stateful sets. #807

Merged

patchback bot mentioned this pull request Dec 10, 2024

[PR #793/fca0dc04 backport][stable-5] Fix k8s_drain runs into timeout with pods from stateful sets. #808

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix k8s_drain runs into timeout with pods from stateful sets. #793

Fix k8s_drain runs into timeout with pods from stateful sets. #793

OttaviaB commented Nov 15, 2024

softwarefactory-project-zuul bot commented Nov 15, 2024

softwarefactory-project-zuul bot commented Nov 15, 2024

softwarefactory-project-zuul bot commented Dec 10, 2024

patchback bot commented Dec 10, 2024 •

edited

Loading

patchback bot commented Dec 10, 2024 •

edited

Loading

Fix k8s_drain runs into timeout with pods from stateful sets. #793

Fix k8s_drain runs into timeout with pods from stateful sets. #793

Conversation

OttaviaB commented Nov 15, 2024

SUMMARY

ISSUE TYPE

COMPONENT NAME

softwarefactory-project-zuul bot commented Nov 15, 2024

softwarefactory-project-zuul bot commented Nov 15, 2024

softwarefactory-project-zuul bot commented Dec 10, 2024

patchback bot commented Dec 10, 2024 • edited Loading

Backport to stable-3: 💚 backport PR created

patchback bot commented Dec 10, 2024 • edited Loading

Backport to stable-5: 💚 backport PR created

patchback bot commented Dec 10, 2024 •

edited

Loading

patchback bot commented Dec 10, 2024 •

edited

Loading