-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YUNIKORN-2637] finalizePods should ignore pods like registerPods #847
Conversation
If a pod was in a terminal state during registration it is skipped. The same principal should apply to finalisePods: * only check pods that were added in registerPods. * remove finished pods in the finalizePods call if they were registered. * new pods are not added in finalizePods
Unit test failure is due to YUNIKORN-2629. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 pending e2e
A related question: Context doesn't have the best unit test coverage. I do believe we need to enhance it. Eg bindPodVolumes()
is completely uncovered, but there are a lot of missing paths, too. Thoughts?
I would completely agree. That is why I wrote tests that cover the two functions that I changed. |
I think we need to get a fix for YUNIKORN-2629 created, reviewed and committed to get the unit tests to pass on the k8shim and get things stable again. Until that point I think we need to wait with commits in the master. |
if _, ok := podMap[pod.UID]; !ok { | ||
// node no longer exists, delete it | ||
// pod no longer exists, delete it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious. In this phase we have added event handler, so those pods may be removed already from context
already. It seems to me that is a race condition but it is fine as ctx.DeletePod(pod)
is no-op for repeated delete. If this description is valid, maybe we should add comments for it. Also, we should add tests for DeletePod
to make sure it is no-op in repeated deletes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to enhance test coverage for Context
, we can have a separate test case for DeletePod()
to make sure that's idempotent. But we can do this in this PR as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to address that in separate PR :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so those pods may be removed already
The maybe is what we're trying to catch. It could be that the pod was removed between the list in registerPods and the event handler add later on. It will not show here but the event handler has not seen it being removed. Those we want to catch. Not the ones that have already been deleted by the handlers. We should not have to describe that in this function.
Also, we should add tests for DeletePod to make sure it is no-op in repeated deletes.
See above for the comment made by Peter around the unit test coverage of the context code. It needs to be extended. Any code path or function that declares to be idempotent should have tests for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
All green, I'm merging this.
If a pod was in a terminal state during registration it is skipped. The same principal should apply to finalisePods: * only check pods that were added in registerPods. * remove finished pods in the finalizePods call if they were registered. * new pods are not added in finalizePods Closes: #847 Signed-off-by: Peter Bacsko <[email protected]> (cherry picked from commit 8e680af)
What is this PR for?
If a pod was in a terminal state during registration it is skipped. The same principal should apply to finalisePods:
What type of PR is it?
What is the Jira issue?
How should this be tested?
Unit tests now cover the code path
e2e tests coverage is not possible as we cannot time the pod removal well enough