-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Let machinePools to honour NodeDeletionTimeout #10553
base: main
Are you sure you want to change the base?
Conversation
Welcome @serngawy! |
Hi @serngawy. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
/area machinepools
@killianmuldoon: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
I think this one goes into a different direction then described in: and /hold |
/cc @mboersma |
@mboersma would review the PR and let me know your thoughts. |
022d53e
to
6da8906
Compare
/retitle Let machinePools to honour NodeDeletionTimeout I dropped some more comments, looking good to me otherwise. |
I would also take a look once I get around to it |
Dropped a few more nits. otherwise /assign @sbueringer |
LGTM label has been added. Git tree hash: 7099007f9d7a3c07faed6267a6b79ab73302d62e
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
LGTM label has been added. Git tree hash: cafa1c9ed7f1ed97e2b9adc4cd0929645ea21e8f
|
Signed-off-by: melserngawy <[email protected]>
New changes are detected. LGTM label has been removed. |
return err | ||
} | ||
} else { | ||
ctrl.LoggerFrom(ctx).Info("MachinePool %s NodeDeleteTimeout passed, force delete the machinePool", machinePool.Namespace, machinePool.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctrl.LoggerFrom(ctx).Info("MachinePool %s NodeDeleteTimeout passed, force delete the machinePool", machinePool.Namespace, machinePool.Name) | |
ctrl.LoggerFrom(ctx).Info("NodeDeleteTimeout passed, skipping Node deletion") |
No need to add the MachinePool. Controller-runtime already adds the reconciled object
(Please note: Info doesn't take a format string + args, it takes k/v pairs)
err: false, | ||
mpExist: false, | ||
result: reconcile.Result{}, | ||
err: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err: true, | |
// Note: We expect an error here because patchHelper won't be able to patch the MachinePool after the finalizer was removed (and the MachinePool object is then gone) | |
// The important part is that we verify that the MachinePool doesn't exist anymore | |
err: true, |
expected: expected{ | ||
mpExist: true, | ||
result: reconcile.Result{}, | ||
err: true, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}, | |
}, | |
withTracker: true, |
Otherwise we are always hitting the code path that the tracker is nil. I think that his not what we want to test here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually let's please drop the withTracker field and let's just always set the tracker. I don't see a reason why we should test the case where Tracker is not set, as this never happens at runtime
expected: expected{ | ||
mpExist: true, | ||
result: reconcile.Result{}, | ||
err: true, | ||
}, | ||
}, | ||
} | ||
|
||
for i := range testCases { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly this is a bit racy:
- In some cases we will only get errors if the patchHelper fails to update the MachinePool, if the MachinePool goes away quickly enough after reconcileDelete removes the finalizer
- => I would suggest to drop the error check
- In some cases mpExist is set to true, even though we expect the MachinePool to go away after the finalizer is removed
- => I would recommend to change the check for mpExists to use Eventually & Consistently
if tc.expected.mpExist {
g.Consistently(func(g Gomega) {
g.Expect(r.Client.Get(ctx, key, &expv1.MachinePool{})).To(Succeed())
}).WithTimeout(2 * time.Second)
} else {
// We expect to fail retrieve the machinePool as it is deleted
g.Eventually(func(g Gomega) {
g.Expect(apierrors.IsNotFound(r.Client.Get(ctx, key, &expv1.MachinePool{}))).To(BeTrue())
}).WithTimeout(5 * time.Second)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once these changes are made, the assertions are correct, but I think the last test case doesn't succeed anymore. Looks like the MP deletion is not actually blocked with our current test setup
I think we're almost there :) |
What this PR does / why we need it:
Ignore unreachable cluster while deleting machinePools
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #10544