Refactor Controller Unit Tests #118

razo7 · 2024-02-07T11:49:16Z

Using 'normal' testEnv test amd setup NodeMaintenanceReconciler. Refactor taint functionality
Refactor initMaintenanceStatus and exclude remediation label functionality
Refactor Reconciliation functionality
ECOPROJECT-1271

Using 'normal' testEnv test amd setup NodeMaintenanceReconciler. In the past it wasn't needed as the old controller tests called relevant funcs of the controller themself. It is needed for testing events. v1->corev1. Testing only taints functionality ATM

…label Modify setOwnerRefToNode, addExcludeRemediationLabel, removeExcludeRemediationLabel, and initMaintenanceStatus functions to be non-reconcile functions so there behavior can be easily tested in unit tests

Combine old logic into less tests but with steps

openshift-ci · 2024-02-07T11:49:20Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

mshitrit · 2024-02-07T12:57:20Z

/test ?

openshift-ci · 2024-02-07T12:57:24Z

@mshitrit: The following commands are available to trigger required jobs:

/test 4.12-ci-bundle-my-bundle
/test 4.12-images
/test 4.12-openshift-e2e
/test 4.13-ci-bundle-my-bundle
/test 4.13-images
/test 4.13-openshift-e2e
/test 4.14-ci-bundle-my-bundle
/test 4.14-images
/test 4.14-openshift-e2e
/test 4.15-ci-bundle-my-bundle
/test 4.15-images
/test 4.15-openshift-e2e

Use /test all to run all jobs.

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mshitrit · 2024-02-07T12:57:37Z

/test 4.15-openshift-e2e

mshitrit · 2024-02-07T13:04:42Z

controllers/controller_suite_test.go

-		Metrics: metricsServer.Options{BindAddress: "0"},
-	})
-	Expect(err).ToNot(HaveOccurred())
+	mockManager, _ := lease.NewManager(k8sClient, "")


Nit: there are a lot of managers involved, maybe mockLeaseManager is better name ?

mockLeaseManager is already used in nodemaintenance_controller_test.go.
Maybe tempMockManager?

slintes

since the diff is difficult to review, I did a review of the complete new version. Can be that some comments are not related to changes of this PR...

slintes · 2024-02-07T14:59:58Z

controllers/controller_suite_test.go

@@ -21,17 +21,18 @@ import (
 	"path/filepath"
 	"testing"

+	"github.com/medik8s/common/pkg/lease"


about the file name: eiher just keep suite_test.go (good enough IMHO), or use <package_name>_suite_test.go please

I wanted to go with <package_name>_suite_test.go, and I forgot the suffix s 👍🏻

BTW should nodemaintenance_controller.go and nodemaintenance_controller_test follow the same pattern? Since they both use "controller" and not "controllers"

no, each file is about a single controller, not?

but the package can contain multiple

but the package can contain multiple

Correct, as SNR has multiple controllers

filename still is wrong

slintes · 2024-02-07T15:36:15Z

controllers/nodemaintenance_controller_test.go

+					Expect(nm.Status.EvictionPods).To(Equal(2))
+					Expect(nm.Status.TotalPods).To(Equal(2))
+					Expect(nm.Status.DrainProgress).To(Equal(0))
+					Expect(nm.Status.LastUpdate.IsZero()).To(BeFalse())


Expect(nm.Status.LastUpdate).ToNot(BeZero())

slintes · 2024-02-07T15:37:29Z

controllers/nodemaintenance_controller_test.go

+			})
+			When("Status was initalized", func() {
+				It("should be set for running with 2 pods to drain", func() {
+					initMaintenanceStatus(nm, r.drainer, r.Client)


unhandled error

slintes · 2024-02-07T15:40:17Z

controllers/nodemaintenance_controller_test.go

+					node := &corev1.Node{}
+					Expect(k8sClient.Get(context.TODO(), client.ObjectKey{Name: taintedNodeName}, node)).To(Succeed())
+					setOwnerRefToNode(nm, node, r.logger)
+					Expect(len(nm.ObjectMeta.GetOwnerReferences())).To(Equal(1))


Expect(nm.ObjectMeta.GetOwnerReferences()).To(HaveLen(1))

slintes · 2024-02-07T15:41:53Z

controllers/nodemaintenance_controller_test.go

+					nmCopy = nm.DeepCopy()
+					nmCopy.Status.Phase = nodemaintenanceapi.MaintenanceFailed
+					initMaintenanceStatus(nmCopy, r.drainer, r.Client)
+					Expect(nmCopy.Status.Phase).NotTo(Equal(nodemaintenanceapi.MaintenanceRunning))


IMHO the test should be that the CR keeps the same phase as before. Same for other fields.

slintes · 2024-02-07T15:54:21Z

controllers/nodemaintenance_controller_test.go

+					Expect(k8sClient.Get(context.Background(), client.ObjectKey{Name: taintedNodeName}, taintedNode)).To(Succeed())
+					Expect(isTaintExist(taintedNode, medik8sDrainTaint.Key, medik8sDrainTaint.Effect)).To(BeTrue())
+					Expect(isTaintExist(taintedNode, NodeUnschedulableTaint.Key, NodeUnschedulableTaint.Effect)).To(BeTrue())
+					Expect(isTaintExist(taintedNode, dummyTaintKey, corev1.TaintEffectPreferNoSchedule)).To(BeTrue())


maybe add a comment that this is an existing taint which should not have been removed?

slintes · 2024-02-07T15:55:43Z

controllers/nodemaintenance_controller_test.go

+					Expect(isTaintExist(taintedNode, NodeUnschedulableTaint.Key, NodeUnschedulableTaint.Effect)).To(BeTrue())
+					Expect(isTaintExist(taintedNode, dummyTaintKey, corev1.TaintEffectPreferNoSchedule)).To(BeTrue())
+					// there is also a not-ready taint
+					Expect(len(taintedNode.Spec.Taints)).To(Equal(4))


Where is this 4th taint coming from? Do we have to care for it?

The "node.kubernetes.io/not-ready" taint is added automatically when a node is created with a taint that of PreferNoSchedule effect.

Do we have to care for it?

We don't but checking the amount of taints is another verification (could be an overkill, and redundant check indeed) for the addition/removal of taints.

we should only verify our own changes, so yes, it's "overkill" indeed

slintes · 2024-02-07T15:56:48Z

controllers/nodemaintenance_controller_test.go

+				})
+			})
+
+			When("Adding and then removing a taint", func() {


similar to label test, why test adding only first, and then adding + removing?

Originally, I was trying to keep the logic of the old tests and not remove some as it could leave us with missing tests, and there is a small value in keeping the current behavior.
In the beginning, I was more keen on going with the adding + removing direction.
The value I see here is that we separate dependency between adding and removing (not necessarily here..., but the addition could be not as part of the test) of taint/label. Another argument for adding + removing direction is that by using "steps" (once for addition and for removal) we can gain this nice separation for debugging the cause of a possible error.

Do you see this dependency separation as valuable for these functionality tests?

The value I see here is that we separate dependency between adding and removing

Either I don't understand what you mean, or i just don't see any value 🤔

we can gain this nice separation for debugging the cause of a possible error.

you always get the line number of a failed test 🤷🏼‍♂️

slintes · 2024-02-07T16:02:52Z

controllers/nodemaintenance_controller_test.go

+			It("should fail on non existing node", func() {
+				By("check nm CR status and whether LastError was updated")
+				maintenance := getNMAfter1Sec(nm)
+				Expect(maintenance.Status.Phase).To(Equal(nodemaintenanceapi.MaintenanceRunning))


don't we have a "failed" phase?

Yes, we do. but only when we can't extend the owned lease... I think we can also put a failed phase when a node doesn't exist (or missing)

I think we can also put a failed phase when a node doesn't exist

I think we should! (in a new PR)

slintes · 2024-02-07T16:05:11Z

controllers/nodemaintenance_controller_test.go

+	}, timeout, pollTime).Should(BeNil())
+	return maintenance
+}
+func getNMAfter1Sec(nm *nodemaintenanceapi.NodeMaintenance) *nodemaintenanceapi.NodeMaintenance {


hm.... what are we doing here? Why not just sleeping for a second? What's the value of getting the CR 5 times very 200ms?

Not so nice. Would simply sleeping for one second or polling for the same amount of time as the timeout would be better? I think the first option

One check for add+removal label/taint, sleep between testing reconcile tests and better expect statments

razo7 · 2024-02-08T12:27:51Z

/test 4.15-openshift-e2e

Refactor the usage of context from background, and simply pass it between tests

razo7 · 2024-02-18T08:34:18Z

/test 4.15-openshift-e2e

slintes · 2024-02-19T09:31:44Z

controllers/nodemaintenance_controller_test.go

+			})
+			When("Status was initalized", func() {
+				It("should be set for running with 2 pods to drain", func() {
+					Expect(initMaintenanceStatus(nm, r.drainer, r.Client)).To(HaveOccurred())


Expecting an error here is a nice indication IMHO that the implementation has an issue. Here: it would be better to separate setting status fields, and doing the actual update. Out of scope for this PR though.

openshift-ci · 2024-02-19T09:31:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: razo7, slintes

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [razo7,slintes]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

razo7 · 2024-02-19T13:27:23Z

/retest

razo7 added 3 commits February 7, 2024 13:37

Refactor unit test for initMaintenanceStatus and exclude remediation …

e4527ee

…label Modify setOwnerRefToNode, addExcludeRemediationLabel, removeExcludeRemediationLabel, and initMaintenanceStatus functions to be non-reconcile functions so there behavior can be easily tested in unit tests

Refactor unit tests for reconciliation behavior

c068b59

Combine old logic into less tests but with steps

openshift-ci bot added the do-not-merge/work-in-progress label Feb 7, 2024

openshift-ci bot added the approved label Feb 7, 2024

razo7 mentioned this pull request Feb 7, 2024

Add Events for the Maintenance Process #113

Merged

mshitrit reviewed Feb 7, 2024

View reviewed changes

slintes reviewed Feb 7, 2024

View reviewed changes

Fixing remarks for clearer test readability

199b494

One check for add+removal label/taint, sleep between testing reconcile tests and better expect statments

Pass context in tests

b3ced39

Refactor the usage of context from background, and simply pass it between tests

razo7 force-pushed the refactor-unit-test-controllers branch from e06006b to b3ced39 Compare February 18, 2024 08:34

slintes approved these changes Feb 19, 2024

View reviewed changes

openshift-ci bot assigned slintes Feb 19, 2024

openshift-ci bot added the lgtm label Feb 19, 2024

razo7 marked this pull request as ready for review February 19, 2024 12:05

openshift-ci bot removed the do-not-merge/work-in-progress label Feb 19, 2024

openshift-ci bot requested review from clobrano and slintes February 19, 2024 12:05

openshift-merge-bot bot merged commit 2fb090b into medik8s:main Feb 19, 2024
14 checks passed

razo7 mentioned this pull request Feb 28, 2024

Separate nm status initalization from update #123

Merged

Refactor Controller Unit Tests #118

Refactor Controller Unit Tests #118

Conversation

razo7 commented Feb 7, 2024 • edited by openshift-ci bot Loading

openshift-ci bot commented Feb 7, 2024

mshitrit commented Feb 7, 2024

openshift-ci bot commented Feb 7, 2024

mshitrit commented Feb 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slintes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

razo7 commented Feb 8, 2024

razo7 commented Feb 18, 2024

Choose a reason for hiding this comment

openshift-ci bot commented Feb 19, 2024

razo7 commented Feb 19, 2024

razo7 commented Feb 7, 2024 •

edited by openshift-ci bot

Loading