-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deployment e2e tests for MultiKueue #4103
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Bobbins228 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @Bobbins228. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
@Bobbins228 being able to execute the whole Deployment on a dedicated worker cluster would be great. One con I see is that scaling will probably not work. One question that comes to my mind first - how do we discriminate which mode of operation a user want to use(via pods or via full workload)? should we discriminate based on a deployment annotation, or you have some alternative ideas? |
Can you elaborate a bit more on this point? Thanks |
Let me try - I might also be missing something here :) . So, with #4034 I believe Deployments are sort of supported with workloads at the level of Pods. On worker clusters we only have the Pods created. This is the "first" mode of operation. With this mode of operation I assume scaling the Deployment works OOTB. The "second" mode of operation is to create a workload at the Deployment level on the management cluster. Then, the entire Deployment could be copied onto the worker cluster and manage the pods there. IIUC in this mode of operation scaling would not work OOTB (but we may check, and add e2e test). I suppose we would need extra code to update the Deployment on a worker. Now, IIUC with this PR you enable the "second" mode of operation. However, it may have a downside compared to (1.) . So, I'm wondering if a user could still have an option to choose (1.) if they need scaling? However, propably we could make scaling work with (2.) as well, so maybe we don't need to modes... |
@mimowo Just off the top of my head for mode 2 PR we could check owner references of Pods and ensure all Pods created through deployments are scheduled on a specific Cluster. We could potentially allow a user the option to change which mode they want to use through a "MultiKueue" configuration parameter in the Kueue manager config. WDYT? |
I see, I was confused by this line https://github.com/kubernetes-sigs/kueue/pull/4103/files#diff-5c2c085d3754c50f6e899f937ef8113a2fc81685d7130b5bf88d1a37d2f26d67R58. Can you elaborate why we need support for prebuilt workloads for Deployments to make it work in "mode1
I think mode 2 would make sense too, but we may get a bit more details, so I would suggest a KEP to discuss the details, such as configuration for mode1 vs mode2. And ideal if you could support it with some use-cases you have. One thing is making sure that all Pods land on the same worker I suppose. Also, scalability could be better as we don't need to create the dummy pods in the management cluster.
Yeah, I think we could start with the global config option. Making it granular per workload seems possible, but could get messy. |
b74a320
to
a5630a2
Compare
My bad, I have removed the added validation.
A KEP would be great I can make an issue for it.
+1, we would also need a better naming convention than "mode 1", "mode 2" :) Updating this issue to cover just e2e tests for mode 1. I am going to make it more robust by also including a test for scaling up the deployment and ensuring more Workloads are created. |
a5630a2
to
62a38d5
Compare
cc @mszadkow Could you make the first pass? |
/ok-to-test |
test/e2e/multikueue/e2e_test.go
Outdated
|
||
for _, pod := range pods.Items { | ||
// Ensure all 4 local pods should be in "Running" phase | ||
gomega.Eventually(func(g gomega.Gomega) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think pods shouldn't be scheduled on the management cluster
Thus you need to observe their status in worker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to know on which worker they will be it's good to request resources in certain type and amount that is available to only one of the workers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this does not work with pods created through deployments.
I tested by setting the deployment resources to 1.5 CPUs before expecting that all pods would be admitted to worker1 but this was not the case.
Kueue will schedule the pods on any cluster with the necessary quota available for mode 1 discussed above.
Luckily though there is another way to know what pods are scheduled where as the pods would be assigned a nodeName of either kind-worker1-control-plane
or kind-worker2-control-plane
on the remote clusters. This method would be a bit convoluted so I am looking for a better option.
test/e2e/multikueue/e2e_test.go
Outdated
}) | ||
|
||
pods := &corev1.PodList{} | ||
gomega.Expect(k8sManagerClient.List(ctx, pods, client.InNamespace(managerNs.Namespace), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should happen on the worker cluster
test/e2e/multikueue/e2e_test.go
Outdated
// Local pods should be in "Running" phase | ||
gomega.Eventually(func(g gomega.Gomega) { | ||
createdPod := &corev1.Pod{} | ||
g.Expect(k8sManagerClient.Get(ctx, client.ObjectKey{Namespace: pod.Namespace, Name: pod.Name}, createdPod)).To(gomega.Succeed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should happen on the worker cluster
ginkgo.By("Waiting for the deployment to get status updates", func() { | ||
gomega.Eventually(func(g gomega.Gomega) { | ||
createdDeployment := appsv1.Deployment{} | ||
g.Expect(k8sManagerClient.Get(ctx, client.ObjectKeyFromObject(deployment), &createdDeployment)).To(gomega.Succeed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's correct, we should see status update on the deployment in manager cluster
@mszadkow Thanks for the review, I'll get right on it |
62a38d5
to
00a6bc6
Compare
I understand that there is no "finish line" for Deployment, Pods just running. |
LGTM label has been added. Git tree hash: a4fd4cdbbdea1f8ffe5c30a480274251ebbc63eb
|
pods = &corev1.PodList{} | ||
gomega.Expect(k8sManagerClient.List(ctx, pods, client.InNamespace(managerNs.Namespace), | ||
client.MatchingLabels(deployment.Spec.Selector.MatchLabels))).To(gomega.Succeed()) | ||
|
||
// Check all worker pods are in a running phase. | ||
ensurePodWorkloadsRunning(pods, *managerNs, multiKueueAc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, actually instead of the comment could you wrap this in a ginkgo.By("Check all worker pods are in a running phase")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within the function is a ginkgo.By("Ensure Pod Workloads are created and Pods are Running on the worker cluster")
The comment is just for dev readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, but still something feels off with the code being split outside and inside the helper function.
I would like to consider the following, move this code inside the function:
pods = &corev1.PodList{}
gomega.Expect(k8sManagerClient.List(ctx, pods, client.InNamespace(managerNs.Namespace),
client.MatchingLabels(deployment.Spec.Selector.MatchLabels))).To(gomega.Succeed())
but keep the gingo.By outside. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(or make the ginkgo.By as the first line of the helper function)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the first option better will make things more uniform, thanks mimowo
client.MatchingLabels(deployment.Spec.Selector.MatchLabels))).To(gomega.Succeed()) | ||
|
||
// Check all worker pods are in a running phase. | ||
ensurePodWorkloadsRunning(pods, *managerNs, multiKueueAc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you could also add one more ginkgo.By - delete the Deployment and verify the Pods are gone?
// By checking the assigned cluster we can discern which client to use | ||
admissionCheckMessage := workload.FindAdmissionCheck(createdLeaderWorkload.Status.AdmissionChecks, multiKueueAc.Name).Message | ||
workerCluster := k8sWorker1Client | ||
if strings.Contains(admissionCheckMessage, "worker2") { | ||
workerCluster = k8sWorker2Client | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Right, this will work but relies on the assumption we have only 2 workers. I would suggest to introduce a helper function which gets the worker name in the utils, for re-usability.
EDIT: for now maybe we can use a regex? For later I would suggest we have a dedicated API field, but this would be a separate enhancement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regex sounds good but in terms of pairing up the retrieved worker name with the appropriate worker client I am at a bit of a loss.
Unless we have a map of cluster names and clients that we can compare to the given cluster name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, yeah we can add the map which we initialize along with the clients.
pods := &corev1.PodList{} | ||
gomega.Expect(k8sManagerClient.List(ctx, pods, client.InNamespace(managerNs.Namespace), | ||
client.MatchingLabels(deployment.Spec.Selector.MatchLabels))).To(gomega.Succeed()) | ||
|
||
// Check all worker pods are in a running phase. | ||
ensurePodWorkloadsRunning(pods, *managerNs, multiKueueAc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also wrap this in a ginkgo.By step.
for _, pod := range pods.Items { // We want to test that all deployment pods have workloads. | ||
createdLeaderWorkload := &kueue.Workload{} | ||
wlLookupKey := types.NamespacedName{Name: workloadpod.GetWorkloadNameForPod(pod.Name, pod.UID), Namespace: managerNs.Name} | ||
gomega.Eventually(func(g gomega.Gomega) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Eventually does not seem needed, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just nits
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR is the second part in enabling deployments for MultiKueue with #4034 being part 1
Which issue(s) this PR fixes:
Part of #3802
Special notes for your reviewer:
@mimowo Seeing as Deployments do not have workload support we are unable to create an adapter and utilize Sync Job for creating a remote deployment and syncing status that way.
On the other hand as the Pods progress on the Manager Cluster the deployment's status is updated as if they were running on single cluster.
Another note is a concern in relation to which cluster the Pods are created in. In this implementation the pods can run on any cluster and are not run as a group on a single cluster. Is this something we should be wary of?
If we are happy with this implementation I can remove the WIP.
Does this PR introduce a user-facing change?