Skip to content

Conversation

@kamarabbas99
Copy link
Contributor

What type of PR is this?

e2e tests

What this PR does / why we need it:

This PR adds the e2e tests for CPU startup boost

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: (https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/7862-cpu-startup-boost#aep-7862-cpu-startup-boost)
NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area labels Oct 21, 2025
@k8s-ci-robot k8s-ci-robot added area/vertical-pod-autoscaler size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed do-not-merge/needs-area labels Oct 21, 2025
@kamarabbas99
Copy link
Contributor Author

/cc omerap12 adrianmoisey laoj2

@kamarabbas99
Copy link
Contributor Author

/area vertical-pod-autoscaler

Copy link
Member

@omerap12 omerap12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, but I wasn't able to run the tests locally.
I exported the following:

export FEATURE_GATES=CPUStartupBoost=true

And changed this:

diff --git a/vertical-pod-autoscaler/hack/run-e2e-tests.sh b/vertical-pod-autoscaler/hack/run-e2e-tests.sh
index f7f724925..9447abc82 100755
@@ -50,7 +50,7 @@ case ${SUITE} in
   recommender|updater|admission-controller|actuation|full-vpa)
     export KUBECONFIG=$HOME/.kube/config
     pushd ${SCRIPT_ROOT}/e2e
-    go test ./v1/*go -v --test.timeout=150m --args --ginkgo.v=true --ginkgo.focus="\[VPA\] \[${SUITE}\]" --report-dir=${WORKSPACE} --disable-log-dump --ginkgo.timeout=150m
+    go test ./v1/*go -v --test.timeout=150m --args --ginkgo.v=true --ginkgo.focus="\[VPA\] \[${SUITE}\]" --report-dir=${WORKSPACE} --ginkgo.label-filter="FG:CPUStartupBoost" --disable-log-dump --ginkgo.timeout=150m
     V1_RESULT=$?
     popd
     echo v1 test result: ${V1_RESULT}

But when I run this:

./vertical-pod-autoscaler/hack/run-e2e-locally.sh updater

I get this:

[sig-autoscaling] [VPA] [updater] [v1] Updater Unboost pods when they become Ready [FG:CPUStartupBoost]
/home/ubuntu/autoscaler/vertical-pod-autoscaler/e2e/v1/updater.go:224
  STEP: Creating a kubernetes client @ 10/27/25 12:23:28.748
  I1027 12:23:28.748322 70130 util.go:453] >>> kubeConfig: /home/ubuntu/.kube/config
  STEP: Building a namespace api object, basename vertical-pod-autoscaling @ 10/27/25 12:23:28.749
  STEP: Waiting for a default service account to be provisioned in namespace @ 10/27/25 12:23:28.757
  STEP: Waiting for kube-root-ca.crt to be provisioned in namespace @ 10/27/25 12:23:28.76
  STEP: Checking CPUStartupBoost cluster feature gate is on @ 10/27/25 12:23:28.762
  STEP: Checking CPUStartupBoost VPA feature gate is enabled for updater @ 10/27/25 12:23:28.762
  STEP: Setting up the Admission Controller status @ 10/27/25 12:23:28.764
  STEP: Setting up a hamster Deployment @ 10/27/25 12:23:28.764
  I1027 12:23:28.769598   70130 warnings.go:110] "Warning: would violate PodSecurity \"restricted:latest\": allowPrivilegeEscalation != false (container \"hamster\" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container \"hamster\" must set securityContext.capabilities.drop=[\"ALL\"]), runAsNonRoot != true (pod or container \"hamster\" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container \"hamster\" must set securityContext.seccompProfile.type to \"RuntimeDefault\" or \"Localhost\")"
  I1027 12:23:28.771252 70130 deployment.go:104] deployment status: v1.DeploymentStatus{ObservedGeneration:0, Replicas:0, UpdatedReplicas:0, ReadyReplicas:0, AvailableReplicas:0, UnavailableReplicas:0, TerminatingReplicas:(*int32)(nil), Conditions:[]v1.DeploymentCondition(nil), CollisionCount:(*int32)(nil)}
  STEP: Setting up a VPA CRD @ 10/27/25 12:23:30.778
  I1027 12:23:30.778948 70130 util.go:453] >>> kubeConfig: /home/ubuntu/.kube/config
  I1027 12:23:30.803540 70130 util.go:453] >>> kubeConfig: /home/ubuntu/.kube/config
  STEP: Annotating pods with boost annotation @ 10/27/25 12:23:30.818
  STEP: Waiting for pods to be in-place updated @ 10/27/25 12:23:31.096
  I1027 12:23:31.096062 70130 common.go:585] waiting for at least one pod to be updated without eviction
  I1027 12:25:31.099681 70130 common.go:629] finished waiting for at least one pod to be updated without eviction
  STEP: Deleting the Admission Controller status @ 10/27/25 12:25:31.099
  [FAILED] in [It] - /home/ubuntu/autoscaler/vertical-pod-autoscaler/e2e/v1/updater.go:252 @ 10/27/25 12:25:31.103
  STEP: Destroying namespace "vertical-pod-autoscaling-9135" for this suite. @ 10/27/25 12:25:31.104
• [FAILED] [122.361 seconds]
[sig-autoscaling] [VPA] [updater] [v1] Updater [It] Unboost pods when they become Ready [FG:CPUStartupBoost]
/home/ubuntu/autoscaler/vertical-pod-autoscaler/e2e/v1/updater.go:224

  [FAILED] Unexpected error:
      <context.deadlineExceededError>: 
      context deadline exceeded
      
          {}
  occurred
  In [It] at: /home/ubuntu/autoscaler/vertical-pod-autoscaler/e2e/v1/updater.go:252 @ 10/27/25 12:25:31.103
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------

Anything I forgot to do?
(when @adrianmoisey PRs will be merged this process will be much better )

@kamarabbas99
Copy link
Contributor Author

kamarabbas99 commented Nov 5, 2025

Anything I forgot to do? (when @adrianmoisey PRs will be merged this process will be much better )

@omerap12
We also need IPPU flag for unboosting since this branch is not up to date with master.

@kamarabbas99
Copy link
Contributor Author

/test pull-autoscaling-e2e-vpa-actuation

@adrianmoisey
Copy link
Member

I've done a bunch of work in master to the e2e tests, including running tests in parallel (to speed them up) and to exclude feature gates that are disabled by default (with a flag to turn them on), the idea being that we can have 2 sets of jobs, one batch to test without gates enabled and one batch to test with gates enabled.

Should we get this feature branch up to date with master, and rebase this PR? That way it will be easy to test these e2e tests locally?

@omerap12
Copy link
Member

Yeah that would be much easier to work with

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. area/balancer area/cluster-autoscaler area/provider/alicloud Issues or PRs related to the AliCloud cloud provider implementation area/provider/aws Issues or PRs related to aws provider labels Nov 11, 2025
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 11, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: kamarabbas99 / name: Kam Saiyed (202579b)

@k8s-ci-robot k8s-ci-robot added area/provider/azure Issues or PRs related to azure provider area/provider/cluster-api Issues or PRs related to Cluster API provider area/provider/digitalocean Issues or PRs related to digitalocean provider and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 11, 2025
@adrianmoisey adrianmoisey removed area/provider/digitalocean Issues or PRs related to digitalocean provider area/provider/cluster-api Issues or PRs related to Cluster API provider area/provider/aws Issues or PRs related to aws provider area/provider/rancher area/provider/magnum Issues or PRs related to the Magnum cloud provider for Cluster Autoscaler area/provider/kwok Issues or PRs related to the kwok cloud provider for Cluster Autoscaler area/provider/huaweicloud area/provider/ionoscloud area/provider/hetzner Issues or PRs related to Hetzner provider area/provider/azure Issues or PRs related to azure provider area/provider/linode Issues or PRs related to linode provider area/provider/gce area/provider/oci Issues or PRs related to oci provider labels Nov 12, 2025
@adrianmoisey
Copy link
Member

I tested these locally, and they all worked, thanks!
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2025
Copy link
Member

@omerap12 omerap12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm (with a small nit) thanks!
/lgtm

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 13, 2025
Copy link
Member

@omerap12 omerap12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! merging this since @adrianmoisey already gave his /lgtm
/lgtm
/approve

Btw, for testing this locally I had to make this tiny change (https://github.com/kubernetes/autoscaler/pull/8692/files#diff-6e0f59e05f47faa269c26e5d0c598d2a94e2def4847e6a9f28d0aee97b02ef9dR232), and then run those tests.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 13, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kamarabbas99, omerap12

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 13, 2025
@k8s-ci-robot k8s-ci-robot merged commit cd061df into kubernetes:experimental-cpu-boost-v2 Nov 13, 2025
13 checks passed
@kamarabbas99
Copy link
Contributor Author

Btw, for testing this locally I had to make this tiny change (https://github.com/kubernetes/autoscaler/pull/8692/files#diff-6e0f59e05f47faa269c26e5d0c598d2a94e2def4847e6a9f28d0aee97b02ef9dR232), and then run those tests.

Interesting! it works for me without that

@omerap12
Copy link
Member

Btw, for testing this locally I had to make this tiny change (https://github.com/kubernetes/autoscaler/pull/8692/files#diff-6e0f59e05f47faa269c26e5d0c598d2a94e2def4847e6a9f28d0aee97b02ef9dR232), and then run those tests.

Interesting! it works for me without that

Yeah it's probably because my dev machine is arm based

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/vertical-pod-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants