Skip to content

✨ WIP: Initial dedicated hosts implementation #5504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

faermanj
Copy link
Contributor

/kind feature

What this PR does / why we need it:

Let users allocate machines to dedicated hosts.

Special notes for your reviewer:

Dedicated hosts must be pre-allocated.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/dedicated-hosts-overview.html

Checklist:

  • squashed commits
  • includes documentation
  • includes emoji in title
  • adds unit tests
  • adds or updates e2e tests

Release note:

NONE

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 26, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign vincepri for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from Ankitasw and nrb May 26, 2025 17:49
@k8s-ci-robot k8s-ci-robot added needs-priority size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 26, 2025
Copy link
Contributor

@mtulio mtulio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @faermanj - thanks for pinging. I have some questions about the API and test steps.

Comment on lines 689 to 693
placementStr := input.Placement.GoString()
s.scope.Warn("Placement already set for instance, overwriting with dedicated host placement",
"hostId", i.HostID,
"affinity", i.HostAffinity,
"placement", placementStr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this approach if you are using only to logging?

Suggested change
placementStr := input.Placement.GoString()
s.scope.Warn("Placement already set for instance, overwriting with dedicated host placement",
"hostId", i.HostID,
"affinity", i.HostAffinity,
"placement", placementStr)
s.scope.Warn("Placement already set for instance, overwriting with dedicated host placement",
"hostId", i.HostID,
"affinity", i.HostAffinity,
"placement", input.Placement.GoString())

Comment on lines 231 to 237
// Affinity specifies the dedicated host affinity setting for the instance.
// When affinity is set to Host, an instance started onto a specific host always restarts on the same host if stopped.
// +optional
// +kubebuilder:validation:Enum:=Default;Host
HostAffinity *string `json:"hostAffinity,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Affinity specifies the dedicated host affinity setting for the instance.
// When affinity is set to Host, an instance started onto a specific host always restarts on the same host if stopped.
// +optional
// +kubebuilder:validation:Enum:=Default;Host
HostAffinity *string `json:"hostAffinity,omitempty"`
// HostAffinity specifies the dedicated host affinity setting for the instance.
// When hostAffinity is set to Host, an instance started onto a specific host always restarts on the same host if stopped.
// +optional
// +kubebuilder:validation:Enum:=Default;Host
HostAffinity *string `json:"hostAffinity,omitempty"`


ginkgo.By("Creating cluster")
clusterName := fmt.Sprintf("%s-%s", specName, util.RandomString(6))
vars := map[string]string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about clusterVars, or maybe add the map directly to the ClusterctlVariables as it is used only there ?

Comment on lines 277 to 283
// HostID specifies the dedicated host on which the instance should be started
// +optional
HostID *string `json:"hostID,omitempty"`

// Affinity specifies the dedicated host affinity setting for the instance.
// +optional
HostAffinity *string `json:"hostAffinity,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to set the defaults here too? IMHO it also would be nice to mention HostAffinity depends on HostID.

shared.ReleaseHost(e2eCtx, hostID)
}()

ginkgo.By("Creating cluster")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

considering the feature is adding a field to the machineSpec, is it recommended to create a new cluster every time or you can use an existing generic cluster to create a new machine to validate if it has been provisioned to a Dedicated Host? (this is a general question, not sure if this is a recommended approach)
cc @richardcase @nrb

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the EKS tests we reuse an existing cluster to help speed thing sup. In the non-EKS side (i.e. unmanaged suite) historically, new clusters were created every time. However, if we can speed up the e2e by just reusing an existing cluster, that would be great.

@rvanderp3 rvanderp3 force-pushed the dedicated-host branch 3 times, most recently from bc6c343 to 83f84c3 Compare June 11, 2025 17:26
@rvanderp3
Copy link
Contributor

superseded by #5548

@faermanj faermanj marked this pull request as ready for review June 16, 2025 14:32
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 16, 2025
@k8s-ci-robot k8s-ci-robot requested a review from faiq June 16, 2025 14:32
@k8s-ci-robot
Copy link
Contributor

@faermanj: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-aws-test 72ecf00 link true /test pull-cluster-api-provider-aws-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-priority release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants