Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPCLOUD-2775: add cluster api autoscaler integration enhancement #1736

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

elmiko
Copy link
Contributor

@elmiko elmiko commented Jan 15, 2025

this enhancement describes how we will integrate the cluster autoscaler, and related controllers, with the Cluster API machine management layer.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 15, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 15, 2025

@elmiko: This pull request references OCPCLOUD-2775 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

this enhancement describes how we will integrate the cluster autoscaler, and related controllers, with the Cluster API machine management layer.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Jan 15, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ashcrow for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@elmiko elmiko force-pushed the add-cas-cao-capi-integration branch 2 times, most recently from 1c80b56 to 4554fb1 Compare January 16, 2025 16:39
@elmiko
Copy link
Contributor Author

elmiko commented Jan 16, 2025

i'm not sure why it's barfing on the metadata

@elmiko
Copy link
Contributor Author

elmiko commented Jan 16, 2025

figured it out, needed quoting on the github handles

@elmiko elmiko force-pushed the add-cas-cao-capi-integration branch from 4554fb1 to 73bfea9 Compare January 16, 2025 18:51
Copy link
Contributor

openshift-ci bot commented Jan 16, 2025

@elmiko: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Comment on lines +85 to +89
version and would allow us to drop some patches we are carrying. The Cluster
API MachineSet sync controller will be updated to recognize when the
Cluster Autoscaler has made a change to a Cluster API resource and then sync
the change to the corresponding Machine API resource, regardless of which resource
is authoritative.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to clarify exactly what kind of writes the CAS would be making, am I right in thinking that it's just the scale subresource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, only the scale subresource.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we could possibly gate scale subresource updates in a different way to other writes 🤔

Comment on lines +94 to +97
locate the resource. The Cluster API MachineSet sync controller will be updated
to ensure that when the Cluster Autoscaler Operator adds the autoscaling
annotations that they are copied to any related resources, regardless of which
is authoritative.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case, since we own the CAO, we don't necessarily need an exception within the CAPI sync controller, and could handle this in CAO. I would expect CAO to look at a MAPI MachineSet, and check if it's authoritative, and then apply the annotations correctly

Will it still be annotations on the CAPI side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the annotations are available on the CAPI side, we will need to migrate a few of them. eventually we will want the infrastructure templates to carry the capacity info in their status field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an estimated timeline on having the scale information directly in the status?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no timeline has ever been proposed. this feature was added as "opt-in", providers are not required to make these changes.

Comment on lines +294 to +296
The Cluster Autoscaler Operator will update the Machine API MachineSet resource, and
then the MachineSet sync controller will sync the change to the Cluster API MachineSet
resource. The sync controller will use the managed fields (i.e. `.metadata.managedFields`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it an error for there to be a MachineAutoscaler that points to both MAPI and CAPI versions of the same MachineSet?

And then CAO can just update the correct, authoritative resource with whatever information it needs to, without having to worry about conflicting resources

I think if we do this, we need less special logic within the sync controllers

In theory we could validate this using a validating admission policy, but I don't think it's fool-proof

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it an error for there to be a MachineAutoscaler that points to both MAPI and CAPI versions of the same MachineSet?

yes we can. the code currently checks to see if a MachineSet has an annotation indicating that it has a MachineAutoscaler, we could extend this search to cover the case where 2 MachineAutoscalers are effectively pointing at the same target.

And then CAO can just update the correct, authoritative resource with whatever information it needs to, without having to worry about conflicting resources

i want to make sure i understand this. as a user, i have a MAPI authoritative MachineSet but i create a MachineAutoscaler referencing the non-authoritative CAPI MachineSet. when the CAO sees this, it should locate both MAPI and CAPI MachineSets and then only update the authoritative one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i want to make sure i understand this. as a user, i have a MAPI authoritative MachineSet but i create a MachineAutoscaler referencing the non-authoritative CAPI MachineSet. when the CAO sees this, it should locate both MAPI and CAPI MachineSets and then only update the authoritative one?

That is what I'm suggesting yes, do you think that is overly complex?

Copy link
Contributor Author

@elmiko elmiko Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, i don't think it's overly complex, and i like how it gives the user strong feedback. but, we still need to reconcile this behavior with your comment here.

essentially, what should happen if the authoritative resource changes, should the MachineAutoscaler reconcile again and change its status?

Comment on lines +358 to +362
* The Cluster Autoscaler Operator has added the minimum and maximum size annotations, and ownership
annotation to a record. If the sync controller sees an update to these annotations on a
non-authoritative resource originating from the Cluster Autoscaler Operator, it will copy
that change to the authoritative resource if no MachineAutoscaler is referencing the
authoritative resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we own the CAO, I think implementing this within CAO is likely the easier option.

The sync logic already syncs annotations across between machinesets, we just need to add special logic to handle well known annotations if they somehow end up converting from annotations to structured status

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, i can update this

Comment on lines +363 to +368
* A provider MachineSet controller has added the scale from zero annotations to a
non-authoritative record. This occurs when the Cluster API resource is marked as
authoritative but the Machine API resource is updated by the provider MachineSet controller.
In these cases the scale from zero annotations will be copied to the non-authoritative
Cluster API resource. The data from the MachineSet controller is only applied to
Machine API resources currently.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an equivalent of this controller in CAPI? Or, if not, is it on the roadmap? If it is on the roadmap, we will want to ensure these controllers following the same pausing as the rest of the controllers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the last time i looked, most if not all of the providers also package a MachineSet actuators, but we actually want to promote a different behavior in the upstream. we want upstream providers to implement infrastructure template controllers to add the capacity information to the status on the infrastructure template, not as annotations on the MachineSet or MachineDeployment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a timeline for seeing something like this directly in the upstream?

Copy link
Contributor Author

@elmiko elmiko Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as previous reply, a timelime was never proposed for this as it is "opt-in".

Comment on lines +369 to +375
* `.spec.replicas` changes will be synced from the Cluster API MachineSet to the Machine
API MachineSet regardless of which is authoritative when the change originates from the
Cluster Autoscaler. As the Cluster Autoscaler will be configured to operate against Cluster
API resources only, there will be a need to identify when the Cluster Autoscaler has updated
a non-authoritative Cluster API resource so that the authoritative resource can be updated.
This will only occur when the sync controller observes and update to the replicas field from
the Cluster Autoscaler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We intend to use a VAP to block updates to a non-authoritative resource, have you investigated how we let the CAS through the VAP (what exception do we need in place) to make just this replicas field change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have not, do you happen to have a link where i could learn more?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our intention is to use VAP (https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy/) to block writes to the object unless they come specifically from sync controller, or, the resource itself is authoritative.

I think what would be interesting to explore would be if we could add an exception allowing another service account to write, but specifically only to the status subresource

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that is interesting.

could create a race condition where updating the minimum and maximum size
values will lead to an inaccurate update to both MachineSets.

To address the risk of possible race conditions on MachineAutoscalers we have a few
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Third option, CAS will take no action when multiple MachineAutoscalers refer to the same MachineSet (MAPI and CAPI mirrors) and instead will update the status to show that one of the two should be removed

Fourth option, use some sort of admission time validation to try and prevent this

Three and four should be able to be done in conjunction with one another

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Third option, CAS will take no action when multiple MachineAutoscalers refer to the same MachineSet (MAPI and CAPI mirrors) and instead will update the status to show that one of the two should be removed

this is similar to option 1 in some ways, also the CAS does not take action from MachineAutoscalers, only when it sees the min/max annotations on a MachineSet (CAPI in this case).

and yes, i want the status on a MachineAutoscaler to tell the user if they have the wrong record or whatever.

Fourth option, use some sort of admission time validation to try and prevent this

will add this as an option.

i think the best solution is probably for us to only allow selecting the authoritative MachineSet, and then putting a good error message in the status. it might be confusing at first, but will make the logic easy to implement and give users a clear indication of how to fix the problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the best solution is probably for us to only allow selecting the authoritative MachineSet

The authoritative machineset may change over time, so we could accept something, and later it become invalid based on this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like something we will need to address either way, it could happen to any MachineAutoscaler resource.

Comment on lines +465 to +466
2. Will we want to remove MachineAutoscalers that reference Cluster API MachineSets
during a downgrade?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support downgrades, so we probably don't need to be concerned about this.

We won't automatically create these during an upgrade, so downgrading a failed upgrade shouldn't produce an issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, i'll add an answer to that question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants