Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RBAC fix to enable slack cluster queue lending limit adjustment #613

Merged
merged 1 commit into from
Sep 4, 2024

Conversation

dgrove-oss
Copy link
Collaborator

The codeflare operator needs permission to read and write clusterqueues to enable the AppWrapper controller to adjust the lending limit of a designated slack cluster queue to reflect cordoned nodes.

The codeflare operator needs permission to read and write clusterqueues
to enable the AppWrapper controller to adjust the lending limit of
a designated slack cluster queue to reflect cordoned nodes.
@dgrove-oss
Copy link
Collaborator Author

Although we can work around this in MLBatch, it would be nice if this fix could be merged in time to make the next release so MLBatch only needs to have the patch for the codeflare operator's role in our configuration for RHOAI 2.12.

Copy link

@varshaprasad96 varshaprasad96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dgrove-oss The changes seem good to provide right permissions to the slack CQ.

I assume this is only for the purposes of managing the Slack queue (as mentioned in the description). However, do we have admission policy (or a validating webhook) in place to not allow modifications to the slack CQ by anyone other than the app wrapper controller (would need a SA for this). Just concerned that we are not letting the same CQ be modified by Kueue as well as appwrapper controllers.

@dgrove-oss
Copy link
Collaborator Author

dgrove-oss commented Sep 4, 2024

Summarizing an offline discussion, the lendingLimit field of the ClusterQueue won't be updated by the kueue controller. Cluster Admins in MLBatch are expected to modify the quota information in the slack ClusterQueue, but are expecting the AppWrapper controller to be modifying the lendingLimit. The AppWrapper controller re-computes the value of the lendingLimit each time from the quota and the node status, so even if the value was mistakenly modified by a cluster admin the AppWrapper controller would correct the mistake on the next reconcile by writing an updated value (after dealing with the reconcile conflict in the usual manner).

Copy link

@varshaprasad96 varshaprasad96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Copy link

openshift-ci bot commented Sep 4, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: varshaprasad96

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Sep 4, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 8fa9455 into project-codeflare:main Sep 4, 2024
8 checks passed
@dgrove-oss dgrove-oss deleted the rbac branch September 4, 2024 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants