-
Notifications
You must be signed in to change notification settings - Fork 1.5k
KEP-5018: move to beta in 1.34 #5327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Rita Zhang <[email protected]>
@ritazh: GitHub didn't allow me to assign the following users: for, sig, auth, PRR. Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update lgtm, just had a couple questions
@@ -466,7 +466,7 @@ ResourceClaimTemplate and ResourceClaim for admin access | |||
- Gather feedback | |||
- Additional tests are in Testgrid and linked in KEP | |||
- Implementations in the kubernetes-sigs/dra-example-driver | |||
- Implementations in the kubernetes-sigs/dra-example-driver: https://github.com/kubernetes-sigs/dra-example-driver/issues/97 and the NVIDIA dra driver: https://github.com/NVIDIA/k8s-dra-driver-gpu/issues/337 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do those issues mean we will show those repos labeling namespaces as admin access and using devices as admin access before promoting the gate to beta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Implementations in the kubernetes-sigs/dra-example-driver
was part of the original beta criteria. I think we should be able to add an example there. I'm less certain about the exact timeline of the Nvidia one. I could remove that one for now and add it back AFTER it's done. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not take dependencies on consumers we're not sure will be ready as a beta graduation criteria ... one example use seems sufficient
@@ -541,7 +541,12 @@ rollout. Similarly, consider large clusters and how enablement/disablement | |||
will rollout across nodes. | |||
--> | |||
|
|||
Will be considered for beta. | |||
- kube-controller-manager: If the kube-controller-manager fails to create `ResourceClaim` objects from `ResourceClaimTemplate` due to misconfigurations or permission issues relating to `adminAccess`, then the associated Pods will remain in a pending state and won't be scheduled. | |||
- kube-scheduler: Bugs in the scheduler might lead to Pods not being scheduled even when resources are available or, scheduling Pods that shouldn't be scheduled due to unmet `adminAccess` requirements. If the `DRAAdminAccess` feature gate isn't enabled or is misconfigured, the scheduler might not recognize ResourceClaim requirements, leading to scheduling failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this thinking of something more than generic scheduler backoff behavior when it encounters failed API requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this should be part of the generic scheduler backoff behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, maybe clarify that... otherwise this line sounds scarier or more specific to this feature than it actually is
@@ -596,7 +603,9 @@ checking if there are objects with field X set) may be a last resort. Avoid | |||
logs or events for this purpose. | |||
--> | |||
|
|||
Will be considered for beta. | |||
".status.allocation.devices.results[*].adminaccess" will be set to true for a claim using adminAccess when needed by a pod. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
".status.allocation.devices.results[*].adminaccess" will be set to true for a claim using adminAccess when needed by a pod. | |
".status.allocation.devices.results[*].adminAccess" will be set to true for a claim using adminAccess when needed by a pod. |
@@ -705,7 +717,8 @@ and creating new ones, as well as about cluster-level services (e.g. DNS): | |||
- Impact of its degraded performance or high-error rates on the feature: | |||
--> | |||
|
|||
Will be considered for beta. | |||
- The DynamicResourceAllocation feature gate must be enabled to create ResourceClaim, ResourceClaimTemplate. More details at [KEP-4381 - DRA Structured Parameters](https://github.com/kubernetes/enhancements/issues/4381) | |||
- A third-party DRA driver is required for how the driver should interpret the AdminAcess field to get acess to device specific resources without allocating them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- A third-party DRA driver is required for how the driver should interpret the AdminAcess field to get acess to device specific resources without allocating them. | |
- A third-party DRA driver is required for how the driver should interpret the AdminAcess field to get access to device specific resources without allocating them. |
Signed-off-by: Rita Zhang <[email protected]>
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly the integration links are missing, otherwise it's good to go.
@@ -466,7 +466,7 @@ ResourceClaimTemplate and ResourceClaim for admin access | |||
- Gather feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing bits higher in the doc:
- make sure to check appropriate boxes in
Release Singoff Checklist
- In
Integration tests
section, please make sure to link tests according to the template, especially the newly added that are called out there, since looking at the PRs submitted during alpha they did add new tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments still hold.
and ResourceClaim. | ||
|
||
- Mitigations: When ResourceClaims or ResourceClaimTemplates the `AdminAccess` | ||
field don't get created, debugging should focus on the namespace labels. The kube-controller-manager logs should have more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
field don't get created, debugging should focus on the namespace labels. The kube-controller-manager logs should have more information. | |
field doesn't get created, debugging should focus on the namespace labels. The kube-controller-manager logs should have more information. |
Signed-off-by: Rita Zhang <[email protected]>
/lgtm |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: liggitt, ritazh The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
``` | ||
Note: This label has been updated from `resource.k8s.io/admin-access` while the feature was in alpha in v1.33. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not quite accurate, b/c 1.33 is still using resource.k8s.io/admin-access: "true"
, so maybe
Note: This label has been updated from `resource.k8s.io/admin-access` while the feature was in alpha in v1.33. | |
Note: This label has been updated from `resource.k8s.io/admin-access` before the beta promotion. |
or
Note: This label has been updated from `resource.k8s.io/admin-access` while the feature was in alpha in v1.33. | |
Note: This label has been updated from `resource.k8s.io/admin-access` while the feature was in alpha. |
Ideally I'd say open a PR doing so asap.
@@ -466,7 +466,7 @@ ResourceClaimTemplate and ResourceClaim for admin access | |||
- Gather feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments still hold.
/wg device-management
/assign @liggitt for sig auth
/assign @pohly
/assign @soltysh for PRR