diff --git a/keps/prod-readiness/sig-node/5304.yaml b/keps/prod-readiness/sig-node/5304.yaml new file mode 100644 index 00000000000..ea07dfcb194 --- /dev/null +++ b/keps/prod-readiness/sig-node/5304.yaml @@ -0,0 +1,3 @@ +kep-number: 5304 +alpha: + approver: "johnbelamaric" diff --git a/keps/sig-node/5304-dra-attributes-downward-api/README.md b/keps/sig-node/5304-dra-attributes-downward-api/README.md new file mode 100644 index 00000000000..5e9f11e45b1 --- /dev/null +++ b/keps/sig-node/5304-dra-attributes-downward-api/README.md @@ -0,0 +1,969 @@ + +# KEP-5304: DRA Device Attributes Downward API + + + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [User Stories (Optional)](#user-stories-optional) + - [Story 1](#story-1) + - [Story 2](#story-2) + - [Story 3](#story-3) + - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Framework Implementation](#framework-implementation) + - [Attributes JSON Generation (NodePrepareResources)](#attributes-json-generation-nodeprepareresources) + - [Cleanup (NodeUnprepareResources)](#cleanup-nodeunprepareresources) + - [Helper Functions](#helper-functions) + - [Driver Integration](#driver-integration) + - [Workload Consumption](#workload-consumption) + - [Usage Examples](#usage-examples) + - [Example 1: Physical GPU Passthrough (KubeVirt)](#example-1-physical-gpu-passthrough-kubevirt) + - [Example 2: vGPU with Mediated Device](#example-2-vgpu-with-mediated-device) + - [Feature Gate](#feature-gate) + - [Feature Maturity and Rollout](#feature-maturity-and-rollout) + - [Alpha (v1.35)](#alpha-v135) + - [Beta](#beta) + - [GA](#ga) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Alpha (v1.35)](#alpha-v135-1) + - [Alpha (v1.35)](#alpha-v135-2) + - [Beta](#beta-1) + - [GA](#ga-1) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) + - [Alternative 1: Downward API with ResourceSliceAttributeSelector (Original Design)](#alternative-1-downward-api-with-resourcesliceattributeselector-original-design) + - [Alternative 2: DRA Driver Extends CDI with Attributes (Driver-Specific)](#alternative-2-dra-driver-extends-cdi-with-attributes-driver-specific) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) within one minor version of promotion to GA +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + +This KEP proposes exposing Dynamic Resource Allocation (DRA) device attributes to workloads via CDI (Container Device +Interface) mounts. The DRA framework will provide helper functions enabling drivers to automatically generate per-claim +attribute JSON files and mount them into containers via CDI. Workloads like KubeVirt can read device metadata like +PCIe bus address, or mediated device UUID, from standardized file paths without requiring custom controllers or Downward +API changes. + +## Motivation + +Workloads that need to interact with DRA-allocated devices (like KubeVirt virtual machines) require access to device +specific metadata such as PCIe bus addresses or mediated device UUIDs. Currently, to fetch attributes from allocated +devices, users must: +1. Go to `ResourceClaimStatus` to find the request and device name +2. Look up the `ResourceSlice` with the device name to get attribute values + +This complexity forces ecosystem projects like KubeVirt to build custom controllers that watch these objects and inject +attributes via annotations/labels, leading to fragile, error-prone, and racy designs. + +### Goals + +- Provide a mechanism for workloads to discover DRA device metadata within workload pods. +- Minimize complexity and avoid modifications to core components like scheduler and kubelet to maintain system + reliability and scalability +- Provide an easy way for DRA device authors to make the attributes discoverable inside the pod. +- Maintain full backward compatibility with existing DRA drivers and workloads + +### Non-Goals + +- Require changes to existing DRA driver implementations +- Expose the entirety of `ResourceClaim`/`ResourceSlice` objects +- Support dynamic updates to attributes after container start +- Standardize attribute names or JSON schema in Alpha + +## Proposal + +This proposal introduces **framework-managed attribute JSON generation and CDI mounting** in the DRA kubelet plugin +framework (`k8s.io/dynamic-resource-allocation/kubeletplugin`). Drivers opt-in by setting the `AttributesJSON(true)` +option when starting their plugin. + +When enabled, the framework automatically: +1. Generates a JSON file per claim+request containing device attributes +2. Creates a corresponding CDI spec that mounts the attributes file into containers +3. Appends the CDI device ID to the NodePrepareResources response +4. Cleans up files during NodeUnprepareResources + +The workload reads attributes from the standardized path: `/var/run/dra-device-attributes/{driverName}-{claimUID}-{requestName}.json` + +### User Stories (Optional) + +#### Story 1 + +As a KubeVirt developer, I want the virt-launcher Pod to automatically discover the PCIe address of an allocated +physical GPU by reading a JSON file at a known path, so that it can construct the libvirt domain XML to pass through the +device to the virtual machine guest without requiring a custom controller. + +#### Story 2 + +As a DRA driver author, I want to enable attribute exposure with a single configuration option (`AttributesJSON(true)`) +and let the framework handle all file generation, CDI mounting, and cleanup, so I don't need to write custom logic for +every driver. + +#### Story 3 + +As a workload developer, I want to automatically discover device attributes inside the pod without parsing +ResourceClaim/ResourceSlice objects or calling the Kubernetes API, so my application can remain simple and portable. + +### Notes/Constraints/Caveats (Optional) + +- **File-based, not env vars**: Attributes are exposed as JSON files mounted via CDI, not environment variables. This + allows for complex structured data and dynamic attribute sets. +- **Opt-in in Alpha**: Drivers must explicitly enable `AttributesJSON(true)` the framework doesn't enable iy by default. +- **No API changes**: Zero modifications to Kubernetes API types. This is purely a framework/driver-side implementation. +- **File lifecycle**: Files are created during NodePrepareResources and deleted during NodeUnprepareResources. + +### Risks and Mitigations + +**Risk**: Exposing device attributes might leak sensitive information. +**Mitigation**: Attributes originate from `ResourceSlice`, which is cluster-scoped. Drivers control which attributes are + published. NodeAuthorizer ensures kubelet only accesses resources for scheduled Pods. Files are created with 0644 + permissions (readable but not writable by container). + +**Risk**: File system clutter from orphaned attribute files. +**Mitigation**: Framework implements cleanup in NodeUnprepareResources. On driver restart, framework can perform + best-effort cleanup by globbing and removing stale files. + +**Risk**: CRI runtime compatibility (not all runtimes support CDI). +**Mitigation**: Document CDI runtime requirements clearly. For Alpha, target containerd 1.7+ and CRI-O 1.23+ which have + stable CDI support. Fail gracefully if CDI is not supported. + +**Risk**: JSON schema changes could break workloads. +**Mitigation**: In Alpha, document that schema is subject to change. In Beta, the JSON schema could potentially be + standardized and versioned. + +## Design Details + +### Framework Implementation + +#### Attributes JSON Generation (NodePrepareResources) + +When `AttributesJSON` is enabled, the framework intercepts NodePrepareResources and for each claim+request: + +1. **Lookup attributes**: There is already a resourceslice controller running in the plugin which has an informer/lister + and cache. Using this cache, attributes will be looked up for a device. +2. **Generate attributes JSON**: + ```json + { + "claims": [ + { + "claimName": "my-claim", + "requests": [ + { + "requestName": "my-request", + "attributes": { + "foo": "bar", + "resource.kubernetes.io/pciBusID": "0000:00:1e.0" + } + } + ] + } + ] + } + ``` +3. **Write attributes file**: `{attributesDir}/{driverName}-{claimUID}-{requestName}.json` +4. **Generate CDI spec**: + ```json + { + "cdiVersion": "0.3.0", + "kind": "{driverName}/test", + "devices": [ + { + "name": "claim-{claimUID}-{requestName}-attrs", + "containerEdits": { + "env": [], + "mounts": [ + { + "hostPath": "/var/run/dra-device-attributes/{driverName}-{claimUID}-{requestName}.json", + "containerPath": "/var/run/dra-device-attributes/{driverName}-{claimUID}-{requestName}.json", + "options": ["ro", "bind"] + } + ] + } + } + ] + } + ``` + +5. **Write CDI spec**: `{cdiDir}/{driverName}-{claimUID}-{requestName}-attrs.json` +6. **Append CDI device ID**: Adds `{driverName}/test=claim-{claimUID}-{requestName}-attrs` to the device's + `CdiDeviceIds` in the response + +#### Cleanup (NodeUnprepareResources) + +When `AttributesJSON` is enabled, the framework removes files for the unprepared claims + +#### Helper Functions + +The `resourceslice.Controller` gains a new method: + +```go +// LookupDeviceAttributes returns device attributes (stringified) from the controller's +// cached ResourceSlices, filtered by pool and device name. +func (c *Controller) LookupDeviceAttributes(poolName, deviceName string) map[string]string +``` + +### Driver Integration + +Drivers enable the feature by passing options to `kubeletplugin.Start()`: + +```go +plugin, err := kubeletplugin.Start(ctx, driverPlugin, + kubeletplugin.AttributesJSON(true), + kubeletplugin.CDIDirectoryPath("/var/run/cdi"), + kubeletplugin.AttributesDirectoryPath("/var/run/dra-device-attributes"), +) +``` + +### Workload Consumption + +Workloads read attributes from the mounted file: + +```yaml +apiVersion: v1 +kind: Pod +metadata: + name: virt-launcher-gpu +spec: + resourceClaims: + - name: my-gpu-claim + resourceClaimName: physical-gpu-claim + containers: + - name: virt-launcher + image: kubevirt/virt-launcher:latest + command: + - /bin/sh + - -c + - | + # Read attributes from mounted JSON + ATTRS_FILE="/var/run/dra-device-attributes/"$(ls /var/run/dra-device-attributes/*.json | head -1) + PCI_ROOT=$(jq -r '.claims[0].requests[0].attributes["resource.kubernetes.io/pcieRoot"]' $ATTRS_FILE) + echo "PCI Root: $PCI_ROOT" + # Use PCI_ROOT to configure libvirt domain XML... +``` + +**File Path Convention**: `/var/run/dra-device-attributes/{driverName}-{claimUID}-{requestName}.json` + +Since workloads typically know their claim name but not the UID, they can: +- Use shell globbing: `ls /var/run/dra-device-attributes/{driverName}-*.json` +- Parse the filename to extract UID and request name +- Or read all JSON files and match by `claimName` field + +### Usage Examples + +#### Example 1: Physical GPU Passthrough (KubeVirt) + +```yaml +apiVersion: v1 +kind: Pod +metadata: + name: vm-with-gpu +spec: + resourceClaims: + - name: pgpu + resourceClaimName: physical-gpu-claim + containers: + - name: compute + image: kubevirt/virt-launcher:latest + command: + - /bin/sh + - -c + - | + ATTRS=$(cat /var/run/dra-device-attributes/gpu.example.com-*-pgpu.json) + PCI_ROOT=$(echo $ATTRS | jq -r '.claims[0].requests[0].attributes["resource.kubernetes.io/pcieRoot"]') + # Generate libvirt XML with PCI passthrough using $PCI_ROOT + echo "
" +``` + +#### Example 2: vGPU with Mediated Device + +```yaml +apiVersion: v1 +kind: Pod +metadata: + name: vm-with-vgpu +spec: + resourceClaims: + - name: vgpu + resourceClaimName: virtual-gpu-claim + containers: + - name: compute + image: kubevirt/virt-launcher:latest + command: + - /bin/sh + - -c + - | + ATTRS=$(cat /var/run/dra-device-attributes/vgpu.example.com-*-vgpu.json) + MDEV_UUID=$(echo $ATTRS | jq -r '.claims[0].requests[0].attributes["dra.kubevirt.io/mdevUUID"]') + # Use MDEV_UUID to configure mediated device passthrough + echo "
" +``` + +### Feature Gate + + + +### Feature Maturity and Rollout + +#### Alpha (v1.35) + +- Opt-in only (drivers must explicitly enable `AttributesJSON(true)`) +- Framework implementation in `k8s.io/dynamic-resource-allocation/kubeletplugin` +- Helper functions for attribute lookup +- Unit tests for JSON generation, CDI spec creation, file lifecycle +- Integration tests with test driver +- E2E test validating file mounting and content +- Documentation for driver authors +- **No feature gate**: This is a framework-level opt-in, not a Kubernetes API change + +#### Beta + +- Opt-out (drivers must explicitly disable `AttributesJSON(false)`, otherwise it will be enabled by default) +- Standardize JSON schema with versioning (`"schemaVersion": "v1beta1"`) +- Production-ready error handling and edge cases +- Performance benchmarks for prepare latency +- Documentation for workload developers +- Real-world validation from KubeVirt and other consumers + +#### GA +- Always enabled +- At least one stable consumer (e.g., KubeVirt) using attributes in production +- Schema versioning and backward compatibility guarantees +- Comprehensive e2e coverage including failure scenarios + +### Test Plan + +[x] I/we understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +#### Prerequisite testing updates + +No additional prerequisite testing updates are required. Existing DRA test infrastructure will be leveraged. + +##### Unit tests + + + + + +- ``: `` - `` + +#### Integration tests + +Integration tests will cover: + +- **End-to-end attribute exposure**: Create Pod with resourceClaims, verify attributes JSON is generated and mounted +- **Multiple claims**: Pod with multiple resource claims, verify separate files for each claim+request +- **Missing attributes**: ResourceSlice with no attributes, verify empty map is written +- **Attribute types**: Test string, bool, int, version attributes are correctly stringified +- **Cleanup**: Verify files are removed after unprepare +- **Opt-in behavior**: Verify files are NOT created when `AttributesJSON(false)` + +Tests will be added to `test/integration/dra/`. + +#### e2e tests + +E2E tests will validate real-world scenarios: + +- **Attributes file mounted**: Pod can read `/var/run/dra-device-attributes/{driver}-{uid}-{request}.json` +- **Correct content**: Verify JSON contains expected claim name, request name, and attributes +- **Multi-device request**: Verify attributes from all allocated devices are included +- **CDI integration**: Verify CRI runtime correctly processes CDI device ID and mounts file +- **Cleanup on delete**: Delete Pod, verify attribute files are removed from host + +Tests will be added to `test/e2e/dra/dra.go`. + +### Graduation Criteria + +#### Alpha (v1.35) + +#### Alpha (v1.35) + +- [ ] Framework implementation complete with opt-in via `AttributesJSON(true)` +- [ ] Helper functions for attribute lookup implemented +- [ ] Unit tests for core logic (JSON generation, CDI spec creation, file lifecycle) +- [ ] Integration tests with test driver +- [ ] E2E test validating file mounting and content +- [ ] Documentation for driver authors published +- [ ] Known limitations documented (no schema standardization yet) + + +#### Beta + +TBD + +#### GA + +TBD + +### Upgrade / Downgrade Strategy + +**Upgrade:** +- No Kubernetes API changes, so upgrade is transparent to control plane +- Framework changes are backward compatible: existing drivers without `AttributesJSON(true)` continue to work unchanged +- Drivers can opt-in at their own pace by adding `AttributesJSON(true)` option +- Workloads without DRA claims are unaffected +- Workloads with DRA claims but not reading attribute files are unaffected + +**Downgrade:** +- Kubelet downgrade is NOT problematic: This feature is implemented entirely in the driver plugin framework, not in + kubelet +- If downgrading the *driver* existing pods with mounted attribute files will continue to run but new pods will not have + attribute files mounted + +**Rolling upgrade:** +- Drivers can be upgraded one at a time without cluster-wide coordination +- Pods using upgraded drivers (with `AttributesJSON(true)`) get attribute files; pods using old drivers don't +- Node/kubelet upgrades do not affect this feature (it's driver-side only) +- Workloads should handle missing files gracefully + +### Version Skew Strategy + +**Control Plane and Node Coordination:** +- This feature primarily involves changes in driver hence no coordination needed between control plane and node + +**Version Skew Scenarios:** + +1. **Newer Driver**: pods created after this update will have attributes file +2. **Older Driver**: pods created by this driver will not have the attributes file + +**Recommendation:** +- Test in a non-production environment first + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + +- [x] Rollout a driver with `AttributesJSON(true)` + +###### Does enabling the feature change any default behavior? + +No. Enabling the feature adds new CDI mount points containing attributes of DRA devices in JSON format + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + +Yes. The feature can be disabled by rolling back to previoud driver version or a new version with `AttributesJSON(false)` + +**Consequences:** +- New Pods will not have the attributes available inside the pod +- Existing running Pods will continue to run + +**Recommendation:** Before disabling, make sure attributes consumers have an alternative mechanism to lookup the attributes + +###### What happens if we reenable the feature if it was previously rolled back? + +Re-enabling the feature restores full functionality. + +- New Pods will work correctly +- Existing Pods (created while feature was disabled) are unaffected. + +No data migration or special handling is required. + +###### Are there any tests for feature enablement/disablement? + +Yes: +- Unit tests verify files are NOT created when `AttributesJSON(false)` +- Integration tests verify opt-in behavior with framework flag toggle +- E2E tests validate files are present with feature on, absent with feature off + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +###### What specific metrics should inform a rollback? + + + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +###### How can someone using this feature know that it is working for their instance? + + + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [ ] Other (treat as last resort) + - Details: + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + +NO + +###### Will enabling / using this feature result in introducing new API types? + +No. + +###### Will enabling / using this feature result in any new calls to the cloud provider? + +No. + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + +No + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + +Yes, but the impact should be minimal: + +- Pod startup latency: Drivers must lookup attribute values before starting containers, but the impact of this is + minimized by local informer based lookup + +- The feature does not affect existing SLIs/SLOs for clusters not using DRA or for drivers not opting-in on this feature + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + +No + +###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)? + +No significant risk of resource exhaustion. + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +###### What are other known failure modes? + + + +###### What steps should be taken if SLOs are not being met to determine the problem? + +## Implementation History + +- 2025-10-02: KEP created and initial proposal drafted +- 2025-10-03: KEP updated with complete PRR questionnaire responses + +## Drawbacks + +1. **Filesystem dependency**: Unlike Downward API environment variables (which are managed by kubelet), this approach + requires reliable filesystem access to `/var/run/`. Failures in file writes block Pod startup. +2. **CDI runtime requirement**: Not all CRI runtimes support CDI (or support different CDI versions). This limits + compatibility to newer runtimes and requires clear documentation. +3. **Opaque file paths**: Workloads must discover filenames via globbing or parse JSON to match claim names. The + Downward API approach with env vars would have been more ergonomic. +4. **No schema standardization in Alpha**: The JSON structure is subject to change. Early adopters may need to update + their parsers between versions. +5. **Driver opt-in complexity**: Drivers must understand and configure multiple framework options (`AttributesJSON`, + `CDIDirectoryPath`, `AttributesDirectoryPath`, `ResourceSliceLister`). The Downward API approach would have been + transparent to drivers. +6. **Limited discoverability**: Workloads can't easily enumerate all claims or requests; they must know the claim name + or glob for files. Env vars would provide named variables. + +## Alternatives + +### Alternative 1: Downward API with ResourceSliceAttributeSelector (Original Design) + +**Description**: Add `resourceSliceAttributeRef` selector to `core/v1.EnvVarSource` allowing environment variables to reference DRA device attributes. Kubelet would run a local controller watching ResourceClaims and ResourceSlices to resolve attributes at container start. + +**Example**: +```yaml +env: +- name: PGPU_PCI_ROOT + valueFrom: + resourceSliceAttributeRef: + claimName: pgpu-claim + requestName: pgpu-request + attribute: resource.kubernetes.io/pcieRoot +``` + +**Pros**: +- Native Kubernetes API integration +- Familiar pattern for users (consistent with Downward API) +- Transparent to drivers (no driver changes required) +- Type-safe API validation +- Named environment variables (no globbing required) + +**Cons**: +- Requires core API changes (longer review/approval cycle) +- Adds complexity to kubelet (new controller, watches, caching) +- Performance impact on API server (kubelet watches ResourceClaims/ResourceSlices cluster-wide or per-node) +- Limited to environment variables (harder to expose complex structured data) +- Single attribute per reference (multiple env vars needed for multiple attributes) + +**Why not chosen**: +- Too invasive for Alpha; requires API review and PRR approval +- Kubelet performance concerns with additional watches +- Ecosystem requested CDI-based approach for flexibility and faster iteration + +### Alternative 2: DRA Driver Extends CDI with Attributes (Driver-Specific) + +**Description**: Each driver generates CDI specs with custom environment variables containing attributes. No framework involvement. + +**Example** (driver-generated CDI): +```json +{ + "devices": [{ + "name": "gpu-0", + "containerEdits": { + "env": [ + "PGPU_PCI_ROOT=0000:00:1e.0", + "PGPU_DEVICE_ID=device-00" + ] + } + }] +} +``` + +**Pros**: +- No framework changes +- Maximum driver flexibility +- Works today with existing DRA + +**Cons**: +- Every driver must implement attribute exposure independently (duplication) +- No standardization across drivers (KubeVirt must support N different drivers) +- Error-prone (drivers may forget to expose attributes or use inconsistent formats) +- Hard to discover (workloads must know each driver's conventions) + +**Why not chosen**: +- Poor user experience (no standard path or format) +- High maintenance burden for ecosystem (KubeVirt, etc.) +- Missed opportunity for framework to provide common functionality + +## Infrastructure Needed (Optional) + +None. This feature will be developed within existing Kubernetes repositories: +- Framework implementation in `kubernetes/kubernetes` (staging/src/k8s.io/dynamic-resource-allocation/kubeletplugin) +- Helper functions in `kubernetes/kubernetes` (staging/src/k8s.io/dynamic-resource-allocation/resourceslice) +- Tests in `kubernetes/kubernetes` (test/integration/dra, test/e2e/dra, test/e2e_node) +- Documentation in `kubernetes/website` (concepts/scheduling-eviction/dynamic-resource-allocation) + +Ecosystem integration (future): +- KubeVirt will consume attributes from JSON files (separate KEP in kubevirt/kubevirt) +- DRA driver examples will be updated to demonstrate `AttributesJSON(true)` usage \ No newline at end of file diff --git a/keps/sig-node/5304-dra-attributes-downward-api/kep.yaml b/keps/sig-node/5304-dra-attributes-downward-api/kep.yaml new file mode 100644 index 00000000000..c9a204c6881 --- /dev/null +++ b/keps/sig-node/5304-dra-attributes-downward-api/kep.yaml @@ -0,0 +1,47 @@ +title: DRA Device Attributes Downward API +kep-number: 5304 +authors: + - "@alaypatel07" +owning-sig: sig-node +participating-sigs: [] +status: implementable +creation-date: 2025-10-02 +reviewers: + - "@johnbelamaric" + - "@mortent" + - "@SergeyKanzhelev" +approvers: + - "@klueska" + +see-also: +- "/keps/sig-node/4381-dra-structured-parameters" +- https://github.com/kubernetes/kubernetes/pull/132296 +replaces: [] + +# The target maturity stage in the current dev cycle for this KEP. +# If the purpose of this KEP is to deprecate a user-visible feature +# and a Deprecated feature gates are added, they should be deprecated|disabled|removed. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v1.35" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.35" + beta: "v1.36" + stable: "v1.37" + +# The following PRR answers are required at alpha release +# List the feature gate name and the components for which it must be enabled +feature-gates: + - name: NA + components: + - +disable-supported: true + +# The following PRR answers are required at beta release +metrics: + - TBD