Skip to content

DRA: ReservedFor Workloads #5194

Open
Open
@johnbelamaric

Description

@johnbelamaric

Enhancement Description

Currently, when the scheduler allocates a ResourceClaim for a given Pod, it adds that Pod to the ResourceClaimStatus.ReservedFor list. A claim shared among multiple pods will have multiple entries in this list. This allows the ResourceClaimController to know when to de-allocate the claim; it does so once this list is empty.

The length of this list is limited to 256 pods. However, some workloads are much larger and may share a resource claim across many more pods, even in the thousands. Simply increasing the pod list to thousands is not a good long term solution.

Instead, this proposal is to allow us to reserve it for a workload. For example, rather than listing individual pods, you could list the Job or ReplicaSet or StatefulSet that is sharing the ResourceClaim. This avoids race conditions as pods come and go, without requiring listing every pod.

cc @pohly @klueska @thebinaryone1

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

Metadata

Metadata

Labels

lead-opted-inDenotes that an issue has been opted in to a releasesig/nodeCategorizes an issue or PR as relevant to SIG Node.sig/schedulingCategorizes an issue or PR as relevant to SIG Scheduling.stage/alphaDenotes an issue tracking an enhancement targeted for Alpha statuswg/device-managementCategorizes an issue or PR as relevant to WG Device Management.

Type

No type

Projects

Status

🏗 In progress

Status

Needs Triage

Status

Deferred

Status

Not for release

Status

Pre-Alpha

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions