-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] unset CPU limit is overridden with CPU request #3574
Comments
We have just encountered this today also; in our case request is being set at the task level: Here is the schema from flytectl's point of view;
pod spec: resources:
limits:
cpu: "64"
memory: 950Gi
requests:
cpu: "64"
memory: 950Gi |
For more context: In many cases setting CPU limits (not requests) results in unused CPU due to throttling, see also |
Need to explore this further, but Flyte does this because of k8s pod QoS. If the requests are different than the limits then k8s frequently preempts the pods to schedule others. By setting them the same, k8s will not prematurely delete the pod to schedule another. |
We were wondering if it is the right decision to set it by default instead of letting the user decide. In our case most tasks are scheduled on exclusive nodes. Without manually overriding the limits, currently users would need to get very close to the maximum CPU and memory request of the node. We currently work around this by automatically overriding the limits on every task. |
My 2 cents here is that letting the user decide, rather than enforcing an unintuitive behavior, is never a bad choice. Sure there can be a default behavior, but the user should be given the option to leave the limits unset (unlimited) depending on the use case. And, as @flixr pointed out, there are good reasons to do it sometimes. |
Describe the bug
A container task with only CPU requests but no limits set, still gets limits applied:
Running this task results in a pod with cpu limit also set to 2 (same as request), but there should be no cpu limit.
Expected behavior
If no CPU limit is specified, it is also not set implicitly and no CPU limit is applied in k8s.
So I propose to only copy the requests to limits as a whole if limits are completely unset instead of filling missing limits with the respective requests values.
Additional context to reproduce
No response
Screenshots
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: