You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 5m59s hami-scheduler 0/1 nodes are available: 1 node unregistered. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
Warning FailedScheduling 39s hami-scheduler 0/1 nodes are available: . preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
Warning FilteringFailed 40s (x2 over 5m59s) hami-scheduler no available node, all node scores do not meet
What happened:
一个pod多个容器申请vGPU调度失败,申请数小于物理GPU数
node: 1
gpu: 8 (A100)
deviceSplitCount: 1 (or 4)
vgpu: 8 (or 32)
What you expected to happen:
pod内多容器可以申请vGPU
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
nvidia-smi -a
on your host/etc/docker/daemon.json
)sudo journalctl -r -u kubelet
)dmesg
Environment:
NVIDIA-SMI 550.127.05 Driver Version: 550.127.05 CUDA Version: 12.4
docker version
uname -a
Linux ai-product-server01 5.4.230-1.el7.elrepo.x86_64
The text was updated successfully, but these errors were encountered: