You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're getting the following error when using more than 1 GPU to run any application. There's no issue when using just 1 GPU.
Deploying 4 pods with each pod utilizing one GPU works fine.
The text was updated successfully, but these errors were encountered:
matthew-zhu
changed the title
Error when utilizing more than 1 GPU to run any application
Error when utilizing more than 1 NVIDIA GPU to run any application
Oct 26, 2023
This is a known issue that has a known fix, but must have slipped through the cracks making it into the latest release. Will talk with the team about getting a fix out quickly.
Hi we have just published the Device Plugin v0.14.3 release which includes a fix for this issue. Please give that a try and let us know if there are further problems.
We're getting the following error when using more than 1 GPU to run any application. There's no issue when using just 1 GPU.
Deploying 4 pods with each pod utilizing one GPU works fine.
OS: Ubuntu 22.04.3 LTS
Container-runtime: containerd
k8s-device-plugin: v0.14.1
nvidia-container-toolkit-daemonset error log:
nvidia-smi output:
Ex. yaml used for cuda-vectoradd:
The text was updated successfully, but these errors were encountered: