Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device plugin not starting and pod showing 0/1 nodes are available: 1 node(s) had untolerated taint {gpu: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.. #421

Open
Abhishekghosh1998 opened this issue Jul 5, 2023 · 4 comments
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@Abhishekghosh1998
Copy link

Abhishekghosh1998 commented Jul 5, 2023

Hello there, I am facing an issue while integrating an Nvidia GPU with a pod.
I have successfully set up the nvidia-container-toolkit and I am using the 5.15.0-76-generic Linux kernel.

$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"clean", BuildDate:"2022-06-15T14:22:29Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-15T00:36:28Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.24) and server (1.27) exceeds the supported minor version skew of +/-1
$ docker --version
Docker version 24.0.2, build cb74dfc
$ docker run --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
Wed Jul  5 06:46:43 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 2080        On  | 00000000:26:00.0 Off |                  N/A |
| 23%   35C    P8              18W / 215W |     92MiB /  8192MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

I have setup a kind-control-plane node on kubectl.

$ kubectl get nodes
NAME                 STATUS   ROLES           AGE   VERSION
kind-control-plane   Ready    control-plane   35m   v1.27.3

I have labeled the node with gpu=installed

$ kubectl label nodes kind-control-plane gpu=installed

And I have also tained the node with

kubectl taint nodes kind-control-plane gpu:NoSchedule

Because without this the plugin was complaining for the node to be not a GPU node:

$ DEVICE_PLUGIN_POD="nvidia-device-plugin-daemonset-tm4pv"
abhishek@abhishek:~$ kubectl logs -n kube-system $DEVICE_PLUGIN_POD
I0705 05:08:24.579349       1 main.go:154] Starting FS watcher.
I0705 05:08:24.580666       1 main.go:161] Starting OS watcher.
I0705 05:08:24.581008       1 main.go:176] Starting Plugins.
I0705 05:08:24.581018       1 main.go:234] Loading configuration.
I0705 05:08:24.581125       1 main.go:242] Updating config with default resource matching patterns.
I0705 05:08:24.581304       1 main.go:253] 
Running with config:
{
  "version": "v1",
  "flags": {
    "migStrategy": "none",
    "failOnInitError": false,
    "nvidiaDriverRoot": "/",
    "gdsEnabled": false,
    "mofedEnabled": false,
    "plugin": {
      "passDeviceSpecs": false,
      "deviceListStrategy": [
        "envvar"
      ],
      "deviceIDStrategy": "uuid",
      "cdiAnnotationPrefix": "cdi.k8s.io/",
      "nvidiaCTKPath": "/usr/bin/nvidia-ctk",
      "containerDriverRoot": "/driver-root"
    }
  },
  "resources": {
    "gpus": [
      {
        "pattern": "*",
        "name": "nvidia.com/gpu"
      }
    ]
  },
  "sharing": {
    "timeSlicing": {}
  }
}
I0705 05:08:24.581315       1 main.go:256] Retreiving plugins.
W0705 05:08:24.584333       1 factory.go:31] No valid resources detected, creating a null CDI handler
I0705 05:08:24.584382       1 factory.go:107] Detected non-NVML platform: could not load NVML library: libnvidia-ml.so.1: cannot open shared object file: No such file or directory
I0705 05:08:24.584425       1 factory.go:107] Detected non-Tegra platform: /sys/devices/soc0/family file not found
E0705 05:08:24.584433       1 factory.go:115] Incompatible platform detected
E0705 05:08:24.584440       1 factory.go:116] If this is a GPU node, did you configure the NVIDIA Container Toolkit?
E0705 05:08:24.584447       1 factory.go:117] You can check the prerequisites at: https://github.com/NVIDIA/k8s-device-plugin#prerequisites
E0705 05:08:24.584455       1 factory.go:118] You can learn how to set the runtime at: https://github.com/NVIDIA/k8s-device-plugin#quick-start
E0705 05:08:24.584465       1 factory.go:119] If this is not a GPU node, you should set up a toleration or nodeSelector to only deploy this plugin on GPU nodes
I0705 05:08:24.584475       1 main.go:287] No devices found. Waiting indefinitely.

So after labeling and tainting the node we have the following node description:

$ kubectl describe node  kind-control-plane
Name:               kind-control-plane
Roles:              control-plane
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    gpu=installed
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=kind-control-plane
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/control-plane=
                    node.kubernetes.io/exclude-from-external-load-balancers=
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 05 Jul 2023 11:42:02 +0530
Taints:             gpu:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  kind-control-plane
  AcquireTime:     <unset>
  RenewTime:       Wed, 05 Jul 2023 12:26:51 +0530
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Wed, 05 Jul 2023 12:21:56 +0530   Wed, 05 Jul 2023 11:42:01 +0530   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Wed, 05 Jul 2023 12:21:56 +0530   Wed, 05 Jul 2023 11:42:01 +0530   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Wed, 05 Jul 2023 12:21:56 +0530   Wed, 05 Jul 2023 11:42:01 +0530   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Wed, 05 Jul 2023 12:21:56 +0530   Wed, 05 Jul 2023 11:42:22 +0530   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  172.19.0.2
  Hostname:    kind-control-plane
Capacity:
  cpu:                16
  ephemeral-storage:  238948692Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             65782984Ki
  pods:               110
Allocatable:
  cpu:                16
  ephemeral-storage:  238948692Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             65782984Ki
  pods:               110
System Info:
  Machine ID:                 99d5a8ae622644c889e61e882ec29ec9
  System UUID:                b9544999-5e7c-40f5-a2e1-519c23074823
  Boot ID:                    4fdb330e-7a23-453d-99b2-5a4073672224
  Kernel Version:             5.15.0-76-generic
  OS Image:                   Debian GNU/Linux 11 (bullseye)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.7.1
  Kubelet Version:            v1.27.3
  Kube-Proxy Version:         v1.27.3
PodCIDR:                      10.244.0.0/24
PodCIDRs:                     10.244.0.0/24
ProviderID:                   kind://docker/kind/kind-control-plane
Non-terminated Pods:          (9 in total)
  Namespace                   Name                                          CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                          ------------  ----------  ---------------  -------------  ---
  kube-system                 coredns-5d78c9869d-fdxj7                      100m (0%)     0 (0%)      70Mi (0%)        170Mi (0%)     44m
  kube-system                 coredns-5d78c9869d-sgnqj                      100m (0%)     0 (0%)      70Mi (0%)        170Mi (0%)     44m
  kube-system                 etcd-kind-control-plane                       100m (0%)     0 (0%)      100Mi (0%)       0 (0%)         44m
  kube-system                 kindnet-56rdl                                 100m (0%)     100m (0%)   50Mi (0%)        50Mi (0%)      44m
  kube-system                 kube-apiserver-kind-control-plane             250m (1%)     0 (0%)      0 (0%)           0 (0%)         44m
  kube-system                 kube-controller-manager-kind-control-plane    200m (1%)     0 (0%)      0 (0%)           0 (0%)         44m
  kube-system                 kube-proxy-vswdf                              0 (0%)        0 (0%)      0 (0%)           0 (0%)         44m
  kube-system                 kube-scheduler-kind-control-plane             100m (0%)     0 (0%)      0 (0%)           0 (0%)         44m
  local-path-storage          local-path-provisioner-6bc4bddd6b-9ntj6       0 (0%)        0 (0%)      0 (0%)           0 (0%)         44m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                950m (5%)   100m (0%)
  memory             290Mi (0%)  390Mi (0%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)
Events:
  Type    Reason                   Age                From             Message
  ----    ------                   ----               ----             -------
  Normal  Starting                 44m                kube-proxy       
  Normal  Starting                 40m                kube-proxy       
  Normal  NodeAllocatableEnforced  45m                kubelet          Updated Node Allocatable limit across pods
  Normal  Starting                 45m                kubelet          Starting kubelet.
  Normal  NodeHasSufficientMemory  45m (x8 over 45m)  kubelet          Node kind-control-plane status is now: NodeHasSufficientMemory
  Normal  NodeHasSufficientPID     45m (x7 over 45m)  kubelet          Node kind-control-plane status is now: NodeHasSufficientPID
  Normal  NodeHasNoDiskPressure    45m (x8 over 45m)  kubelet          Node kind-control-plane status is now: NodeHasNoDiskPressure
  Normal  NodeAllocatableEnforced  44m                kubelet          Updated Node Allocatable limit across pods
  Normal  Starting                 44m                kubelet          Starting kubelet.
  Normal  NodeHasSufficientMemory  44m                kubelet          Node kind-control-plane status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    44m                kubelet          Node kind-control-plane status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     44m                kubelet          Node kind-control-plane status is now: NodeHasSufficientPID
  Normal  RegisteredNode           44m                node-controller  Node kind-control-plane event: Registered Node kind-control-plane in Controller
  Normal  NodeReady                44m                kubelet          Node kind-control-plane status is now: NodeReady
  Normal  NodeAllocatableEnforced  40m                kubelet          Updated Node Allocatable limit across pods
  Normal  Starting                 40m                kubelet          Starting kubelet.
  Normal  NodeHasSufficientMemory  40m (x8 over 40m)  kubelet          Node kind-control-plane status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    40m (x8 over 40m)  kubelet          Node kind-control-plane status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     40m (x7 over 40m)  kubelet          Node kind-control-plane status is now: NodeHasSufficientPID
  Normal  RegisteredNode           40m                node-controller  Node kind-control-plane event: Registered Node kind-control-plane in Controller

Note that in the above description, the node does not have nvidia.com/gpu listed anywhere.

The description of the gpu-pod is as follows:

$ kubectl describe pods
Name:         gpu-pod
Namespace:    default
Priority:     0
Node:         <none>
Labels:       <none>
Annotations:  <none>
Status:       Pending
IP:           
IPs:          <none>
Containers:
  cuda-container:
    Image:      nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2
    Port:       <none>
    Host Port:  <none>
    Limits:
      nvidia.com/gpu:  1
    Requests:
      nvidia.com/gpu:  1
    Environment:       <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-86lnd (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  kube-api-access-86lnd:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
                             nvidia.com/gpu:NoSchedule op=Exists
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  35s   default-scheduler  0/1 nodes are available: 1 node(s) had untolerated taint {gpu: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..

But now, after the labeling and taint, the daemon does not even start or rather show up in the kubectl get pods -n kube-system command.

$ kubectl get pods -n kube-system
NAME                                         READY   STATUS    RESTARTS      AGE
coredns-5d78c9869d-fdxj7                     1/1     Running   1 (45m ago)   49m
coredns-5d78c9869d-sgnqj                     1/1     Running   1 (45m ago)   49m
etcd-kind-control-plane                      1/1     Running   1 (45m ago)   50m
kindnet-56rdl                                1/1     Running   1 (45m ago)   49m
kube-apiserver-kind-control-plane            1/1     Running   1 (45m ago)   50m
kube-controller-manager-kind-control-plane   1/1     Running   1 (45m ago)   50m
kube-proxy-vswdf                             1/1     Running   1 (45m ago)   49m
kube-scheduler-kind-control-plane            1/1     Running   1 (45m ago)   50m

Trying to create it again throws the following error:

$ kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.0/nvidia-device-plugin.yml
Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.0/nvidia-device-plugin.yml": daemonsets.apps "nvidia-device-plugin-daemonset" already exists

Please, can anyone help me out?

@HigashikataZhangsuke
Copy link

Hi Abhishekghosh,

Did you figure it out? Looks like I met the same problem...

@klueska
Copy link
Contributor

klueska commented Jul 25, 2023

Adding the taint for kubectl taint nodes kind-control-plane gpu:NoSchedule is what made the daemonset no longer schedulable on the control plane node. If you want it to be scheduled there you should not add this taint.

Note: the following taint would have worked but is unnecessary (as it is meant to be applied to GPU nodes to repel CPU workloads, not attract GPU workloads):

kubectl taint nodes kind-control-plane nvidia.com/gpu:NoSchedule

As to your question about why GPUs are not discovered / advertised by the plugin when it does run...

How are you making your kind cluster aware of the GPUs so that they can be visible to the nodes it starts? It only occurred to me recently how to do this, and it's not very intuitive: kubernetes-sigs/kind#3257 (comment)

Are you following this procedure?

@Abhishekghosh1998
Copy link
Author

@HigashikataZhangsuke I looked into the code of kind and sort of modified it accordingly so that the nodes created by KinD are aware of GPU. If I am not wrong, KinD uses Docker containers to launch/create the nodes. So, I ensured that the "node" container launched uses Nvidia docker and is aware of the GPUs.

@klueska Thanks for the pointer to your approach. Your approach is clean and nice. :) My approach is sort of similar but not exact.

Copy link

This issue has become stale and will be closed automatically within 30 days if no activity is recorded.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

3 participants