Create Amazon EKS cluster with GPU-enabled workers

This document explains how to create an Amazon EKS cluster with GPU-enabled workers.

Create EKS cluster with GPU nodes:

eksctl create cluster eks-gpu --node-type=p3.8xlarge --timeout=40m

By default, ssh access is not enabled. Here is how it can be done:

eksctl create cluster eks-gpu \
   --node-type=p3.8xlarge \
   --timeout=40m \
   --ssh-access \
   --ssh-public-key arun-us-west2

Apply NVIDIA driver:

kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.10/nvidia-device-plugin.yml

This will be simplified after eksctl-io/eksctl#205 is fixed.

Get memory, CPU and GPU for each node in the cluster:

kubectl get nodes "-o=custom-columns=NAME:.metadata.name,MEMORY:.status.allocatable.memory,CPU:.status.allocatable.cpu,GPU:.status.allocatable.nvidia\.com/gpu"

Shows something like:

NAME                                            MEMORY        CPU       GPU
ip-192-168-101-177.us-west-2.compute.internal   251643680Ki   32        4
ip-192-168-196-254.us-west-2.compute.internal   251643680Ki   32        4

The maximum number of GPUs that may be scheduled to a pod is capped by the number of GPUs available per node. By default, pods are scheduled on CPU.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eks-gpu.md

eks-gpu.md

Create Amazon EKS cluster with GPU-enabled workers

Files

eks-gpu.md

Latest commit

History

eks-gpu.md

File metadata and controls

Create Amazon EKS cluster with GPU-enabled workers