GPU-Enabled Kubernetes Cluster Setup Guide

This guide will help you create a GPU-enabled Kubernetes cluster or enable GPU support on an existing cluster using containerd as the runtime.

Prerequisites

Ubuntu 22.04 Server for worker nodes
Ubuntu 22.04 Desktop for the master node
Latest NVIDIA GPU drivers installed
Static IP addresses for all machines

Kubernetes Cluster Setup

Follow the guide by Choudhry Shehryar to set up your Kubernetes cluster.

Enabling GPU Support

Install NVIDIA Container Toolkit
- Follow the installing with apt guide
- Configure containerd using the official guide
Install NVIDIA device plugin
Modify containerd configuration:
- Open /etc/containerd/config.toml (make a backup first)
- Set default_runtime_name = "nvidia"
- Set runtime = "/usr/bin/nvidia-container-runtime" or runtime = "nvidia-container-runtime"
Restart containerd:
```
sudo systemctl restart containerd
```

Test GPU access in containerd:

sudo ctr image pull docker.io/nvidia/cuda:11.2.2-base-ubuntu20.04
sudo ctr run --rm --gpus 0 -t docker.io/nvidia/cuda:11.2.2-base-ubuntu20.04 cuda-11.0-base nvidia-smi

Apply NVIDIA device plugin on the master node:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.3/nvidia-device-plugin.yml

Testing GPU Support

Create a test pod using the following YAML:

apiVersion: v1
kind: Pod
metadata:
  name: gputest
  namespace: default
spec:
  restartPolicy: Never
  containers:
    - name: gpu
      image: nvidia/cuda:11.2.2-base-ubuntu20.04
      command: [ "/bin/bash", "-c", "--" ]
      args: [ "while true; do sleep 30; done;" ]
      resources:
        limits:
          nvidia.com/gpu: 1

Access the pod and run nvidia-smi:

kubectl exec -n default -it gputest -- /bin/bash
nvidia-smi

Check GPU availability on worker nodes:
```
kubectl describe node <name-of-worker-node>
```
Look for nvidia.com/gpu: 2 in the output.

Additional Resources

For a comprehensive guide on setting up a GPU-enabled Kubernetes cluster from start to finish, check out Choudhry Shehryar's full guide.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU-Enabled Kubernetes Cluster Setup Guide

Prerequisites

Kubernetes Cluster Setup

Enabling GPU Support

Testing GPU Support

Additional Resources

About

Releases

Packages

bilal77511/k8s-GPU-setting

Folders and files

Latest commit

History

Repository files navigation

GPU-Enabled Kubernetes Cluster Setup Guide

Prerequisites

Kubernetes Cluster Setup

Enabling GPU Support

Testing GPU Support

Additional Resources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages