Skip to content

Commit

Permalink
deploy: f8d0d2d
Browse files Browse the repository at this point in the history
  • Loading branch information
dpoulopoulos committed May 2, 2024
0 parents commit 93df361
Show file tree
Hide file tree
Showing 80 changed files with 9,603 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 1495ba56dff114e4f80d4d69960cf979
tags: 645f666f9bcd5a90fca523b33c5a78b7
Binary file added .doctrees/debian-vm.doctree
Binary file not shown.
Binary file added .doctrees/environment.pickle
Binary file not shown.
Binary file added .doctrees/gpu-passthrough.doctree
Binary file not shown.
Binary file added .doctrees/index.doctree
Binary file not shown.
Binary file not shown.
Binary file added .doctrees/manual-deployment/index.doctree
Binary file not shown.
Binary file added .doctrees/manual-deployment/kubeflow.doctree
Binary file not shown.
Binary file added .doctrees/manual-deployment/kubernetes.doctree
Binary file not shown.
Binary file added .doctrees/manual-deployment/longhorn.doctree
Binary file not shown.
Binary file added .doctrees/manual-deployment/pxe-server.doctree
Binary file not shown.
Binary file added .doctrees/qemu-kvm.doctree
Binary file not shown.
Empty file added .nojekyll
Empty file.
Binary file added _images/longhorn-ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions _images/virtml-logo-black.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
139 changes: 139 additions & 0 deletions _sources/debian-vm.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# Create a Debian VM

This guide walks you through creating a new VM using `libvirt` and installing Debian on it. This is
the VM that you will later use as a PXE server to boot the Kubernetes nodes.

## What you'll need

To complete this guide, you will need the following:

* A Debian-based system.
* A working [QEMU/KVM installation](qemu-kvm).

## Procedure

Follow the steps below to create a new VM and install Debian on it.

1. Change to root user:

```console
user:~$ sudo su -
root:~#
```

1. Navigate to the project's root directory:

```console
root:~# cd /home/user/virtlml
```

```{note}
Replace `/home/user/virtlml` with the path to the project's root directory.
```

1. Create a new `QCOW2` virtual disk for the VM:

```console
root:/home/user/virtlml# qemu-img create -f qcow2 /var/lib/libvirt/images/pxe-server.qcow2 32G
Formatting '/var/lib/libvirt/images/pxe-server.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=34359738368 lazy_refcounts=off refcount_bits=16
```

1. Define a new VM, using the XML file inside the `infra` directory:

```console
root:/home/user/virtlml# virsh define --file infra/pxe-server.xml
Domain 'pxe-server' defined from pxe-server.xml
```

```{note}
The `pxe-server.xml` file is a template for the PXE server VM. You can modify it to fit your
needs. Pay close attention to the sections where you specify the path to the `QCOW2` file and
the Debian `12.4` ISO.
```

1. Start the PXE server VM:

```console
root:/home/user/virtlml# virsh start pxe-server
Domain 'pxe-server' started
```

1. Connect to the PXE server VM through the "Virtual Machine Manager" UI and run the Debian
installer. The Debian installer will guide you through the installation process. You can use the
default settings for most of the options.

```console
root:/home/user/virtlml# virt-manager
```

```{note}
We recommend doing a minimal installation of Debian, without any graphical interface.
```

## Verify

1. Verify that the VM is running:

```console
root:/home/user/virtlml# virsh list --all
Id Name State
-----------------------------
1 pxe-server running
```

1. Change back to your user:

```console
root:/home/user/virtlml# exit
user:~/virtlml$
```

1. Create an SSH key, if you don't already have one:

```console
user:~/virtlml$ ssh-keygen
```

1. Copy the SSH public key to the PXE server VM:

```console
user:~/virtlml$ ssh-copy-id user@pxe-server
```

```{note}
Replace `user` with your username. Also, replace `pxe-server` with the IP address of the VM.
Alternatively, you can add an entry to your `/etc/hosts` file with the IP address and hostname
of the VM.
```

1. Verify that you can SSH into the PXE server VM:

```console
user:~/virtlml/infra$ ssh user@pxe-server
The authenticity of host '192.168.122.89 (192.168.122.89)' can't be established.
ED25519 key fingerprint is SHA256:dNnHdISPbUDbtJWqSLDpEdGEO3tGEIQ1TiSrfPxyRHg.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.122.89' (ED25519) to the list of known hosts.
[email protected]'s password:
Linux pxe-server 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Feb 7 18:10:32 2024
user@pxe-server:~$
```

```{note}
Replace `user` with your username and `pxe-server` with the IP address of the PXE server VM.
```

```{important}
You should add your SSH public key to the authorized keys list of the root user on the PXE
server VM. This is something that you will need later, as you should be able to SSH into the
PXE server VM as the root user.
```
177 changes: 177 additions & 0 deletions _sources/gpu-passthrough.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# GPU Passthrough using VFIO

This guide walks you througη the process of passing a GPU through to a Virtual Machine (VM), an
essential step for setting up a VM to function as a Kubernetes GPU worker.

## What you'll need

Το complete this guide, you will need the following:

* A Debian-based system.
* A working [QEMU/KVM installation](qemu-kvm).

## Step 1: Set your Primary Display for the Host

This step is essential for switching the primary display to the integrated GPU, thereby leaving the
dedicated GPU available for use by the VM.

### What you'll need

To complete this step, you will need:

* Access to the BIOS settings of the host machine.

### Procedure

Follow the steps below to set the primary display for the host machine. The steps configure the X
server to use the integrated GPU and set the primary display in the BIOS settings.

<!-- 1. Change to root user:

```console
user:~$ sudo su -
root:~#
``` -->

1. Get the BusID of the integrated GPU by running the following command:

```console
user:~/virtml$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation Raptor Lake-S GT1 [UHD Graphics 770] (rev 04)
01:00.0 VGA compatible controller: NVIDIA Corporation GA106 [GeForce RTX 3060 Lite Hash Rate] (rev a1)
```

```{note}
In this example, the BusID of the integrated GPU is `PCI:0:2:0`.
```

1. Export the BusID in an environment variable:

```console
user:~/virtml$ export PCI_BUS_ID="PCI:0:2:0"
```

1. Create the configuration file for the X server, using the provided template:

```console
user:~/virtml$ j2 infra/intel.conf.j2 > intel.conf
```

1. Copy the configuration file to the X server configuration directory:

```console
user:~/virtml$ sudo cp intel.conf /etc/X11/xorg.conf.d/20-intel.conf
```

1. Change the ownership and the group of the configuration file to `root`:

```console
user:~/virtml$ sudo chown root:root /etc/X11/xorg.conf.d/20-intel.conf
```

1. Boot to UEFI/BIOS settings, and set the primary display to the integrated GPU. Look under
"Advanced" settings, for an option like "Primary Display". Set it to "Auto" and connect the
monitor directly to the motherboard. Alternativelly, set it to "CPU" or "iGPU" if available.

```console
user:~/virtml$ sudo systemctl reboot --firmware-setup
```

## Step 2: Enable GPU Passthrough

In this section, you will bind the GPU to the VFIO driver and prevent the Linux Kernel from loading
the NVIDIA driver during boot.

### What you'll need

To complete this step, you will need:

* A dedicated GPU that is not being used by the host.

### Procedure

Follow the steps below to bind the GPU to the VFIO driver and prevent the Linux Kernel from loading
the NVIDIA driver during boot.

1. Change to root user:

```console
user:~/virtml$ sudo su -
root:~#
```

1. Get the PCIe ID of the GPU:

```console
root:~# lspci -nn | grep -i nvidia
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA106 [GeForce RTX 3060 Lite Hash Rate] [10de:2504] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GA106 High Definition Audio Controller [10de:228e] (rev a1)
```

```{note}
The PCIe ID of the VGA controller is `10de:2504` and the Audio device is `10de:228e`. Take a
note of these IDs. You will need them later.
```

1. Change the `GRUB_CMDLINE_LINUX_DEFAULT` variable in the `/etc/default/grub` file to include the
following options:

* `intel_iommu=on`: Enable IOMMU for the integrated GPU.
* `iommu=pt`: Enable IOMMU passthrough.

```console
root:~# sed -i 's/GRUB_CMDLINE_LINUX_DEFAULT="\(.*\)"/GRUB_CMDLINE_LINUX_DEFAULT="\1 intel_iommu=on iommu=pt"/' /etc/default/grub
```

1. Update the GRUB configuration:

```console
root:~# update-grub
```

1. Create a configuration file to bind the GPU to the VFIO driver:

a. Run the following command:

```console
root:~# cat > /etc/modprobe.d/vfio.conf
```

b. Copy and paste the following text:

```
options vfio-pci ids=10de:2504,10de:228e
softdep nvidia pre: vfio-pci
```

```{important}
Replace `10de:2504,10de:228e` with the PCIe IDs of your GPU.
```

c. Run `CTRL + D` to exit.

1. Update the initial ramdisk:

```console
root:~# update-initramfs -c -k $(uname -r)
```

1. Reboot the system:

```console
root:~# reboot
```

### Verify

Verify that the GPU is bound to the VFIO driver:

1. Check if the GPU is bound to the VFIO driver:

```console
root:~# lspci -k | grep -E "vfio-pci|NVIDIA"
01:00.0 VGA compatible controller: NVIDIA Corporation GA106 [GeForce RTX 3060 Lite Hash Rate] (rev a1)
Kernel driver in use: vfio-pci
01:00.1 Audio device: NVIDIA Corporation GA106 High Definition Audio Controller (rev a1)
Kernel driver in use: vfio-pci
```
Loading

0 comments on commit 93df361

Please sign in to comment.