Skip to content

Commit

Permalink
disk passthrough and nix
Browse files Browse the repository at this point in the history
  • Loading branch information
ironicbadger committed Sep 12, 2024
1 parent 2e42a80 commit 6d802e7
Show file tree
Hide file tree
Showing 7 changed files with 165 additions and 40 deletions.
92 changes: 92 additions & 0 deletions hardware/moose/HL15.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Server info

## Software

+ OS - Proxmox
+ ZFS layout
+ `rpool` - root dataset
+ 2x 256gb nvme in a mirror
+ 256gb used for the OS _only_
+ `nvmeu2`
+ 2x 2tb nvme in a mirror
+ used for vm storage
+ `data` - 4x 14tb drives
+ 2 mirrored vdevs
+ 28tb usable space
+ store data here yeah?

## Hardware
+ HL15 chassis
+ CPU - Xeon Silver 4214 12c/24t
+ Motherboard - Supermicro X11SPH-NCTF (with onboard dual SFP+)
+ Manual link - https://www.supermicro.com/manuals/motherboard/C620/MNL-1949.pdf
+ RAM - 64gb ECC
+ 2x 16gb SK Hynix PC4-3200AA-RE2-12
+ 2x 16gb PC4-3200AA-RE2-12 (original HL15 DIMMs)
+ Boot disks
+ 2x Samsung 256gb NVME drives
+ ZFS mirror with Promox installed on root
+ Disk 1 installed directly into motherboard m.2 slot
+ Disk 2 installed into PCIe slot 2
+ Slot 2 is an 8x physical but is a 4x wired slot
+ VM disks
+ 2x 2tb u.2 Intel DC4510 NVME SSDs
+ ZFS mirror
+ Installed via a bifurcated dual u.2 to PCIe caddy card
+ NVME card is in PCIe slot 3 (an 8x slot)
+ BIOS (del to enter) -> advanced -> chipset configuration -> north bridge -> iio configuration -> cpu configuration -> IOU1 (x4x4x8)
+ Data disks
+ 4x 14tb HDDs
+ Configured in mirrored pairs as a single vdev

## Proxmox commands

In order to passthrough 2 of the 4k native disks to Proxmox we have to get creative with directly passing qemu args directly to the VM. This is because passthrough isn't really real, it's emulated. The emulation will only handle 512byte sectors, but two of our 4 14tb data disks are 4Kn (4k native sectors). The answer used here was found [in this reddit thread](https://www.reddit.com/r/Proxmox/comments/s1oat1/passthrough_4knadvanced_format_disks_to_vm/?share_id=UMzEIhrOHmAiD_28AvevQ).

To combat this we need to invoke `qm showcmd <VMID>` and add a hardcoded value of `logical_block_size=4096,physical_block_size=4096`. But first we need to add the disks to our VM so that we get a helping hand with the syntax kvm is expecting.

```
qm set 1001 -scsi5 /dev/disk/by-id/ata-WDC_WUH721414ALE6L4_9MG6ARZJ
qm set 1001 -scsi6 /dev/disk/by-id/ata-WDC_WUH721414ALE6L4_XJG0HXDM
qm set 1001 -scsi7 /dev/disk/by-id/ata-WDC_WUH721414ALN604_9MH2BWLU
qm set 1001 -scsi8 /dev/disk/by-id/ata-WDC_WUH721414ALN604_X1G4EPXL
```

The VM config which lives at `/etc/pve/qemu-server/VMID.conf` will now contain 4 disks mapped to the SCSI controllers assigned above (these numbers must be unique). Now we can run:

```
$ qm showcmd 1001
```

This will spew out a really long and intimidating looking raw kvm command used by Proxmox to interface with the VM. We need to extra the disks from this output. It can be a bit hard to find what you need at first so copy and paste the output into a text editor and make your life easier.

```
a
```

In the end the command we assemble for all 4 disks looks like this (note it is only required to do this on the 4k sector native disks but I wanted to pass the serial number for each disk through to the VM even on the 512byte disks - this is optional).

```
qm set 1001 -args \
" -device 'virtio-scsi-pci,id=virtioscsi5,bus=pci.3,addr=0x6' -drive 'file=/dev/disk/by-id/ata-WDC_WUH721414ALE6L4_9MG6ARZJ,if=none,id=drive-scsi5,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=virtioscsi5.0,channel=0,scsi-id=0,lun=5,drive=drive-scsi5,id=scsi5,serial=9MG6ARZJ' \
-device 'virtio-scsi-pci,id=virtioscsi6,bus=pci.3,addr=0x7' -drive 'file=/dev/disk/by-id/ata-WDC_WUH721414ALE6L4_XJG0HXDM,if=none,id=drive-scsi6,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=virtioscsi6.0,channel=0,scsi-id=0,lun=6,drive=drive-scsi6,id=scsi6,serial=XJG0HXDM' \
-device 'virtio-scsi-pci,id=virtioscsi7,bus=pci.3,addr=0x8' -drive 'file=/dev/disk/by-id/ata-WDC_WUH721414ALN604_9MH2BWLU,if=none,id=drive-scsi7,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=virtioscsi7.0,channel=0,scsi-id=0,lun=7,logical_block_size=4096,physical_block_size=4096,drive=drive-scsi7,id=scsi7,serial=9MH2BWLU' \
-device 'virtio-scsi-pci,id=virtioscsi8,bus=pci.3,addr=0x9' -drive 'file=/dev/disk/by-id/ata-WDC_WUH721414ALN604_X1G4EPXL,if=none,id=drive-scsi8,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=virtioscsi8.0,channel=0,scsi-id=0,lun=8,logical_block_size=4096,physical_block_size=4096,drive=drive-scsi8,id=scsi8,serial=X1G4EPXL' "
```

Now this is done edit the VM config file to remove the originally mapped (and now superflous) scsi device mappings or run:

```
qm unlink 1001 --idlist scsi5
qm unlink 1001 --idlist scsi6
qm unlink 1001 --idlist scsi7
qm unlink 1001 --idlist scsi8
```

Then, if all went well run:

```
qm start <VMID>
```

And your disks will show up with their native 4k sectors making ZFS and anything else fussy happy.
73 changes: 73 additions & 0 deletions hardware/moose/nix/configuration.nix
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
{ config, pkgs, ... }:
{
imports =
[
./hardware-configuration.nix
(fetchTarball "https://github.com/nix-community/nixos-vscode-server/tarball/master")
];

boot.loader.systemd-boot.enable = true;
boot.loader.efi.canTouchEfiVariables = true;
boot.kernelModules = [ "drivetemp" ];

boot.supportedFilesystems = [ "zfs" ];
boot.zfs.extraPools = [ "jbdata" ];
services.zfs.autoScrub.enable = true;

networking.networkmanager.enable = true;
networking.hostName = "moose-jbdata";
networking.hostId = "a13b14c1";
time.timeZone = "America/Toronto";

i18n.defaultLocale = "en_US.UTF-8";
services.xserver.xkb.layout = "us";

users.users.moose = {
isNormalUser = true;
description = "alex";
extraGroups = [ "networkmanager" "wheel" ];
packages = with pkgs; [];
};
users.defaultUserShell = pkgs.bash;
programs.bash.interactiveShellInit = "figurine -f \"3d.flf\" moose-jbdata";

# Allow unfree packages
nixpkgs.config.allowUnfree = true;
environment.systemPackages = with pkgs; [
ansible
dig
e2fsprogs # badblocks
figurine
gcc
git
htop
inxi
iotop
lm_sensors
mc
molly-guard
ncdu
nmap
python3
smartmontools
tmux
tree
vim
wget

# zfs send/rec with sanoid/syncoid
lzop
mbuffer
pv
sanoid
zstd
];

services.tailscale.enable = true;
boot.kernel.sysctl."net.ipv4.ip_forward" = 1;

services.openssh.enable = true;
services.qemuGuest.enable = true;

system.stateVersion = "24.05";
}
40 changes: 0 additions & 40 deletions hardware/ont/HL15.md

This file was deleted.

0 comments on commit 6d802e7

Please sign in to comment.