On ovh1 and ovh2 we use proxmox to manage VMs.
TODO this page is really incomplete !
Every VM / CT is backuped twice a week using general proxmox backup, in a specific zfs dataset (see Datacenter -> backup)
We configure network through virtual bridges (vmbr<n>
) that are themselves linked to a physical network device.
Also note that [NGINX proxy] has it's own IP address on a separate bridge, so that it can get network traffic directly.
The local container normally sets their network to be 10.1.0.<ct number>/24
. But they route through 10.0.0.1
(or 2) and for that define the route to 10.0.0.1
. This is all managed by Proxmox.
vmbr0 is the bridge for the local network. vmbr1 is the public interface.
Name | Type | Active | Autostart | VLAN | Ports/Slaves | Bond Mode | CIDR | Gateway | Comment |
---|---|---|---|---|---|---|---|---|---|
enp5s0f0 | Network Device | Yes | No | No | |||||
enp5s0f1 | Network Device | Yes | No | No | |||||
vmbr0 | Linux Bridge | Yes | Yes | No | enp5s0f1 | 10.0.0.1/8 2001:41d0:0203:948c:1::1/80 | |||
vmbr1 | Linux Bridge | Yes | Yes | No | enp5s0f0 | 146.59.148.140/24 2001:41d0:0203:948c:0::1/80 | 146.59.148.254 2001:41d0:0203:94ff:ffff:ffff:ffff:ffff |
vmbr0 is the public interface. vmbr1 is the bridge for the local network.
Name | Type | Active | Autostart | VLAN | Ports/Slaves | Bond Mode | CIDR | Gateway | Comment |
---|---|---|---|---|---|---|---|---|---|
eno1 | Network Device | Yes | No | No | |||||
eno2 | Network Device | Yes | No | No | |||||
vmbr0 | Linux Bridge | Yes | Yes | No | eno1 | 213.36.253.208/27 | 213.36.253.222 | ||
vmbr1 | Linux Bridge | Yes | Yes | No | eno2 | 10.0.0.2/8 | Internal network with VM and other free servers |
This correspond to this settings in /etc/network/interfaces
(generated by PVE)
auto lo
iface lo inet loopback
iface eno1 inet manual
iface eno2 inet manual
auto vmbr0
iface vmbr0 inet static
address 213.36.253.208/27
gateway 213.36.253.222
bridge-ports eno1
bridge-stp off
bridge-fd 0
auto vmbr1
iface vmbr1 inet static
address 10.0.0.2/8
bridge-ports eno2
bridge-stp off
bridge-fd 0
post-up echo 1 > /proc/sys/net/ipv4/ip_forward
post-up iptables -t nat -A POSTROUTING -s '10.1.0.0/16' -o vmbr0 -j MASQUERADE
post-down iptables -t nat -D POSTROUTING -s '10.1.0.0/16' -o vmbr0 -j MASQUERADE
#Internal network with VM and other free servers
FIXME: I added the last lines (post-up/down) myself, not sure if PVE would normally add them by itself ?
We use proxmox firewall on host. FIXME to be completed.
We have a masquerading rule for 10.1.0.1/24.
We have a minimal set of users and groups.
admin group is for proxmox admins (Administrator Role). ro_users gives a read only access to the interface (PVEAuditor Role)
We put users (see users in proxmox interface) in groups (see groups in proxmox interface), and give roles to users (see permissions in proxmox interface).
Remove enterprise repository and add the no-subscription one
rm /etc/apt/sources.list.d/pve-enterprise.list
echo "deb http://download.proxmox.com/debian/pve "$(lsb_release --short --codename)" pve-no-subscription" > /etc/apt/sources.list.d/pve-install-repo.list
apt update
Verify smartclt is activated.
Interesting page: Techno Tim
IOMMU ?
Don't forget to schedule backups.
The VM 101 is a http / https proxy to all services.
It has it's own bridge interface with a public facing ip.
At OVH we have special DNS entries:
proxy1.openfoodfacts.org
pointing to OVH reverse proxyoff-proxy.openfoodfacts.org
pointing to Free reverse proxy
VM and container storage are regularly synchronized to ovh3 (and eventually to ovh1/2) to have a continuous backup.
Replication can be seen in the web interface, clicking on "replication" section on a particular container / VM.
This is managed with command line pvesr
(PVE Storage replication). See official doc
You may want to move containers or VM from one server to another.
Just go to the interface, right click on the VM / Container and ask to migrate !
If you have a large disk, you may want to first setup replication of your disk to the target server (see Storage synchronization), schedule it immediatly (schedule button)− and then run the migration.
Sometimes you may get alerts in email telling backup failed on a VM because it is locked. (CT is locked
)
This might be a temporary issue, so you should first verify in proxmox console if it's already resolved.
If not you can unlock it using this command:
pct unlock <vm-id>
We use two type of storage: the NVME and zfs storage. There are also mounts of zfs storage from ovh3.
TODO tell much more
You can bind mount folders on the host in the containers. This way you can also mount a zfs dataset in a container.
You cannot add them through the web interface, you can instead edit /etc/pve/lxc/<container_id>.lxc
and add lines like:
mp0 /absolute/path/on/host,mp=/absolute/path/in/container
.
Note that this will be available only after a reboot of the container: pct retboot <container_id>
(or start
if it was stopped).
See also Proxmox documentation on mount points
see How to add disk space on a qemu VM.
This user will have the right to use proxmox administration.
see sudo pveum help user add
and https://pve.proxmox.com/wiki/User_Management#_command_line_tool
userid
: use something like (for John Doe) jdoe@pve
.
:Pencil: Note: The user login is the part before
@pve
. Here our user will have to usejdoe
as username and "Proxmox VE authentication server" as authentication service.
To reset a user password, use sudo pveum passwd <userid>
On proxmox we created a read only group:
$ pveum group add ro_users -comment "Read only users"
$ pveum acl modify / -group ro_users -role -role PVEAuditor
On proxmox we created an administrators group:
$ pveum group add admin -comment "System Administrators"
$ pveum acl modify / -group admin -role Administrator
To add a user to admin group (here jdoe@pve
):
$ pveum user modify jdoe@pve -group admin
You need to have an account on the Proxmox infra.
Using web interface:
- Login to the Proxmox web interface
- Click on "Create CT"
- Note carefully the container number (ct number)
- Use a "Hostname" to let people know what it is about. Eg. "robotoff", "wiki", "proxy"...
- set Nesting option (systemd recent versions needs it)
- keep "Unprivileged container" option checked… unless you know what you do.
- Password: put something complex and forget it, as we will connect through SSH and not the web interface (
pwgen 20 20
is your friend) - Create a root password - forget about it also (you will use
pct enter
orlxc-attach
) - Choose template (normally debian)
- Disk: try to keep a tight disk space and to avoid using nvme if it's not useful.
- Swap: you might choose 0B (do you really need swap ?) or a sensible value
- Network:
- Bridge: vmbr0 (may vary, currently vmbr1 on off2 !) - the one which is an "Internal network"
- IPv4:
10.1.0.<ct number>/24
(you need to use /24; end of IPv4 should be the same as the Proxmox container id; container 101 has the 10.1.0.101 IP). - Gateway:
10.0.0.1
(may vary,10.0.0.2
on off2 or ovh2 !)
- Start the server after install
Wait for container to be created and started !
Then connect to the proxmox host:
-
Install useful package and do some other configurations:
sudo /root/cluster-scripts/ct_postinstall
(or/opt/openfoodfacts-infrastructure/scripts/proxmox-management/ct_postinstall
) choose the container ID when asked. -
eventually disable systemd services that are not needed (and would crash on unprivileged containers):
- disable systemd-logind (see below)
sudo systemctl disable --now sys-kernel-config.mount
- running
sudo systemctl list-unit --state=failed
is a good idea
-
create a user, most of the time you prefer off to have id 1000.
Then you can login to the machine (see logging in to a container or VM).
Using the web interface:
-
Check "options" of the container and:
- Start at boot: Yes
- Protection: Yes (to avoid deleting it by mistake)
-
Eventually Add replication to ovh3 or off1/2 (if we are not using sanoid/syncoid instead)
- In the Replication menu of the container, "Add" one
- Target: ovh3
- Schedule: */5 if you want every 5 minutes (takes less than 10 seconds, thanks to ZFS)
Also think about configuring email in the container
Most of the time we use ssh to connect to containers and VM. See how to create a user in a Container or VM
For the happy few sudoers on the host, they can attach to containers using pct enter <num>
(or lxc-attach -n <num>
) where <num>
is the VM number.
This gives a root console in the container and has the advantage of not depending on the container network state.
If you add a user which is not a person that needs ssh account (eg. the off service account), just open the console and the adduser
command.
The sudo /root/cluster-scripts/mkuser
(see script mkuser) helps you create users using github keys.
Alternatively sudo /root/cluster-scripts/mkuseralias
can be used if the username on server is different from the GitHub username.
If when you ssh to a container it takes a long time, here is a possible fix:
See https://gist.github.com/charlyie/76ff7d288165c7d42e5ef7d304245916:
# If Debian 11 is ran on a LXC container (Proxmox), SSH login and sudo actions can be slow
# Check if in /var/log/auth.log the following messages
Failed to activate service 'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
-> Run `systemctl mask --now systemd-logind`
-> Run `pam-auth-update` (and deselect `Register user sessions in the systemd control group hierarchy`)
Proxmox is installed from a bootable USB disk based on Proxmox VE iso, the way you would install a Debian.
Some service of systemd might not work because it needs nesting capabilities (and AppArmor is blocking them).
Using systemctl list-units --failed
, one can see that systemd-networkd
is down, might be same for systemd-logind
and systemd-resolved
.
Some logs that may appear:
systemd-networkd "Failed to set up mount namespacing" "/run/systemd/unit-root/proc" "Permission denied" lxc
nov. 28 18:40:57 proxy systemd[123]: systemd-networkd.service: Failed to set up mount namespacing: /run/systemd/unit-root/proc: Permission denied
nov. 28 18:40:57 proxy systemd[123]: systemd-networkd.service: Failed at step NAMESPACE spawning /lib/systemd/systemd-networkd: Permission denied
On symptom is a slow time at login time, which is due to systemd-logind service being down:
Mar 29 10:37:53 proxy dbus-daemon[128]: [system] Failed to activate service 'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
Mar 29 10:42:43 proxy dbus-daemon[128]: [system] Failed to activate service 'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
Just add nesting capability to the container and restart it.
Thread on same issue : https://discuss.linuxcontainers.org/t/apparmor-blocks-systemd-services-in-container/9812