Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VM is not created properly or completely, no Cloud-init drive after creation #1104

Closed
yyy916 opened this issue Sep 16, 2024 · 27 comments · Fixed by #1125
Closed

VM is not created properly or completely, no Cloud-init drive after creation #1104

yyy916 opened this issue Sep 16, 2024 · 27 comments · Fixed by #1125

Comments

@yyy916
Copy link

yyy916 commented Sep 16, 2024

Hi! I have no prior knowledge related to any of this, so I may have made many mistakes. Please help me out with this.
I'm using an Ubuntu 24.04 VM template to create VMs using Terraform, followed this tutorial to create the VM template https://tcude.net/creating-a-vm-template-in-proxmox/.
My main.tf is as follows -

terraform {
required_providers {
proxmox = {
source = "Telmate/proxmox"
version = "3.0.1-rc3"
}
}
}
provider "proxmox" {
pm_api_url = "https://xxxxx:8006/api2/json"
pm_user = "xyz@pve"
pm_password = "xyz" # Or use an API token
pm_tls_insecure = true
pm_debug = true
pm_log_enable = true
}

resource "proxmox_vm_qemu" "my_test_vms" {
count = 1
name = "Test-vm-${count.index + 1}"

onboot = true
vm_state = "running"

target_node = "xxx"
memory = 2048
cores = 2
sockets = 1
clone = "template-name"
network {
model = "virtio"
bridge = "vmbr0"
}

ipconfig0 = "ip=dhcp"

}

This gives me a VM with no cloud-init drive and a state of continuous boots like the pic attached.
The Ubuntu OS doesn't startup.

image
image

Now , if I remove bridge="vmbr0" and ipconfig0="ip=dhcp". This creates a VM with the following error. It starts once manually started though.

proxmox_vm_qemu.my_test_vms[0]: Creating...
proxmox_vm_qemu.my_test_vms[0]: Still creating... [10s elapsed]

│ Error: error updating VM: 500 no sdn vnet ID specified, error status: {"data":null} (params: map[agent:0 bios:seabios cores:2 cpu:host delete:ide2,scsi0 hotplug:network,disk,usb kvm:true memory:2048 name:Test-vm-1 net0:virtio=5E:74:ED:5A:07:CC numa:false onboot:true protection:false scsihw:lsi sockets:1 tablet:true vmid:102])

│ with proxmox_vm_qemu.my_test_vms[0],
│ on main.tf line 20, in resource "proxmox_vm_qemu" "my_test_vms":
│ 20: resource "proxmox_vm_qemu" "my_test_vms" {

If I remove just the line "bridge=vmbr0", it gives a similar error and starts when started.
Error: error updating VM: 500 no sdn vnet ID specified, error status: {"data":null} (params: map[agent:0 balloon:0 bios:seabios cicustom: cipassword: ciupgrade:0 cores:2 cpu:host delete:ide0,ide2,scsi0,ciuser,searchdomain,nameserver,shares,serial0 hotplug:network,disk,usb ipconfig0:ip=dhcp kvm:true memory:4096 name:induz-memcache-vm-1 net0:virtio=CE:F6:F8:EA:5F:B9 numa:0 onboot:true protection:false scsihw:lsi sockets:1 sshkeys:%0A tablet:true vmid:100])

@yyy916
Copy link
Author

yyy916 commented Sep 16, 2024

Any little feedback is much needed and appreciated!!!

@yyy916 yyy916 closed this as completed Sep 16, 2024
@yyy916 yyy916 reopened this Sep 16, 2024
@yyy916
Copy link
Author

yyy916 commented Sep 16, 2024

I have also experienced Plugin crashes and Can't lock file...- got timeout errors similar to #1101

@yyy916
Copy link
Author

yyy916 commented Sep 16, 2024

With the newest version = "3.0.1-rc4", I get a created VM with no errors but it doesn't start. It keeps getting connected and disconnected as a loop.

@regexhater
Copy link

I have the same problem. What's more, I've spotted that when running two vms, that only differ in template to clone from, one boots and one has the same problem as described in this thread.

@Tinyblargon
Copy link
Collaborator

@yyy916

  • Which version of PVE are you using?
  • How much is configured in the template? Only a boot disk is needed, everything else is managed by terraform.
  • Does the created VM have any detached disks?
  • Is the boot order configured?

@yyy916
Copy link
Author

yyy916 commented Sep 17, 2024

@Tinyblargon

  1. I used the Proxmox-VE_8.2-1 iso image. The pveversion shows proxmox-ve: 8.2.0 (running kernel: 6.8.4-2-pve)
    pve-manager: 8.2.2 (running version: 8.2.2/9355359cd7afbae4)
  2. I followed everything as in the link https://tcude.net/creating-a-vm-template-in-proxmox/ . I attached an extra hard drive for storage.
  3. I don't think there are any unattached or orphaned disks from deleting a VM. I usually face the issue of a failed apply when there are, and go on to delete them. I'm assuming you're asking about cloud-init-disks. I delete them using rbd rm command. Please help me understand if I misunderstood something!
  4. Again, I followed as in the above link.
    image

@yyy916
Copy link
Author

yyy916 commented Sep 17, 2024

@Tinyblargon
Please let me know if I haven't given you enough details! Do you need me to attach any logs or other info?

@Tinyblargon
Copy link
Collaborator

@yyy916 in the article you linked scsi0 seems to be the boot disk, could you change the boot order to only include scsi0 and see if that fixes it.

@yyy916
Copy link
Author

yyy916 commented Sep 17, 2024

I will try and let you know!

@yyy916
Copy link
Author

yyy916 commented Sep 17, 2024

Hey...sorry for the delay. There was plugin init error.
So i edited out the boot order in the template and regenerated it. It gives me the same output. The VM is created without a cloud-init drive and is in a state of continuous connection issues.
image

I enabled only scsi0. After regenerating the image, the order changed to scsi0,ide0.

@yyy916
Copy link
Author

yyy916 commented Sep 17, 2024

My created VM has this boot order
image

@yyy916
Copy link
Author

yyy916 commented Sep 17, 2024

@Tinyblargon Could I be missing out on any other details?
I even tried mentioning bootdisk = "scsi0" in my main.tf

@Tinyblargon
Copy link
Collaborator

@yyy916 one of the things that i think is going wrong is that the guide you followed works under the assumption template settings will be preserved.

Terraform will create the new vm with no regard for the template. Therefore, the only thing needed in the template is a boot disk.

@yyy916
Copy link
Author

yyy916 commented Sep 17, 2024

@Tinyblargon So how do you recommend I create the template? I need an Ubuntu 24.04 template. Could you please help me out with a step-by-step guide? I tried out whatever I found, for some reason it ends up the same.

@Uncurlhalo
Copy link

So i've been battling this same issue today. I ultimately managed to find a solution that seems to work. I'll provide some links and screenshots as well as my resource block for creating the vm's that fixed my issue. There is clearly a documentation gap here but i see you have an issue to address it for a planned release.

To start with I prepared a template per the instructions on the proxmox wiki here. I called my resulting template ubuntu-cloud-init-template. These images show my "Options" and "Hardware" for the template
template-hardware
template-options

Once I had my template VM I created a resource to make my VM. I used the existing docs on creating a cloud-init file in snippets to create user-data.yml with the contents I needed for my use case but I think this should be easy enough to understand and implement on your own. The core of the issue here I think is that cloud-init expects a serial console. My template had them but the resource seems to just clone the disk image so you need to make sure you create one as part of your resource. I could be totally off the mark with that, but it got me out of the looping startup hell.

resource "proxmox_vm_qemu" "k8s-control-plane" {
  # count of number of control nodes
  count = var.control_node_spec.count

  # Start of actual resrouces for vm
  name        = "k8s-control-${count.index}"
  target_node = var.node_name

  vmid    = format("${var.control_node_spec.vm_id_prefix}%02d", count.index)
  desc    = format("Kubernetes Control Plane %02d", count.index)
  tags    = "k8s,control-plane"
  os_type = "cloud-init"

  # clone my existing template
  clone = "ubuntu-cloud-init-template"

  # start at boot
  onboot = true

  # expect qemu-agent to be enabled and tell it we are booting linux guests
  agent   = 1
  qemu_os = "l26"
  scsihw  = "virtio-scsi-pci"

  # define resources
  cpu     = "host"
  cores   = var.control_node_spec.cores
  sockets = 1

  # define memory
  memory = var.control_node_spec.memory

  # specify our custom userdata script
  cicustom = "user=local:snippets/k8s-user-data.yml"

  # create my disks
  disks {
    ide {
      ide2 {
        cloudinit {
          storage = "local"
        }
      }
    }
    scsi {
      scsi0 {
        disk {
          size    = "20G"
          storage = "local-lvm"
          format  = "raw"
          cache   = "none"
          backup  = false
        }
      }
    }
  }

  # This is mandatory for some reason
  serial {
    id   = 0
    type = "socket"
  }

  # define network interfaces
  network {
    model  = "virtio"
    bridge = "vmbr0"
  }

  # set cloud init networking info, look at providing with cicustom
  ipconfig0 = format("ip=192.168.1.2%02d/24,gw=192.168.1.1", count.index)
}

@Tinyblargon
Copy link
Collaborator

@Uncurlhalo thanks for the help with this issue, the information you provided will go a long way when i start working on #1105

@Uncurlhalo
Copy link

I also tested this with ipconfig0 = "ip=dhcp" as well, as i know someone mentioned something with that before, it seemed to work fine as the vm's started and I saw they got leases from my router.

@Stankye
Copy link
Contributor

Stankye commented Sep 19, 2024

I recently updated my automation, some quick notes on that might help when you are looking over stuff (UEFI was a pain):

This is my disk import line. I do not import efidisk in TF, so that might need to be documented.
qm create 9107 --name "debian-12-cloud-init-dhcp-cis" --bios ovmf --machine q35 --efidisk0 local-lvm:0,efitype=4m,pre-enrolled-keys=1 --cpu cputype=host --cores 2 --memory 4096 --net0 virtio,bridge=vmbr0,mtu=1

qm importdisk 9107 debian-12-genericcloud-amd64-20240901-1857.qcow2 local-lvm

--efidisk0 local-lvm:0,efitype=4m,pre-enrolled-keys=1

pre-enrolled-keys does not seem to be configurable. https://github.com/Telmate/terraform-provider-proxmox/blob/master/proxmox/resource_vm_qemu.go

This will prevent images that do not have secureboot enabled from booting (like Alpine)

resource "proxmox_vm_qemu" "pihole" {
    name = "pihole"
    desc = "pihole_alloy_debian12_dhcp_cis"
    tags = ["pihole", "alloy", "debian12", "cis", "docker"]

    target_node = "pve1"
    
    # Start on boot
    # onboot = true

    # The destination resource pool for the new VM
    # pool = "pool0"

    # The template name to clone this vm from
    clone = "debian-12-cloud-init-dhcp-cis"

    # omvf is required for UEFI
    bios = "ovmf"

    # Link, Fast clone
    full_clone = "false"

    # Activate QEMU agent for this VM
    agent = 1

    # Set to cloud-init
    os_type = "cloud-init"
    cores = 2
    sockets = 1
    cpu = "host"
    memory = 4096
    scsihw = "virtio-scsi-single" # Benchmarks faster then iscsi https://kb.blockbridge.com/technote/proxmox-aio-vs-iouring/#recommended-settings

    # Setup the disks
    # SCSI1 is the cloudinit disk. UEFI does not work with IDE.
    # modify discard if you are running on ssd's
    disks {
        scsi {
            scsi1 {
                cloudinit {
                    storage = "local-lvm"
                }
            }
            scsi0 {
                disk {
                    size            = 16
                    cache           = "writeback"
                    storage         = "local-lvm"
                    iothread        = true
                    asyncio         = "io_uring"
                    discard         = false
                }
            }
        }
    }

    network {
        model = "virtio"
        bridge = "vmbr0"
        tag = 0
        mtu = 1
    }

    # scsi0 is the boot/OS disk.
    boot = "order=scsi0"
    # this sets the ip address to dhcp. Note: If you are configuring from a template, make sure that dhcp will get seeded properly. This might require "echo -n > /etc/machine-id"
    ipconfig0 = "ip=dhcp"

    sshkeys = trimspace(data.local_file.ssh_public_key.content)
    ci_wait = 30
    ciuser = "debian"
    cipassword = "password"
  }

edit: https://github.com/Stankye/Proxmox-CloudInit-Template/blob/main/debian.md

@ElForastero
Copy link

ElForastero commented Sep 20, 2024

  • pve 8.2.5
  • RC4

Disclaimer: I'm a noobie in terraform.

I've created and configured ubuntu cloud-init image converting it into a vm template using an official pve guide, configuring boot order, serial console, etc.

image

UPD: I figured it out. I had to create a second ide2 disk in order to mount cloud-init volume properly. 👍

@yyy916
Copy link
Author

yyy916 commented Sep 24, 2024

So i've been battling this same issue today. I ultimately managed to find a solution that seems to work. I'll provide some links and screenshots as well as my resource block for creating the vm's that fixed my issue. There is clearly a documentation gap here but i see you have an issue to address it for a planned release.

To start with I prepared a template per the instructions on the proxmox wiki here. I called my resulting template ubuntu-cloud-init-template. These images show my "Options" and "Hardware" for the template template-hardware template-options

Once I had my template VM I created a resource to make my VM. I used the existing docs on creating a cloud-init file in snippets to create user-data.yml with the contents I needed for my use case but I think this should be easy enough to understand and implement on your own. The core of the issue here I think is that cloud-init expects a serial console. My template had them but the resource seems to just clone the disk image so you need to make sure you create one as part of your resource. I could be totally off the mark with that, but it got me out of the looping startup hell.

resource "proxmox_vm_qemu" "k8s-control-plane" {
  # count of number of control nodes
  count = var.control_node_spec.count

  # Start of actual resrouces for vm
  name        = "k8s-control-${count.index}"
  target_node = var.node_name

  vmid    = format("${var.control_node_spec.vm_id_prefix}%02d", count.index)
  desc    = format("Kubernetes Control Plane %02d", count.index)
  tags    = "k8s,control-plane"
  os_type = "cloud-init"

  # clone my existing template
  clone = "ubuntu-cloud-init-template"

  # start at boot
  onboot = true

  # expect qemu-agent to be enabled and tell it we are booting linux guests
  agent   = 1
  qemu_os = "l26"
  scsihw  = "virtio-scsi-pci"

  # define resources
  cpu     = "host"
  cores   = var.control_node_spec.cores
  sockets = 1

  # define memory
  memory = var.control_node_spec.memory

  # specify our custom userdata script
  cicustom = "user=local:snippets/k8s-user-data.yml"

  # create my disks
  disks {
    ide {
      ide2 {
        cloudinit {
          storage = "local"
        }
      }
    }
    scsi {
      scsi0 {
        disk {
          size    = "20G"
          storage = "local-lvm"
          format  = "raw"
          cache   = "none"
          backup  = false
        }
      }
    }
  }

  # This is mandatory for some reason
  serial {
    id   = 0
    type = "socket"
  }

  # define network interfaces
  network {
    model  = "virtio"
    bridge = "vmbr0"
  }

  # set cloud init networking info, look at providing with cicustom
  ipconfig0 = format("ip=192.168.1.2%02d/24,gw=192.168.1.1", count.index)
}

@Uncurlhalo This config works for my existing template too!!

@yyy916
Copy link
Author

yyy916 commented Sep 24, 2024

I still get plug-in crashes and can't lock file errors and am trying to get a ssh connection working but I can finally get a working VM that starts after creation. Thank you so very much!! @Tinyblargon @Uncurlhalo

@sewe75
Copy link

sewe75 commented Oct 1, 2024

Had the same issues yesterday but thanks to this (and some other issues) as well as the latest docs by @Tinyblargon I finally got it working and decided to create a short starter README you can find here: https://github.com/sewe75/proxmox-terraform.
Mostly for my notes but maybe it's a starting point for others as well.

Thanks for this provider as well as still working on it!

@M-Barrows
Copy link

Had the same issues yesterday but thanks to this (and some other issues) as well as the latest docs by @Tinyblargon I finally got it working and decided to create a short starter README you can find here: https://github.com/sewe75/proxmox-terraform. Mostly for my notes but maybe it's a starting point for others as well.

Thanks for this provider as well as still working on it!

@sewe75 - I hadn't changed anything in my PVE setup and all of a sudden my deployments started failing with Error: error updating VM: Put "https://<server_ip>/api2/json/nodes/<server_name>/qemu/<node_id>/config": EOF, error status: no body available for HTTP response

I tried to use your guide (even updated to PVE 8 from 7 to match your example) and I'm still experiencing the same error. Have you seen this issue before?

@Tinyblargon
Copy link
Collaborator

@M-Barrows which PVE version are you on?

@M-Barrows
Copy link

M-Barrows commented Oct 7, 2024

@M-Barrows which PVE version are you on?

@Tinyblargon
PVE 8.2.7
Terraform v1.9.7
on linux_amd64

  • provider registry.terraform.io/telmate/proxmox v3.0.1-rc4

@lukebarton
Copy link

I had this issue and solved it by explicitly declaring the disks:

    disks {
        ide {
            ide0 {
                cloudinit {
                    storage = "local-zfs"
                }
            }
        }

        virtio {
            virtio0 {
                disk {
                    size = "32G"
                    storage = "local-zfs"
                }
            }
        }
    }

@M-Barrows
Copy link

M-Barrows commented Oct 8, 2024

Finally got a working setup - I think the issue was tied to EFI. So the heavy lifting was done by the template creation script, but I had to make sure my disks section matched the new template setup. Before I was using ide and scsi but now I'm using scsi and virtio (I'm still new to all this so I can't tell you exactly why this worked but I'm happy it does!). Here's a working setup for me for future documentation.

One bug with the current setup is that any updates to these resources trigger a reboot on PVE and that never completes. I have to manually stop the resources first. I'm okay with that for now but would be good to look into if possible.

Cloud Init Template Create

taken from https://github.com/UntouchedWagons/Ubuntu-CloudInit-Docs/blob/main/samples/ubuntu/ubuntu-noble-cloudinit.sh

#! /bin/bash

VMID=8000
STORAGE=local-lvm

set -x
rm -f noble-server-cloudimg-amd64.img
wget -q https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
qemu-img resize noble-server-cloudimg-amd64.img 8G
qm destroy $VMID
qm create $VMID --name "ubuntu-noble-template" --ostype l26 \
    --memory 1024 --balloon 0 \
    --agent 1 \
    --bios ovmf --machine q35 --efidisk0 $STORAGE:0,pre-enrolled-keys=0 \
    --cpu host --cores 1 --numa 1 \
    --vga serial0 --serial0 socket  \
    --net0 virtio,bridge=vmbr0,mtu=1
qm importdisk $VMID noble-server-cloudimg-amd64.img $STORAGE
qm set $VMID --scsihw virtio-scsi-pci --virtio0 $STORAGE:vm-$VMID-disk-1,discard=on
qm set $VMID --boot order=virtio0
qm set $VMID --scsi1 $STORAGE:cloudinit

cat << EOF | tee /var/lib/vz/snippets/ubuntu.yaml
#cloud-config
runcmd:
    - apt-get update
    - apt-get install -y qemu-guest-agent
    - systemctl enable ssh
    - reboot
# Taken from https://forum.proxmox.com/threads/combining-custom-cloud-init-with-auto-generated.59008/page-3#post-428772
EOF

qm set $VMID --cicustom "vendor=local:snippets/ubuntu.yaml"
qm set $VMID --tags ubuntu-template,noble,cloudinit
# qm set $VMID --ciuser serveradmin
# qm set $VMID --sshkeys ~/.ssh/authorized_keys
qm set $VMID --ipconfig0 ip=dhcp
qm template $VMID

TF Provider

terraform {
  required_providers {
    proxmox = {
      source  = "Telmate/proxmox"
      version = "3.0.1-rc4"
    }
  }
}

provider "proxmox" {
  pm_api_url          = "${var.PM_BASE_URI}/api2/json"
  pm_api_token_id     = var.PM_API_TOKEN_ID
  pm_api_token_secret = var.PM_API_TOKEN_SECRET
}

vms.tf

resource "proxmox_vm_qemu" "resource-name" {
  count       = 1
  name        = "vm-name"
  target_node = "pve-node-name"
  agent       = 1
  memory      = 6144 // 6Gb
  tags        = "iac,terraform"
  ipconfig0   = "ip=192.168.1.100/32,gw=192.168.1.1"
  os_type     = "cloud-init"
  clone       = "ubuntu-noble-template"
  sockets     = 1
  cores       = 1
  scsihw      = "virtio-scsi-pci"
  onboot      = true
  ciuser      = "serveradmin"
  sshkeys     = var.ssh_keys
  serial {
    id = 0
    type = "socket"
  }
  network {
    bridge    = "vmbr0"
    firewall  = false
    link_down = false
    model     = "virtio"
  }
  disks {
    scsi {
      scsi1 {
        cloudinit {
          storage = "local-lvm"
        }
      }
    }
    virtio {
      virtio0 {
        disk {
          storage = "local-lvm"
          size    = "30G"
        }
      }
    }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants