Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error creating LXC container - can't lock file - got timeout #1069

Open
kennyparsons opened this issue Aug 5, 2024 · 7 comments
Open

error creating LXC container - can't lock file - got timeout #1069

kennyparsons opened this issue Aug 5, 2024 · 7 comments

Comments

@kennyparsons
Copy link

Full error here:

"│ Error: error creating LXC container: 500 unable to create CT 1000 - 
can't lock file '/run/lock/lxc/pve-config-1000.lock' - 
got timeout, error status: {"data":null} (params: {"arch":"amd64",
"cmode":"tty",
"console":true,
"cores":1,
"cpulimit":0,
"cpuunits":1024,
"features":"nesting=1",
"hostname":"test-lxc-0",
"memory":512,
"net0":"bridge=vmbr0,name=eth0,ip=dhcp",
"onboot":true,
"ostemplate":"storage:vztmpl/name-of-ct-template.tar.zst",
"password":"redacted",
"pool":"",
"protection":false,
"rootfs":"local-lvm:8",
"ssh-public-keys":"redacted\n",
"start":true,
"storage":"local","swap":0,"tags":"",
"tty":2,
"unprivileged":false,
"vmid":1000})
│ 
│   with module.control_vms.proxmox_lxc.container[0],
│   on modules/proxmox-lxc/main.tf line 8, in resource "proxmox_lxc" "container":
│    8: resource "proxmox_lxc" "container" {
│"

According to the error, the lock file can't be locked. The container spins up just fine and is available if I access it via the console. Note, the above error happens after the ct is created:
module.worker_vms.proxmox_lxc.container[0]: Creation complete after 24s [id=hv01/lxc/1000] (it's called VM, but it's actually a ct)

@kennyparsons
Copy link
Author

If i try setting parallelism to 1 with terraform apply

module.ssh_keys.tls_private_key.ssh_key: Creating...
module.ssh_keys.tls_private_key.ssh_key: Creation complete after 1s [id=649a139e76f1d2d8d861d0c87c97a58a80ad57d9]
module.ssh_keys.local_file.private_key: Creating...
module.ssh_keys.local_file.private_key: Creation complete after 0s [id=2dbcc6165f939f6bace2fc2615504e0e4de638cb]
module.ssh_keys.local_file.public_key: Creating...
module.ssh_keys.local_file.public_key: Creation complete after 0s [id=2f834e45a2c138975e6f499266fb7295f33c2e64]
module.worker_vms.proxmox_lxc.container[0]: Creating...
module.worker_vms.proxmox_lxc.container[0]: Creation complete after 6s [id=hv01/lxc/1000]
module.control_vms.proxmox_lxc.container[0]: Creating...
module.control_vms.proxmox_lxc.container[1]: Creating...
module.control_vms.proxmox_lxc.container[1]: Creation complete after 6s [id=hv01/lxc/1001]
╷
│ Error: error creating LXC container: 500 CT 1000 already exists on node

It says the ct already exists, but only because the apply created it. It does not exist before terraform applies.

@rrottmann
Copy link

Same here. Happens with both count and for_each loops. If you specify vmid=0, excess containers are spawned. When you split the loop into multiple tf files for each vm, it works even with parallel execution.

@azertypwd
Copy link

azertypwd commented Sep 3, 2024

Same error when i create 3 VMs through terraform module the first VM is correctly created with VMID 118 but the other is not created with this error:

image

image

@Koleon
Copy link

Koleon commented Sep 24, 2024

I'm getting the same error using:

OpenTofu v1.8.2
on darwin_arm64
+ provider registry.opentofu.org/telmate/proxmox v3.0.1-rc4

test.tf file:

resource "proxmox_lxc" "test1" {

  target_node  = "pve"
  hostname     = "test1.home.arpa"
  ostemplate   = "isos-ct-templates:vztmpl/debian-12-standard_12.2-1_amd64.tar.zst"
  password     = var.password
  cores        = "4"
  memory       = "4096"
  unprivileged = true
  tags         = "servers"

  start        = true
  onboot       = true

  rootfs {
    storage = "vm-ct-drives"
    size    = "8G"
  }

  network {
    name   = "eth0"
    bridge = "vmbr0"
    ip     = "192.168.60.3/24"
    gw     = "192.168.60.1"
    tag    = "60"
    firewall = true
  }

  nameserver = "192.168.60.1"
}

resource "proxmox_lxc" "test2" {

  target_node  = "pve"
  hostname     = "test2.home.arpa"
  ostemplate   = "isos-ct-templates:vztmpl/debian-12-standard_12.2-1_amd64.tar.zst"
  password     = var.password
  cores        = "4"
  memory       = "4096"
  unprivileged = true
  tags         = "servers"

  start        = true
  onboot       = true

  rootfs {
    storage = "vm-ct-drives"
    size    = "8G"
  }

  network {
    name   = "eth0"
    bridge = "vmbr0"
    ip     = "192.168.60.4/24"
    gw     = "192.168.60.1"
    tag    = "60"
    firewall = true
  }

  nameserver = "192.168.60.1"
}

@erdincka
Copy link

This cloud be the result of too fast consequent requests. Terraform sends an API request to get next VMID for create/clone operations. Since this request is made in parallel (or one immediately after the other), you'll get the same VMID for more than one VM/PCT requests, and then cannot create/start the second one. That VMID is already used for another VM at that point.
Resolution for me was to add random delay (<1s) between API calls. Since I don't know how to do this with Terraform, and I am only creating few resources, I just re-run "apply" until they all succeed. Setting parallelism=1 on command line might work as a workaround.

@jacobmarcil
Copy link

I got it working by configuring the vmid parameter instead of having the default 0 value that selects an incremental ID for you.

@rrottmann
Copy link

I got it working by configuring the vmid parameter instead of having the default 0 value that selects an incremental ID for you.

This makes things a bit tricky when using for_each. If you're trying to create a module that hides all the little details, like naming conventions and general VM options that stay the same for a bunch of instances, it can get pretty inflexible.

Example:

module "lxc_instances" {
  source = "./my_proxmox_lxc_module"

  instances = [
  {
    name     = "vm1"
    workload = "docker"
    vmid     = 100
  },
  {
    name     = "vm2"
    workload = "storage"
    vmid     = 101
  }
 ]
}

When you have different projects that utilize the same module, one would need to keep track of all the currently assigned vmids to avoid any conflicts. Over time when you decommission some of the VMs, this could get messy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants