Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3.0.1-rc3: IP not found and agent_timeout not being used #1106

Open
arty-hlr opened this issue Sep 17, 2024 · 13 comments
Open

v3.0.1-rc3: IP not found and agent_timeout not being used #1106

arty-hlr opened this issue Sep 17, 2024 · 13 comments

Comments

@arty-hlr
Copy link

Contrary to the PR #1016 and to what the documentation https://registry.terraform.io/providers/telmate/proxmox/3.0.1-rc3/docs/resources/vm_qemu says, agent_timeout is not being used, which results in VMs being created and then the following error:
image

I do use agent_timeout in the module:
image

but it's not being shown in the plan output:
image

The only difference with previous templates I tried to clone is that there is now SSH installed. I am not sure why it cannot find the IP address as it shows it in the GUI. I would expect agent_timeout to help, as the error suggests, but it does not do anything unfortunately. It also sometimes work on one VM or the other, pretty randomly as it seems, so doing multiple terraform apply helps, kinda.

Here is the full output with TF_VAR_LOG=DEBUG:
image

To reproduce:

  • create a windows template with SSH server installed and clone it (linked clone) with terraform
  • create a VM from a template with terraform and use agent_timeout
@GMZwinge
Copy link

GMZwinge commented Oct 2, 2024

Test without Cloud-init (os_type = "centos"):

  • 3.0.1-rc3 provides an IP address:
2024-10-02T10:56:03.512-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc3.exe: 2024/10/02 10:56:03 [DEBUG] VM is running, checking the IP: timestamp=2024-10-02T10:56:03.512-0400
2024-10-02T10:56:03.512-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc3.exe: 2024/10/02 10:56:03 [INFO][initConnInfo] trying to get vm ip address for provisioner: timestamp=2024-10-02T10:56:03.512-0400
2024-10-02T10:56:03.512-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc3.exe: 2024/10/02 10:56:03 [DEBUG][initConnInfo] retrying for at most  20m0s minutes before giving up: timestamp=2024-10-02T10:56:03.512-0400
2024-10-02T10:56:03.512-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc3.exe: 2024/10/02 10:56:03 [DEBUG][initConnInfo] retries will end at 2024-10-02 11:16:03.5122731 -0400 EDT m=+1224.259660501: timestamp=2024-10-02T10:56:03.512-0400
2024-10-02T10:56:06.525-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc3.exe: 2024/10/02 10:56:06 [INFO][getPrimaryIP] check ip result error 500 QEMU guest agent is not running: timestamp=2024-10-02T10:56:06.524-0400
2024-10-02T10:56:14.566-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc3.exe: 2024/10/02 10:56:14 [INFO][getPrimaryIP] check ip result error 500 QEMU guest agent is not running: timestamp=2024-10-02T10:56:14.565-0400
2024-10-02T10:56:19.712-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc3.exe: 2024/10/02 10:56:19 [INFO][getPrimaryIP] QEMU Agent interfaces found: [{00:00:00:00:00:00 [127.0.0.1 ::1] lo <nil>} {<MacAddress> [<Ipv6Addresses>] ens18 <nil>}]: timestamp=2024-10-02T10:56:19.712-0400
2024-10-02T10:56:24.922-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc3.exe: 2024/10/02 10:56:24 [INFO][getPrimaryIP] QEMU Agent interfaces found: [{00:00:00:00:00:00 [127.0.0.1 ::1] lo <nil>} {<MacAddress> [<Ipv4Address> <Ipv6Addresses>] ens18 <nil>}]: timestamp=2024-10-02T10:56:24.922-0400
2024-10-02T10:56:24.922-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc3.exe: 2024/10/02 10:56:24 [DEBUG][initConnInfo] this is the vm configuration: <Ipv4Address> 22: timestamp=2024-10-02T10:56:24.922-0400
  • 3.0.1-rc4 doesn't provide an IP address:
2024-10-02T10:53:14.311-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc4.exe: 2024/10/02 10:53:14 [DEBUG] VM is running, checking the IP: timestamp=2024-10-02T10:53:14.311-0400
2024-10-02T10:53:14.311-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc4.exe: 2024/10/02 10:53:14 [INFO][initConnInfo] trying to get vm ip address for provisioner: timestamp=2024-10-02T10:53:14.311-0400
2024-10-02T10:53:14.311-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc4.exe: 2024/10/02 10:53:14 [DEBUG][initConnInfo] retrying for at most  20m0s minutes before giving up: timestamp=2024-10-02T10:53:14.311-0400
2024-10-02T10:53:14.311-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc4.exe: 2024/10/02 10:53:14 [DEBUG][initConnInfo] retries will end at 2024-10-02 11:13:14.3111935 -0400 EDT m=+1221.822484201: timestamp=2024-10-02T10:53:14.311-0400
2024-10-02T10:53:14.311-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc4.exe: 2024/10/02 10:53:14 [INFO][getPrimaryIP] vm has a cloud-init configuration: timestamp=2024-10-02T10:53:14.311-0400
2024-10-02T10:53:14.311-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc4.exe: 2024/10/02 10:53:14 [DEBUG][initConnInfo] this is the vm configuration:  22: timestamp=2024-10-02T10:53:14.311-0400

Also, 3.0.1-rc4 displays those warning: Warning: Cloud-init is enabled but no IP config is set and Cloud-init is enabled in your configuration but no static IP address is set, nor is the DHCP option enabled. Was able to get rid of those warning with define_connection_info = false, but it still doesn't provide an IP address:

2024-10-02T11:09:14.978-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc4.exe: 2024/10/02 11:09:14 [DEBUG] VM is running, checking the IP: timestamp=2024-10-02T11:09:14.978-0400
2024-10-02T11:09:14.978-0400 [INFO]  provider.terraform-provider-proxmox_v3.0.1-rc4.exe: 2024/10/02 11:09:14 [INFO][initConnInfo] define_connection_info is false, no further action: timestamp=2024-10-02T11:09:14.978-0400

Not sure if that's the correct way to NOT use Cloud-init though.

@Tinyblargon
Copy link
Collaborator

Found the commit that most likely broke it.
Ironically it separated the logic so it would be easier to test.

18095e5#diff-104365919f693375882979581d9f36f5266991eef361948e56c220931fe886ddL1916

@Tinyblargon
Copy link
Collaborator

@GMZwinge could you check if #1120 fixes your issue?

@Tinyblargon
Copy link
Collaborator

@arty-hlr

agent_timeout is being used, it could be that endTime on L1970 is shorter, but by default this is 20 minutes.

for time.Now().Before(endTime) {
var interfaces []pxapi.AgentNetworkInterface
interfaces, err = vmr.GetAgentInformation(client, false)
if err != nil {
if !strings.Contains(err.Error(), ErrorGuestAgentNotRunning) {
return primaryIPs{}, diag.FromErr(err)
}
log.Printf("[INFO][getPrimaryIP] check ip result error %s", err.Error())
logger.Debug().Int("vmid", vmr.VmId()).Msgf("check ip result error %s", err.Error())
} else { // vm is running and reachable
if len(interfaces) > 0 { // agent returned some information
log.Printf("[INFO][getPrimaryIP] QEMU Agent interfaces found: %v", interfaces)
logger.Debug().Int("vmid", vmr.VmId()).Msgf("QEMU Agent interfaces found: %v", interfaces)
conn = conn.parsePrimaryIPs(interfaces, primaryMacAddress)
if conn.hasRequiredIP() {
return conn.IPs, diag.Diagnostics{}
}
}
if waitedTime > agentTimeout {
break
}
waitedTime += additionalWait
}
time.Sleep(time.Duration(additionalWait) * time.Second)
}

However, it the log your showed does show the information in it's raw form.
I think we are parsing the MAC address wrong and therefore we can't match it with an interface returned by the guest-agent.

if _, ok := vmConfig["net"+strconv.Itoa(i)]; ok {
primaryMacAddress = macAddressRegex.FindString(vmConfig["net"+strconv.Itoa(i)].(string))
break
}

Upstream I'm working on re-implementingthe network interfaces to get rid of this regex parsing of the MAC address on L1966.

@Tinyblargon
Copy link
Collaborator

@arty-hlr can you check if #1120 fixes your issue?

@arty-hlr
Copy link
Author

@Tinyblargon How do I tell terraform/opentofu to use your branch? I only see source and version in the terraform provider config. Should I compile it, and copy the executable to the plugin directory like here?

@arty-hlr
Copy link
Author

about agent_timeout: I did try to change its value, but it didn't make any difference, so I assumed it wasn't used at all as it wasn't in the logs.

@Tinyblargon
Copy link
Collaborator

@arty-hlr when you compile the branch, the compiled binary has to be renamed to linux_amd64, then you can put at .terraform/providers/terraform.local/local/proxmox/1.0.0/linux_amd64 inside your terraform project developer.md has more information about this.

@arty-hlr
Copy link
Author

Hi @Tinyblargon, that didn't work, I had to change the .terraformrc to add a filesystem mirror, unfortunately that's not mentioned anywhere in the docs.

I just ran a test with your branch and unfortunately it still doesn't work, here's the relevant part of the output:

grafik

It seems that looping over terraform apply (which worked before) doesn't anymore either.

grafik

@Tinyblargon
Copy link
Collaborator

@arty-hlr

Checking if i have the correct situation.

  • os_type == "cloud-init"
  • a cloud-init disk is configured
  • skip_ipv4 == true
  • skip_ipv6 == false

The error is telling me that ipconfig0 is not configured.

@arty-hlr
Copy link
Author

@Tinyblargon The same happens with os_type as cloud-init or cloud-init not set at all, the above screenshots from the run were with cloud-init not set at all. There is no cloud-init disk configured, the VMs are just cloned from the template and get a dynamic IP address from the DHCP server, so I don't use ipconfig0 as there's no actual cloud-init.

I'm actually not sure why it says Cloud-init is enabled because it is not in the terraform config and the templates don't have a cloud-init drive.

grafik

@arty-hlr
Copy link
Author

The code here https://github.com/Telmate/terraform-provider-proxmox/blob/master/proxmox/resource_vm_qemu.go#L1935 seems to indicate that this shouldn't happen when cloud-init is not set

Tinyblargon added a commit to Tinyblargon/terraform-provider-proxmox that referenced this issue Oct 31, 2024
@Tinyblargon
Copy link
Collaborator

@arty-hlr made some changes and tested locally.
It now works as expected and reliably gets an IP address.
Did 2 test, one VM with cloud-init, and one VM without cloud-init with pre-configured DHCP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants