-
Notifications
You must be signed in to change notification settings - Fork 911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Init-local Checks DataSourceNoCloud Before Curtin Was Able to Mount the Filesystem #6001
Comments
Thanks for the bug report. There was a recent upstream change that moved |
we experienced this issue in a Jammy LVM-based Virtual Machine and in Ubuntu Jammy there's no
the most important triggering factor here is LVM that's the key thing that made this whole scenario to happen I didn't have LVM in my Virtual Machine so I tried many times to reproduce with the same seed data and config (mentioned above) and never managed to reproduce But then I got the sosreport from a LVM-based Virtual Machine where this whole issue has been detected, and then by studying the journal it became clear why is this happening We then added this modification on the LVM-based Virtual Machine
and only after this change, the NoCloud seed discovery on the next reboot was successful because now the "init-local" stage waited until all the filesystems on the Logical Volumes got mounted to their final mount points
|
@bryanfraschetti @zilardcherry I just created a PPA that has the proposed fix for jammy. Please let me know if you can reproduce the issue using it. |
@holmanb thank you so much for the quick response, I asked customer to perform the test for us, please stay tuned, I will come back with the results shortly |
@holmanb Customer tested your package and reported back that now cloud-init works fine, please see the below logs I collected from the test environment
|
That's good to hear, thanks! I'll make sure that this is included in the upcoming release. |
@holmanb @TheRealFalcon is this fix planned to be released at milestone cloud-init.25.1 for Ubuntu 22.04 Jammy ? |
This will be fixed in the downstream 24.4.1 releases, actually. We're queuing an SRU currently, so after validation and SRU review it will hit -updates. |
Hello @holmanb @TheRealFalcon |
Bug report
In a customer environment, where curtin is used to place seed data in /var/lib/cloud/seed/nocloud/, we observed a situation where cloud-init-local executed before the mountpoint was fully configured. As a result, by the time the user-data and meta-data were able to be read, cloud-init-local already determined that DataSourceNoCloud had no data even though this was the intended datasource target.
As the datasource was not detected, the VM failed to initialize as intended and began to test remote datasources. In this case, this was perceived as a negative side effect because the VM is intended to run without external connectivity but it had to wait for all NICs to attempt and fail at DHCP and then try to wait on the network (which failed since systemd-networkd-wait-online.service was masked). To summarize, cloud-init-local executed so quickly that it determined a datasource was invalid before it was ready and this unfolded in such a way that introduced a series boot delays
It is worth mentioning that the VMs are on logical volumes and the filesystem on dm-3 (virtual block device created dynamically by the Device Mapper and used for the "var" Logical Volume) is mounted to the /var mount point much later during the boot process, after cloud-init during the "init-local" stage tried to find DataSourceNoCloud by looking for /var/lib/cloud/seed/nocloud/user-data and /var/lib/cloud/seed/nocloud/meta-data. However, at the time they were checked it couldn't find these files because the filesystem on the "var" Logical Volume and dm-3 virtual block device is not yet mounted to the /var mount point
We have found that adding the following to cloud-init-local.service allows for init-local to wait for the local filesystem to be ready before it checks it for the seed files and are seeking guidance from upstream and wondering if this should be patched
Steps to reproduce the problem
Relevant fragment from curtin-install.yaml with certain details obscured:
Environment Details
Cloud-init version: 24.4-0ubuntu1~22.04.1
Operating System: Ubuntu 22.04 (Jammy)
Data sources on local filesystem, with limited network features enabled
cloud-init logs
We see the cloud-init starts init-local and begins searching DataSourceNoCloud and trying to read from /var/lib/cloud/seed/nocloud. Shortly after dm-3, which is mounted at /var, succeeds. However, it is too late and init-local's check of DataSourceNoCloud returns having not found the seed data.
Block device map lvm2/lvmdump/dev_listing
The text was updated successfully, but these errors were encountered: