Sometimes VM fails to start due to out of memory error, even though qmemman did freed (supposedly) enough #9431

marmarek · 2024-08-24T00:06:33Z

Qubes OS release

R4.3

Brief summary

Sometimes VM fails to start with internal error: libxenlight failed to create new domain message. libxl logs shows it's about out of memory.

Steps to reproduce

Not sure exactly. Happens from time to time during integration tests, I think more often when starting two VMs at once.

Expected behavior

VM starts normally

Actual behavior

VM fails to start
libxl-driver.log contains

2024-08-23 20:39:24.619+0000: xc: panic: xg_dom_boot.c:119: xc_dom_boot_mem_init: can't allocate low memory for domain: Out of memory
2024-08-23 20:39:24.619+0000: libxl: libxl_dom.c:581:libxl__build_dom: xc_dom_boot_mem_init failed: Device or resource busy
2024-08-23 20:39:24.638+0000: libxl: libxl_create.c:1753:domcreate_rebuild_done: Domain 58:cannot (re-)build domain: -3

I suspect the calculation how much free memory is needed to start a VM needs an update.

This started happening after update to Xen 4.19 (from Xen 4.17).

The text was updated successfully, but these errors were encountered:

marmarek · 2024-10-20T01:53:05Z

With some extra logging I've collected the following:

qubesd requested 422 MB from qmemman
Before starting the VM there is 526 MB free (according to xl info free_memory)
After starting the VM (still paused) - 92 MB free

So, it used 434MB, 12MB more than calculated. I need to collect more data for the new formula...

Experiments show that using memory hotplug or populate-on-demand makes no difference in required memory at startup. But PV vs PVH/HVM does make a difference - PV doesn't need extra per-MB overhead at all. On top of that, experimentally find the correct factor. Do it by starting VM (paused) with different parameters and compare `xl info free_memory` before and after. memory / maxmem: difference 400 / 4000: 434 600 / 4000: 634 400 / 400: 405 400 / 600: 407 400 / 2000: 418 2000 / 2000: 2018 600 / 600: 607 All above are with 2 vcpus. Testing with other vcpus count shows the 1.5MB per vcpu is quite accurate. As seen above, the initial memory doesn't affect the overhead. The maxmem counts. Applying linear regression to that shows it's about 8kb per MB of maxmem, so round it up to 8192. The base overhead of 4MB doesn't match exactly, but since the calculated number is smaller, leave it at 4MB as a safety margin. Fixes QubesOS/qubes-issues#9431

Experiments show that using memory hotplug or populate-on-demand makes no difference in required memory at startup. But PV vs PVH/HVM does make a difference - PV doesn't need extra per-MB overhead at all. On top of that, experimentally find the correct factor. Do it by starting VM (paused) with different parameters and compare `xl info free_memory` before and after. memory / maxmem: difference 400 / 4000: 434 600 / 4000: 634 400 / 400: 405 400 / 600: 407 400 / 2000: 418 2000 / 2000: 2018 600 / 600: 607 All above are with 2 vcpus. Testing with other vcpus count shows the 1.5MB per vcpu is quite accurate. As seen above, the initial memory doesn't affect the overhead. The maxmem counts. Applying linear regression to that shows it's about 0.008MB per MB of maxmem, so round it up to 8192 bytes. The base overhead of 4MB doesn't match exactly, but since the calculated number is smaller, leave it at 4MB as a safety margin. Fixes QubesOS/qubes-issues#9431

marmarek · 2024-10-20T14:07:40Z

While the formula may be inaccurate, it looks like the issue is somewhere else. One of the failed log contains:

DEBUG:vm.test-inst-vm2:mem required with overhead: 459538432.0
DEBUG:vm.test-inst-vm2:free mem before start: 119

So, it requested 438MB from qmemman, but then only 119MB was freed - way too little for starting 400MB VM.

... for the next watcher loop iteration. If two VMs are started in parallel, there may be no watcher loop iteration between handling their requests. This means the memory request for the second VM will operate on outdated list of VMs and may not account for some allocations (assume memory is free, while in fact it's already allocated to another VM). If that happens, the second VM may fail to start due to out of memory error. This is very similar problem as described in QubesOS/qubes-issues#1389, but affects actual VM startup, not its auxiliary processes. Fixes QubesOS/qubes-issues#9431

Experiments show that using memory hotplug or populate-on-demand makes no difference in required memory at startup. But PV vs PVH/HVM does make a difference - PV doesn't need extra per-MB overhead at all. On top of that, experimentally find the correct factor. Do it by starting VM (paused) with different parameters and compare `xl info free_memory` before and after. memory / maxmem: difference 400 / 4000: 434 600 / 4000: 634 400 / 400: 405 400 / 600: 407 400 / 2000: 418 2000 / 2000: 2018 600 / 600: 607 All above are with 2 vcpus. Testing with other vcpus count shows the 1.5MB per vcpu is quite accurate. As seen above, the initial memory doesn't affect the overhead. The maxmem counts. Applying linear regression to that shows it's about 0.008MB per MB of maxmem, so round it up to 8192 bytes. The base overhead of 4MB doesn't match exactly, but since the calculated number is smaller, leave it at 4MB as a safety margin. Fixes QubesOS/qubes-issues#9431

... for the next watcher loop iteration. If two VMs are started in parallel, there may be no watcher loop iteration between handling their requests. This means the memory request for the second VM will operate on outdated list of VMs and may not account for some allocations (assume memory is free, while in fact it's already allocated to another VM). If that happens, the second VM may fail to start due to out of memory error. This is very similar problem as described in QubesOS/qubes-issues#1389, but affects actual VM startup, not its auxiliary processes. Fixes QubesOS/qubes-issues#9431

Any memory adjustments must be done while holding a lock, to not interfere with client request handling. This is critical to prevent memory just freed for a new VM being re-allocated elsewhere. The domain_list_changed() function failed to do that - do_balance call was done after releasing the lock. It wasn't a problem for a long time because of Python's global interpreter lock. But Python 3.13 is finally starting to support proper parallel thread execution, and it revealed this bug. Fixes QubesOS/qubes-issues#9431

Experiments show that using memory hotplug or populate-on-demand makes no difference in required memory at startup. But PV vs PVH/HVM does make a difference - PV doesn't need extra per-MB overhead at all. On top of that, experimentally find the correct factor. Do it by starting VM (paused) with different parameters and compare `xl info free_memory` before and after. memory / maxmem: difference 400 / 4000: 434 600 / 4000: 634 400 / 400: 405 400 / 600: 407 400 / 2000: 418 2000 / 2000: 2018 600 / 600: 607 All above are with 2 vcpus. Testing with other vcpus count shows the 1.5MB per vcpu is quite accurate. As seen above, the initial memory doesn't affect the overhead. The maxmem counts. Applying linear regression to that shows it's about 0.008MB per MB of maxmem, so round it up to 8192 bytes. The base overhead of 4MB doesn't match exactly, but since the calculated number is smaller, leave it at 4MB as a safety margin. Fixes QubesOS/qubes-issues#9431

Any memory adjustments must be done while holding a lock, to not interfere with client request handling. This is critical to prevent memory just freed for a new VM being re-allocated elsewhere. The domain_list_changed() function failed to do that - do_balance call was done after releasing the lock. It wasn't a problem for a long time because of Python's global interpreter lock. But Python 3.13 is finally starting to support proper parallel thread execution, and it revealed this bug. Fixes QubesOS/qubes-issues#9431

Experiments show that using memory hotplug or populate-on-demand makes no difference in required memory at startup. But PV vs PVH/HVM does make a difference - PV doesn't need extra per-MB overhead at all. On top of that, experimentally find the correct factor. Do it by starting VM (paused) with different parameters and compare `xl info free_memory` before and after. memory / maxmem: difference 400 / 4000: 434 600 / 4000: 634 400 / 400: 405 400 / 600: 407 400 / 2000: 418 2000 / 2000: 2018 600 / 600: 607 All above are with 2 vcpus. Testing with other vcpus count shows the 1.5MB per vcpu is quite accurate. As seen above, the initial memory doesn't affect the overhead. The maxmem counts. Applying linear regression to that shows it's about 0.008MB per MB of maxmem, so round it up to 8192 bytes. The base overhead of 4MB doesn't match exactly, but since the calculated number is smaller, leave it at 4MB as a safety margin. Fixes QubesOS/qubes-issues#9431

Any memory adjustments must be done while holding a lock, to not interfere with client request handling. This is critical to prevent memory just freed for a new VM being re-allocated elsewhere. The domain_list_changed() function failed to do that - do_balance call was done after releasing the lock. It wasn't a problem for a long time because of Python's global interpreter lock. But Python 3.13 is finally starting to support proper parallel thread execution, and it revealed this bug. Fixes QubesOS/qubes-issues#9431

Any memory adjustments must be done while holding a lock, to not interfere with client request handling. This is critical to prevent memory just freed for a new VM being re-allocated elsewhere. The domain_list_changed() function failed to do that - do_balance call was done after releasing the lock. It wasn't a problem for a long time because of Python's global interpreter lock. But Python 3.13 is finally starting to support proper parallel thread execution, and it revealed this bug. Fixes QubesOS/qubes-issues#9431 (cherry picked from commit 2de9eb7) Fixes QubesOS/qubes-issues#9627

andrewdavidwong added the needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. label Aug 25, 2024

marmarek mentioned this issue Oct 27, 2024

Fix locking in qmemman on Python 3.13 QubesOS/qubes-core-admin#629

Merged

marmarek closed this as completed in QubesOS/qubes-core-admin@39329c8 Oct 29, 2024

qubesos-bot mentioned this issue Oct 29, 2024

core-admin v4.3.10 (r4.3) QubesOS/updates-status#5168

Closed

qubesos-bot added the r4.3-host-cur-test label Oct 29, 2024

marmarek mentioned this issue Dec 5, 2024

qmemman inflating a qube not in need, starving the system #9627

Open

qubesos-bot mentioned this issue Dec 8, 2024

core-admin v4.2.35 (r4.2) QubesOS/updates-status#5305

Closed

qubesos-bot added the r4.2-host-cur-test label Dec 8, 2024

qubesos-bot added r4.2-host-stable and removed r4.2-host-cur-test labels Dec 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sometimes VM fails to start due to out of memory error, even though qmemman did freed (supposedly) enough #9431

Sometimes VM fails to start due to out of memory error, even though qmemman did freed (supposedly) enough #9431

marmarek commented Aug 24, 2024

marmarek commented Oct 20, 2024

marmarek commented Oct 20, 2024

Sometimes VM fails to start due to out of memory error, even though qmemman did freed (supposedly) enough #9431

Sometimes VM fails to start due to out of memory error, even though qmemman did freed (supposedly) enough #9431

Comments

marmarek commented Aug 24, 2024

Qubes OS release

Brief summary

Steps to reproduce

Expected behavior

Actual behavior

marmarek commented Oct 20, 2024

marmarek commented Oct 20, 2024