Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sometimes VM fails to start due to out of memory error, even though qmemman did freed (supposedly) enough #9431

Closed
marmarek opened this issue Aug 24, 2024 · 2 comments · Fixed by QubesOS/qubes-core-admin#629
Labels
affects-4.3 This issue affects Qubes OS 4.3. C: core diagnosed Technical diagnosis has been performed (see issue comments). P: major Priority: major. Between "default" and "critical" in severity. pr submitted A pull request has been submitted for this issue. r4.3-host-cur-test T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@marmarek
Copy link
Member

How to file a helpful issue

Qubes OS release

R4.3

Brief summary

Sometimes VM fails to start with internal error: libxenlight failed to create new domain message. libxl logs shows it's about out of memory.

Steps to reproduce

Not sure exactly. Happens from time to time during integration tests, I think more often when starting two VMs at once.

Expected behavior

VM starts normally

Actual behavior

VM fails to start
libxl-driver.log contains

2024-08-23 20:39:24.619+0000: xc: panic: xg_dom_boot.c:119: xc_dom_boot_mem_init: can't allocate low memory for domain: Out of memory
2024-08-23 20:39:24.619+0000: libxl: libxl_dom.c:581:libxl__build_dom: xc_dom_boot_mem_init failed: Device or resource busy
2024-08-23 20:39:24.638+0000: libxl: libxl_create.c:1753:domcreate_rebuild_done: Domain 58:cannot (re-)build domain: -3

I suspect the calculation how much free memory is needed to start a VM needs an update.

This started happening after update to Xen 4.19 (from Xen 4.17).

@marmarek marmarek added T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. C: core P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. affects-4.3 This issue affects Qubes OS 4.3. P: major Priority: major. Between "default" and "critical" in severity. and removed P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. labels Aug 24, 2024
@andrewdavidwong andrewdavidwong added the needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. label Aug 25, 2024
@marmarek
Copy link
Member Author

With some extra logging I've collected the following:

  1. qubesd requested 422 MB from qmemman
  2. Before starting the VM there is 526 MB free (according to xl info free_memory)
  3. After starting the VM (still paused) - 92 MB free

So, it used 434MB, 12MB more than calculated. I need to collect more data for the new formula...

marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Oct 20, 2024
Experiments show that using memory hotplug or populate-on-demand makes
no difference in required memory at startup. But PV vs PVH/HVM does make
a difference - PV doesn't need extra per-MB overhead at all.

On top of that, experimentally find the correct factor. Do it by
starting VM (paused) with different parameters and compare `xl info
free_memory` before and after.

memory / maxmem: difference
400  / 4000: 434
600  / 4000: 634
400  / 400: 405
400  / 600: 407
400  / 2000: 418
2000 / 2000: 2018
600  / 600: 607

All above are with 2 vcpus. Testing with other vcpus count shows the
1.5MB per vcpu is quite accurate.
As seen above, the initial memory doesn't affect the overhead. The
maxmem counts. Applying linear regression to that shows it's about 8kb
per MB of maxmem, so round it up to 8192.
The base overhead of 4MB doesn't match exactly, but since the calculated
number is smaller, leave it at 4MB as a safety margin.

Fixes QubesOS/qubes-issues#9431
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Oct 20, 2024
Experiments show that using memory hotplug or populate-on-demand makes
no difference in required memory at startup. But PV vs PVH/HVM does make
a difference - PV doesn't need extra per-MB overhead at all.

On top of that, experimentally find the correct factor. Do it by
starting VM (paused) with different parameters and compare `xl info
free_memory` before and after.

memory / maxmem: difference
400  / 4000: 434
600  / 4000: 634
400  / 400: 405
400  / 600: 407
400  / 2000: 418
2000 / 2000: 2018
600  / 600: 607

All above are with 2 vcpus. Testing with other vcpus count shows the
1.5MB per vcpu is quite accurate.
As seen above, the initial memory doesn't affect the overhead. The
maxmem counts. Applying linear regression to that shows it's about
0.008MB per MB of maxmem, so round it up to 8192 bytes.
The base overhead of 4MB doesn't match exactly, but since the calculated
number is smaller, leave it at 4MB as a safety margin.

Fixes QubesOS/qubes-issues#9431
@marmarek
Copy link
Member Author

While the formula may be inaccurate, it looks like the issue is somewhere else. One of the failed log contains:

DEBUG:vm.test-inst-vm2:mem required with overhead: 459538432.0
DEBUG:vm.test-inst-vm2:free mem before start: 119

So, it requested 438MB from qmemman, but then only 119MB was freed - way too little for starting 400MB VM.

marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Oct 22, 2024
... for the next watcher loop iteration.

If two VMs are started in parallel, there may be no watcher loop
iteration between handling their requests. This means the memory request
for the second VM will operate on outdated list of VMs and may not
account for some allocations (assume memory is free, while in fact it's
already allocated to another VM). If that happens, the second VM may
fail to start due to out of memory error.

This is very similar problem as described in QubesOS/qubes-issues#1389,
but affects actual VM startup, not its auxiliary processes.

Fixes QubesOS/qubes-issues#9431
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Oct 22, 2024
Experiments show that using memory hotplug or populate-on-demand makes
no difference in required memory at startup. But PV vs PVH/HVM does make
a difference - PV doesn't need extra per-MB overhead at all.

On top of that, experimentally find the correct factor. Do it by
starting VM (paused) with different parameters and compare `xl info
free_memory` before and after.

memory / maxmem: difference
400  / 4000: 434
600  / 4000: 634
400  / 400: 405
400  / 600: 407
400  / 2000: 418
2000 / 2000: 2018
600  / 600: 607

All above are with 2 vcpus. Testing with other vcpus count shows the
1.5MB per vcpu is quite accurate.
As seen above, the initial memory doesn't affect the overhead. The
maxmem counts. Applying linear regression to that shows it's about
0.008MB per MB of maxmem, so round it up to 8192 bytes.
The base overhead of 4MB doesn't match exactly, but since the calculated
number is smaller, leave it at 4MB as a safety margin.

Fixes QubesOS/qubes-issues#9431
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Oct 22, 2024
... for the next watcher loop iteration.

If two VMs are started in parallel, there may be no watcher loop
iteration between handling their requests. This means the memory request
for the second VM will operate on outdated list of VMs and may not
account for some allocations (assume memory is free, while in fact it's
already allocated to another VM). If that happens, the second VM may
fail to start due to out of memory error.

This is very similar problem as described in QubesOS/qubes-issues#1389,
but affects actual VM startup, not its auxiliary processes.

Fixes QubesOS/qubes-issues#9431
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Oct 23, 2024
... for the next watcher loop iteration.

If two VMs are started in parallel, there may be no watcher loop
iteration between handling their requests. This means the memory request
for the second VM will operate on outdated list of VMs and may not
account for some allocations (assume memory is free, while in fact it's
already allocated to another VM). If that happens, the second VM may
fail to start due to out of memory error.

This is very similar problem as described in QubesOS/qubes-issues#1389,
but affects actual VM startup, not its auxiliary processes.

Fixes QubesOS/qubes-issues#9431
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Oct 24, 2024
Any memory adjustments must be done while holding a lock, to not
interfere with client request handling. This is critical to prevent
memory just freed for a new VM being re-allocated elsewhere.
The domain_list_changed() function failed to do that - do_balance call
was done after releasing the lock.

It wasn't a problem for a long time because of Python's global interpreter
lock. But Python 3.13 is finally starting to support proper parallel
thread execution, and it revealed this bug.

Fixes QubesOS/qubes-issues#9431
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Oct 27, 2024
Experiments show that using memory hotplug or populate-on-demand makes
no difference in required memory at startup. But PV vs PVH/HVM does make
a difference - PV doesn't need extra per-MB overhead at all.

On top of that, experimentally find the correct factor. Do it by
starting VM (paused) with different parameters and compare `xl info
free_memory` before and after.

memory / maxmem: difference
400  / 4000: 434
600  / 4000: 634
400  / 400: 405
400  / 600: 407
400  / 2000: 418
2000 / 2000: 2018
600  / 600: 607

All above are with 2 vcpus. Testing with other vcpus count shows the
1.5MB per vcpu is quite accurate.
As seen above, the initial memory doesn't affect the overhead. The
maxmem counts. Applying linear regression to that shows it's about
0.008MB per MB of maxmem, so round it up to 8192 bytes.
The base overhead of 4MB doesn't match exactly, but since the calculated
number is smaller, leave it at 4MB as a safety margin.

Fixes QubesOS/qubes-issues#9431
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Oct 27, 2024
Any memory adjustments must be done while holding a lock, to not
interfere with client request handling. This is critical to prevent
memory just freed for a new VM being re-allocated elsewhere.
The domain_list_changed() function failed to do that - do_balance call
was done after releasing the lock.

It wasn't a problem for a long time because of Python's global interpreter
lock. But Python 3.13 is finally starting to support proper parallel
thread execution, and it revealed this bug.

Fixes QubesOS/qubes-issues#9431
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Oct 27, 2024
Any memory adjustments must be done while holding a lock, to not
interfere with client request handling. This is critical to prevent
memory just freed for a new VM being re-allocated elsewhere.
The domain_list_changed() function failed to do that - do_balance call
was done after releasing the lock.

It wasn't a problem for a long time because of Python's global interpreter
lock. But Python 3.13 is finally starting to support proper parallel
thread execution, and it revealed this bug.

Fixes QubesOS/qubes-issues#9431
@andrewdavidwong andrewdavidwong added diagnosed Technical diagnosis has been performed (see issue comments). pr submitted A pull request has been submitted for this issue. and removed needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. labels Oct 30, 2024
fepitre pushed a commit to fepitre/qubes-core-admin that referenced this issue Nov 4, 2024
Experiments show that using memory hotplug or populate-on-demand makes
no difference in required memory at startup. But PV vs PVH/HVM does make
a difference - PV doesn't need extra per-MB overhead at all.

On top of that, experimentally find the correct factor. Do it by
starting VM (paused) with different parameters and compare `xl info
free_memory` before and after.

memory / maxmem: difference
400  / 4000: 434
600  / 4000: 634
400  / 400: 405
400  / 600: 407
400  / 2000: 418
2000 / 2000: 2018
600  / 600: 607

All above are with 2 vcpus. Testing with other vcpus count shows the
1.5MB per vcpu is quite accurate.
As seen above, the initial memory doesn't affect the overhead. The
maxmem counts. Applying linear regression to that shows it's about
0.008MB per MB of maxmem, so round it up to 8192 bytes.
The base overhead of 4MB doesn't match exactly, but since the calculated
number is smaller, leave it at 4MB as a safety margin.

Fixes QubesOS/qubes-issues#9431
fepitre pushed a commit to fepitre/qubes-core-admin that referenced this issue Nov 4, 2024
Any memory adjustments must be done while holding a lock, to not
interfere with client request handling. This is critical to prevent
memory just freed for a new VM being re-allocated elsewhere.
The domain_list_changed() function failed to do that - do_balance call
was done after releasing the lock.

It wasn't a problem for a long time because of Python's global interpreter
lock. But Python 3.13 is finally starting to support proper parallel
thread execution, and it revealed this bug.

Fixes QubesOS/qubes-issues#9431
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.3 This issue affects Qubes OS 4.3. C: core diagnosed Technical diagnosis has been performed (see issue comments). P: major Priority: major. Between "default" and "critical" in severity. pr submitted A pull request has been submitted for this issue. r4.3-host-cur-test T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants