-
-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEBUG: log list of VMs when startup failed #624
Conversation
OpenQA test summaryComplete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2024102523-4.3&flavor=pull-requests Test run included the following:
New failures, excluding unstableCompared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2024091704-4.3&flavor=update
Failed tests16 failures
Fixed failuresCompared to: https://openqa.qubes-os.org/tests/112766#dependencies 201 fixed
Unstable tests
|
021e19f
to
529d85f
Compare
Should this be marked as draft? I assume it is not meant for merging. |
Ease diagnosing test failures.
529d85f
to
515c064
Compare
f2a5909
to
6be2854
Compare
And also basic xen info (including free memory)
Experiments show that using memory hotplug or populate-on-demand makes no difference in required memory at startup. But PV vs PVH/HVM does make a difference - PV doesn't need extra per-MB overhead at all. On top of that, experimentally find the correct factor. Do it by starting VM (paused) with different parameters and compare `xl info free_memory` before and after. memory / maxmem: difference 400 / 4000: 434 600 / 4000: 634 400 / 400: 405 400 / 600: 407 400 / 2000: 418 2000 / 2000: 2018 600 / 600: 607 All above are with 2 vcpus. Testing with other vcpus count shows the 1.5MB per vcpu is quite accurate. As seen above, the initial memory doesn't affect the overhead. The maxmem counts. Applying linear regression to that shows it's about 0.008MB per MB of maxmem, so round it up to 8192 bytes. The base overhead of 4MB doesn't match exactly, but since the calculated number is smaller, leave it at 4MB as a safety margin. Fixes QubesOS/qubes-issues#9431
6be2854
to
20d32f7
Compare
... for the next watcher loop iteration. If two VMs are started in parallel, there may be no watcher loop iteration between handling their requests. This means the memory request for the second VM will operate on outdated list of VMs and may not account for some allocations (assume memory is free, while in fact it's already allocated to another VM). If that happens, the second VM may fail to start due to out of memory error. This is very similar problem as described in QubesOS/qubes-issues#1389, but affects actual VM startup, not its auxiliary processes. Fixes QubesOS/qubes-issues#9431
38b04cc
to
4e3818e
Compare
Any memory adjustments must be done while holding a lock, to not interfere with client request handling. This is critical to prevent memory just freed for a new VM being re-allocated elsewhere. The domain_list_changed() function failed to do that - do_balance call was done after releasing the lock. It wasn't a problem for a long time because of Python's global interpreter lock. But Python 3.13 is finally starting to support proper parallel thread execution, and it revealed this bug. Fixes QubesOS/qubes-issues#9431
The actual fix is in #629 |
And also basic xen info (including free memory)