-
Notifications
You must be signed in to change notification settings - Fork 291
xenopsd: Don't balloon down memory on same-host migration #6437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
xenopsd: Don't balloon down memory on same-host migration #6437
Conversation
ocaml/xenopsd/lib/xenops_server.ml
Outdated
reducess unnecessary memory copying. *) | ||
( try B.VM.wait_ballooning t vm | ||
with Xenopsd_error Ballooning_timeout_before_migration -> () | ||
(* CA-78365: set the memory dynamic range to a single value to stop ballooning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still need this part: even if you want to avoid ballooning down, you still need to stop ballooning. A VM attempting to balloon during migration won't be very healthy.
Instead we probably need to look at how much memory it is currently using, and set the balloon target to that to stop it from changing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we should probably skip all this when static_min == dynamic_min == dynamic_max == static_max, which is the usual setting (I'd hope that the code here is mostly a noop in that case, but I'm not sure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these should be done now
When the VM (and its memory) isn't actually going to be moved anywhere (like in VDI migration to another SR), there's no point in ballooning down, it's actually likely to make VDI migration take longer if swap is engaged. Instead change the ballooning target to memory_actual and wait for any ballooning to be stopped. If no ballooning could have been happening in the first place (dynamic_min = dynamic_max = static_max), then don't do any ballooning manipulations at all. Signed-off-by: Andrii Sultanov <[email protected]>
25e4a88
to
64a5a0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, but generally I think this is fine.
ocaml/xenopsd/lib/xenops_server.ml
Outdated
( if | ||
not | ||
(vm.memory_dynamic_min = vm.memory_dynamic_max | ||
&& vm.memory_dynamic_max = vm.memory_static_max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This second condition is not needed: if the dynamic range is fixed, then there will not be any ballooning (the atomic VM_set_memory_dynamic_range (id, vm.Vm.memory_dynamic_min, vm.Vm.memory_dynamic_min)
will not do anything).
then | ||
(* There's no need to balloon down when doing localhost migration - | ||
we're not copying any memory in the first place. This would | ||
likely increase VDI migration time as swap would be engaged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just a guess or do you have evidence from tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part comes from a report on the xcp-ng Discord
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the observation on Windows VMs whilst performing VDI migrations in a production environment.
For example, on a VM with 32GB dynamic MAX and 16GB dynamic MIN with 20GB in use, the ballooning would mean waiting for 4GB to be pushed into the page file (my assumption being that this would then also mean that the changed blocks which would have to be sent to the new SR if we are migrating the disc backing the page file). The free 12GB may also have been used by the guest OS read cache and would be ejected, meaning potential subsequent reads from disc that may have been cache hits.
When the VM (and its memory) isn't actually going to be moved anywhere (like in VDI migration to another SR), there's no point in ballooning down, it's actually likely to make VDI migration take longer if swap is engaged.