Bug 53851 - nVMX: Support live migration of whole L1 guest
Summary: nVMX: Support live migration of whole L1 guest
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P1 low
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks: 94971 198621 53601
  Show dependency tree
 
Reported: 2013-02-14 14:36 UTC by Nadav Har'El
Modified: 2021-06-09 09:15 UTC (History)
2 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Nadav Har'El 2013-02-14 14:36:53 UTC
We need to support live migration of a guest, even if it is running a hypervisor with sub-guests.

I.e., this issue is about supporting live migration of a whole L1 guest hypervisor, together with its L2 guests.

One option is to add ioctls for saving and restoring a currently-running L2 guest's state (vmcs01, vcpu_vmx.nested, and perhaps more things?).

A different option is to force an exit from L2 to L1 during live migration. We can use a spurious external interrupt exit (hoping L1 will ignore it and immediately return to L2), or hide this exit from L1 - add ioctl for L2->L1 exit (if in L2) and another ioctl for reentry used on the receiving hypervisor (it will need to know to run it - i.e., is_guest_mode(vcpu) also needs to be migrated). Note how the exit from L2 to L1 includes writing all the L2's state from host memory (vmcs02) to guest memory (vmcs12), where it will be migrated together with the rest of the guest.

See discussion in http://www.mail-archive.com/kvm@vger.kernel.org/msg54257.html on comparing these two options, and how one approach should be chosen for both nVMX and nSVM.

Note that additionally, we need to also save/restore vmx->nested.vmxon (whether the migrated L1 guest ever did VMXON), and also vmx->nested.current_vmptr (last VMPTRLDed VMCS).
Comment 1 Nadav Har'El 2013-02-26 15:27:02 UTC
Another issue to consider for live migration: right now we only kunmap() and release_page_dirty() the vmcs12 page after a vmclear (or vmxof, etc.). However, we may need to do this also on exit to user space (KVM_VCPU_RUN ioctl completes), so that live migration knows this page is dirty and needs to be copied.
Comment 2 christian.rohmann 2021-06-09 08:55:41 UTC
Sorry for replying to this rather old bug - I was pointed to this via https://www.linux-kvm.org/page/Nested_Guests#Limitations 


If I may ask, is this really the last state of discussion and work on this issue?
Looking at i.e. 

* https://github.com/qemu/qemu/commit/ebbfef2f34cfc749c045a4569dedb4f748ec024a
* https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=039aeb9deb9291f3b19c375a8bc6fa7f768996cc


there have been commits for the kernel as well as QEMU to support migration of nested VMs.
Comment 3 mlevitsk 2021-06-09 09:15:04 UTC
On Wed, 2021-06-09 at 08:55 +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=53851
> 
> --- Comment #2 from christian.rohmann@frittentheke.de ---
> Sorry for replying to this rather old bug - I was pointed to this via
> https://www.linux-kvm.org/page/Nested_Guests#Limitations 
> 
> 
> If I may ask, is this really the last state of discussion and work on this
> issue?
> Looking at i.e. 
> 
> *
> https://github.com/qemu/qemu/commit/ebbfef2f34cfc749c045a4569dedb4f748ec024a
> *
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=039aeb9deb9291f3b19c375a8bc6fa7f768996cc
> 
> 
> there have been commits for the kernel as well as QEMU to support migration
> of
> nested VMs.
> 

AFAIK, running nested guests and migration while nested guest is running should
work on both Intel and AMD, but there were lots of fixes in this area recently
so a very new kernel should be used.

Plus in some cases if the nested guest is 32 bit, the migration still can fail,
on Intel at least, last time I checked. On AMD I just recently fixed
such issue for 32 bit guest and it seems to work for me.

I also know that if the nested guest is hyper-v enabled (which is a bit overkill as
this brings us to a double nesting), then it crashes once in a whileafter lots of migration
cycles.

So there are still bugs, but overall it works.

Best regards,
	Maxim Levitsky

Note You need to log in before you can comment on or make changes to this bug.