We need to support live migration of a guest, even if it is running a hypervisor with sub-guests.
I.e., this issue is about supporting live migration of a whole L1 guest hypervisor, together with its L2 guests.
One option is to add ioctls for saving and restoring a currently-running L2 guest's state (vmcs01, vcpu_vmx.nested, and perhaps more things?).
A different option is to force an exit from L2 to L1 during live migration. We can use a spurious external interrupt exit (hoping L1 will ignore it and immediately return to L2), or hide this exit from L1 - add ioctl for L2->L1 exit (if in L2) and another ioctl for reentry used on the receiving hypervisor (it will need to know to run it - i.e., is_guest_mode(vcpu) also needs to be migrated). Note how the exit from L2 to L1 includes writing all the L2's state from host memory (vmcs02) to guest memory (vmcs12), where it will be migrated together with the rest of the guest.
See discussion in http://firstname.lastname@example.org/msg54257.html on comparing these two options, and how one approach should be chosen for both nVMX and nSVM.
Note that additionally, we need to also save/restore vmx->nested.vmxon (whether the migrated L1 guest ever did VMXON), and also vmx->nested.current_vmptr (last VMPTRLDed VMCS).
Another issue to consider for live migration: right now we only kunmap() and release_page_dirty() the vmcs12 page after a vmclear (or vmxof, etc.). However, we may need to do this also on exit to user space (KVM_VCPU_RUN ioctl completes), so that live migration knows this page is dirty and needs to be copied.
Sorry for replying to this rather old bug - I was pointed to this via https://www.linux-kvm.org/page/Nested_Guests#Limitations
If I may ask, is this really the last state of discussion and work on this issue?
Looking at i.e.
there have been commits for the kernel as well as QEMU to support migration of nested VMs.
On Wed, 2021-06-09 at 08:55 +0000, email@example.com wrote:
> --- Comment #2 from firstname.lastname@example.org ---
> Sorry for replying to this rather old bug - I was pointed to this via
> If I may ask, is this really the last state of discussion and work on this
> Looking at i.e.
> there have been commits for the kernel as well as QEMU to support migration
> nested VMs.
AFAIK, running nested guests and migration while nested guest is running should
work on both Intel and AMD, but there were lots of fixes in this area recently
so a very new kernel should be used.
Plus in some cases if the nested guest is 32 bit, the migration still can fail,
on Intel at least, last time I checked. On AMD I just recently fixed
such issue for 32 bit guest and it seems to work for me.
I also know that if the nested guest is hyper-v enabled (which is a bit overkill as
this brings us to a double nesting), then it crashes once in a whileafter lots of migration
So there are still bugs, but overall it works.