Bug 216046 - KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm) when booting nested guest Windows 7 on another disk
Summary: KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm) when booting nested gue...
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-05-29 06:43 UTC by Eric Li
Modified: 2022-05-29 06:47 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.17.8-200.fc35.x86_64
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Guest hypervisor to reproduce this bug (xz compressed) (830.07 KB, application/x-xz)
2022-05-29 06:43 UTC, Eric Li
Details

Description Eric Li 2022-05-29 06:43:32 UTC
Created attachment 301072 [details]
Guest hypervisor to reproduce this bug (xz compressed)

Reproducible host configuration 1:
    CPU model: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz
    Host kernel version: 5.17.8-200.fc35.x86_64

Reproducible host configuration 2:
    CPU model: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
    Host kernel version: 5.17.9

Host kernel arch: x86_64
Guest: a micro-hypervisor (called XMHF, 64-bits), which runs Windows 7 or Windows 10 BIOS mode boot loader (32-bits).
QEMU command line: qemu-system-x86_64 -m 512M -cpu Haswell,vmx=yes -enable-kvm -serial stdio -drive media=disk,file=c.img,index=1 -drive media=disk,file=w.img,index=2
This bug still exists if using -machine kernel_irqchip=off
This problem cannot be tested with -accel tcg , because the guest requires nested virtualization

How to reproduce:

1. Install Windows 7 or Windows 10 in QEMU. Use MBR and BIOS (i.e. do not use GPT and UEFI). For example, I installed Windows on a 32G disk, and it results in around 3 partitions: 50M, 31.5G (this is C:), 450M. Only the MBR header (around 1 M) and the 50M disk is needed. For example, https://drive.google.com/uc?id=1mLvKsPSuLbeckwcdnavnQMu8QxOwvX29 can be used to reproduce this bug. Suppose Windows is installed in w.img.

2. Obtain c.img. c.img (8M) is uploaded at https://drive.google.com/uc?id=1g3c9KMAoh_Yvb9bzSuOBMG5L-2VX6twU . It is also compressed as c.img.xz and uploaded with this bug. It is built from https://github.com/lxylxy123456/uberxmhf/tree/ab7968ed8017f43978081862526636f75c80a3b7 .

3. Start QEMU using the command line above.

4. BIOS will boot the micro-hypervisor (XMHF), then XMHF boots Windows as a guest. After a little bit see error:

error: kvm run failed Input/output error
EAX=00000020 EBX=0000ffff ECX=00000000 EDX=0000ffff
ESI=00000000 EDI=00002300 EBP=00000000 ESP=00006d8c
EIP=00000018 EFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =f000 000f0000 ffffffff 00809300
CS =cb00 000cb000 ffffffff 00809b00
SS =0000 00000000 ffffffff 00809300
DS =0000 00000000 ffffffff 00809300
FS =0000 00000000 ffffffff 00809300
GS =0000 00000000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 00000000
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=0e 07 31 c0 b9 00 10 8d 3e 00 03 fc f3 ab 07 b8 20 00 e7 7e <cb> 0f 1f 80 00 00 00 00 6b 76 6d 20 61 50 69 43 20 00 00 00 2d 02 00 00 d9 02 00 00 00 03
KVM_GET_CLOCK failed: Input/output error
Aborted (core dumped)

After doing some print debugging on "Reproducible host configuration 2", with Linux kernel version 5.17.9, I get the call stack of this bug

QEMU: ioctl(..., KVM_RUN, ...)
  kvm_vcpu_ioctl()
    kvm_arch_vcpu_ioctl_run()
      vcpu_run()
        vcpu_enter_guest()
          vmx_handle_exit() (static_call(kvm_x86_handle_exit))
            __vmx_handle_exit()
              KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm)

That is, line 6038 in __vmx_handle_exit() is reached with vmx->nested.nested_run_pending = 1

  6032		/*
  6033		 * KVM should never reach this point with a pending nested VM-Enter.
  6034		 * More specifically, short-circuiting VM-Entry to emulate L2 due to
  6035		 * invalid guest state should never happen as that means KVM knowingly
  6036		 * allowed a nested VM-Enter with an invalid vmcs12.  More below.
  6037		 */
  6038		if (KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm))
  6039			return -EIO;

Note You need to log in before you can comment on or make changes to this bug.