Bug 82211

Summary: Cannot boot Xen under KVM with X2APIC enabled
Product: Virtualization Reporter: Zhou, Chao (chao.zhou)
Component: kvmAssignee: virtualization_kvm
Status: RESOLVED CODE_FIX    
Severity: normal CC: alan, bonzini, rkrcmar
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.16.0 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 94971    
Attachments: L1 serial log
xen.gz file

Description Zhou, Chao 2014-08-12 07:35:45 UTC
Environment:
------------
Host OS (ia32/ia32e/IA64):ia32e
Guest OS (ia32/ia32e/IA64):ia32e
Guest OS Type (Linux/Windows):Linux
kvm.git Commit:c77dcacb397519b6ade8f08201a4a90a7f4f751e
qemu.git Commit:2d591ce2aeebf9620ff527c7946844a3122afeec
Host Kernel Version:3.16.0
Hardware:Romley_EP, Haswell_EP, Ivytown_EP


Bug detailed description:
--------------------------
L1(xen on kvm) guest panic then reboot continuously when boot up L1 guest.

note:
this is a kernel bug:
kvm.git   + qemu.git  =  result
c77dcacb  + 2d591ce2  =  bad
9f6226a7  + 2d591ce2  = good

Reproduce steps:
----------------
1. create guest
qemu-system-x86_64 -enable-kvm -m 4G -smp 2 -net nic,macaddr=00:13:13:51:51:15 -net tap,script=/etc/kvm/qemu-ifup nested-xen.qcow -cpu host

Current result:
----------------
L1 guest panic then reboot continuously

Expected result:
----------------
L1 guest boot up correctly

Basic root-causing log:
Comment 1 Zhou, Chao 2014-08-12 07:36:04 UTC
the first bad commit is:
commit 6addfc42992be4b073c39137ecfdf4b2aa2d487f
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Thu Mar 27 11:29:28 2014 +0100

    KVM: x86: avoid useless set of KVM_REQ_EVENT after emulation
    
    Despite the provisions to emulate up to 130 consecutive instructions, in
    practice KVM will emulate just one before exiting handle_invalid_guest_state
    because x86_emulate_instruction always sets KVM_REQ_EVENT.
    
    However, we only need to do this if an interrupt could be injected,
    which happens a) if an interrupt shadow bit (STI or MOV SS) has gone
    away; b) if the interrupt flag has just been set (other instructions
    than STI can set it without enabling an interrupt shadow).
    
    This cuts another 700-900 cycles from the cost of emulating an
    instruction (measured on a Sandy Bridge Xeon: 1650-2600 cycles
    before the patch on kvm-unit-tests, 925-1700 afterwards).
    
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Comment 2 Zhou, Chao 2014-08-12 07:40:29 UTC
Created attachment 146291 [details]
L1 serial log
Comment 3 Paolo Bonzini 2014-09-01 12:46:27 UTC
Reproduced.  This is caused by "-cpu host" and, in particular by x2apic.  This command line fails:

/usr/libexec/qemu-kvm \
    -kernel xen-4.4.0 \
    -append 'noreboot loglvl=all com1=115200,8n1 console=com1' \
    -serial mon:stdio \
    -initrd /boot/vmlinuz-2.6.18-348.el5xen -cpu kvm64,+x2apic

It works with "-cpu kvm64".
Comment 4 Paolo Bonzini 2014-09-01 13:09:20 UTC
... but couldn't reproduce the bisection results.  It fails for me in all three of 3.16, 3.17 and RHEL6.

Maybe the bisection result is specific to a particular KVM module parameter, for example enable_apicv=1?
Comment 5 Zhou, Chao 2014-09-02 04:50:38 UTC
kvm.git+ qemu.git:fd275235_8b303011
kernel version:3.17.0-rc1
test on Ivytown_EP
qemu-system-x86_64 -enable-kvm -m 4G -smp 2 -net nic,macaddr=00:13:13:51:51:15 -net tap,script=/etc/kvm/qemu-ifup nested-xen.qcow -cpu kvm64
the L1 guest panic and reboot, the bug can reproduce.

when I try enable_apicv=1 or enalbe_apicv=0
create guest
qemu-system-x86_64 -enable-kvm -m 4G -smp 2 -net nic,macaddr=00:13:13:51:51:15 -net tap,script=/etc/kvm/qemu-ifup nested-xen.qcow -cpu kvm64
or 
qemu-system-x86_64 -enable-kvm -m 4G -smp 2 -net nic,macaddr=00:13:13:51:51:15 -net tap,script=/etc/kvm/qemu-ifup nested-xen.qcow -cpu host
the bug can reproduce.
Comment 6 Paolo Bonzini 2014-09-02 06:31:16 UTC
What version of Xen?  Can you attach the xen.gz file?
Comment 7 Zhou, Chao 2014-09-03 01:50:36 UTC
xen verion: 4.4-unstable. xen.gz file is attached
Comment 8 Zhou, Chao 2014-09-03 01:58:40 UTC
Created attachment 149111 [details]
xen.gz file
Comment 9 Paolo Bonzini 2014-09-03 11:25:15 UTC
Nope, your binary works with kvm/queue for me:

/sys/module/kvm_intel/parameters/emulate_invalid_guest_state:Y
/sys/module/kvm_intel/parameters/enable_apicv:N
/sys/module/kvm_intel/parameters/enable_shadow_vmcs:N
/sys/module/kvm_intel/parameters/ept:Y
/sys/module/kvm_intel/parameters/eptad:N
/sys/module/kvm_intel/parameters/fasteoi:Y
/sys/module/kvm_intel/parameters/flexpriority:Y
/sys/module/kvm_intel/parameters/nested:Y
/sys/module/kvm_intel/parameters/ple_gap:128
/sys/module/kvm_intel/parameters/ple_window:4096
/sys/module/kvm_intel/parameters/ple_window_grow:2
/sys/module/kvm_intel/parameters/ple_window_max:1073741823
/sys/module/kvm_intel/parameters/ple_window_shrink:0
/sys/module/kvm_intel/parameters/unrestricted_guest:Y
/sys/module/kvm_intel/parameters/vmm_exclusive:Y
/sys/module/kvm_intel/parameters/vpid:Y

I unzipped it, and invoked QEMU with

qemu-kvm -kernel ./xen -initrd /boot/vmlinuz-2.6.18-348.el5xen -cpu kvm64

(Any initrd will do).
Comment 10 Radim Krčmář 2015-03-17 17:01:58 UTC
I can reproduce with kernel-4.0.0-0.rc3.git2.1.fc23.x86_64 and attached Xen.

  qemu-kvm -kernel ./xen -initrd /boot/vmlinuz-[...] -cpu kvm64,+x2apic

Xen doesn't use x2APIC (which is good), but x2APIC enables directed EOI for xAPIC, which doesn't work.  All is good if we manually avoid it

  qemu-kvm [...] -append ioapic_ack=new

KVM IOAPIC never gets EOI.

I'll take a look where the problem is.
Comment 11 Radim Krčmář 2015-03-18 18:56:24 UTC
Should be fixed with "KVM: x86: call irq notifiers with directed EOI",
(http://www.spinics.net/lists/kernel/msg1949367.html)

can you check if it is?

Thanks.
Comment 12 Zhou, Chao 2015-03-20 05:17:12 UTC
(In reply to Radim Krčmář from comment #11)
> Should be fixed with "KVM: x86: call irq notifiers with directed EOI",
> (http://www.spinics.net/lists/kernel/msg1949367.html)
> 
> can you check if it is?
> 
> Thanks.

test this patch with kvm.git:4ff6f8e61eb7f96d3ca535c6d240f863ccd6fb7d
kernel version: 4.0.0-rc1+
qemu.git:cd232acfa0d70002fed89e9293f04afda577a513
test on Haswell_EP and Ivytown_EP

L1(xen on kvm) guest works fine when boot up L1 guest