Bug 60679 - L2 can't boot up when creating L1 with '-cpu host' qemu option
Summary: L2 can't boot up when creating L1 with '-cpu host' qemu option
Status: CLOSED CODE_FIX
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-02 06:33 UTC by Jay Ren
Modified: 2013-08-13 07:36 UTC (History)
0 users

See Also:
Kernel Version: 3.11.0-RC1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
serial log of L1 guest (34.47 KB, application/octet-stream)
2013-08-02 06:33 UTC, Jay Ren
Details

Description Jay Ren 2013-08-02 06:33:03 UTC
Created attachment 107077 [details]
serial log of L1 guest

Environment:
------------
Host OS (ia32/ia32e/IA64):ia32e
Guest OS (ia32/ia32e/IA64):ia32e
Guest OS Type (Linux/Windows):Linux
kvm.git next Commit:bf640876e21fe603f7f52b0c27d66b7716da0384
qemu-kvm uq/master Commit:0779caeb1a17f4d3ed14e2925b36ba09b084fb7b
Host Kernel Version:3.11.0-rc1
Hardware: SNB-EP


Bug detailed description:
--------------------------
create L1 guest with "-cpu host", then create a L2 guest, L1 will call trace and L2 can't boot up.
note:
1. create L1 guest with "-cpu qemu64,+vmx", L2 guest works fine.
2. This should be a kvm bug.
kvm      + qemu-kvm   =  result
bf640876 + 0779caeb   =  bad
6d128e1e + 0779caeb   =  good

the first bad commit is:
commit 21feb4eb64e21f8dc91136b91ee886b978ce6421
Author: Arthur Chunqi Li <yzt356@gmail.com>
Date:   Mon Jul 15 16:04:08 2013 +0800

    KVM: nVMX: Set segment infomation of L1 when L2 exits


Reproduce steps:
----------------
1. create L1 guest:
qemu-system-x86_64 -enable-kvm -m 4G -smp 4 -net nic,macaddr=00:12:46:09:13:56
-net tap,script=/etc/kvm/qemu-ifup nested-kvm.qcow
2. create L2 guest:
qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -net none rhel6u4.img

Current result:
----------------
L1 call trace; L2 can't boot up.


Expected result:
----------------
L1 and L2 work fine



Basic root-causing log:  (in L1 guest)
----------------------
[   94.585378] BUG: unable to handle kernel NULL pointer dereference at
0000000000000034

[   94.586002] IP: [<ffffffffa010bb26>] write_segment_descriptor+0x66/0xa0
[kvm]

[   94.586002] PGD 0 

[   94.586002] Oops: 0000 [#1] SMP 

[   94.586002] Modules linked in: fuse nfsv3 nfs_acl nfsv4 auth_rpcgss nfs
fscache dns_resolver lockd sunrpc 8021q garp stp llc binfmt_misc uinput ppdev
parport_pc parport kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode
pcspkr e1000 i2c_piix4 floppy(F) cirrus(F) ttm(F) drm_kms_helper(F) drm(F)
i2c_core(F)

[   94.586002] CPU 0 

[   94.586002] Pid: 2132, comm: qemu-system-x86 Tainted: GF            3.8.5 #4
Bochs Bochs

[   94.586002] RIP: 0010:[<ffffffffa010bb26>]  [<ffffffffa010bb26>]
write_segment_descriptor+0x66/0xa0 [kvm]

[   94.586002] RSP: 0018:ffff880118d2bac8  EFLAGS: 00010246

[   94.586002] RAX: 0000000000000000 RBX: ffff880106a79540 RCX:
0000000000000000

[   94.586002] RDX: 0000000000001000 RSI: 0000000000000009 RDI:
00000000000000a0

[   94.586002] RBP: ffff880118d2baf8 R08: 0000000000000008 R09:
00000000000000a0

[   94.586002] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff880118d2bb38

[   94.586002] R13: 0000000000000001 R14: 0000000000000008 R15:
0000000000000008

[   94.586002] FS:  00007f83d3bb5700(0000) GS:ffff88011fc00000(0000)
knlGS:0000000000000000

[   94.586002] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

[   94.586002] CR2: 0000000000000034 CR3: 00000001069bf000 CR4:
00000000001427f0

[   94.586002] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000

[   94.586002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400

[   94.586002] Process qemu-system-x86 (pid: 2132, threadinfo ffff880118d2a000,
task ffff8801086e2e80)

[   94.586002] Stack:

[   94.586002]  0000000091800027 ffff880106a70000 ffff880106a79540
0000000000000001

[   94.586002]  0000000000000008 ffff880118d2bb38 ffff880118d2bb78
ffffffffa010be5f

[   94.586002]  ffff880106a79700 0000000800009568 ffff880106a78000
0000000000009188

[   94.586002] Call Trace:

[   94.586002]  [<ffffffffa010be5f>] load_segment_descriptor+0x2ff/0x330 [kvm]

[   94.586002]  [<ffffffffa010d4c8>] em_jmp_far+0x38/0x70 [kvm]

[   94.586002]  [<ffffffffa0108bd0>] ? check_cr_read+0x40/0x40 [kvm]

[   94.586002]  [<ffffffffa010c211>] x86_emulate_insn+0x261/0x1430 [kvm]

[   94.586002]  [<ffffffffa010d490>] ? em_lldt+0x30/0x30 [kvm]

[   94.586002]  [<ffffffffa00f5ff8>] x86_emulate_instruction+0x98/0x420 [kvm]

[   94.586002]  [<ffffffffa016a72d>] vmx_handle_exit+0x20d/0x780 [kvm_intel]

[   94.586002]  [<ffffffffa0165dfc>] ? vmx_vcpu_run+0x38c/0x5b0 [kvm_intel]

[   94.586002]  [<ffffffffa0110a28>] ? kvm_apic_has_interrupt+0x28/0xd0 [kvm]

[   94.586002]  [<ffffffffa01621b0>] ? vmx_invpcid_supported+0x20/0x20
[kvm_intel]

[   94.586002]  [<ffffffffa00f3982>] kvm_arch_vcpu_ioctl_run+0x8c2/0x1140 [kvm]

[   94.586002]  [<ffffffffa00ef607>] ? kvm_arch_vcpu_load+0x57/0x1e0 [kvm]

[   94.586002]  [<ffffffffa00dfcde>] kvm_vcpu_ioctl+0x37e/0x540 [kvm]

[   94.586002]  [<ffffffff8109a870>] ? __dequeue_entity+0x30/0x50

[   94.586002]  [<ffffffff811b074a>] do_vfs_ioctl+0x9a/0x550

[   94.586002]  [<ffffffff8164a817>] ? __schedule+0x3d7/0x7b0

[   94.586002]  [<ffffffff811b0ca1>] sys_ioctl+0xa1/0xb0

[   94.586002]  [<ffffffff81654659>] system_call_fastpath+0x16/0x1b

[   94.586002] Code: 41 0f b7 f5 0f b7 55 d0 c1 e6 03 8d 46 07 39 c2 7d 33 41
81 e6 fc ff 00 00 c6 43 28 0d c6 43 29 01 66 44 89 73 2a b8 02 00 00 00 <48> 8b
5d e0 4c 8b 65 e8 4c 8b 6d f0 4c 8b 75 f8 c9 c3 0f 1f 84 

[   94.586002] RIP  [<ffffffffa010bb26>] write_segment_descriptor+0x66/0xa0
[kvm]

[   94.586002]  RSP <ffff880118d2bac8>

[   94.586002] CR2: 0000000000000034

[   94.637540] ---[ end trace d23673089dd8f566 ]---

[   94.639336] ------------[ cut here ]------------

[   94.640276] kernel BUG at arch/x86/kernel/traps.c:643!

[   94.640276] invalid opcode: 0000 [#2] SMP 

[   94.640276] Modules linked in: fuse nfsv3 nfs_acl nfsv4 auth_rpcgss nfs
fscache dns_resolver lockd sunrpc 8021q garp stp llc binfmt_misc uinput ppdev
parport_pc parport kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode
pcspkr e1000 i2c_piix4 floppy(F) cirrus(F) ttm(F) drm_kms_helper(F) drm(F)
i2c_core(F)

[   94.640276] CPU 0 

[   94.640276] Pid: 2132, comm: qemu-system-x86 Tainted: GF     D      3.8.5 #4
Bochs Bochs

[   94.640276] RIP: 0010:[<ffffffff8164ca16>]  [<ffffffff8164ca16>]
do_device_not_available+0x16/0x30

[   94.640276] RSP: 0018:ffffffff81c01d18  EFLAGS: 00010002

[   94.640276] RAX: 000000008164c301 RBX: 0000000000000001 RCX:
ffffffff8164c32c

[   94.640276] RDX: 00000000ffffffff RSI: 0000000000000000 RDI:
ffffffff81c01d28

[   94.640276] RBP: ffffffff81c01d18 R08: ffff8801086e2ef0 R09:
000000000000e910

[   94.640276] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff8801086e2e80

[   94.640276] R13: ffffffff81c148b0 R14: 0000000000000000 R15:
ffff88011fc11980

[   94.640276] FS:  0000000000000000(0000) GS:ffff88011fc00000(0000)
knlGS:0000000000000000

[   94.640276] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b

[   94.640276] CR2: 0000000000000034 CR3: 00000001069bf000 CR4:
00000000001427f0

[   94.640276] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000

[   94.640276] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400

[   94.640276] Process qemu-system-x86 (pid: 2132, threadinfo ffff880118d2a000,
task ffff8801086e2e80)

[   94.640276] Stack:

[   94.640276]  ffffffff81c01e28 ffffffff8165578e ffff88011fc11980
0000000000000000

[   94.640276]  ffffffff81c148b0 ffff8801086e2e80 ffffffff81c01e28
ffffffff81c14420

[   94.640276]  0000000000000000 0000000000000000 000000000000e910
ffff8801086e2ef0

[   94.640276] Call Trace:

[   94.640276] Code: f0 c9 c3 66 90 48 2d a8 00 00 00 48 89 87 98 00 00 00 eb
c8 90 55 48 89 e5 0f 1f 44 00 00 b0 01 84 c0 75 07 e8 9c 83 9c ff c9 c3 <0f> 0b
0f 1f 84 00 00 00 00 00 eb f6 66 66 66 66 66 2e 0f 1f 84 

[   94.640276] RIP  [<ffffffff8164ca16>] do_device_not_available+0x16/0x30

[   94.640276]  RSP <ffffffff81c01d18>

[   94.640276] ---[ end trace d23673089dd8f567 ]---

[   94.640276] Fixing recursive fault but reboot is needed!

[  125.299855] ------------[ cut here ]------------

[  125.299855] WARNING: at kernel/watchdog.c:246
watchdog_overflow_callback+0x98/0xc0()

[  125.299855] Hardware name: Bochs

[  125.299855] Watchdog detected hard LOCKUP on cpu 1

[  125.299855] Modules linked in: fuse nfsv3 nfs_acl nfsv4 auth_rpcgss nfs
fscache dns_resolver lockd sunrpc 8021q garp stp llc binfmt_misc uinput ppdev
parport_pc parport kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode
pcspkr e1000 i2c_piix4 floppy(F) cirrus(F) ttm(F) drm_kms_helper(F) drm(F)
i2c_core(F)

[  125.299855] Pid: 0, comm: swapper/1 Tainted: GF     D      3.8.5 #4

[  125.299855] Call Trace:

[  125.299855]  <NMI>  [<ffffffff8106062f>] warn_slowpath_common+0x7f/0xc0

[  125.299855]  [<ffffffff81060726>] warn_slowpath_fmt+0x46/0x50

[  125.299855]  [<ffffffff810f00c8>] watchdog_overflow_callback+0x98/0xc0

[  125.299855]  [<ffffffff8112c0fc>] __perf_event_overflow+0x9c/0x220

[  125.299855]  [<ffffffff8102500a>] ? x86_perf_event_set_period+0xda/0x170

[  125.299855]  [<ffffffff8112c9a4>] perf_event_overflow+0x14/0x20

[  125.299855]  [<ffffffff8102b2a4>] intel_pmu_handle_irq+0x1c4/0x330

[  125.299855]  [<ffffffff8164db61>] perf_event_nmi_handler+0x21/0x30

[  125.299855]  [<ffffffff8164d2fa>] nmi_handle+0x5a/0x80

[  125.299855]  [<ffffffff8164d41d>] do_nmi+0xfd/0x360

[  125.299855]  [<ffffffff8164c901>] end_repeat_nmi+0x1e/0x2e

[  125.299855]  [<ffffffff8164bfa2>] ? _raw_spin_lock+0x22/0x30

[  125.299855]  [<ffffffff8164bfa2>] ? _raw_spin_lock+0x22/0x30

[  125.299855]  [<ffffffff8164bfa2>] ? _raw_spin_lock+0x22/0x30

[  125.299855]  <<EOE>>  <IRQ>  [<ffffffff810a43f5>]
sched_rt_period_timer+0x105/0x320

[  125.299855]  [<ffffffff81088ec0>] __run_hrtimer+0x70/0x1d0

[  125.299855]  [<ffffffff810a42f0>] ? enqueue_rt_entity+0x80/0x80

[  125.299855]  [<ffffffff81089296>] hrtimer_interrupt+0xf6/0x230

[  125.299855]  [<ffffffff816562d9>] smp_apic_timer_interrupt+0x69/0x99

[  125.299855]  [<ffffffff8165521d>] apic_timer_interrupt+0x6d/0x80

[  125.299855]  <EOI>  [<ffffffff81045606>] ? native_safe_halt+0x6/0x10

[  125.299855]  [<ffffffff8101d5cf>] default_idle+0x4f/0x1a0

[  125.299855]  [<ffffffff8101ce99>] cpu_idle+0xd9/0x120
[  125.299855]  [<ffffffff81644245>] start_secondary+0x24c/0x24e

[  125.299855] ---[ end trace d23673089dd8f568 ]---
Comment 1 Jay Ren 2013-08-13 07:35:37 UTC
the following commit fixed the bug:

commit 205befd9a5c701b56f569434045821f413f08f6d
Author: Gleb Natapov <gleb@redhat.com>
Date:   Sun Aug 4 15:08:06 2013 +0300

    KVM: nVMX: correctly set tr base on nested vmexit emulation

    After commit 21feb4eb64e21f8dc91136b91ee886b978ce6421 tr base is zeroed
    during vmexit. Set it to L1's HOST_TR_BASE. This should fix
    https://bugzilla.kernel.org/show_bug.cgi?id=60679

    Reported-by: Yongjie Ren <yongjie.ren@intel.com>
    Reviewed-by: Arthur Chunqi Li <yzt356@gmail.com>
    Tested-by: Yongjie Ren <yongjie.ren@intel.com>
    Signed-off-by: Gleb Natapov <gleb@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Note You need to log in before you can comment on or make changes to this bug.