Bug 69361 - Host call trace and guest hang after create guest.
Summary: Host call trace and guest hang after create guest.
Status: RESOLVED CODE_FIX
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-24 03:04 UTC by Zhou, Chao
Modified: 2014-02-18 09:33 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.13.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments
host-dmesg (195.66 KB, text/plain)
2014-01-24 03:06 UTC, Zhou, Chao
Details
host-serial (128.15 KB, text/plain)
2014-01-24 03:09 UTC, Zhou, Chao
Details
host-cmdline (124 bytes, text/plain)
2014-01-24 03:11 UTC, Zhou, Chao
Details
host kernel config (106.68 KB, text/plain)
2014-01-26 01:31 UTC, Zhou, Chao
Details

Description Zhou, Chao 2014-01-24 03:04:14 UTC
Environment:
------------
Host OS (ia32/ia32e/IA64):ia32e
Guest OS (ia32/ia32e/IA64):ia32e
Guest OS Type (Linux/Windows):Linux
kvm.git Commit:c760f5e29d92adf5184589f1e616a4be146fb57c
qemu.git Commit:732c66ce641c69702a7e7fdb73b68f0c1b583ab5
Host Kernel Version:3.13.0
Hardware:Ivytown_EP, Romley_EP



Bug detailed description:
--------------------------
when create a guest, the host will call trace and guest will hang/

note:
1.this should be a kernel bug
kvm      + qemu     = result
7650b687 + 732c66ce = good
c760f5e2 + 732c66ce = bad
2.I create guest 3 times, the bug will reproduce one time.

Reproduce steps:
----------------
1.start guest:
qemu-system-x86_64 -enable-kvm -m 2048 -smp 2 -net none rhel6u4.img

Current result:
----------------
guest hang, host call trace

Expected result:
----------------
guest and host work fine

Basic root-causing log:
----------------------
INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected stalls on CPUs/tasks: { 0} (detected by 27, t=21004 jiffies, g=5092, c=5091, q=637)
sending NMI to all CPUs:
NMI backtrace for cpu 0
CPU: 0 PID: 10728 Comm: qemu-system-x86 Not tainted 3.13.0 #2
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.99.99.x056.071020121508 07/10/2012
task: ffff88043d192d20 ti: ffff880433ac2000 task.ti: ffff880433ac2000
RIP: 0010:[<ffffffff812427e5>]  [<ffffffff812427e5>] delay_tsc+0x28/0x4b
RSP: 0018:ffff8800bd003b88  EFLAGS: 00000097
RAX: 0000000028ee59d7 RBX: ffffffff81cae4d0 RCX: 0000000028ee58e7
RDX: 00000000000000f0 RSI: 0000000000000000 RDI: 0000000000000a86
RBP: ffff8800bd003b88 R08: 0000000000000000 R09: ffffffff81a91888
R10: 0000000000000001 R11: 0000000000000000 R12: 00000000000026de
R13: 0000000000000060 R14: 0000000000000001 R15: 000000000000002a
FS:  00007fb28ca2f700(0000) GS:ffff8800bd000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000043d4e2000 CR4: 00000000000427e0
Stack:
 ffff8800bd003b98 ffffffff81242856 ffff8800bd003bc8 ffffffff812d252d
 ffffffff81cae4d0 0000000000000000 ffffffff81cae4d0 0000000000000006
 ffff8800bd003c18 ffffffff812d4511 ffffffff81bc3f80 00000005812405f4
Call Trace:
 <IRQ>  [<ffffffff81242856>] __const_udelay+0x28/0x2a
 [<ffffffff812d252d>] wait_for_xmitr+0x46/0x8e
 [<ffffffff812d4511>] serial8250_console_write+0xbc/0xfc
 [<ffffffff8107c772>] call_console_drivers.clone.1+0xb4/0xc6
 [<ffffffff8107c881>] console_cont_flush.clone.0+0xfd/0x114
 [<ffffffff8107c8cf>] console_unlock+0x37/0x217
 [<ffffffff8107cfbe>] vprintk_emit+0x3c3/0x3ec
 [<ffffffff8147d93d>] printk+0x48/0x4a
 [<ffffffff81084db9>] print_cpu_stall+0x25/0x129
 [<ffffffff8108514c>] __rcu_pending+0x9f/0x1cc
 [<ffffffff8108533c>] rcu_check_callbacks+0xc3/0x144
 [<ffffffff81047c86>] update_process_times+0x3c/0x65
 [<ffffffff8108d975>] tick_sched_handle+0x45/0x54
 [<ffffffff8108db36>] tick_sched_timer+0x58/0x77
 [<ffffffff8105ad82>] __run_hrtimer+0xd6/0x161
 [<ffffffff8108dade>] ? tick_nohz_handler+0xab/0xab
 [<ffffffff8105b16b>] hrtimer_interrupt+0xd0/0x1bc
 [<ffffffff810283c5>] local_apic_timer_interrupt+0x53/0x58
 [<ffffffff81029035>] smp_apic_timer_interrupt+0x3e/0x51
 [<ffffffff8148884a>] apic_timer_interrupt+0x6a/0x70
 <EOI>  [<ffffffff81082627>] ? __srcu_read_unlock+0xa/0x18
 [<ffffffffa022b498>] vcpu_enter_guest+0x46f/0x696 [kvm]
 [<ffffffff8108260d>] ? __srcu_read_lock+0x39/0x49
 [<ffffffffa022b726>] __vcpu_run+0x67/0x1bb [kvm]
 [<ffffffffa022f3e4>] kvm_arch_vcpu_ioctl_run+0xef/0x1ac [kvm]
 [<ffffffffa021d3e2>] kvm_vcpu_ioctl+0x121/0x4b5 [kvm]
 [<ffffffff8104a884>] ? do_sigtimedwait+0x8e/0x19f
 [<ffffffff81134125>] do_vfs_ioctl+0x2a2/0x2be
 [<ffffffff81091735>] ? SyS_futex+0x103/0x13d
 [<ffffffff8113419a>] SyS_ioctl+0x59/0x7d
 [<ffffffff81487ca2>] system_call_fastpath+0x16/0x1b
Code: 00 c9 c3 55 48 89 e5 65 8b 34 25 70 b0 00 00 66 66 90 0f ae e8 0f 31 89 c1 66 66 90 0f ae e8 0f 31 89 c2 29 ca 39 fa 73 23 f3 90 <65> 44 8b 04 25 70 b0 00 00 44 39 c6 74 e0 29 c1 01 cf 66 66 90
NMI backtrace for cpu 1
CPU: 1 PID: 10727 Comm: qemu-system-x86 Not tainted 3.13.0 #2
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.99.99.x056.071020121508 07/10/2012
task: ffff8804310612b0 ti: ffff88043418c000 task.ti: ffff88043418c000
RIP: 0010:[<ffffffffa02a055a>]  [<ffffffffa02a055a>] vmx_vcpu_run+0x3f3/0x4c3 [kvm_intel]
RSP: 0018:ffff88043418dcb8  EFLAGS: 00000046
RAX: 0000000080000202 RBX: 0000000000000200 RCX: ffff8808377d00c0
RDX: 0000000000004404 RSI: 0000000000000002 RDI: ffff8808377d00c0
RBP: ffff88043418dd08 R08: ffffffff81c07720 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000004 R15: 0000000000000000
FS:  00007fb28d230700(0000) GS:ffff8800bd020000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000043d4e2000 CR4: 00000000000427e0
Stack:
 0000000000000000 ffff8808377d00c0 ffff8808377d00c0 00000002377d00c0
 ffff88043418dd08 ffff8808377d00c0 0000000000000000 ffff8808377d00f0
 0000000000000000 0000000000000000 ffff88043418dd98 ffffffffa022b594
Call Trace:
 [<ffffffffa022b594>] vcpu_enter_guest+0x56b/0x696 [kvm]
 [<ffffffffa0243252>] ? __apic_accept_irq+0x130/0x1ea [kvm]
 [<ffffffffa0243362>] ? kvm_apic_local_deliver+0x56/0x5c [kvm]
 [<ffffffffa022b726>] __vcpu_run+0x67/0x1bb [kvm]
 [<ffffffffa022f3e4>] kvm_arch_vcpu_ioctl_run+0xef/0x1ac [kvm]
 [<ffffffffa021d3e2>] kvm_vcpu_ioctl+0x121/0x4b5 [kvm]
 [<ffffffff81134125>] do_vfs_ioctl+0x2a2/0x2be
 [<ffffffffa022937a>] ? kvm_on_user_return+0x4f/0x51 [kvm]
 [<ffffffff8113419a>] SyS_ioctl+0x59/0x7d
 [<ffffffff81487ca2>] system_call_fastpath+0x16/0x1b
Code: 00 80 3d 12 03 00 80 75 05 e8 01 e2 ff ff 85 db 79 22 81 e3 00 07 00 00 81 fb 00 02 00 00 75 14 48 8b 7d b8 e8 d3 23 f8 ff cd 02 <48> 8b 7d b8 e8 d7 23 f8 ff f6 05 52 5d 01 00 20 48 8b 45 b8 8b
Comment 1 Zhou, Chao 2014-01-24 03:06:58 UTC
Created attachment 123241 [details]
host-dmesg
Comment 2 Zhou, Chao 2014-01-24 03:09:34 UTC
Created attachment 123251 [details]
host-serial
Comment 3 Zhou, Chao 2014-01-24 03:11:11 UTC
Created attachment 123261 [details]
host-cmdline
Comment 4 Marcelo Tosatti 2014-01-24 19:52:08 UTC
Can you make the host kernel config file available?
Comment 5 Zhou, Chao 2014-01-26 01:31:33 UTC
Created attachment 123351 [details]
host kernel config
Comment 6 Marcelo Tosatti 2014-01-26 19:16:54 UTC
Could be a duplicate of 

http://marc.info/?l=linux-kernel&m=139038631607917&q=raw

Can you try that patch please
Comment 7 Zhou, Chao 2014-01-27 05:37:51 UTC
(In reply to Marcelo Tosatti from comment #6)
> Could be a duplicate of 
> http://marc.info/?l=linux-kernel&m=139038631607917&q=raw

Can you try that
> patch please

after apply this patch to kvm.git commit:c760f5e29d92adf5184589f1e616a4be146fb57c, both guest and host work fine, the bug can't reproduce.
Comment 8 Robert Ho 2014-02-17 09:12:14 UTC
Is this patch in upstream now? any update?
Comment 9 Paolo Bonzini 2014-02-18 09:14:00 UTC
Yes, commit 215393bc1fab3d61a5a296838bdffce22f27ffda.
Comment 10 Robert Ho 2014-02-18 09:19:30 UTC
(In reply to Paolo Bonzini from comment #9)
> Yes, commit 215393bc1fab3d61a5a296838bdffce22f27ffda.

May I know which branch it is committed?
Comment 11 Paolo Bonzini 2014-02-18 09:33:09 UTC
It is in v3.14-rc1

Note You need to log in before you can comment on or make changes to this bug.