Created attachment 60022 [details] config-3.0.0-rc1 Host is 3.0.0-rc1, guest is 3.0.0-rc1. After shutting down the guest, KVM on the host uses 100% CPU. gdb shows for KVM: 6 Thread 0x7f00d1875700 (LWP 32105) 0x0000003e278ce2f7 in ioctl () at ../sysdeps/unix/syscall-template.S:82 5 Thread 0x7f00d1074700 (LWP 32106) 0x0000003e278ce2f7 in ioctl () at ../sysdeps/unix/syscall-template.S:82 4 Thread 0x7f00d0873700 (LWP 32107) 0x0000003e278ce2f7 in ioctl () at ../sysdeps/unix/syscall-template.S:82 3 Thread 0x7f00d0072700 (LWP 32108) 0x0000003e278ce2f7 in ioctl () at ../sysdeps/unix/syscall-template.S:82 2 Thread 0x7f00cf871700 (LWP 32109) 0x0000003e278ce2f7 in ioctl () at ../sysdeps/unix/syscall-template.S:82 * 1 Thread 0x7f00d1e95760 (LWP 32100) 0x0000003e278ceda3 in select () at ../sysdeps/unix/syscall-template.S:82 This seems to be a problem with the 3.0.0-rc1 guest, because if I boot 2.6.39 in the guest (still keeping same 3.0.0-rc1 running on the host), then shutdown completes normally, no 100% CPU usage loop. .config for KVM guest attached. KVM command-line (created by libvirt): /usr/bin/kvm -S -M pc-0.14 -cpu phenom,+wdt,+skinit,+osvw,+3dnowprefetch,+misalignsse,+sse4a,+abm,+cr8legacy,+extapic,+cmp_legacy,+lahf_lm,+rdtscp,+pdpe1gb,+popcnt,+cx16,+ht,+vme -enable-nesting -enable-kvm -m 3072 -smp 5,sockets=5,cores=1,threads=1 -name debian-enc -uuid 55782a0a-937c-fa7b-8db5-86876d151a08 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/debian-enc.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=readline -rtc base=utc -boot order=c,menu=off -drive file=/var/lib/libvirt/images/debian-enc.img,if=none,id=drive-virtio-disk0,boot=on,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=18,id=hostnet0,vhost=on,vhostfd=19 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:52:de:54,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 Bisecting...
Please also post - kvm_stat output - trace-cmd record -e kvm; trace-cmd report - qemu monitor (qemu) info registers (qemu) x/40i $eip - 32
Created attachment 60032 [details] kvm_stat output when the 100% cpu usage happens
Created attachment 60042 [details] trace-cmd record -e kvm; ^C; trace-cmd report (In reply to comment #1) > Please also post > > - kvm_stat output > - trace-cmd record -e kvm; trace-cmd report Doesn't seem to be part of mainline, are you refering to this trace-cmd? git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git
(In reply to comment #1) > - qemu monitor > (qemu) info registers > (qemu) x/40i $eip - 32 Hmm libvirt doesn't provide access to qemu monitor on purpose. Run it manually like this: /usr/bin/kvm -S -cpu phenom,+wdt,+skinit,+osvw,+3dnowprefetch,+misalignsse,+sse4a,+abm,+cr8legacy,+extapic,+cmp_legacy,+lahf_lm,+rdtscp,+pdpe1gb,+popcnt,+cx16,+ht,+vme -enable-nesting -enable-kvm -m 3072 -smp 5,sockets=5,cores=1,threads=1 -name debian-enc -uuid 55782a0a-937c-fa7b-8db5-86876d151a08 -rtc base=utc -boot order=c,menu=off -drive file=/var/lib/libvirt/images/debian-enc.img,if=none,id=drive-virtio-disk0,boot=on,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -usb -device usb-tablet,id=input0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -monitor stdio (qemu) info registers RAX=ffffffff81ca0ac0 RBX=ffff8800ba5a3d48 RCX=ffffffff81c9b020 RDX=0000000000000000 RSI=00000000000000ff RDI=00000000000000f0 RBP=ffff8800bf003f88 RSP=ffff8800bf003f88 R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000001 R11=00000090ffffff90 R12=0000000000000004 R13=ffff8800ba5a3d6c R14=ffff8800ba5a3d00 R15=ffff8800bf00d1c0 RIP=ffffffff81023f75 RFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=1 ES =0000 0000000000000000 ffffffff 00000000 CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0000 0000000000000000 ffffffff 00000000 FS =0000 00007fc3a9d4b700 ffffffff 00000000 GS =0000 ffff8800bf000000 ffffffff 00000000 LDT=0000 0000000000000000 0000ffff 00000000 TR =0040 ffff8800bf00e640 00002087 00008b00 DPL=0 TSS64-busy GDT= ffff8800bf004000 0000007f IDT= ffffffff81dd5000 00000fff CR0=8005003b CR2=00000000019093dc CR3=00000000bab17000 CR4=000006f0 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000d01 FCW=037f FSW=3800 [ST=7] FTW=80 MXCSR=00001f80 FPR0=8200000000000000 c01d FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=00ff0000000000000000000000000000 XMM01=00000000000000000000000000000000 XMM02=ffffffff00000000000000ff00000000 XMM03=ffffffff00000000000000ff00000000 XMM04=776f6e206c6c695700000000019039e0 XMM05=00000000000000000000000000000000 XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000 XMM08=00000000000000000000000000000000 XMM09=00000000000000000000000000000000 XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000 XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000 XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000 (qemu) x/40i $eip - 32 0xffffffff81023f55: leaveq 0xffffffff81023f56: retq 0xffffffff81023f57: nopw 0x0(%rax,%rax,1) 0xffffffff81023f60: push %rbp 0xffffffff81023f61: mov %rsp,%rbp 0xffffffff81023f64: sti 0xffffffff81023f65: hlt 0xffffffff81023f66: leaveq 0xffffffff81023f67: retq 0xffffffff81023f68: nopl 0x0(%rax,%rax,1) 0xffffffff81023f70: push %rbp 0xffffffff81023f71: mov %rsp,%rbp 0xffffffff81023f74: hlt 0xffffffff81023f75: leaveq 0xffffffff81023f76: retq 0xffffffff81023f77: nopw 0x0(%rax,%rax,1) 0xffffffff81023f80: push %rbp 0xffffffff81023f81: mov %rsp,%rbp 0xffffffff81023f84: clts 0xffffffff81023f86: leaveq 0xffffffff81023f87: retq 0xffffffff81023f88: nopl 0x0(%rax,%rax,1) 0xffffffff81023f90: push %rbp 0xffffffff81023f91: mov %rsp,%rbp 0xffffffff81023f94: mov %cr0,%rax 0xffffffff81023f97: leaveq 0xffffffff81023f98: retq 0xffffffff81023f99: nopl 0x0(%rax) 0xffffffff81023fa0: push %rbp 0xffffffff81023fa1: mov %rsp,%rbp 0xffffffff81023fa4: mov %rdi,%cr0 0xffffffff81023fa7: leaveq 0xffffffff81023fa8: retq 0xffffffff81023fa9: nopl 0x0(%rax) 0xffffffff81023fb0: push %rbp 0xffffffff81023fb1: mov %rsp,%rbp 0xffffffff81023fb4: mov %cr2,%rax 0xffffffff81023fb7: leaveq 0xffffffff81023fb8: retq 0xffffffff81023fb9: nopl 0x0(%rax)
On 05/30/2011 12:20 PM, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=36222 > > > > > > --- Comment #3 from Török Edwin<edwintorok@gmail.com> 2011-05-30 09:20:06 > --- > Created an attachment (id=60042) > --> (https://bugzilla.kernel.org/attachment.cgi?id=60042) > trace-cmd record -e kvm; ^C; trace-cmd report > > (In reply to comment #1) > > Please also post > > > > - kvm_stat output > > - trace-cmd record -e kvm; trace-cmd report > > Doesn't seem to be part of mainline, are you refering to this trace-cmd? > git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git > Yes. Install udis86 and udis86-devel before building it, please.
On 05/30/2011 12:26 PM, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=36222 > > > > > > --- Comment #4 from Török Edwin<edwintorok@gmail.com> 2011-05-30 09:25:58 > --- > (In reply to comment #1) > > - qemu monitor > > (qemu) info registers > > (qemu) x/40i $eip - 32 > 0xffffffff81023f74: hlt > 0xffffffff81023f75: leaveq > 0xffffffff81023f76: retq Looks like the ordinary idle loop. What does the guest say while it's shutting down? Seems like some daemon can't stop itself, not a host issue.
It says: Will now halt. _
And that corresponds to this script: log_action_msg "Will now halt" halt -d -f $netdown $poweroff $hddown
Using -serial stdio, and console=ttyS0 I see an Oops when running 'halt -f -p' in the guest: /[ 36.308445] invalid opcode: 0000 [#1] SMP [ 36.308445] CPU 1 [ 36.308445] Modules linked in: [ 36.308445] [ 36.308445] Pid: 0, comm: kworker/0:0 Not tainted 3.0.0-rc1 #8 Bochs Bochs [ 36.308445] RIP: 0010:[<ffffffff8101cf19>] [<ffffffff8101cf19>] native_play_dead+0xd9/0xf0 [ 36.308445] RSP: 0018:ffff8800bbeb1ef8 EFLAGS: 00010046 [ 36.308445] RAX: ffff8800bbeb0010 RBX: ffff8800bbeb0010 RCX: 0000000000000000 [ 36.308445] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800bbeb1ee4 [ 36.308445] RBP: ffff8800bbeb1f08 R08: ffff8800bbeb1edc R09: ffff8800bbeb1ed8 [ 36.308445] R10: 0000000874121b10 R11: 0000000000000000 R12: 00000000ffffffff [ 36.308445] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 36.308445] FS: 00007f0c93b3f700(0000) GS:ffff8800bf100000(0000) knlGS:0000000000000000 [ 36.308445] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 36.308445] CR2: 00007f9c8e7c0f80 CR3: 0000000001c93000 CR4: 00000000000006e0 [ 36.308445] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 36.308445] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 36.308445] Process kworker/0:0 (pid: 0, threadinfo ffff8800bbeb0000, task ffff8800bbe76740) [ 36.308445] Stack: [ 36.308445] ffff8800bbeb0010 ffffffff81d10810 ffff8800bbeb1f28 ffffffff810018ae [ 36.308445] 0000000000000001 0000000000000000 ffff8800bbeb1f48 ffffffff8162c73b [ 36.308445] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 36.308445] Call Trace: [ 36.308445] [<ffffffff810018ae>] cpu_idle+0x9e/0xb0 [ 36.308445] [<ffffffff8162c73b>] start_secondary+0x177/0x17c [ 36.308445] Code: 09 f4 65 48 8b 1c 25 c8 b5 00 00 48 81 eb c8 1f 00 00 0f 09 66 66 90 66 90 31 c9 0f 1f 80 00 00 00 00 0f ae 3b 48 89 d8 48 89 ca <0f> 01 c8 0f ae f0 4c 89 e0 0f 01 c9 eb e9 66 0f 1f 84 00 00 00 [ 36.308445] RIP [<ffffffff8101cf19>] native_play_dead+0xd9/0xf0 [ 36.308445] RSP <ffff8800bbeb1ef8> [ 36.308445] ---[ end trace b72c6a73fe4299e2 ]--- [ 36.330393] invalid opcode: 0000 [#2] SMP [ 36.330393] CPU 2 [ 36.330393] Modules linked in: [ 36.330393] [ 36.330393] Pid: 0, comm: kworker/0:1 Tainted: G D 3.0.0-rc1 #8 Bochs Bochs [ 36.330393] RIP: 0010:[<ffffffff8101cf19>] [<ffffffff8101cf19>] native_play_dead+0xd9/0xf0 [ 36.330393] RSP: 0018:ffff8800bbecfef8 EFLAGS: 00010046 [ 36.330393] RAX: ffff8800bbece010 RBX: ffff8800bbece010 RCX: 0000000000000000 [ 36.330393] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800bbecfee4 [ 36.330393] RBP: ffff8800bbecff08 R08: ffff8800bbecfedc R09: ffff8800bbecfed8 [ 36.330393] R10: 000000166d13e921 R11: 0000000000000000 R12: 00000000ffffffff [ 36.330393] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 36.330393] FS: 00007ff8f51e97a0(0000) GS:ffff8800bf200000(0000) knlGS:0000000000000000 [ 36.330393] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 36.330393] CR2: 00007ff8f4aa7920 CR3: 0000000001c93000 CR4: 00000000000006e0 [ 36.330393] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 36.330393] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 36.330393] Process kworker/0:1 (pid: 0, threadinfo ffff8800bbece000, task ffff8800bbec0ec0) [ 36.330393] Stack: [ 36.330393] ffff8800bbece010 ffffffff81d10810 ffff8800bbecff28 ffffffff810018ae [ 36.330393] 0000000000000002 0000000000000000 ffff8800bbecff48 ffffffff8162c73b [ 36.330393] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 36.330393] Call Trace: [ 36.330393] [<ffffffff810018ae>] cpu_idle+0x9e/0xb0 [ 36.330393] [<ffffffff8162c73b>] start_secondary+0x177/0x17c [ 36.330393] Code: 09 f4 65 48 8b 1c 25 c8 b5 00 00 48 81 eb c8 1f 00 00 0f 09 66 66 90 66 90 31 c9 0f 1f 80 00 00 00 00 0f ae 3b 48 89 d8 48 89 ca <0f> 01 c8 0f ae f0 4c 89 e0 0f 01 c9 eb e9 66 0f 1f 84 00 00 00 [ 36.330393] RIP [<ffffffff8101cf19>] native_play_dead+0xd9/0xf0 [ 36.330393] RSP <ffff8800bbecfef8> [ 36.330393] ---[ end trace b72c6a73fe4299e3 ]--- [ 36.330393] Kernel panic - not syncing: Attempted to kill the idle task! [ 36.330393] Pid: 0, comm: kworker/0:1 Tainted: G D 3.0.0-rc1 #8 [ 36.330393] Call Trace: [ 36.330393] [<ffffffff81630721>] panic+0x8c/0x190 [ 36.330393] [<ffffffff8104d033>] do_exit+0x6a3/0x8a0 [ 36.330393] [<ffffffff81049544>] ? kmsg_dump+0x44/0x100 [ 36.330393] [<ffffffff81005633>] oops_end+0x83/0xb0 [ 36.330393] [<ffffffff81005766>] die+0x56/0x90 [ 36.330393] [<ffffffff81002bcc>] do_trap+0xbc/0x160 [ 36.330393] [<ffffffff810030c0>] do_invalid_op+0x90/0xb0 [ 36.330393] [<ffffffff8101cf19>] ? native_play_dead+0xd9/0xf0 [ 36.330393] [<ffffffff8103a97b>] ? __dequeue_entity+0x2b/0x50 [ 36.330393] [<ffffffff81024c48>] ? pvclock_clocksource_read+0x58/0xd0 [ 36.330393] [<ffffffff8126f0c5>] ? timerqueue_del+0x35/0x90 [ 36.330393] [<ffffffff8106d396>] ? __remove_hrtimer+0x36/0xa0 [ 36.330393] [<ffffffff81634d5b>] invalid_op+0x1b/0x20 [ 36.330393] [<ffffffff8101cf19>] ? native_play_dead+0xd9/0xf0 [ 36.330393] [<ffffffff8101cec5>] ? native_play_dead+0x85/0xf0 [ 36.330393] [<ffffffff810018ae>] cpu_idle+0x9e/0xb0 [ 36.330393] [<ffffffff8162c73b>] start_secondary+0x177/0x17c [ 36.308445] Kernel panic - not syncing: Attempted to kill the idle task! [ 36.308445] Pid: 0, comm: kworker/0:0 Tainted: G D 3.0.0-rc1 #8 [ 36.308445] Call Trace: [ 36.308445] [<ffffffff81630721>] panic+0x8c/0x190 [ 36.308445] [<ffffffff8104d033>] do_exit+0x6a3/0x8a0 [ 36.308445] [<ffffffff81049544>] ? kmsg_dump+0x44/0x100 [ 36.308445] [<ffffffff81005633>] oops_end+0x83/0xb0 [ 36.308445] [<ffffffff81005766>] die+0x56/0x90 [ 36.308445] [<ffffffff81002bcc>] do_trap+0xbc/0x160 [ 36.308445] [<ffffffff810030c0>] do_invalid_op+0x90/0xb0 [ 36.308445] [<ffffffff8101cf19>] ? native_play_dead+0xd9/0xf0 [ 36.308445] [<ffffffff81024c48>] ? pvclock_clocksource_read+0x58/0xd0 [ 36.308445] [<ffffffff8126f0c5>] ? timerqueue_del+0x35/0x90 [ 36.308445] [<ffffffff8106d396>] ? __remove_hrtimer+0x36/0xa0 [ 36.308445] [<ffffffff81634d5b>] invalid_op+0x1b/0x20 [ 36.308445] [<ffffffff8101cf19>] ? native_play_dead+0xd9/0xf0 [ 36.308445] [<ffffffff8101cec5>] ? native_play_dead+0x85/0xf0 [ 36.308445] [<ffffffff810018ae>] cpu_idle+0x9e/0xb0
On 05/30/2011 02:09 PM, bugzilla-daemon@bugzilla.kernel.org wrote: > [ 36.308445] [<ffffffff810018ae>] cpu_idle+0x9e/0xb0 > [ 36.308445] [<ffffffff8162c73b>] start_secondary+0x177/0x17c > [ 36.308445] Code: 09 f4 65 48 8b 1c 25 c8 b5 00 00 48 81 eb c8 1f 00 00 0f > 09 66 66 90 66 90 31 c9 0f 1f 80 00 00 00 00 0f ae 3b 48 89 d8 48 89 ca<0f> > 01 > c8 0f ae f0 4c 89 e0 0f 01 c9 eb e9 66 0f 1f 84 00 00 00 0f 01 c8 monitor %rax,%rcx,%rdx The guest thinks it has monitor/mwait, which kvm doesn't emulate. What's the guest's /proc/cpuinfo?
With 3.0.0-rc1: processor : 4 vendor_id : AuthenticAMD cpu family : 16 model : 2 model name : AMD Phenom(tm) 9550 Quad-Core Processor stepping : 3 cpu MHz : 3214.320 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb lm rep_good nopl extd_apicid pni cx16 popcnt hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch npt bogomips : 6428.64 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: With 2.6.39 (which shuts down properly) it is the same.
Bisection tells me this is the offending commit (see below), but I see you found it too. commit 349c004e3d31fda23ad225b61861be38047fff16 Author: Christoph Lameter <cl@linux.com> Date: Sat Mar 12 12:50:10 2011 +0100 x86: A fast way to check capabilities of the current cpu Add this_cpu_has() which determines if the current cpu has a certain ability using a segment prefix and a bit test operation. For that we need to add bit operations to x86s percpu.h. Many uses of cpu_has use a pointer passed to a function to determine the current flags. That is no longer necessary after this patch. However, this patch only converts the straightforward cases where cpu_has is used with this_cpu_ptr. The rest is work for later. -tj: Rolled up patch to add x86_ prefix and use percpu_read() instead of percpu_read_stable(). Signed-off-by: Christoph Lameter <cl@linux.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
A patch referencing this bug report has been merged in v3.0-rc2: commit 4f3c125c7420c85eaff627145557e392a871922d Author: Avi Kivity <avi@redhat.com> Date: Mon May 30 08:23:57 2011 -0400 x86: Fix mwait_play_dead() faulting on mwait-incapable cpus