Bug 12465 - KVM guests stalling on 2.6.28 (bisected)
KVM guests stalling on 2.6.28 (bisected)
Status: CLOSED UNREPRODUCIBLE
Product: Process Management
Classification: Unclassified
Component: Scheduler
All Linux
: P1 normal
Assigned To: Ingo Molnar
:
Depends on:
Blocks: 11808
  Show dependency treegraph
 
Reported: 2009-01-17 03:37 UTC by Kevin Shanahan
Modified: 2012-03-13 23:43 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.29
Tree: Mainline
Regression: Yes


Attachments
kernel config (46.83 KB, text/plain)
2009-01-17 03:43 UTC, Kevin Shanahan
Details
git bisect log (2.51 KB, text/plain)
2009-01-17 03:43 UTC, Kevin Shanahan
Details
Notes from tests (5.73 KB, text/plain)
2009-01-17 03:45 UTC, Kevin Shanahan
Details
tracing setup script (956 bytes, text/plain)
2009-02-23 14:41 UTC, Kevin Shanahan
Details
trace output, sample 1 (58 bytes, text/plain)
2009-02-23 14:46 UTC, Kevin Shanahan
Details
trace output, sample 2 (58 bytes, text/plain)
2009-02-23 14:47 UTC, Kevin Shanahan
Details
trace output, sample 3 (58 bytes, text/plain)
2009-02-23 14:47 UTC, Kevin Shanahan
Details
trace output, sample 4 (58 bytes, text/plain)
2009-02-23 14:47 UTC, Kevin Shanahan
Details
exit kvm_vcpu_block if KVM_REQ_PENDING_TIMER is set (360 bytes, application/octet-stream)
2009-03-18 22:41 UTC, Marcelo Tosatti
Details

Description Kevin Shanahan 2009-01-17 03:37:17 UTC
Latest working kernel version: 2.6.27.10
Earliest failing kernel version: 2.6.28
Distribution: Debian (Lenny, amd64)
Hardware Environment: IBM x3550 (2 x Xeon 5130, 32GB RAM)
Software Environment: ...

Problem Description:

kvm guests are stalling periodically for up to several seconds when running under a 2.6.28 host kernel.

Steps to reproduce:

Launch one or more kvm guests (much easier to trigger if several are running). The guest I tested with is an 32-bit Debian GNU/Linux guest, launched with the following command line:

/usr/local/kvm/bin/qemu-system-x86_64 \
    -smp 2 \
    -localtime -m 2048 \
    -hda kvm-17-1.img \
    -hdb kvm-17-tmp.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:67,model=rtl8139 \
    -net tap,vlan=0,ifname=tap17,script=no \
    -vnc 127.0.0.1:17 -usbdevice tablet \
    -daemonize

If using the system interactively, the stalls are pretty obvious when they happen. As a non-interactive test, I just pinged the guest from the host (the guest tap interface is bridged to one of the host's NICs). There can be up to several seconds latency, e.g.:

--- hermes-old.wumi.org.au ping statistics ---
600 packets transmitted, 600 received, 0% packet loss, time 599681ms
rtt min/avg/max/mdev = 0.256/328.184/10335.007/1024.143 ms, pipe 11

I bisected it down to this commit between 2.6.27 and 2.6.28:

14800984706bf6936bbec5187f736e928be5c218 is first bad commit
commit 14800984706bf6936bbec5187f736e928be5c218
Author: Mike Galbraith <efault@gmx.de>
Date:   Fri Nov 7 15:26:50 2008 +0100

    sched: fine-tune SD_MC_INIT
    
    Tune SD_MC_INIT the same way as SD_CPU_INIT:
    unset SD_BALANCE_NEWIDLE, and set SD_WAKE_BALANCE.
    
    This improves vmark by 5%:
    
    vmark         132102 125968 125497 messages/sec    avg 127855.66    .984
    vmark         139404 131719 131272 messages/sec    avg 134131.66   1.033
    
    Signed-off-by: Mike Galbraith <efault@gmx.de>
    Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
    
     # *DOCUMENTATION*

:040000 040000 33b285285748415ddbb5b3dc947a70d9e7cafc94 f70fac391f313b567179b4de24fb3808d057ff87 M	include

I haven't yet tried just reverting that commit, but I'll do that next. I'll attach my config and my test results.
Comment 1 Kevin Shanahan 2009-01-17 03:39:09 UTC
This is the bug I initially filed against kvm:

https://sourceforge.net/tracker/index.php?func=detail&aid=2494730&group_id=180599&atid=893831
Comment 2 Kevin Shanahan 2009-01-17 03:43:09 UTC
Created attachment 19851 [details]
kernel config
Comment 3 Kevin Shanahan 2009-01-17 03:43:42 UTC
Created attachment 19852 [details]
git bisect log
Comment 4 Kevin Shanahan 2009-01-17 03:45:00 UTC
Created attachment 19853 [details]
Notes from tests
Comment 5 Kevin Shanahan 2009-01-17 05:02:39 UTC
The tests look good with just that commit reverted from 2.6.28.

commit 80412b323514b9009e65ab32624c39b6baa50138
Author: Kevin Shanahan <kmshanah@flexo.wumi.org.au>
Date:   Sat Jan 17 22:19:27 2009 +1030

    Revert "sched: fine-tune SD_MC_INIT"
    
    This reverts commit 14800984706bf6936bbec5187f736e928be5c218.

=====

Linux 2.6.28-00001-g80412b3

--- hermes-old.wumi.org.au ping statistics ---
900 packets transmitted, 900 received, 0% packet loss, time 899071ms
rtt min/avg/max/mdev = 0.116/0.270/24.916/0.912 ms

--- hermes-old.wumi.org.au ping statistics ---
900 packets transmitted, 900 received, 0% packet loss, time 899190ms
rtt min/avg/max/mdev = 0.122/0.243/2.298/0.182 ms
Comment 6 Kevin Shanahan 2009-01-28 17:29:11 UTC
Copy for bugzilla

-------- Forwarded Message --------
From: Kevin Shanahan <kmshanah@ucwb.org.au>
To: Avi Kivity <avi@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de>, Andrew
Morton <akpm@linux-foundation.org>, a.p.zijlstra@chello.nl,
torvalds@linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: [git pull] scheduler fixes
Date: Sun, 18 Jan 2009 20:10:04 +1030

On Sun, 2009-01-18 at 10:59 +0200, Avi Kivity wrote:
> A simple test running 10 idle guests over 4 cores doesn't reproduce 
> (2.6.29-rc).  However we likely need a more complicated setup.
> 
> Kevin, any details about your setup?  What are the guests doing?  Are 
> you overcommitting memory (is the host swapping)?

Hi Avi,

No, I don't think there is any swapping. The host has 32GB RAM and only
~9GB of guest RAM has been requested in total. The host doesn't do
anything but run KVM.

I fear my test setup is quite complicated as it is a production system
and I haven't been able to reproduce a simpler case as yet. Anyway, here
are the guests I have running during the test:

Windows XP, 32-bit (UP) - There are twelve of these launched with
identical scripts, apart from differing amounts of RAM. Six with "-m
512", one with "-m 768" and five with "-m 256":

KVM=/usr/local/kvm/bin/qemu-system-x86_64
$KVM \
    -no-acpi -localtime -m 512 \
    -hda kvm-00-xp.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:56,model=rtl8139 \
    -net tap,vlan=0,ifname=tap0,script=no \
    -vnc 127.0.0.1:0 -usbdevice tablet \
    -daemonize

Windows 2003, 32-bit (UP):

KVM=/usr/local/kvm/bin/qemu-system-x86_64
$KVM \
    -no-acpi \
    -localtime -m 1024 \
    -hda kvm-09-2003-C.img \
    -hdb kvm-09-2003-E.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:5F,model=rtl8139 \
    -net tap,vlan=0,ifname=tap9,script=no \
    -vnc 127.0.0.1:9 -usbdevice tablet \
    -daemonize

Debian Lenny, 64-bit (MP):

KVM=/usr/local/kvm/bin/qemu-system-x86_64
$KVM \
    -smp 2 \
    -m 512 \
    -hda kvm-10-lenny64.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:60,model=rtl8139 \
    -net tap,vlan=0,ifname=tap10,script=no \
    -vnc 127.0.0.1:10 -usbdevice tablet \
    -daemonize

Debian Lenny, 32-bit (MP):

KVM=/usr/local/kvm/bin/qemu-system-x86_64
$KVM \
    -smp 2 \
    -m 512 \
    -hda kvm-15-etch-i386.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:65,model=rtl8139 \
    -net tap,vlan=0,ifname=tap15,script=no \
    -vnc 127.0.0.1:15 -usbdevice tablet \
    -daemonize

Debian Etch (using a kernel.org 2.6.27.10) 32-bit (MP):

KVM=/usr/local/kvm/bin/qemu-system-x86_64
$KVM \
    -smp 2 \
    -m 2048 \
    -hda kvm-17-1.img \
    -hdb kvm-17-tmp.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:67,model=rtl8139 \
    -net tap,vlan=0,ifname=tap17,script=no \
    -vnc 127.0.0.1:17 -usbdevice tablet \
    -daemonize

This last VM is the one I was pinging from the host.

I begin by launching the guests at 90 second intervals and then let them
settle for about 5 minutes. The final guest runs Apache with RT and a
postgresql database. I set my browser to refresh the main RT page every
2 minutes, to roughly simulate the common usage. With all the guests
basically idling and just the last guest servicing the RT requests, the
load average is about 0.25 or less. Then I just ping the last guest for
about 15 minutes.

One thing that may be of interest - a couple of times while running the
test I also had a terminal logged into the guest being pinged and
running "top". There was some crazy stuff showing up in there, with some
processes having e.g. 1200 in the "%CPU" column. I remember postmaster
(postgresql) being one of them. A quick scan over the top few tasks
might show the total "%CPU" adding up to 2000 or more. This tended to
happen at the same time I saw the latency spikes in ping as well. That
might just be an artifact of timing getting screwed up in "top", I
guess.

Regards,
Kevin.


Comment 7 Rafael J. Wysocki 2009-02-15 06:42:54 UTC
On Sunday 15 February 2009, Kevin Shanahan wrote:
> On Sat, 2009-02-14 at 21:50 +0100, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.27 and 2.6.28.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.27 and 2.6.28.  Please verify if it still should
> > be listed and let me know (either way).
> 
> Yes, this should still be listed.
> 
> I just tested against 2.6.29-rc5 and the problem is as bad as ever
> (perhaps worse?)
> 
> --- hermes-old.wumi.org.au ping statistics ---
> 900 packets transmitted, 448 received, +317 errors, 50% packet loss, time 899845ms
> rtt min/avg/max/mdev = 0.131/420.015/10890.699/1297.022 ms, pipe 11
> 
> The guest being pinged crashed during the test - the QEMU monitor was
> accessible, but the guest didn't respond to "sendkey alt-sysrq-s", etc.
> This was the last thing in the guest syslog after reboot:
> 
> Feb 15 19:48:58 hermes-old kernel: ------------[ cut here ]------------
> Feb 15 19:48:58 hermes-old kernel: WARNING: at net/sched/sch_generic.c:219 dev_watchdog+0x111/0x195()
> Feb 15 19:48:58 hermes-old kernel: NETDEV WATCHDOG: eth0 (8139too): transmit timed out
> Feb 15 19:48:58 hermes-old kernel: Pid: 0, comm: swapper Not tainted 2.6.27.10 #1
> Feb 15 19:48:58 hermes-old kernel:  [<c011d75c>] warn_slowpath+0x5c/0x81
> Feb 15 19:48:58 hermes-old kernel:  [<c02f5f7c>] nf_hook_slow+0x44/0xb1
> Feb 15 19:48:58 hermes-old kernel:  [<c02d93f1>] dev_queue_xmit+0x3da/0x411
> Feb 15 19:48:58 hermes-old kernel:  [<c030043d>] ip_finish_output+0x1f9/0x231
> Feb 15 19:48:58 hermes-old kernel:  [<c01daeee>] __next_cpu+0x12/0x21
> Feb 15 19:48:58 hermes-old kernel:  [<c0116b42>] find_busiest_group+0x232/0x69f
> Feb 15 19:48:58 hermes-old kernel:  [<c01160dc>] update_curr+0x41/0x65
> Feb 15 19:48:58 hermes-old kernel:  [<c02e33b5>] dev_watchdog+0x111/0x195
> Feb 15 19:48:58 hermes-old kernel:  [<c011822f>] enqueue_task_fair+0x16/0x24
> Feb 15 19:48:58 hermes-old kernel:  [<c0115645>] enqueue_task+0xa/0x14
> Feb 15 19:48:58 hermes-old kernel:  [<c01156d5>] activate_task+0x16/0x1b
> Feb 15 19:48:58 hermes-old kernel:  [<c0119c8c>] try_to_wake_up+0x131/0x13a
> Feb 15 19:48:58 hermes-old kernel:  [<c02e32a4>] dev_watchdog+0x0/0x195
> Feb 15 19:48:58 hermes-old kernel:  [<c012424c>] run_timer_softirq+0xf5/0x14a
> Feb 15 19:48:58 hermes-old kernel:  [<c0120f60>] __do_softirq+0x5d/0xc1
> Feb 15 19:48:58 hermes-old kernel:  [<c0120ff6>] do_softirq+0x32/0x36
> Feb 15 19:48:58 hermes-old kernel:  [<c012112c>] irq_exit+0x35/0x40
> Feb 15 19:48:58 hermes-old kernel:  [<c010e8db>] smp_apic_timer_interrupt+0x6e/0x7b
> Feb 15 19:48:58 hermes-old kernel:  [<c01035ac>] apic_timer_interrupt+0x28/0x30
> Feb 15 19:48:58 hermes-old kernel:  [<c0107386>] default_idle+0x2a/0x3d
> Feb 15 19:48:58 hermes-old kernel:  [<c0101900>] cpu_idle+0x5c/0x84
> Feb 15 19:48:58 hermes-old kernel:  =======================
> Feb 15 19:48:58 hermes-old kernel: ---[ end trace eff10a8043ac4e7b ]---
> Feb 15 19:49:01 hermes-old kernel: eth0: Transmit timeout, status 0d 0000 c07f media d0.
> Feb 15 19:49:01 hermes-old kernel: eth0: Tx queue start entry 839  dirty entry 839.
> Feb 15 19:49:01 hermes-old kernel: eth0:  Tx descriptor 0 is 0008a03c.
> Feb 15 19:49:01 hermes-old kernel: eth0:  Tx descriptor 1 is 0008a062.
> Feb 15 19:49:01 hermes-old kernel: eth0:  Tx descriptor 2 is 0008a062.
> Feb 15 19:49:01 hermes-old kernel: eth0:  Tx descriptor 3 is 0008a05b. (queue head)
> Feb 15 19:49:01 hermes-old kernel: eth0: link up, 100Mbps, full-duplex, lpa 0x05E1
> 
> I think I saw some patches to fix the latency tracer for non-RT tasks on
> the mailing list a while ago. If that's still going to be a useful test,
> can someone give me some hints on which kernel tree and/or patches to
> download to get that working? The simpler you can make it, the better ;)

Comment 8 Kevin Shanahan 2009-02-22 03:52:41 UTC
-------- Forwarded Message --------
From: Ingo Molnar <mingo@elte.hu>
To: Kevin Shanahan <kmshanah@ucwb.org.au>
Cc: Rafael J. Wysocki <rjw@sisk.pl>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Kernel Testers List <kernel-testers@vger.kernel.org>, Mike Galbraith <efault@gmx.de>, bugme-daemon@bugzilla.kernel.org, Steven Rostedt <rostedt@goodmis.org>, Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected) [Bug 12465]
Date: Sun, 15 Feb 2009 11:04:54 +0100

* Kevin Shanahan <kmshanah@ucwb.org.au> wrote:

> On Sat, 2009-02-14 at 21:50 +0100, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.27 and 2.6.28.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.27 and 2.6.28.  Please verify if it still should
> > be listed and let me know (either way).
> 
> Yes, this should still be listed.
> 
> I just tested against 2.6.29-rc5 and the problem is as bad as ever
> (perhaps worse?)
> 
> --- hermes-old.wumi.org.au ping statistics ---
> 900 packets transmitted, 448 received, +317 errors, 50% packet loss, time 899845ms
> rtt min/avg/max/mdev = 0.131/420.015/10890.699/1297.022 ms, pipe 11

i looked at the trace you did earlier and which you uploaded to:

  http://disenchant.net/tmp/bug-12465/trace-1/

Here is one 3 seconds (!) latency:

 0)  qemu-sy-4237  |               |      kvm_vcpu_block() {
 0)  qemu-sy-4237  |               |        kvm_cpu_has_interrupt() {
 0)  qemu-sy-4237  |               |          kvm_apic_has_interrupt() {
 0)  qemu-sy-4237  |   0.291 us    |          }
 0)  qemu-sy-4237  |               |          kvm_apic_accept_pic_intr() {
 0)  qemu-sy-4237  |   0.291 us    |          }
 0)  qemu-sy-4237  |   1.476 us    |        }
 0)  qemu-sy-4237  |               |        kvm_cpu_has_pending_timer() {
 0)  qemu-sy-4237  |   0.325 us    |        }
 0)  qemu-sy-4237  |               |        kvm_arch_vcpu_runnable() {
 0)  qemu-sy-4237  |   0.288 us    |        }
 0)  qemu-sy-4237  |               |        kvm_arch_vcpu_put() {
 0)  qemu-sy-4237  |   0.415 us    |        }
 0)  qemu-sy-4237  |               |        schedule() {
 0)  qemu-sy-4237  |               |          wakeup_preempt_entity() {
 0)  qemu-sy-4237  |   0.300 us    |          }
 ------------------------------------------
 0)  qemu-sy-4237  =>   ksoftir-4   
 ------------------------------------------

 0)   ksoftir-4    | ! 3010470 us |  }
 ------------------------------------------
 0)   ksoftir-4    =>  qemu-sy-4355 
 ------------------------------------------

 0)  qemu-sy-4355  |   1.575 us    |          }
 0)  qemu-sy-4355  |   6.520 us    |        }
 0)  qemu-sy-4355  |   7.121 us    |      }
 0)  qemu-sy-4355  |               |      __wake_up() {
 0)  qemu-sy-4355  |               |        __wake_up_common() {
 0)  qemu-sy-4355  |               |          autoremove_wake_function() {
 0)  qemu-sy-4355  |               |            default_wake_function() {

qemu-sy-4237 has been scheduled away, and the system appeared to have done
nothing in the meantime. That's not something that really looks like a
scheduler regression - there is nothing the scheduler can do if KVM
decides to block a task.

It would be nice to enhance this single-CPU trace some more - to more
surgically see what is going on. Firstly, absolute timestamps would be
nice:

  echo funcgraph-abstime  > trace_options
  echo funcgraph-proc     > trace_options

as it's a bit hard to see the global timescale of events.

Secondly, not all events are included - in particular i dont really see
the points when packets are passed. Would it be possible to add a tracing
hypercall so that the guest kernel can inject trace events that can be seen
on the native-side trace? Regarding ping latencies really just two things
matter: the loopback network device's rx and tx path. We should trace the
outgoing sequence number and the incoming sequence number of IP packets,
and inject that to the host side. This way we can correlate the delays
precisely.

	Ingo
Comment 9 Kevin Shanahan 2009-02-22 03:53:42 UTC
-------- Forwarded Message --------
From: Kevin Shanahan <kmshanah@ucwb.org.au>
To: Ingo Molnar <mingo@elte.hu>
Cc: Rafael J. Wysocki <rjw@sisk.pl>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Kernel Testers List <kernel-testers@vger.kernel.org>, Mike Galbraith <efault@gmx.de>, bugme-daemon@bugzilla.kernel.org, Steven Rostedt <rostedt@goodmis.org>, Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected) [bug 12465]
Date: Sun, 22 Feb 2009 21:09:19 +1030

On Sun, 2009-02-15 at 11:04 +0100, Ingo Molnar wrote:
> qemu-sy-4237 has been scheduled away, and the system appeared to have done
> nothing in the meantime. That's not something that really looks like a
> scheduler regression - there is nothing the scheduler can do if KVM
> decides to block a task.
> 
> It would be nice to enhance this single-CPU trace some more - to more
> surgically see what is going on. Firstly, absolute timestamps would be
> nice:
> 
>   echo funcgraph-abstime  > trace_options
>   echo funcgraph-proc     > trace_options
> 
> as it's a bit hard to see the global timescale of events.

I was going to try and grab the trace with absolute timestamps tonight,
but that option doesn't seem to be available in Linus' current kernel.

flexo:/sys/kernel/debug/tracing# echo 0 > tracing_enabled
flexo:/sys/kernel/debug/tracing# echo function_graph > current_tracer
flexo:/sys/kernel/debug/tracing# echo funcgraph-proc > trace_options
flexo:/sys/kernel/debug/tracing# echo funcgraph-abstime  > trace_options
-su: echo: write error: Invalid argument
flexo:/sys/kernel/debug/tracing# cat trace_options 
print-parent nosym-offset nosym-addr noverbose noraw nohex nobin noblock
nostacktrace nosched-tree ftrace_printk noftrace_preempt nobranch
annotate nouserstacktrace nosym-userobj noprintk-msg-only
nofuncgraph-overrun funcgraph-cpu funcgraph-overhead funcgraph-proc 
flexo:/sys/kernel/debug/tracing# uname -a
Linux flexo 2.6.29-rc5-00299-gadfafef #6 SMP Sun Feb 22 20:09:37 CST
2009 x86_64 GNU/Linux

What am I missing?

Cheers,
Kevin.
Comment 10 Ingo Molnar 2009-02-22 08:13:01 UTC
* bugme-daemon@bugzilla.kernel.org <bugme-daemon@bugzilla.kernel.org> wrote:

> I was going to try and grab the trace with absolute timestamps 
> tonight, but that option doesn't seem to be available in 
> Linus' current kernel.

you need latest -tip:

  http://people.redhat.com/mingo/tip.git/README

which includes the latest tracing tree as well.

(-tip has no KVM changes so the issue should still be present in 
it.)

	Ingo

Comment 11 Kevin Shanahan 2009-02-23 14:39:52 UTC
The bugme-daemon email gateway still hates me.
Emailed response pasted below.

The traces I uploaded were:
  trace5.txt.bz2
  trace6.txt.bz2
  trace7.txt.bz2
  trace8.txt.bz2

I'll attach them here if bugzilla will accept the file sizes.

-------- Forwarded Message --------
From: Kevin Shanahan <kmshanah@ucwb.org.au>
To: Ingo Molnar <mingo@elte.hu>
Cc: Rafael J. Wysocki <rjw@sisk.pl>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Kernel Testers List <kernel-testers@vger.kernel.org>, Mike Galbraith <efault@gmx.de>, bugme-daemon@bugzilla.kernel.org, Steven Rostedt <rostedt@goodmis.org>, Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected) [Bug 12465]
Date: Mon, 23 Feb 2009 22:08:58 +1030

On Sun, 2009-02-15 at 11:04 +0100, Ingo Molnar wrote:
> It would be nice to enhance this single-CPU trace some more - to more
> surgically see what is going on. Firstly, absolute timestamps would be
> nice:
> 
>   echo funcgraph-abstime  > trace_options
>   echo funcgraph-proc     > trace_options
> 
> as it's a bit hard to see the global timescale of events.

Okay, here's some more trace data. I grabbed a few samples at different
times during the ping test. I think the data in files trace6.txt and
trace8.txt coincided with some of the biggest delays.

  http://disenchant.net/tmp/bug-12465/trace-2/

This is captured on 2.6.29-rc5-tip-02057-gaad11ad. The kvm guest being
pinged is process 11211:

  flexo:~# pstree -p 11211
  qemu-system-x86(11211)─┬─{qemu-system-x86}(11212)
                         ├─{qemu-system-x86}(11213)
                         └─{qemu-system-x86}(11609)

Cheers,
Kevin.

> Secondly, not all events are included - in particular i dont really see
> the points when packets are passed. Would it be possible to add a tracing
> hypercall so that the guest kernel can inject trace events that can be seen
> on the native-side trace? Regarding ping latencies really just two things
> matter: the loopback network device's rx and tx path. We should trace the
> outgoing sequence number and the incoming sequence number of IP packets,
> and inject that to the host side. This way we can correlate the delays
> precisely.
> 
> 	Ingo
Comment 12 Kevin Shanahan 2009-02-23 14:41:57 UTC
Created attachment 20336 [details]
tracing setup script
Comment 13 Kevin Shanahan 2009-02-23 14:46:31 UTC
Created attachment 20337 [details]
trace output, sample 1
Comment 14 Kevin Shanahan 2009-02-23 14:47:07 UTC
Created attachment 20338 [details]
trace output, sample 2
Comment 15 Kevin Shanahan 2009-02-23 14:47:21 UTC
Created attachment 20339 [details]
trace output, sample 3
Comment 16 Kevin Shanahan 2009-02-23 14:47:33 UTC
Created attachment 20340 [details]
trace output, sample 4
Comment 17 Rafael J. Wysocki 2009-02-23 17:38:06 UTC
On Tuesday 24 February 2009, Kevin Shanahan wrote:
> On Mon, 2009-02-23 at 23:03 +0100, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.27 and 2.6.28.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.27 and 2.6.28.  Please verify if it still should
> > be listed and let me know (either way).
> 
> Yes, the problem should still be listed.
> The bug is still present as recently as 2.6.29-rc5-00299-gadfafef.

Comment 18 Rafael J. Wysocki 2009-02-25 14:52:27 UTC
Handled-By : Avi Kivity <avi@redhat.com>
Comment 19 Kevin Shanahan 2009-03-05 16:07:19 UTC
On Thu, 2009-03-05 at 08:29 +0100, Mike Galbraith wrote:
> On Wed, 2009-03-04 at 13:38 +1030, Kevin Shanahan wrote:
> > The traces are there waiting to be looked at. If there's anything else I
> > can do to help things along, please let me know.
> 
> I've been staring at them.  They're loaded with kipmi0 sequences as
> below.  Having zero success at parsing.. 77ms __schedule()?  Not.
> 
>  5812.647954 |   0)   kipmi0-501   |   0.357 us    |          perf_counter_task_sched_in();
>  5812.647954 |   0)   kipmi0-501   | ! 3902.489 us |        }
>  5812.647955 |   0)   kipmi0-501   | ! 3903.102 us |      }
>  5812.647955 |   0)   kipmi0-501   | ! 3903.928 us |    }
>  5812.647955 |   0)   kipmi0-501   | ! 3904.572 us |  }
>  5812.647957 |   0)   kipmi0-501   |               |  schedule_timeout_interruptible() {
>  5812.647958 |   0)   kipmi0-501   |               |    schedule_timeout() {
>  5812.647958 |   0)   kipmi0-501   |               |      schedule() {
>  5812.647958 |   0)   kipmi0-501   |               |        __schedule() {
>  5812.647959 |   0)   kipmi0-501   |               |          perf_counter_task_sched_out() {
>  5812.647960 |   0)   kipmi0-501   |   0.325 us    |            __perf_counter_sched_out();
>  5812.647960 |   0)   kipmi0-501   |   0.956 us    |          }
>  5812.647961 |   0) qemu-sy-11119  |   0.343 us    |                perf_counter_task_sched_in();
>  5812.647962 |   0) qemu-sy-11119  |               |                kvm_sched_in() {
>  5812.647962 |   0) qemu-sy-11119  |               |                  kvm_arch_vcpu_load() {
>  5812.647963 |   0) qemu-sy-11119  |   0.313 us    |                    kvm_write_guest_time();
>  5812.647963 |   0) qemu-sy-11119  |   1.203 us    |                  }
>  5812.647964 |   0) qemu-sy-11119  |   1.809 us    |                }
>  5812.647964 |   0) qemu-sy-11119  | ! 77180.30 us |              }
>  5812.647964 |   0) qemu-sy-11119  | ! 77180.90 us |            }
>  5812.647965 |   0) qemu-sy-11119  | ! 77181.51 us |          }
>  5812.647965 |   0) qemu-sy-11119  | ! 77182.12 us |        }
>  5812.647965 |   0) qemu-sy-11119  | ! 77182.71 us |      }
>  5812.647966 |   0) qemu-sy-11119  |               |      kvm_inject_pending_timer_irqs() {
>  5812.647966 |   0) qemu-sy-11119  |   0.427 us    |        kvm_inject_apic_timer_irqs();
>  5812.647967 |   0) qemu-sy-11119  |               |        kvm_inject_pit_timer_irqs() {
>  5812.647967 |   0) qemu-sy-11119  |               |          kvm_set_irq() {
>  5812.647968 |   0) qemu-sy-11119  |   0.460 us    |            kvm_ioapic_set_irq();
>  5812.647969 |   0) qemu-sy-11119  |               |            kvm_pic_set_irq() {
>  5812.647969 |   0) qemu-sy-11119  |   0.312 us    |              kvm_vcpu_kick();
>  5812.647970 |   0) qemu-sy-11119  |   1.145 us    |            }
>  5812.647970 |   0) qemu-sy-11119  |   2.655 us    |          }
>  5812.647970 |   0) qemu-sy-11119  |               |          kvm_set_irq() {
>  5812.647971 |   0) qemu-sy-11119  |   0.315 us    |            kvm_ioapic_set_irq();
>  5812.647971 |   0) qemu-sy-11119  |   0.478 us    |            kvm_pic_set_irq();
>  5812.647972 |   0) qemu-sy-11119  |   1.719 us    |          }
>  5812.647972 |   0) qemu-sy-11119  |   5.465 us    |        }
>  5812.647973 |   0) qemu-sy-11119  |   6.797 us    |      }


On Thu, 2009-03-05 at 10:45 +0100, Mike Galbraith wrote:
> On Thu, 2009-03-05 at 08:29 +0100, Mike Galbraith wrote:
> > I've been staring at them.  They're loaded with kipmi0 sequences as
> > below.  Having zero success at parsing.. 77ms __schedule()?  Not.
> 
> >  5812.647964 |   0) qemu-sy-11119  | ! 77180.30 us |              }
> 
> Hrmph, despite not being able to determine where such scary numbers come
> from, timestamp wise it still looks to me like the scheduler is doing
> it's thing properly, and whatever is going wonky is in KVM.
> 
> Not very useful I know, but thought I should let you know that someone
> is pondering those traces.
> 
> 	-Mike
Comment 20 Avi Kivity 2009-03-15 02:17:13 UTC
I'm afraid I can't really read the traces.  Can someone who can decode the format comment?
Comment 21 Rafael J. Wysocki 2009-03-15 03:30:06 UTC
References : http://lkml.org/lkml/2009/3/15/51
Comment 22 Marcelo Tosatti 2009-03-17 16:30:59 UTC
Unless I'm misinterpreting the trace, the following seems to indicate KVM 
is not at fault (correct me otherwise).

From trace6.txt from http://disenchant.net/tmp/bug-12465/trace-2/. I've extracted 
the relevant part into http://people.redhat.com/~mtosatti/chunk-of-trace6.txt.

You can see qemu being scheduled out, cat scheduled in, and ksoftirq scheduled in. Only after 3 seconds qemu-kvm is scheduled back in again.

Note that KVM _is not_ emulating halt for the guest (which waits for events in 
kvm_vcpu_block).

The state of each vcpu is as follows:

qemu-sy-11194: kvm_vcpu_block
qemu-sy-11133: kvm_vcpu_block
qemu-sy-11134: kvm_vcpu_block

qemu-sy-11213: not blocked on vcpu block, scheduled out by cat. _Could_ be waiting on IO, but that should be visible in the trace?

qemu-sy-11118: poll_schedule_timeout, the IO thread.


Halt emulation can be seen in http://people.redhat.com/~mtosatti/chunk-of-trace6-good.txt.

Kevin, is the guest using periodic ticks? /proc/timer_list output from the guest would help. Have you attempted to use -no-kvm-irqchip?



 
Comment 23 Kevin Shanahan 2009-03-17 16:59:12 UTC
Periodic ticks, yes I think so - you mean this info?

$ grep HZ hermesold.0 
# CONFIG_NO_HZ is not set
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
CONFIG_HZ_300=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=300

kmshanah@hermes-old:~$ cat /proc/timer_list 
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 124344651194856 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1237209293010817006 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6631eec>, tick_sched_timer, S:01
 # expires at 124344650898867 nsecs [in -295989 nsecs]
 #1: <f6631eec>, it_real_fn, S:01
 # expires at 124369186053044 nsecs [in 24534858188 nsecs]
 #2: <f6631eec>, hrtimer_wakeup, S:01
 # expires at 124388537770723 nsecs [in 43886575867 nsecs]
 #3: <f6631eec>, it_real_fn, S:01
 # expires at 149486451698113 nsecs [in 25141800503257 nsecs]
  .expires_next   : 124344650898867 nsecs
  .hres_active    : 1
  .nr_events      : 24547649
  .nohz_mode      : 0
  .idle_tick      : 0 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 0
  .idle_calls     : 0
  .idle_sleeps    : 0
  .idle_entrytime : 0 nsecs
  .idle_waketime  : 0 nsecs
  .idle_exittime  : 0 nsecs
  .idle_sleeptime : 0 nsecs
  .last_jiffies   : 0
  .next_jiffies   : 0
  .idle_expires   : 0 nsecs
jiffies: 37213398

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1237209293010817006 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6631eec>, tick_sched_timer, S:01
 # expires at 124344651107200 nsecs [in -87656 nsecs]
 #1: <f6631eec>, sched_rt_period_timer, S:01
 # expires at 124345000000000 nsecs [in 348805144 nsecs]
 #2: <f6631eec>, it_real_fn, S:01
 # expires at 124372954500821 nsecs [in 28303305965 nsecs]
 #3: <f6631eec>, it_real_fn, S:01
 # expires at 124489098233043 nsecs [in 144447038187 nsecs]
 #4: <f6631eec>, it_real_fn, S:01
 # expires at 124635106565989 nsecs [in 290455371133 nsecs]
 #5: <f6631eec>, it_real_fn, S:01
 # expires at 124635106671613 nsecs [in 290455476757 nsecs]
 #6: <f6631eec>, hrtimer_wakeup, S:01
 # expires at 126159569385803 nsecs [in 1814918190947 nsecs]
  .expires_next   : 124344651107200 nsecs
  .hres_active    : 1
  .nr_events      : 24416849
  .nohz_mode      : 0
  .idle_tick      : 0 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 0
  .idle_calls     : 0
  .idle_sleeps    : 0
  .idle_entrytime : 0 nsecs
  .idle_waketime  : 0 nsecs
  .idle_exittime  : 0 nsecs
  .idle_sleeptime : 0 nsecs
  .last_jiffies   : 0
  .next_jiffies   : 0
  .idle_expires   : 0 nsecs
jiffies: 37213398


Tick Device: mode:     1
Clock Event Device: pit
 max_delta_ns:   27461866
 min_delta_ns:   12571
 mult:           5124677
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: pit_next_event
 set_mode:       init_pit_timer
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000000
tick_broadcast_oneshot_mask: 00000000


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   134208338
 min_delta_ns:   1000
 mult:           268454204
 shift:          32
 mode:           3
 next_event:     124344650898867 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   134208338
 min_delta_ns:   1000
 mult:           268454204
 shift:          32
 mode:           3
 next_event:     124344651107200 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

As for the -no-kvm-irqchip, yes tried it but it didn't obviously solve the problem. I have an email around here somewhere:

-------- Forwarded Message --------
From: Kevin Shanahan <kmshanah@ucwb.org.au>
To: Avi Kivity <avi@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>, Rafael J. Wysocki <rjw@sisk.pl>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Kernel Testers List <kernel-testers@vger.kernel.org>, Mike Galbraith <efault@gmx.de>, bugme-daemon@bugzilla.kernel.org, Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)
Date: Wed, 21 Jan 2009 04:24:44 +1030

On Tue, 2009-01-20 at 15:04 +0200, Avi Kivity wrote:
> Kevin Shanahan wrote:
> > On Tue, 2009-01-20 at 12:35 +0100, Ingo Molnar wrote:
> >> This only seems to occur under KVM, right? I.e. you tested it with -no-kvm 
> >> and the problem went away, correct?
> >>     
> >
> > Well, the I couldn't make the test conditions identical, but it the
> > problem didn't occur with the test I was able to do:
> >
> >   http://marc.info/?l=linux-kernel&m=123228728416498&w=2
> >
> >   
> 
> Can you also try with -no-kvm-irqchip?
> 
> You will need to comment out the lines
> 
>     /* ISA IRQs map to GSI 1-1 except for IRQ0 which maps
>      * to GSI 2.  GSI maps to ioapic 1-1.  This is not
>      * the cleanest way of doing it but it should work. */
> 
>     if (vector == 0)
>         vector = 2;
> 
> in qemu/hw/apic.c (should also fix -no-kvm smp).  This will change kvm 
> wakeups to use signals rather than the in-kernel code, which may be buggy.

Okay, I commented out those lines and compiled a new kvm-82 userspace
and kernel modules. Using those on a vanilla 2.6.28 kernel, with all
guests run with -no-kvm-irqchip added.

As before a number of the XP guests wanted to chug away at 100% CPU
usage for a long time. Three of the guests clocked up ~40 minutes CPU
time before I decided to just shut them down. Perhaps coincidentally,
these three guests are the only ones with Office 2003 installed on them.
That could be the difference between those guests and the other XP
guests, but that's probably not important for now.

The two Linux SMP guests booted okay this time, though they seem to only
use one CPU on the host (I guess kvm is not multi-threaded in this
mode?). "hermes-old", the guest I am pinging in all my tests, had a lot
of trouble running the apache2 setup - it was so slow it was difficult
to load a complete page from our RT system. The kvm process for this
guest was taking up 100% cpu on the host constantly and all sorts of
wierd stuff could be seen by watching top in the guest:

top - 03:44:17 up 43 min,  1 user,  load average: 3.95, 1.55, 0.80
Tasks: 101 total,   4 running,  97 sleeping,   0 stopped,   0 zombie
Cpu(s): 79.7%us, 10.4%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  9.9%si,
0.0%st
Mem:   2075428k total,   391128k used,  1684300k free,    13044k buffers
Swap:  3502160k total,        0k used,  3502160k free,   118488k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND             
 2956 postgres  20   0 19704  11m  10m S 1658  0.6   2:55.99 postmaster          
 2934 www-data  20   0 60392  40m 5132 R   31  2.0   0:17.28 apache2             
 2958 postgres  20   0 19700  11m 9.8m R   28  0.6   0:20.41 postmaster          
 2940 www-data  20   0 58652  38m 5016 S   27  1.9   0:04.87 apache2             
 2937 www-data  20   0 60124  40m 5112 S   18  2.0   0:11.00 apache2             
 2959 postgres  20   0 19132 5424 4132 S   10  0.3   0:01.50 postmaster          
 2072 postgres  20   0  8064 1416  548 S    7  0.1   0:23.71 postmaster          
 2960 postgres  20   0 19132 5368 4060 R    6  0.3   0:01.55 postmaster          
 2071 postgres  20   0  8560 1972  488 S    5  0.1   0:08.33 postmaster    

Running the ping test with without apache2 running in the guest:

--- hermes-old.wumi.org.au ping statistics ---
900 packets transmitted, 900 received, 0% packet loss, time 902740ms
rtt min/avg/max/mdev = 0.568/3.745/272.558/16.990 ms

And with apache2 running:

--- hermes-old.wumi.org.au ping statistics ---
900 packets transmitted, 900 received, 0% packet loss, time 902758ms
rtt min/avg/max/mdev = 0.625/25.634/852.739/76.586 ms

In both cases it's quite variable, but the max latency is still not as
bad as when running with the irq chip enabled.

Anyway, the test is again not ideal, but I hope we're proving something.
That's all I can do for tonight - should be ready for more again
tomorrow night.

Regards,
Kevin.
Comment 24 Marcelo Tosatti 2009-03-18 22:41:32 UTC
Created attachment 20591 [details]
exit kvm_vcpu_block if KVM_REQ_PENDING_TIMER is set
Comment 25 Marcelo Tosatti 2009-03-18 22:41:54 UTC
Kevin,

Can you please try the attached patch (test-pending-timer-bit.patch) top of 
git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git.

Thanks
Comment 26 Marcelo Tosatti 2009-03-18 22:46:09 UTC
Also try -no-kvm-pit, and see if there is a difference on maximum latency (that option will not affect performance as much as -no-kvm-irqchip).

 
Comment 27 Kevin Shanahan 2009-03-19 06:57:31 UTC
(In reply to comment #25)
> Can you please try the attached patch (test-pending-timer-bit.patch) top of 
> git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git.

Okay, compiled and tested on 2.6.29-rc8-02003-g4dcf262 + your patch. Unfortunately it doesn't look like this is the fix.

--- hermes-old.wumi.org.au ping statistics ---
900 packets transmitted, 900 received, 0% packet loss, time 899280ms
rtt min/avg/max/mdev = 0.122/497.769/15491.621/1624.426 ms, pipe 16

--- hermes-old.wumi.org.au ping statistics ---
900 packets transmitted, 900 received, 0% packet loss, time 899503ms
rtt min/avg/max/mdev = 0.132/865.812/16075.744/2240.968 ms, pipe 17
Comment 28 Kevin Shanahan 2009-03-19 08:35:06 UTC
(In reply to comment #26)
> Also try -no-kvm-pit, and see if there is a difference on maximum latency (that
> option will not affect performance as much as -no-kvm-irqchip).

Same kernel as above (2.6.29-rc8-02003-g4dcf262 + patch), but with all guests using -no-kvm-pit:

--- hermes-old.wumi.org.au ping statistics ---
900 packets transmitted, 900 received, 0% packet loss, time 899207ms
rtt min/avg/max/mdev = 0.118/51.113/4071.994/323.490 ms, pipe 5

--- hermes-old.wumi.org.au ping statistics ---
900 packets transmitted, 900 received, 0% packet loss, time 900201ms
rtt min/avg/max/mdev = 0.126/535.958/14692.164/1669.703 ms, pipe 15
Comment 29 Kevin Shanahan 2009-03-20 22:02:39 UTC
-------- Forwarded Message --------
From: Kevin Shanahan <kmshanah@ucwb.org.au>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Avi Kivity <avi@redhat.com>, Rafael J. Wysocki <rjw@sisk.pl>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Kernel Testers List <kernel-testers@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de>, Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)
Date: Sat, 21 Mar 2009 15:30:42 +1030

On Thu, 2009-03-19 at 07:54 +1030, Kevin Shanahan wrote:
> On Wed, 2009-03-18 at 11:46 +1030, Kevin Shanahan wrote:
> > On Wed, 2009-03-18 at 01:20 +0100, Frederic Weisbecker wrote:
> > > Ok, I've made a small script based on yours which could do this job.
> > > You will just have to set yourself a threshold of latency
> > > that you consider as buggy. I don't remember the latency you observed.
> > > About 5 secs right?
> > > 
> > > It's the "thres" variable in the script.
> > > 
> > > The resulting trace should be a mixup of the function graph traces
> > > and scheduler events which look like this:
> > > 
> > >  gnome-screensav-4691  [000]  6716.774277:   4691:120:S ==> [000]     0:140:R <idle>
> > >   xfce4-terminal-4723  [001]  6716.774303:   4723:120:R   + [001]  4289:120:S Xorg
> > >   xfce4-terminal-4723  [001]  6716.774417:   4723:120:S ==> [001]  4289:120:R Xorg
> > >             Xorg-4289  [001]  6716.774427:   4289:120:S ==> [001]     0:140:R <idle>
> > > 
> > > + is a wakeup and ==> is a context switch.
> > > 
> > > The script will loop trying some pings and will only keep the trace that matches
> > > the latency threshold you defined.
> > > 
> > > Tell if the following script work for you.
> 
> ...
> 
> > Either way, I'll try to get some results in my maintenance window
> > tonight.
> 
> Testing did not go so well. I compiled and booted
> 2.6.29-rc8-tip-02630-g93c4989, but had some problems with the system
> load when I tried to start tracing - it shot up to around 16-20 or so. I
> started shutting down VMs to try and get it under control, but before I
> got back to tracing again the machine disappeared off the network -
> unresponsive to ping.
> 
> When I got in this morning, there was nothing on the console, nothing in
> the logs to show what went wrong. I will try again, but my next chance
> will probably be Saturday. Stay tuned.

Okay, new set of traces have been uploaded to:

  http://disenchant.net/tmp/bug-12465/trace-3/

These were done on the latest tip, which I pulled down this morning:
2.6.29-rc8-tip-02744-gd9937cb.

The system load was very high again when I first tried to trace with
sevarl guests running, so I ended up only having the one guest running
and thankfully the bug was still reproducable that way.

Fingers crossed this set of traces is able to tell us something.

Regards,
Kevin.
Comment 30 Kevin Shanahan 2009-03-26 20:26:19 UTC
-------- Forwarded Message --------
From: Kevin Shanahan <kmshanah@ucwb.org.au>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Avi Kivity <avi@redhat.com>, Rafael J. Wysocki <rjw@sisk.pl>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Kernel Testers List <kernel-testers@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de>, Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)
Date: Fri, 27 Mar 2009 06:53:05 +1030

On Tue, 2009-03-24 at 12:44 +0100, Frederic Weisbecker wrote:
> As I explained in my previous mail, you trace is only
> a snapshot that happened in 10 msec.
> 
> I experimented different sizes for the ring buffer but even
> a 1 second trace require 20 Mo of memory. And a so huge trace
> would be impractical.
> 
> I think we should keep the trace filters we had previously.
> If you don't minde, could you please retest against latest -tip
> the following updated patch? Iadded the filters, fixed the python
> subshell and also flushed the buffer more nicely according to
> a recent feature in -tip:
> 
> echo > trace 
> 
> instead of switching to nop.
> You will need to pull latest -tip again.

Ok, new set of traces uploaded again here:

  http://disenchant.net/tmp/bug-12465/trace-4/

These were taken using 2.6.29-tip-02749-g398bf09.

Same as last time, it was only necessary to have the one guest running
to reproduce the problem.

Cheers,
Kevin.
Comment 31 Florian Mickler 2012-03-13 22:55:32 UTC
Is this still a problem on newer kernels?
Comment 32 Kevin Shanahan 2012-03-13 23:00:19 UTC
No longer a problem. Currently using Linux 3.2.9 and qemu-kvm 1.0.

Note You need to log in before you can comment on or make changes to this bug.