Bug 41442

Summary: rcu_sched_state detected stall on CPU 0, when booting on Xen
Product: Virtualization Reporter: Witold Baryluk (witold.baryluk+kernel)
Component: XenAssignee: Konrad Rzeszutek Wilk (konrad.wilk)
Status: CLOSED UNREPRODUCIBLE    
Severity: normal CC: florian, konrad.wilk, maciej.rutecki, paulmck, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.0.0+ Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 40982    
Attachments: Prevent boot-time RCU grace periods

Description Witold Baryluk 2011-08-21 04:06:24 UTC
i386, 32 bits, Debian sid/unstable. Pentium-m (IBM ThinkPad T43). Using Debian's xen-4.1-i386.gz as hypervisor and booting using grub-pc 1.99-11.

This happens for me both on Debian shipped 3.0.0-1-686-pae, and my custom kernel 3.1.0-rc2+. Booting both kernels on bare metal make it work.

Stall happens early (1.9 s after kernel decompression) in boot/init process, and prevents from doing anything useful. It is like this

  1.900000 PCI: Using ACPI for IRQ routing
 61.900000 INFO: rcu_sched_state detected stall on CPU 0, when booting on Xen (t=6000 jiffies)
242.900000 INFO: rcu_sched_state detected stall on CPU 0, when booting on Xen (t=6000 jiffies)
422.900000 INFO: rcu_sched_state detected stall on CPU 0, when booting on Xen (t=6000 jiffies)

On 3.0.0-1-686-pae which have slightly more debuging I have a more verbose call trace, however I'm not able to have it all (it is probably displaying call trace for 2 kernel threads - I cannot scroll console, and do not have phisical serial console available).

It randomly displays one of three this stack traces:


(
xen_force_evtchn_callback+0xc/0x10
check_events+0x8/0xc
xen_restore_fl_direct_reloc+0x4/0x4
arch_locale_irq_restore+0x6/0x7
kmem_cache_alloc_trace+0x38/
__reserver_region_with_split+0x28/0xd1
|
xen_force_evtchn_callback+0xc/0x10
check_events+0x8/0xc
xen_restore_fl_direct_reloc+0x4/0x4
arch_locale_irq_restore+0x6/0x7
__reserver_region_with_split+0x6a/0xd1
)?
reserve_region_with_split+0x38/0x44
e820_reserver_resources_late+0xeb/0x10d
pcibios_allocate_bus_resources+0x83/0x92
pci_legacy_init+0x39/0x39
pcibios_resource_survey+0x20/0x25
pcibios_init+0x20/0x31
pci_subsys_init+0x3c/0x40
do_one_initcall+0x68/0x110
kernel_init+0xa4/0x11b
start_kernel+0x33f/0x33f
kernel_thread_helper+0x6/0x10
Comment 1 Witold Baryluk 2011-08-21 04:08:51 UTC
Should be t=6000, t=24030, t=42060. On Debian shiped kernel, I have call trace dumps every 60 seconds. I do not understand why on my custom optimized kernel, I have this INFO not every 60 (I still have RCU_CPU_STALL_TIMEOUT=60 in kernel config).
Comment 2 Witold Baryluk 2011-08-21 04:14:11 UTC
Looks that this topic have been seen on linux-kernel o xen-devel mailing list few weeks ago.

http://lists.xensource.com/archives/html/xen-devel/2011-07/msg00331.html
http://osdir.com/ml/linux-kernel/2011-07/msg00036.html
Comment 3 Witold Baryluk 2011-08-21 04:24:52 UTC
Few interesting things. System is uniprocess. But both kernels are built with SMP support (to have DOM0 support). Also I have disabled CONFIG_RCU_FAST_NO_HZ, however have CONFIG_NO_HZ=1. I also boot with "threadirqs" kernel boot option.
Comment 4 Paul E. McKenney 2011-08-21 04:47:05 UTC
Created attachment 69512 [details]
Prevent boot-time RCU grace periods

Is this the guest OS getting the stalls?  If so, is the guest running on a single emulated CPU or more than one?

Either way, does this patch help?
Comment 5 Witold Baryluk 2011-08-21 06:47:13 UTC
No, this is DOM0 running on single CPU. (evne if guest, it couldn't be more than 1, as I have only one core in system).

Help doesn't helped. :(
Comment 6 Witold Baryluk 2011-08-21 06:48:34 UTC
Oops. I wanted to say that patch doesn't helped with my problem. I still have RCU stall, and no progress in booting.
Comment 7 Witold Baryluk 2011-08-22 03:11:02 UTC
Any other idea? Is this is happening to other people, or is this just me? If so why?

This bug have somehow high priority for me, as it completely prevents me from booting system on new kernels.

Thanks.
Comment 8 Paul E. McKenney 2011-08-22 04:17:32 UTC
There are a couple of additional things to try:

1. Try disabling the rcu_cpu_stall_suppress module parameter at boot time via the kernel boot arguments.  If your kernel boots just fine, then we have a problem with RCU (see #2 below), otherwise RCU is correctly pointing us at a problem (see #3 below).

2. Pull the rcu/next commit series from the -rcu git tree at git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git -- I gave you the patch that seemed to me to be the most likely to address the problem if RCU is at fault, but perhaps one of the remaining patches will help.

3. Examine the code called from __reserver_region_with_split() and see if there is an infinite loop.  This sort of thing is what the RCU stall warnings were designed to detect.  See Documentation/RCU/stallwarn.txt for more information.  Sorry I cannot help more here, as I am not a Xen expert.
Comment 9 Witold Baryluk 2011-08-24 17:11:31 UTC
I was trying booting with rcu_cpu_stall_suppress=1 and rcu_cpu_stall_suppress=0, but non helps when booting on Xen. I did not wait, so do not know if message is printed, but kernel looked to be stalled anyway.

I do not see reason why RCU stall detection here was buggy. From the beginning, I was thinking, that it is infinite loop in reserve_region_with_split (basing on call trace), just wanted someone more aware of this code to look into it, especially that bug only appears under Xen.


I will disable memory balooning support in kernel, to see if this can be a problem.  I also have some problem with locating reserver_region_with_split function, I see where it is called from, but I cannot find its definition in kernel tree. :(
Comment 10 Konrad Rzeszutek Wilk 2011-08-29 14:13:18 UTC
Witold,

Can you:
 1). Attach a serial console to your laptop and on the Xen command line add:
"console=com1 com1=115200,8n1 loglvl=all" and on the Linux command line:
"console=hvc0 debug loglevel=8 initcall_debug" and attach the whole serial console to this bug?
 2). Bug says "regression" - what version of Xen and Linux kernel worked on your machine before this?
Comment 11 Witold Baryluk 2011-09-01 17:41:40 UTC
1) Unfortunately, this laptop doesn't have serial console port (at least without port replicator, which I do not have). I will try booting same xen and kernel on desktop machine.

2) Yes, this is regression, however I'm probably comparing here to the Debian kernel, which isn't based on pv_ops, but older xen sources.

I was trying to do test multiboot booting using qemu, however I do not know to pass additional modules (there is only kernel/initrd options).
Comment 12 Konrad Rzeszutek Wilk 2011-09-01 19:40:50 UTC
(In reply to comment #11)
> 1) Unfortunately, this laptop doesn't have serial console port (at least
> without port replicator, which I do not have). I will try booting same xen
> and
> kernel on desktop machine.

They are pretty cheap on eBay. But yes, please try the same environment on a desktop.

> 
> 2) Yes, this is regression, however I'm probably comparing here to the Debian
> kernel, which isn't based on pv_ops, but older xen sources.

That is not a regression. A regression on this bugzilla system is against stock (ftp.kernel.org) Linux kernels.

> 
> I was trying to do test multiboot booting using qemu, however I do not know
> to
> pass additional modules (there is only kernel/initrd options).

Can you make an disk image that has your normal partition?
Comment 13 Witold Baryluk 2011-09-01 21:03:14 UTC
> Can you make an disk image that has your normal partition?

Nice trick, it was easy!

However, in qemu bug doesn't appear.

I cannot reproduce this bug with 3.1.0-rc4-00131-g9e79e3e or with Debian's 3.0.0-1-686-pae. Both boots without problem.

I guess, will need serial port. Eventually delay each line of debug output somehow to write it down. :)

Well, buying will be simpler. There are two port replicators on allegro. 12.90 zł :) Will report back in few days.
Comment 14 Florian Mickler 2011-09-02 09:43:14 UTC
There is a kernel commandline to delay message printing. Also if you can upload it to youtube (readable) or take a couple of pictures, writing it down will not be necessary.
Comment 15 Witold Baryluk 2011-09-02 11:29:00 UTC
I'm aware of kernel message delaying (boot_delay=MSECS kernel command line option). I even have modified printk.c to have some other usefull delaying mechanisms. 

However I do not know way to delay xen hypervisor console messages.
Comment 16 Konrad Rzeszutek Wilk 2011-09-02 13:08:08 UTC
netconsol could work too - you would have to make sure to build your e1000 in the kernel.
Comment 17 Konrad Rzeszutek Wilk 2011-09-02 15:09:34 UTC
> However I do not know way to delay xen hypervisor console messages.

You can do this: "console=vga vga=text-80x50,keep console_to_ring sync_console loglvl=all" on the Xen hypervisor line.

And naturally on your Linux "console=hvc0 debug loglevel=8 earlyprintk=xen" and add that slow-down config option - that should make the whole output be much much slower.
Comment 18 Witold Baryluk 2011-09-03 17:16:09 UTC
Well, I have some photos http://smp.if.uj.edu.pl/~baryluk/xen-debuging-2011-09-02/ but movie was of very bad quality. Unfortunetly not everything is delayed, so I still do not have time to take photo. Photos show multiple boots actually, and one of them end actually in different error. I can also reproduce it (it is probably bug due to the console and message delaying enabled). Will have serial console in Tuesday. However can try taking more photos for better coverage of boot process :)

> netconsol could work too - you would have to make sure to build your e1000 in
the kernel.
netconsole will not work. my computer is using tg3 module, and it needs firmware from userspace (udev) to be functional.
Comment 19 Konrad Rzeszutek Wilk 2011-09-06 16:23:51 UTC
Looking at your pictures, I think the rcu_sched_state is printing its message b/c of:
http://smp.if.uj.edu.pl/~baryluk/xen-debuging-2011-09-02/DSC01919.JPG
which has:
BUG: soft lockup - CPU#0 stuck for 23s!

But before we got that route to figure out what went wrong there, lets see what the serial console with all the fancy debug options (http://wiki.xensource.com/xenwiki/XenParavirtOps) tells us.

In regards to the firmware - you can make the kernel have the firmware built in. Not sure exactly how - but there is a thread on LKML about radeon and firmware - and it is a similar concept.
Comment 20 Witold Baryluk 2012-02-07 13:11:08 UTC
I have now working serial console (it was disabled in BIOS, which prevented early boot from using it, and I forgot password to BIOS, so I couldn't change enable it back), but actually I cannot reproduce problem anymore. I tested at 3.3-rc2+ or 3.2.4. Both boots, and works on Xen.

Now with working serial console, I could try to bisect problem, but because it is working now, it is of small priority (I have other testing and error fixing to do first).
 
@19, Konrad, as of building in a firmware into kernel, I already discovered this. I will remember this in future, when trying to use netconsole!
Comment 21 Florian Mickler 2012-02-11 15:46:26 UTC
Thx for reporting back. I will close it as unreproducible.