Bug 16537

Summary: TREE_RCU hangs at boot
Product: Platform Specific/Hardware Reporter: Rolf Eike Beer (eike-kernel)
Component: PA-RISCAssignee: Paul E. McKenney (paulmck)
Status: CLOSED CODE_FIX    
Severity: normal CC: akpm, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.35 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 16055    
Attachments: Config of 2.6.34.1

Description Rolf Eike Beer 2010-08-07 12:18:40 UTC
I upgraded my C3600 from 2.6.34.1 to 2.6.35 and now it hangs on boot. If I switch to TINY_RCU it works again.
Comment 1 Andrew Morton 2010-08-25 23:57:22 UTC
Reassigned to Paul.

Is this a regression?  Sounds like it.
Comment 2 Paul E. McKenney 2010-08-26 00:43:45 UTC
Hello, Rolf!  Could you please post the offending .config and console log?  If the log is a silent hang, could you please force stack traces from sysrq or whatever you have handy?
Comment 3 Rolf Eike Beer 2010-08-26 06:12:22 UTC
Created attachment 28001 [details]
Config of 2.6.34.1

Used that config for 2.6.35 with "make oldconfig".
Comment 4 Paul E. McKenney 2010-08-26 15:17:41 UTC
Thank you for the info, Rolf!

Could you please set CONFIG_RCU_CPU_STALL_DETECTOR=y and try again?  This diagnostic config option might print valuable information to the console.

(Of course, the console log from your previous hang would be very useful -- just knowing roughly where in the boot process the hang occurred would help me.)
Comment 5 Rolf Eike Beer 2010-08-28 07:38:53 UTC
Sorry, I had no time to try this yesterday. I'll be on vacation for the next 2 weeks.

Before I ran into this the issue was already known in #gentoo-hppa on Freenode IRC. Maybe you could ask for some tests there, the guys there also have affected machines at hand.
Comment 6 Paul E. McKenney 2010-08-30 14:29:29 UTC
Thank you, Rolf -- I pointed that IRC channel to this bug.
Comment 7 Rafael J. Wysocki 2010-08-30 17:36:55 UTC
On Monday, August 30, 2010, Rolf Eike Beer wrote:
> Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.34 and 2.6.35.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.34 and 2.6.35.  Please verify if it still should
> > be listed and let the tracking team know (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=16537
> > Subject             : TREE_RCU hangs at boot
> > Submitter   : Rolf Eike Beer <eike-kernel@sf-tec.de>
> > Date                : 2010-08-07 12:18 (23 days old)
> 
> Yes, this is a regression that is still unfixed.
Comment 8 Rafael J. Wysocki 2010-09-12 18:49:26 UTC
Handled-By : Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Comment 9 Rafael J. Wysocki 2010-09-21 19:07:06 UTC
On Tuesday, September 21, 2010, Paul E. McKenney wrote:
> On Mon, Sep 20, 2010 at 11:26:27AM -0700, Paul E. McKenney wrote:
> > On Sun, Sep 12, 2010 at 09:08:42PM +0200, Rafael J. Wysocki wrote:
> > > This message has been generated automatically as a part of a report
> > > of regressions introduced between 2.6.34 and 2.6.35.
> > > 
> > > The following bug entry is on the current list of known regressions
> > > introduced between 2.6.34 and 2.6.35.  Please verify if it still should
> > > be listed and let the tracking team know (either way).
> > > 
> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16537
> > > Subject           : TREE_RCU hangs at boot
> > > Submitter : Rolf Eike Beer <eike-kernel@sf-tec.de>
> > > Date              : 2010-08-07 12:18 (37 days old)
> > > Handled-By        : Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > I cannot reproduce this.  The function that is claimed to hang contains
> > only printk()s.  I have asked the guys who can actually reproduce it
> > to try a later kernel, and will let you know how this goes.
> > 
> > If you cannot trust printk(), who can you trust?  ;-)
> 
> It turns out that this problem is due to tab processing in the PA-RISC
> firmware, which certainly explains my inability to reproduce it.  It
> turns out that it is fixed by:
> 
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d9b68e5e88248bb24fd4e455588bea1d56108fd6
> 
> So we can close this regression.
> 
> Many thanks to Guy Martin <gmsoft@tuxicoman.be> for much help with
> this!!!