Subject : Analyzed/Solved: Booting 2.6.30-rc2-git7 very slow Submitter : Martin Knoblauch <spamtrap@knobisoft.de> Date : 2009-04-24 12:45 References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4 This entry is being used for tracking a regression from 2.6.28. Please don't close it until the problem is fixed in the mainline.
The reason for the problem is that "/proc/mounts" contains two entries for "sysfs": [root@lpsdm52 hotplug]# uname -a Linux lpsdm52 2.6.30-rc3-git2-nfs_ra #3 SMP Mon Apr 27 10:21:31 CEST 2009 x86_64 x86_64 x86_64 GNU/Linux [root@lpsdm52 hotplug]# grep sysfs /proc/mounts none /sys sysfs rw,relatime 0 0 /sys /sys sysfs rw,relatime 0 0 Which breaks the RHEL-4.3 provides script "/etc/hotplug/firmware.agent". The mount-path is now determined to be "/sys\n/sys". In turn every driver using the firmware-loader now fails and times out on the value in "/sys/class/firmware/timeout". There is a simple fix to the firmware agent, but this behaviour is still a regression. Cheers Martin
On Monday 27 April 2009, Martin Knoblauch wrote: > > ----- Original Message ---- > > > From: Martin Knoblauch <spamtrap@knobisoft.de> > > To: Rafael J. Wysocki <rjw@sisk.pl>; Linux Kernel Mailing List > <linux-kernel@vger.kernel.org> > > Cc: Kernel Testers List <kernel-testers@vger.kernel.org> > > Sent: Monday, April 27, 2009 9:18:53 AM > > Subject: Re: [Bug #13178] Booting very slow > > > > ----- Original Message ---- > > > > > From: Rafael J. Wysocki > > > To: Linux Kernel Mailing List > > > Cc: Kernel Testers List ; Martin Knoblauch > > > > > Sent: Sunday, April 26, 2009 11:46:31 AM > > > Subject: [Bug #13178] Booting very slow > > > > > > This message has been generated automatically as a part of a report > > > of regressions introduced between 2.6.28 and 2.6.29. > > > > > > The following bug entry is on the current list of known regressions > > > introduced between 2.6.28 and 2.6.29. Please verify if it still should > > > be listed and let me know (either way). > > > > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13178 > > > Subject : Booting very slow > > > Submitter : Martin Knoblauch > > > Date : 2009-04-24 12:45 (3 days old) > > > References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4 > > > > Not really sure whether this is a real regression. Between 2.6.28 and > 2.6.29 the > > content of /proc/mounts for sysfs changed from > > > > /sys /sys sysfs rw 0 0 > > > > to > > > > none /sys sysfs rw 0 0 > > > > > > This breaks RHEL-4.3 userland which parses /proc/mounts in the firmware > hotplug > > agent to find the mount-point for sysfs. As a result firmware loading > started to > > fail in 2.6.29. There is a simple fix in the /etc/hotplug/firmware.agent > script > > (just assume /sys as it is done elsewhere). > > > > Your call. > > > > Cheers > > Martin > > Actually I have to correct myself. The reason for the failure to parse > /proc/mounts for "sysfs" is that there are two lines: > > [hotplug]# uname -a > Linux lpsdm52 2.6.30-rc3-git2-nfs_ra #3 SMP Mon Apr 27 10:21:31 CEST 2009 > x86_64 x86_64 x86_64 GNU/Linux > [hotplug]# grep sysfs /proc/mounts > none /sys sysfs rw,relatime 0 0 > /sys /sys sysfs rw,relatime 0 0 > > This breaks the "firmware.agent" /sys-parsing code. There still exists the > simple fix to > userspace, but I now think that this is a real regression that should be > fixed.
On Monday 18 May 2009, Martin Knoblauch wrote: > > ----- Original Message ---- > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org> > > Cc: Kernel Testers List <kernel-testers@vger.kernel.org>; Martin Knoblauch > <spamtrap@knobisoft.de> > > Sent: Saturday, May 16, 2009 10:06:02 PM > > Subject: [Bug #13178] Booting very slow > > > > This message has been generated automatically as a part of a report > > of regressions introduced between 2.6.28 and 2.6.29. > > > > The following bug entry is on the current list of known regressions > > introduced between 2.6.28 and 2.6.29. Please verify if it still should > > be listed and let me know (either way). > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13178 > > Subject : Booting very slow > > Submitter : Martin Knoblauch > > Date : 2009-04-24 12:45 (23 days old) > > References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4 > > The issue is still open. It turns out that starting with 2.6.29-rc1 > /proc/mounts already has a "sysfs" line when entering the startup scripts > from initrd. This breaks the RHEL4 firmware hotplug script. > > Simple fix to user space is available. I do not know how important this issue > is.
The problem has been bisected down to commit: |commit 1120f8b8169fb2cb51219d326892d963e762edb6 |Author: Stephen Hemminger <shemminger@vyatta.com> |Date: Thu Dec 18 09:17:16 2008 -0800 | | PCI: handle long delays in VPD access | | Accessing the VPD area can take a long time. The existing | VPD access code fails consistently on my hardware. There are comments | | Change the access routines to: | * use a mutex rather than spinning with IRQ's disabled and lock held | * have a much longer timeout | * call cond_resched while spinning | | Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> | Reviewed-by: Matthew Wilcox <willy@linux.intel.com> | Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> The issue seems to be kind of timing dependent. It seems I can only reproduce it on a certain hap configuration (HP/Proliant DL380G4). The symptom does not show up on an IBM x3650 with the same RHEL4.3 userpace. It also does not show up on my notebook with CentOS-5.3 userspace. No idea what to do about it. No complaints from my side if this gets closed for fuzzyness :-) Martin
On Wednesday 27 May 2009, Andrew Morton wrote: > On Tue, 26 May 2009 01:04:04 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > > On Monday 25 May 2009, Martin Knoblauch wrote: > > > > > > ----- Original Message ---- > > > > > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org> > > > > Cc: Kernel Testers List <kernel-testers@vger.kernel.org>; Martin > Knoblauch <spamtrap@knobisoft.de> > > > > Sent: Sunday, May 24, 2009 9:31:18 PM > > > > Subject: [Bug #13178] Booting very slow > > > > > > > > This message has been generated automatically as a part of a report > > > > of regressions introduced between 2.6.28 and 2.6.29. > > > > > > > > The following bug entry is on the current list of known regressions > > > > introduced between 2.6.28 and 2.6.29. Please verify if it still should > > > > be listed and let me know (either way). > > > > > > > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13178 > > > > Subject : Booting very slow > > > > Submitter : Martin Knoblauch > > > > Date : 2009-04-24 12:45 (31 days old) > > > > References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4 > > > > > > Still happens with 2.6.30-rc7. But see my comment on bz. I would be > willing to leave this as "fuzzy timing related problem. > > > > OK > > > > I've closed it as "unreproducible". > > > > afacit this should remain open. It's a reproducible regression on one > of Martin's machines and it has been bisected down to a particular > commit which quite clearly has the potential to increase device > intialisation times by a lot. Especially if that commit was buggy.
On Monday 01 June 2009, Martin Knoblauch wrote: > > ----- Original Message ---- > > > This message has been generated automatically as a part of a report > > of regressions introduced between 2.6.28 and 2.6.29. > > > > The following bug entry is on the current list of known regressions > > introduced between 2.6.28 and 2.6.29. Please verify if it still should > > be listed and let me know (either way). > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13178 > > Subject : Booting very slow > > Submitter : Martin Knoblauch > > Date : 2009-04-24 12:45 (37 days old) > > References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4 > > We (HP and myself) are trying to track it down.
On Monday 08 June 2009, Martin Knoblauch wrote: > > ----- Original Message ---- > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org> > > Cc: Kernel Testers List <kernel-testers@vger.kernel.org>; Jesse Barnes > <jbarnes@virtuousgeek.org>; Martin Knoblauch <spamtrap@knobisoft.de>; Stephen > Hemminger <shemminger@vyatta.com> > > Sent: Sunday, June 7, 2009 12:06:22 PM > > Subject: [Bug #13178] Booting very slow > > > > This message has been generated automatically as a part of a report > > of regressions introduced between 2.6.28 and 2.6.29. > > > > The following bug entry is on the current list of known regressions > > introduced between 2.6.28 and 2.6.29. Please verify if it still should > > be listed and let me know (either way). > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13178 > > Subject : Booting very slow > > Submitter : Martin Knoblauch > > Date : 2009-04-24 12:45 (45 days old) > > References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4 > > No change since last ping. We ruled out a non-HP NIC in the DL380. HP will > try to reproduce in-house.
Almost forgot about this one, as the hardware in question has been retired by the customer. Investigation by HP back in July/August 2009 showed, that the problem was caused by a VPD read problem on the platform. That in turn prevented the umount of "/sys" from the initrd image, which resulted in the double entry, which broke the hotplug script, which .... I will try to find out whether HP ever found a solution.