Bug 13178
Summary: | Booting very slow | ||
---|---|---|---|
Product: | Other | Reporter: | Rafael J. Wysocki (rjw) |
Component: | Other | Assignee: | other_other |
Status: | CLOSED UNREPRODUCIBLE | ||
Severity: | normal | CC: | alan, spamtrap |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.29 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 12398 |
Description
Rafael J. Wysocki
2009-04-25 19:50:15 UTC
The reason for the problem is that "/proc/mounts" contains two entries for "sysfs": [root@lpsdm52 hotplug]# uname -a Linux lpsdm52 2.6.30-rc3-git2-nfs_ra #3 SMP Mon Apr 27 10:21:31 CEST 2009 x86_64 x86_64 x86_64 GNU/Linux [root@lpsdm52 hotplug]# grep sysfs /proc/mounts none /sys sysfs rw,relatime 0 0 /sys /sys sysfs rw,relatime 0 0 Which breaks the RHEL-4.3 provides script "/etc/hotplug/firmware.agent". The mount-path is now determined to be "/sys\n/sys". In turn every driver using the firmware-loader now fails and times out on the value in "/sys/class/firmware/timeout". There is a simple fix to the firmware agent, but this behaviour is still a regression. Cheers Martin On Monday 27 April 2009, Martin Knoblauch wrote:
>
> ----- Original Message ----
>
> > From: Martin Knoblauch <spamtrap@knobisoft.de>
> > To: Rafael J. Wysocki <rjw@sisk.pl>; Linux Kernel Mailing List
> <linux-kernel@vger.kernel.org>
> > Cc: Kernel Testers List <kernel-testers@vger.kernel.org>
> > Sent: Monday, April 27, 2009 9:18:53 AM
> > Subject: Re: [Bug #13178] Booting very slow
> >
> > ----- Original Message ----
> >
> > > From: Rafael J. Wysocki
> > > To: Linux Kernel Mailing List
> > > Cc: Kernel Testers List ; Martin Knoblauch
> >
> > > Sent: Sunday, April 26, 2009 11:46:31 AM
> > > Subject: [Bug #13178] Booting very slow
> > >
> > > This message has been generated automatically as a part of a report
> > > of regressions introduced between 2.6.28 and 2.6.29.
> > >
> > > The following bug entry is on the current list of known regressions
> > > introduced between 2.6.28 and 2.6.29. Please verify if it still should
> > > be listed and let me know (either way).
> > >
> > >
> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13178
> > > Subject : Booting very slow
> > > Submitter : Martin Knoblauch
> > > Date : 2009-04-24 12:45 (3 days old)
> > > References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4
> >
> > Not really sure whether this is a real regression. Between 2.6.28 and
> 2.6.29 the
> > content of /proc/mounts for sysfs changed from
> >
> > /sys /sys sysfs rw 0 0
> >
> > to
> >
> > none /sys sysfs rw 0 0
> >
> >
> > This breaks RHEL-4.3 userland which parses /proc/mounts in the firmware
> hotplug
> > agent to find the mount-point for sysfs. As a result firmware loading
> started to
> > fail in 2.6.29. There is a simple fix in the /etc/hotplug/firmware.agent
> script
> > (just assume /sys as it is done elsewhere).
> >
> > Your call.
> >
> > Cheers
> > Martin
>
> Actually I have to correct myself. The reason for the failure to parse
> /proc/mounts for "sysfs" is that there are two lines:
>
> [hotplug]# uname -a
> Linux lpsdm52 2.6.30-rc3-git2-nfs_ra #3 SMP Mon Apr 27 10:21:31 CEST 2009
> x86_64 x86_64 x86_64 GNU/Linux
> [hotplug]# grep sysfs /proc/mounts
> none /sys sysfs rw,relatime 0 0
> /sys /sys sysfs rw,relatime 0 0
>
> This breaks the "firmware.agent" /sys-parsing code. There still exists the
> simple fix to
> userspace, but I now think that this is a real regression that should be
> fixed.
On Monday 18 May 2009, Martin Knoblauch wrote:
>
> ----- Original Message ----
>
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
> > Cc: Kernel Testers List <kernel-testers@vger.kernel.org>; Martin Knoblauch
> <spamtrap@knobisoft.de>
> > Sent: Saturday, May 16, 2009 10:06:02 PM
> > Subject: [Bug #13178] Booting very slow
> >
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.28 and 2.6.29.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.28 and 2.6.29. Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13178
> > Subject : Booting very slow
> > Submitter : Martin Knoblauch
> > Date : 2009-04-24 12:45 (23 days old)
> > References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4
>
> The issue is still open. It turns out that starting with 2.6.29-rc1
> /proc/mounts already has a "sysfs" line when entering the startup scripts
> from initrd. This breaks the RHEL4 firmware hotplug script.
>
> Simple fix to user space is available. I do not know how important this issue
> is.
The problem has been bisected down to commit: |commit 1120f8b8169fb2cb51219d326892d963e762edb6 |Author: Stephen Hemminger <shemminger@vyatta.com> |Date: Thu Dec 18 09:17:16 2008 -0800 | | PCI: handle long delays in VPD access | | Accessing the VPD area can take a long time. The existing | VPD access code fails consistently on my hardware. There are comments | | Change the access routines to: | * use a mutex rather than spinning with IRQ's disabled and lock held | * have a much longer timeout | * call cond_resched while spinning | | Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> | Reviewed-by: Matthew Wilcox <willy@linux.intel.com> | Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> The issue seems to be kind of timing dependent. It seems I can only reproduce it on a certain hap configuration (HP/Proliant DL380G4). The symptom does not show up on an IBM x3650 with the same RHEL4.3 userpace. It also does not show up on my notebook with CentOS-5.3 userspace. No idea what to do about it. No complaints from my side if this gets closed for fuzzyness :-) Martin On Wednesday 27 May 2009, Andrew Morton wrote:
> On Tue, 26 May 2009 01:04:04 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
> > On Monday 25 May 2009, Martin Knoblauch wrote:
> > >
> > > ----- Original Message ----
> > >
> > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
> > > > Cc: Kernel Testers List <kernel-testers@vger.kernel.org>; Martin
> Knoblauch <spamtrap@knobisoft.de>
> > > > Sent: Sunday, May 24, 2009 9:31:18 PM
> > > > Subject: [Bug #13178] Booting very slow
> > > >
> > > > This message has been generated automatically as a part of a report
> > > > of regressions introduced between 2.6.28 and 2.6.29.
> > > >
> > > > The following bug entry is on the current list of known regressions
> > > > introduced between 2.6.28 and 2.6.29. Please verify if it still should
> > > > be listed and let me know (either way).
> > > >
> > > >
> > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13178
> > > > Subject : Booting very slow
> > > > Submitter : Martin Knoblauch
> > > > Date : 2009-04-24 12:45 (31 days old)
> > > > References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4
> > >
> > > Still happens with 2.6.30-rc7. But see my comment on bz. I would be
> willing to leave this as "fuzzy timing related problem.
> >
> > OK
> >
> > I've closed it as "unreproducible".
> >
>
> afacit this should remain open. It's a reproducible regression on one
> of Martin's machines and it has been bisected down to a particular
> commit which quite clearly has the potential to increase device
> intialisation times by a lot. Especially if that commit was buggy.
On Monday 01 June 2009, Martin Knoblauch wrote:
>
> ----- Original Message ----
>
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.28 and 2.6.29.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.28 and 2.6.29. Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13178
> > Subject : Booting very slow
> > Submitter : Martin Knoblauch
> > Date : 2009-04-24 12:45 (37 days old)
> > References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4
>
> We (HP and myself) are trying to track it down.
On Monday 08 June 2009, Martin Knoblauch wrote:
>
> ----- Original Message ----
>
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
> > Cc: Kernel Testers List <kernel-testers@vger.kernel.org>; Jesse Barnes
> <jbarnes@virtuousgeek.org>; Martin Knoblauch <spamtrap@knobisoft.de>; Stephen
> Hemminger <shemminger@vyatta.com>
> > Sent: Sunday, June 7, 2009 12:06:22 PM
> > Subject: [Bug #13178] Booting very slow
> >
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.28 and 2.6.29.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.28 and 2.6.29. Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13178
> > Subject : Booting very slow
> > Submitter : Martin Knoblauch
> > Date : 2009-04-24 12:45 (45 days old)
> > References : http://marc.info/?l=linux-kernel&m=124057716231773&w=4
>
> No change since last ping. We ruled out a non-HP NIC in the DL380. HP will
> try to reproduce in-house.
Almost forgot about this one, as the hardware in question has been retired by the customer. Investigation by HP back in July/August 2009 showed, that the problem was caused by a VPD read problem on the platform. That in turn prevented the umount of "/sys" from the initrd image, which resulted in the double entry, which broke the hotplug script, which .... I will try to find out whether HP ever found a solution. |