Bug 27482
Summary: | WARNING: at kernel/lockdep.c:2323 trace_hardirqs_on_caller | ||
---|---|---|---|
Product: | Power Management | Reporter: | tim blechmann (tim) |
Component: | Other | Assignee: | power-management_other |
Status: | CLOSED CODE_FIX | ||
Severity: | low | CC: | florian, kirjanov, maciej.rutecki, rjw, stern |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.38-rc2 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216, 27352 | ||
Attachments: | dmesg |
Description
tim blechmann
2011-01-24 10:49:20 UTC
On Monday, January 24, 2011, Alan Stern wrote:
> On Mon, 24 Jan 2011, Rafael J. Wysocki wrote:
>
> > On Monday, January 24, 2011, Andrew Morton wrote:
> > > On Mon, 24 Jan 2011 10:49:23 GMT
> > > bugzilla-daemon@bugzilla.kernel.org wrote:
> > >
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=27482
> > >
> > > post-2.6.37 PM regression. Rafael's fault :)
> >
> > Nah, seems like USB rather or a false-positive. Adding Alan to CC.
>
> It's a real bug, but it's not in USB -- it's in the no_callbacks
> addition to the runtime PM framework. I failed to consider a
> particular type of call path.
>
> Basically the problem is that the rpm_suspend() routine includes a
> sequence like this:
>
> spin_unlock_irq(&a);
> spin_lock_irqsave(&b);
> spin_unlock_irqsave(&b);
> spin_lock_irq(&a);
>
> Before it was okay, although wasteful in the way interrupts were
> enabled and disabled. Now it's just wrong, since the code can run in
> an interrupt handler.
>
> Replacing everything with simple spin_unlock() and spin_lock() calls
> should fix the problem. Does this patch work?
>
> Alan Stern
>
>
> Index: usb-2.6/drivers/base/power/runtime.c
> ===================================================================
> --- usb-2.6.orig/drivers/base/power/runtime.c
> +++ usb-2.6/drivers/base/power/runtime.c
> @@ -407,12 +407,15 @@ static int rpm_suspend(struct device *de
> goto out;
> }
>
> + /* Maybe the parent is now able to suspend. */
> if (parent && !parent->power.ignore_children && !dev->power.irq_safe) {
> - spin_unlock_irq(&dev->power.lock);
> + spin_unlock(&dev->power.lock);
>
> - pm_request_idle(parent);
> + spin_lock(&parent->power.lock);
> + rpm_idle(parent, RPM_ASYNC);
> + spin_unlock(&parent->power.lock);
>
> - spin_lock_irq(&dev->power.lock);
> + spin_lock(&dev->power.lock);
> }
>
> out:
It seems that the Bugzilla didn't take your message, can you attach the patch
directly to the bug entry, please?
tested and the warning disappears! On Mon, 24 Jan 2011, Rafael J. Wysocki wrote:
> On Monday, January 24, 2011, Andrew Morton wrote:
> > On Mon, 24 Jan 2011 10:49:23 GMT
> > bugzilla-daemon@bugzilla.kernel.org wrote:
> >
> > > https://bugzilla.kernel.org/show_bug.cgi?id=27482
> >
> > post-2.6.37 PM regression. Rafael's fault :)
>
> Nah, seems like USB rather or a false-positive. Adding Alan to CC.
It's a real bug, but it's not in USB -- it's in the no_callbacks
addition to the runtime PM framework. I failed to consider a
particular type of call path.
Basically the problem is that the rpm_suspend() routine includes a
sequence like this:
spin_unlock_irq(&a);
spin_lock_irqsave(&b);
spin_unlock_irqsave(&b);
spin_lock_irq(&a);
Before it was okay, although wasteful in the way interrupts were
enabled and disabled. Now it's just wrong, since the code can run in
an interrupt handler.
Replacing everything with simple spin_unlock() and spin_lock() calls
should fix the problem. Does this patch work?
Alan Stern
Index: usb-2.6/drivers/base/power/runtime.c
===================================================================
--- usb-2.6.orig/drivers/base/power/runtime.c
+++ usb-2.6/drivers/base/power/runtime.c
@@ -407,12 +407,15 @@ static int rpm_suspend(struct device *de
goto out;
}
+ /* Maybe the parent is now able to suspend. */
if (parent && !parent->power.ignore_children && !dev->power.irq_safe) {
- spin_unlock_irq(&dev->power.lock);
+ spin_unlock(&dev->power.lock);
- pm_request_idle(parent);
+ spin_lock(&parent->power.lock);
+ rpm_idle(parent, RPM_ASYNC);
+ spin_unlock(&parent->power.lock);
- spin_lock_irq(&dev->power.lock);
+ spin_lock(&dev->power.lock);
}
out:
Okay, I'll send an official patch to Rafael. I [ 96.942093] ------------[ cut here ]------------ [ 96.942117] WARNING: at kernel/lockdep.c:2321 [ 96.942129] Modules linked in: snd_aoa_codec_onyx snd_aoa_fabric_layout snd_aoa snd_aoa_i2sbus snd_aoa_soundbus nouveau snd_pcm snd_page_alloc snd_timer snd drm_kms_helper firewire_ohci ttm soundcore firewire_core cfbcopyarea cfbimgblt cfbfillrect crc_itu_t uninorth_agp [ 96.942266] NIP: c0000000000a4368 LR: c0000000000a434c CTR: 00000fffa02f8610 [ 96.942284] REGS: c000000177d9fb20 TRAP: 0700 Not tainted (2.6.38-rc2-00274-g1f0324c) [ 96.942300] MSR: 9000000000021032 <ME,CE,IR,DR> CR: 24004422 XER: 00000000 [ 96.942346] TASK = c00000017ab342e0[2858] 'syslog-ng' THREAD: c000000177d9c000 CPU: 3 [ 96.942367] GPR00: 0000000000000000 c000000177d9fda0 c000000000854448 0000000000000001 [ 96.942402] GPR04: 0000000000000005 0000000000122c58 0000000000000000 0000000000000028 [ 96.942438] GPR08: 0000000000000000 c000000001124c0c 00000fffa02f8610 0000000000000001 [ 96.942473] GPR12: 900000000200d032 c00000000ffff780 0000000010118e24 0000000010118e2c [ 96.942508] GPR16: 0000000042dfa400 0000000000000000 0000000010117648 0000000000000000 [ 96.942543] GPR20: 0000000000000000 0000000000000001 0000000016804f28 0000000000122c58 [ 96.942579] GPR24: 00000000100668e8 00000fffe16e0e04 00000fffa07ff020 0000000016804f28 [ 96.942614] GPR28: 0000000000000005 c000000000007540 c0000000007d57e0 c00000017ab342e0 [ 96.942658] NIP [c0000000000a4368] .trace_hardirqs_on_caller+0x198/0x1a0 [ 96.942674] LR [c0000000000a434c] .trace_hardirqs_on_caller+0x17c/0x1a0 [ 96.942689] Call Trace: [ 96.942700] [c000000177d9fda0] [0000000100000000] 0x100000000 (unreliable) [ 96.942725] [c000000177d9fe30] [c000000000007540] system_call_common+0xc0/0x110 [ 96.942744] Instruction dump: [ 96.942756] 4bfffbc9 2fa30000 409eff24 4bffff38 481d0fc9 60000000 2fa30000 419eff28 [ 96.942805] e93e8120 80090000 2f800000 409eff18 <0fe00000> 4bffff10 7c6802a6 4bfffe5c [ 96.942855] ---[ end trace 9413391f15683486 ]--- have the similar bug on my ppc64 box. It happens when running perf top Created attachment 45372 [details]
dmesg
Patch merged. commit c3810c88788d505d4ffd786addd111b745e42161 Author: Alan Stern <stern@rowland.harvard.edu> Date: Tue Jan 25 20:50:07 2011 +0100 PM / Runtime: Don't enable interrupts while running in_interrupt This is already applied Different backtrace. If you can reproduce the issue consistently, please open a new bug and post a link here. Thanks, Flo |