Bug 28892

Summary: 2.6.36+ "irq 16: nobody cared" on resume - Acer Aspire One
Product: Power Management Reporter: Andy (dr.diesel)
Component: Hibernation/SuspendAssignee: power-management_other
Status: CLOSED UNREPRODUCIBLE    
Severity: normal CC: lenb, rjw, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.38rc4 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: dmesg from .38rc4 kernel

Description Andy 2011-02-11 21:26:38 UTC
Description of problem:

Kernel 2.6.36-1.fc15.x86_64 makes for a choppy erratic mouse after awaking from
a suspend to ram.  Mouse will work but the action if slow and choppy.Booting back to kernel 2.6.35.6-45 (or any other 2.6.35 kernel) and the problem goes away and
all is normal.

One notable output while performing suspend: Disabling IRQ #19

This is posted downstream where it first started, lots of info posted by the request of Kyle McMartin.

https://bugzilla.redhat.com/show_bug.cgi?id=645968

I have now tested vanilla 2.6.38rc4 and this problem is still occurring.  No Fedora 2.6.35 kernel had this problem, I did not test a vanilla .35 at that time.  It seems that the longer the computer is "asleep" the more likely this is to happen, but this is a totally unscientific observation.  Please note, this is does not happen on on a 100% of suspend/resume cycles.

Happy to provide any further details, log files, test patches etc.
Comment 1 Andy 2011-02-11 21:34:48 UTC
Created attachment 47402 [details]
dmesg from .38rc4 kernel
Comment 2 Rafael J. Wysocki 2011-02-11 22:22:21 UTC
So, the latest 2.6.35.y is the last known good kernel, right?
Comment 3 Andy 2011-02-11 22:24:55 UTC
(In reply to comment #2)
> So, the latest 2.6.35.y is the last known good kernel, right?

Correct.

Thanks
Comment 4 Len Brown 2011-03-22 02:40:09 UTC
the dmesg in the redhat bug report shows that the failure is actually on IRQ 16

[    1.820737] uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    1.825712] uhci_hcd 0000:00:1d.2: PCI INT D -> GSI 16 (level, low) -> IRQ 16
[    3.510125] i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[   12.817902] atl1c 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 9204.665017] Disabling IRQ #16
[ 9206.315132] uhci_hcd 0000:00:1d.2: PCI INT D -> GSI 16 (level, low) -> IRQ 16
[ 9207.114269] uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 9207.132812] uhci_hcd 0000:00:1d.2: PCI INT D -> GSI 16 (level, low) -> IRQ 16
[ 9312.181497] uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16

and atl1c complains BEFORE the suspend:

[ 9203.914074] atl1c 0000:01:00.0: MAC state machine can't be idle since disabled for 10ms second

[ 9204.137295] wlan0: deauthenticating from 00:25:9c:ed:e5:57 by local choice (reason=3)
[ 9204.142602] cfg80211: Calling CRDA to update world regulatory domain
[ 9204.142699] cfg80211: Calling CRDA for country: US
[ 9204.164000] cfg80211: Regulatory domain changed to country: US
[ 9204.164153]     (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
[ 9204.164160]     (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2700 mBm)
[ 9204.164165]     (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 1700 mBm)
[ 9204.164170]     (5250000 KHz - 5330000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[ 9204.164175]     (5490000 KHz - 5600000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[ 9204.164181]     (5650000 KHz - 5710000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[ 9204.164186]     (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 3000 mBm)
[ 9204.664831] irq 16: nobody cared (try booting with the "irqpoll" option)
[ 9204.664842] Pid: 0, comm: swapper Tainted: G          I 2.6.36-1.fc15.x86_64 #1
[ 9204.664847] Call Trace:
[ 9204.664850]  <IRQ>  [<ffffffff810b738c>] __report_bad_irq.clone.1+0x3d/0x8b
[ 9204.664869]  [<ffffffff810b74f4>] note_interrupt+0x11a/0x17e
[ 9204.664877]  [<ffffffff810b7fd8>] handle_fasteoi_irq+0xad/0xd7
[ 9204.664885]  [<ffffffff8100c3c6>] handle_irq+0x88/0x90
[ 9204.664893]  [<ffffffff814a3bac>] do_IRQ+0x5c/0xb4
[ 9204.664901]  [<ffffffff8149d9d3>] ret_from_intr+0x0/0x16
[ 9204.664905]  <EOI>  [<ffffffff812bc64b>] ? acpi_idle_enter_c1+0xa6/0xc9
[ 9204.664920]  [<ffffffff812bba20>] ? raw_local_irq_enable+0x10/0x12
[ 9204.664929]  [<ffffffff8108042e>] ? trace_hardirqs_on+0xd/0xf
[ 9204.664936]  [<ffffffff812bc650>] acpi_idle_enter_c1+0xab/0xc9
[ 9204.664945]  [<ffffffff813bf76f>] cpuidle_idle_call+0xa4/0x113
[ 9204.664953]  [<ffffffff8100830b>] cpu_idle+0xb3/0x10f
[ 9204.664961]  [<ffffffff81484573>] rest_init+0xb7/0xbe
[ 9204.664967]  [<ffffffff814844bc>] ? rest_init+0x0/0xbe
[ 9204.664976]  [<ffffffff81d76c53>] start_kernel+0x412/0x41d
[ 9204.664983]  [<ffffffff81d762c6>] x86_64_start_reservations+0xb1/0xb5
[ 9204.664990]  [<ffffffff81d763c5>] x86_64_start_kernel+0xfb/0x10a
[ 9204.664995] handlers:
[ 9204.664998] [<ffffffff8135ec98>] (usb_hcd_irq+0x0/0x9e)
[ 9204.665009] [<ffffffff8135ec98>] (usb_hcd_irq+0x0/0x9e)
[ 9204.665017] Disabling IRQ #16
[ 9206.207270] PM: Syncing filesystems ... done.
[ 9206.216799] PM: Preparing system for mem sleep
[ 9206.280299] Freezing user space processes ... (elapsed 0.01 seconds) done.
[ 9206.292122] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[ 9206.303118] PM: Entering mem sleep
...

> MAC state machine can't be idle since disabled for 10ms second

Perhaps this device is still going to interrupt us after we've
suspend or removed it, and that kills the shared interrupt?
Can you reproduce the problem with this device absent from the system?
Perhaps using an older version of that driver works better?

Are you using a script to suspend the system that tries to
unload/load this driver on suspend/resume?
If so, what happens when you don't use that script and simply
# echo mem > /sys/power/state
(to exercise the drivers .suspend/.resume methods)

Without using suspend, plese
#rmmod atl1c
and
# modprobe atl1c
a few times and see if IRQ16 survives...
Comment 5 Andy 2011-03-23 22:34:28 UTC
This machine is a laptop, so can't remove it!

I tried:
#rmmod atl1c && modprobe atl1c

About 200 times without issue, but this didn't always happen before either and I'm running a kernel a couple RCs newer.  This is no longer my main machine (whoohoo Sandy Bridge is awesome!!) but still quite used.


I will wait for it to happen again on kernel 2.6.38-5.fc15.x86_64, then blacklist atl1c (since only the wireless is used on this machine) and wait to see if it still happens.

I will report back, thanks!
Comment 6 Zhang Rui 2012-01-18 03:13:14 UTC
It's great that the kernel bugzilla is back.

Can you please verify if the problem still exists in the latest upstream kernel?
Comment 7 Andy 2012-01-18 10:41:53 UTC
I no longer have this machine.  Closing this bug.

Thanks