Bug 35332

Summary: irq 17: nobody cared
Product: ACPI Reporter: Andrew Schultz (ajschult)
Component: Config-InterruptsAssignee: Aaron Lu (aaron.lu)
Status: CLOSED DUPLICATE    
Severity: normal CC: aklhfex, akpm, alan, andyrtr, chris.palmer, ghost_3k, lenb, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.39-rc7 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg
kernel config
/proc/interuppts
dmesg

Description Andrew Schultz 2011-05-18 18:03:24 UTC
Created attachment 58312 [details]
dmesg

My system works fine for some time, but eventually will encounter an irq problem:

 irq 17: nobody cared (try booting with the "irqpoll" option)
 Pid: 0, comm: kworker/0:1 Not tainted 2.6.38.6 #2
 Call Trace:
  <IRQ>  [<ffffffff81088c40>] ? __report_bad_irq+0x38/0x87
  [<ffffffff81088da2>] ? note_interrupt+0x113/0x179
  [<ffffffff810896c7>] ? handle_fasteoi_irq+0xa1/0xc9
  [<ffffffff81004d3b>] ? handle_irq+0x83/0x8c
  [<ffffffff8100437a>] ? do_IRQ+0x48/0xaf
  [<ffffffff81518013>] ? ret_from_intr+0x0/0xe
  <EOI>  [<ffffffff81240066>] ? acpi_idle_enter_bm+0x22e/0x262
  [<ffffffff81240061>] ? acpi_idle_enter_bm+0x229/0x262
  [<ffffffff8136e3f6>] ? cpuidle_idle_call+0x112/0x1b4
  [<ffffffff81001ca0>] ? cpu_idle+0x5a/0x91
  [<ffffffff815126c0>] ? start_secondary+0x164/0x168
 handlers:
 [<ffffffffa006f55f>] (rtl8169_interrupt+0x0/0x2cc [r8169])
 Disabling IRQ #17

[booting with irqpoll shows the same behavior]

this happens with kernel versions 2.6.38.6 as well as 2.6.39-rc7.  After this happens, the system still runs, but the networking is painful (100ms pings).
Comment 1 Andrew Schultz 2011-05-18 18:04:31 UTC
Created attachment 58322 [details]
kernel config
Comment 2 Andrew Schultz 2011-05-18 18:05:20 UTC
Created attachment 58332 [details]
/proc/interuppts
Comment 3 Alexandru Coman 2011-05-23 09:36:17 UTC
I have the same problem using kernel 2.6.32 and 2.6.38, the call trace is the same (just different addresses).
On 2.6.32 it took around 1 day for the issue to come up.
On 2.6.38 it takes less than 30 min.
After this the network is unusable, with ping times over 30ms, compared to the usual 1ms.

The irq in question is the one used by a Intel network card installed on an PCI port, I have also tried a Realtek one, same issue.
But I'm not if it has anything to do with network cards, I got once the same call trace for an irq used by an usb keyboard.
*Might* be related to the mainboard, Asus P8H67-M EVO (with Intel Sandy Bridge).


Booting with "irqpoll", or "noapic" does not help.
But booting with "acpi=off" fixes this, the server has been running smoothly for 3 days.
Comment 4 Alexandru Coman 2011-07-11 10:40:21 UTC
After 2 months of running with "acpi=off" I can say that it does not completely fix the problem, it takes around 5-6 days to come up.

So, I have to reboot it every 5-6 days.
Comment 5 Andreas Radke 2011-07-21 09:22:30 UTC
Same here with a Asus P8P67 board. I had to plug in a 100Mbit e100 NIC (onboard NIC makes it freeze without any message) and now I'm running into the same issues. It doesn't matter what pci slot I use.

Attaching my dmesg.

I've found a similar issue with a Intel mainboard in the net so this is propably more than a Asus bios issue. Looks similar to https://bugzilla.kernel.org/show_bug.cgi?id=38632
Comment 6 Andreas Radke 2011-07-21 09:23:09 UTC
Created attachment 66212 [details]
dmesg
Comment 7 Aaron Lu 2012-12-10 03:03:21 UTC
Hello,

From all your dmesg, the ASM1083 PCIe to PCI bridge(1b21:1080) is used, and that chip is known buggy and will cause irq problems for PCI devices attached under it.

I'll mark it as duplicate of
https://bugzilla.kernel.org/show_bug.cgi?id=38632

And for more information:
https://lkml.org/lkml/2012/1/30/216

Thanks.

*** This bug has been marked as a duplicate of bug 38632 ***