Most recent kernel where this bug did *NOT* occur: 2.6.18.3 Distribution: Fedora Core 6 Hardware Environment: 00:00.0 Host bridge: Intel Corporation 82815 815 Chipset Host Bridge and Memory Controller Hub (rev 02) 00:01.0 PCI bridge: Intel Corporation 82815 815 Chipset AGP Bridge (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 05) 00:1f.0 ISA bridge: Intel Corporation 82801BA ISA Bridge (LPC) (rev 05) 00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 (rev 05) 00:1f.2 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #1) (rev 05) 00:1f.3 SMBus: Intel Corporation 82801BA/BAM SMBus (rev 05) 00:1f.4 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #2) (rev 05) 01:00.0 VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev a1) 02:09.0 FireWire (IEEE 1394): Texas Instruments TSB12LV23 IEEE-1394 Controller 02:0a.0 Multimedia video controller: Brooktree Corporation Bt848 Video Capture (rev 12) 02:0b.0 USB Controller: NEC Corporation USB (rev 43) 02:0b.1 USB Controller: NEC Corporation USB (rev 43) 02:0b.2 USB Controller: NEC Corporation USB 2.0 (rev 04) 02:0c.0 Multimedia audio controller: Ensoniq ES1370 [AudioPCI] (rev 01) 02:0e.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 08) Software Environment: Mainline kernel, gcc 4.1.1, glibc 2.5 Problem Description: Since kernel 2.6.19 (today), the message "irq 18: nobody cared ..." appeared. I do not know how serious this could be, but it does not look good, expecially considering that kernel 2.6.18.3 was fine. Steps to reproduce: Reboot the box with 2.6.19... :-) I tried with PNP OS (in BIOS) yes/no with same result. I also create dmesg and /proc/interrupts output with 2.6.18.3, 2.6.19 and 2.6.19 with "irqpoll". The last one seems to work properly, i.e. the irq 18 is assigned properly. Performace wise, I don't know. If I found the way :-) all attach the files here.
Created attachment 9692 [details] dmesg for kernel 2.6.18.3
Created attachment 9693 [details] /proc/interrupts 2.6.18.3
Created attachment 9694 [details] dmesg for kernel 2.6.19
Created attachment 9695 [details] /proc/interrupts 2.6.19
Created attachment 9696 [details] dmesg for kernel 2.6.19 with irqpoll
Created attachment 9697 [details] /proc/interrupts 2.6.19 with irqpoll
I have the same problem. Here's an extract from dmesg: APIC error on CPU0: 00(40) ACPI: AC Adapter [C177] (on-line) APIC error on CPU0: 40(40) APIC error on CPU0: 40(40) APIC error on CPU0: 40(40) irq 21: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff80194752>] __report_bad_irq+0x30/0x7d [<ffffffff801949e4>] note_interrupt+0x245/0x281 [<ffffffff801953af>] handle_fasteoi_irq+0xb2/0x101 [<ffffffff801654e3>] do_IRQ+0x60/0x97 [<ffffffff80163765>] default_idle+0x0/0x47 [<ffffffff8015a301>] ret_from_intr+0x0/0xa <EOI> [<ffffffff8016378e>] default_idle+0x29/0x47 [<ffffffff80144e71>] cpu_idle+0x3c/0x79 [<ffffffff80687673>] start_kernel+0x1cc/0x1d1 [<ffffffff8068713e>] _sinittext+0x13e/0x142 handlers: [<ffffffff803b89eb>] (ohci_irq_handler+0x0/0x7e5) Disabling IRQ #21 I also tried using "irqpoll" as boot option. No effect.
Created attachment 9698 [details] dmesg for 2.6.19 (vanilla)
Piergiorgio, Please attach the output from 'lspci -vv' and also acpidump ouput. acpidump can be found in the latest pmtools here: http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/ My first guess is that the irq re-write has somehow collided with the (unusual) ACPI SCI override on your system: > ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 20 low level) means that this is simply wrong: > 9: 0 IO-APIC-fasteoi acpi And so as soon as you get an ACPI interrupt, it will pull on IRQ20 and listen to IRQ9 and thus kill whoever is on 20. (err. re-named to 18) I'm astonished (and somewhat alarmed) that this wasn't discovered earlier, because this kind of setup is unusual, but it isn't rare. (Indeed, I think I've got a box someplace in the back that may have this -- maybe I can reproduce this tomorrow) Mihai, For your input to be useful you need to: 1. attach the dmesg all the way back to the beginning eg. dmesg -s64000 2. attach the output from 'cat /proc/interrupts' 3. do this for both 2.6.18 and 2.6.19 4. lspci -vv and acpidump (only one copy of each) per above.
Created attachment 9708 [details] SCI patch in 2.6.19 Please try reverting this patch from your 2.6.19 source tree and report if it helps. (patch -Rp1 < filename)
Created attachment 9709 [details] 2.6.18+2.6.19 info I've made I lil' archive with all you've asked. It contans: 2.6.18/2.6.19: - dmesg -s64000 >dmesg.txt - cat /proc/interrupts >interrupts.txt - lspci -vv >lspci.txt - acpidump >acpidump.tx I hope it's ok. I'll try your patch in a couple of minutes.
Created attachment 9710 [details] dmesg after reverting the SCI patch in 2.6.19 Things look good. Your patch appears to have fixed the bug. I'll give it a week or so to see if anything else pops up as a result of this fix. Thanks a lot.
Created attachment 9711 [details] /proc/interrupts after reverting SCI patch in 2.6.19 The contents of cat /proc/interrupts
Created attachment 9712 [details] lspci -vv kernel 2.6.19 with irqpoll
Created attachment 9713 [details] lspci -vv kernel 2.6.19 with irqpoll
Created attachment 9714 [details] acpidump with kernel 2.6.19 and irqpoll
I'm also going to try the patch ASAP. BTW, sorry for the double post of lspci, I mess up something...
OK, good, I tested the patch and it seems working, now /proc/interrupts shows the same as in 2.6.18.3, except for irq 21 and 22 which are swapped (the device are swapped, of course). I guess this is not a problem. I've a couple of questions: 1) Does this patch "reverts" to 2.6.18.x status? 2) In which way "irqpoll" has negative effects of performances? Thanks, bye
yes, the patch in comment #10 was added between 2.6.18 and 2.6.19, so reverting it from 2.6.19 changes that code back to how 2.6.18 was. you should no longer need irqpoll, please don't use it. if you notice a significant performance difference with irqpoll depends on your system and workload -- but it is intended as a workaround only, not something you should use if you can avoid it.
I found an additional system which fails, an Intel STL2 -- an old Dual PIII with a serverworks chipset that does this: ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 31 low level) I've reverted the patch from comment #10 and sent it to Linus for 2.6.20 < rc1, as well as to stable for 2.6.19.1.
Hi again, thanks for the answer (Comment #9). Of course I'll not use "irqpoll" anymore, I was just wondering about the performance loss. In fact, with "irqpoll", hdparm -t /dev/hda gives less than 20MB/sec, while in normal conditions is more than 50MB/sec. Which is a quite impressive difference... Anyway, hopefully this got fixed. I do have another question. You got two acpidumps, did you find anything strange there? Thanks, bye
Piergiorgio, turns out that dmesg printed out the part of acpidump that is needed here, thanks. ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 20 low level)
*** Bug 7628 has been marked as a duplicate of this bug. ***
shipped in 2.6.19.1 and 2.6.20 closed.