Bug 1352 - irq 9: nobody cared! (APIC SCI+USB) - SiS645
irq 9: nobody cared! (APIC SCI+USB) - SiS645
Status: CLOSED PATCH_ALREADY_AVAILABLE
Product: ACPI
Classification: Unclassified
Component: Config-Interrupts
i386 Linux
: P2 normal
Assigned To: Len Brown
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2003-10-13 21:56 UTC by Joshua Schmidlkofer
Modified: 2004-03-04 12:11 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.0-test7
Tree: Mainline
Regression: ---


Attachments
2.6.0-test7 dmesg boot log. (13.71 KB, text/plain)
2003-10-13 22:05 UTC, Joshua Schmidlkofer
Details
dmesg after error (15.04 KB, text/plain)
2003-10-13 22:05 UTC, Joshua Schmidlkofer
Details
lspci -vv output (8.47 KB, text/plain)
2003-10-13 22:06 UTC, Joshua Schmidlkofer
Details
Problem in remissions - added 'noapic' to boot cmdline (670 bytes, text/plain)
2003-10-13 23:51 UTC, Joshua Schmidlkofer
Details
dmesg - with 'noapic' (11.12 KB, text/plain)
2003-10-13 23:52 UTC, Joshua Schmidlkofer
Details
Interrupts from the Working Condition (744 bytes, text/plain)
2003-10-23 20:40 UTC, Joshua Schmidlkofer
Details
Interrupts After rebooting and getting starts. (702 bytes, text/plain)
2003-10-23 20:48 UTC, Joshua Schmidlkofer
Details
Interrupts After rebooting and then moving the mouse. (702 bytes, text/plain)
2003-10-23 20:50 UTC, Joshua Schmidlkofer
Details
dmesg with - borked Interrupts. (14.67 KB, text/plain)
2003-10-23 20:53 UTC, Joshua Schmidlkofer
Details
dmeg - with pci=noacpi (13.07 KB, text/plain)
2003-10-23 20:59 UTC, Joshua Schmidlkofer
Details
print ioapic patch (467 bytes, patch)
2003-10-24 15:56 UTC, Len Brown
Details | Diff
dmesg - as much as it saves, with patch. (14.91 KB, text/plain)
2003-11-11 21:05 UTC, Joshua Schmidlkofer
Details
acpidmp output (71.46 KB, text/plain)
2003-11-11 21:06 UTC, Joshua Schmidlkofer
Details
/proc/interrupts (662 bytes, text/plain)
2003-11-11 21:07 UTC, Joshua Schmidlkofer
Details
/proc/interrupts - after hitting power button 3 times. (662 bytes, text/plain)
2003-11-11 21:07 UTC, Joshua Schmidlkofer
Details
cmdline used for boot (18 bytes, text/plain)
2003-11-11 21:08 UTC, Joshua Schmidlkofer
Details
dmesg with cmdline: acpi=off noapic (9.43 KB, text/plain)
2003-11-11 21:26 UTC, Joshua Schmidlkofer
Details
/proc/interrupts with cmdline: acpi=off noapic (624 bytes, text/plain)
2003-11-11 21:26 UTC, Joshua Schmidlkofer
Details
dmesg with cmdline: acpi=off (11.69 KB, text/plain)
2003-11-11 21:27 UTC, Joshua Schmidlkofer
Details
/proc/interrupts with cmdline: acpi=off (624 bytes, text/plain)
2003-11-11 21:27 UTC, Joshua Schmidlkofer
Details
2.6.0-test11 + patch from 1351 - ACPI/USB STILL BORKED (14.25 KB, text/plain)
2003-12-01 11:48 UTC, Joshua Schmidlkofer
Details
INTERRUPTS - 2.6.0-test11 + patch from 1351- ACPI/USB STILL BORKED (673 bytes, text/plain)
2003-12-01 11:48 UTC, Joshua Schmidlkofer
Details
2.6.2 - Behaviour changed, still broken. (428 bytes, text/plain)
2004-02-04 22:39 UTC, Joshua Schmidlkofer
Details
dmesg output from 2.6.2 w/ pci=noacpi (14.02 KB, text/plain)
2004-02-04 22:40 UTC, Joshua Schmidlkofer
Details
dmesg output w/ fullacpi (14.96 KB, text/plain)
2004-02-04 22:53 UTC, Joshua Schmidlkofer
Details
Interrupts w/ fullacpi (570 bytes, text/plain)
2004-02-04 22:55 UTC, Joshua Schmidlkofer
Details
test patch for ACPI interrupt over-ride (1.90 KB, patch)
2004-02-07 18:14 UTC, Len Brown
Details | Diff

Description Joshua Schmidlkofer 2003-10-13 21:56:41 UTC
Distribution: Redhat 9
Hardware Environment: 
   Pentium 4 2.4
   Soyo Dragon Lite - SiS 645 chipset
   
Software Environment:
   gcc (GCC) 3.2.2 20030222 (Red Hat Linux 3.2.2-5)
   module-init-tools version 0.9.14
Problem Description:
   ACPI interrupt is disabled after a while.  It is mysteriously linked to
another interrupt.  In this case, one of the USB controllers.  Previously it was
linked to the NVidia interrupt. Let me be clear: The ACPI controller has always
been alone on interrupt 9, in /proc/interrupts.  However, I can cause the number
of interrupts to climb by using the device that it is linked to.  When the
NVidia card was linked, anything that caused events except moving the mouse
caused the number to climb.  Now, moving my mouse, or writing from my USB drives
cause the number to climb.  When it hits 100002, then I get a Call Trace, and
the kernel disables IRQ #9.  The kernel appears to continue uninterrupted, with
no further problems.

Steps to reproduce:
    Boot with full ACPI support.  Move mouse, or use USB drive. It seems to be
similar to report # 905: http://bugme.osdl.org/show_bug.cgi?id=905,  however, in
this case, nothing is shown to share the acpi interrupt.
Comment 1 Joshua Schmidlkofer 2003-10-13 22:05:37 UTC
Created attachment 1041 [details]
2.6.0-test7 dmesg boot log.
Comment 2 Joshua Schmidlkofer 2003-10-13 22:05:59 UTC
Created attachment 1042 [details]
dmesg after error
Comment 3 Joshua Schmidlkofer 2003-10-13 22:06:52 UTC
Created attachment 1043 [details]
lspci -vv output
Comment 4 Joshua Schmidlkofer 2003-10-13 22:11:02 UTC
I just noticed that after the prior to the crash, I have a bunch of
connect-debounce messages, and then after, it re-detects and creates my
muti-card reader.
Comment 5 Joshua Schmidlkofer 2003-10-13 23:51:08 UTC
Created attachment 1044 [details]
Problem in remissions - added 'noapic' to boot cmdline

I booted up with noapic - prior to 2.6.0-test1 this would still have problems,
however, so far I have no USB error messages,and the ACPI is working fine!   I
don't know what this means entirely, but I am thankful that I don't have an HT
CPU about now =).
Comment 6 Joshua Schmidlkofer 2003-10-13 23:52:09 UTC
Created attachment 1045 [details]
dmesg - with 'noapic'

Here is the dmesg, if anyone needs to review the changes.
Comment 7 Shaohua 2003-10-14 22:20:07 UTC
Please try the patch in bug 1240 first to see if it's USB's bug. If still 
can't work, please attach '/proc/interrupt' and acpidmp
Comment 8 Joshua Schmidlkofer 2003-10-15 18:25:36 UTC
The patch from bug 1240 does not apply at all against 2.6.0-test7 - should I be
dropping to a different revision?

js
Comment 9 Len Brown 2003-10-22 15:09:47 UTC
The 2nd dmesg attachment is a continuation of the 1st, yes? 
 
By remission, do you mean that "noapic" makes it work, 
but that the original APIC-mode problem still persists? 
 
/proc/interrupts from PIC mode shows IRQ9 shared: acpi, ohci-hcd 
 
Can you attach the /proc/interrupts from the APIC mode failure? 
 
It would be interesting to see if ACPI and USB still share an IRQ, because 
the BIOS specifies that in APIC mode IRQ9 should be Edge Triggered Active High: 
 
ACPI: INT_SRC_OVR (bus[0] irq[0x9] global_irq[0x9] polarity[0x0] trigger[0x0]) 
 
which _isn't_ condusive to sharing, particularly with an active low PCI interrupt... 
 
It would also be good to know if this problem persists in 2.6.0-test8, 
and interesting to know if ACPI events, such as pressing the power button, 
result in acpi interrupts being received. 
 
thanks, 
-Len 
 
Comment 10 Joshua Schmidlkofer 2003-10-23 20:40:30 UTC
Created attachment 1154 [details]
Interrupts from the Working Condition

The only change that I made: 
 - cmdline: ro root=/dev/hda7 pci=noacpi
  I added pci=noacpi
Comment 11 Joshua Schmidlkofer 2003-10-23 20:48:27 UTC
Created attachment 1155 [details]
Interrupts After rebooting and getting starts.

This is after starting the system, X, Firebird and Evolution.
Comment 12 Joshua Schmidlkofer 2003-10-23 20:50:17 UTC
Created attachment 1156 [details]
Interrupts After rebooting and then moving the mouse.

This is after everything was running.  Then I moved the mouse a bunch, then I
re-polled the interrupts.
Comment 13 Joshua Schmidlkofer 2003-10-23 20:52:18 UTC
NOTE:  The "Interrupts After" posts are with the following cmdline:
ro root=/dev/hda7
Comment 14 Joshua Schmidlkofer 2003-10-23 20:53:21 UTC
Created attachment 1157 [details]
dmesg with - borked Interrupts.
Comment 15 Joshua Schmidlkofer 2003-10-23 20:59:24 UTC
Created attachment 1158 [details]
dmeg - with pci=noacpi

This is with pci=noacpi, this seems to work great.  I've run since the release
of 2.6.0-test8 without incident.
Comment 16 Len Brown 2003-10-24 03:28:14 UTC
Please boot with acpi=off and attach dmesg and /proc/interrupts 
Please boot with acpi=off noapic, and attach dmesg and /proc interrupts. 
 
The MPS/IOAPIC mode results we got above via pci=noacpi look pretty 
much like the PIC IRQ case with the APIC turned on -- and doesn't match 
the ACPI/IOAPIC mapping at all. 
 
The above should get ACPI completely out of the way (should be the same 
as !CONFIG_ACPI, and !CONFIG_ACPI && !CONFIG_X86_IO_APIC) 
 
Also, please attach the output of acpidmp so we can take a look at your _PRT 
and the output of dmidecode so we can identify your board and BIOS version. 
I'd like to see your MPS tables too, but off-hand I don't see a utility to dump it, 
maybe hwinfo from Suse? 
 
Comment 17 Joshua Schmidlkofer 2003-10-24 11:09:47 UTC
Len, will do. This is my home system, so I have to get home, then spend time
with fam.... but I will endeavour to get the requested info tonight.  
Comment 18 Len Brown 2003-10-24 15:56:04 UTC
Created attachment 1188 [details]
print ioapic patch

The fact that acpi on IRQ9 seems to have (always) exactly 6 more interrupts
more than USB up on IRQ20, even when a boat load of interrupts are added to
IRQ20
 surely can't be a coincidence.

Please apply this patch do dump out the IOAPIC _after_ it gets programmed by
ACPI
and attach the dmesg output, need to look at the vectors...

thanks,
-Len
Comment 19 Joshua Schmidlkofer 2003-11-11 21:05:47 UTC
Created attachment 1415 [details]
dmesg - as much as it saves, with patch.

dmesg w/ patch
Comment 20 Joshua Schmidlkofer 2003-11-11 21:06:20 UTC
Created attachment 1416 [details]
acpidmp output
Comment 21 Joshua Schmidlkofer 2003-11-11 21:07:21 UTC
Created attachment 1417 [details]
/proc/interrupts
Comment 22 Joshua Schmidlkofer 2003-11-11 21:07:53 UTC
Created attachment 1418 [details]
/proc/interrupts - after hitting power button 3 times.
Comment 23 Joshua Schmidlkofer 2003-11-11 21:08:27 UTC
Created attachment 1419 [details]
cmdline used for boot
Comment 24 Joshua Schmidlkofer 2003-11-11 21:09:25 UTC
I also have the following files that I saved at the time:

acpidmp-2.6.0-test9-with-patch.FACP-acpitbl
acpidmp-2.6.0-test9-with-patch.acpidisasm
acpidmp-2.6.0-test9-with-patch.System.map.gz


Please let me know if you want them.
Comment 25 Joshua Schmidlkofer 2003-11-11 21:26:07 UTC
Created attachment 1420 [details]
dmesg with cmdline: acpi=off noapic
Comment 26 Joshua Schmidlkofer 2003-11-11 21:26:30 UTC
Created attachment 1421 [details]
/proc/interrupts with cmdline: acpi=off noapic
Comment 27 Joshua Schmidlkofer 2003-11-11 21:27:06 UTC
Created attachment 1422 [details]
dmesg with cmdline: acpi=off
Comment 28 Joshua Schmidlkofer 2003-11-11 21:27:40 UTC
Created attachment 1423 [details]
/proc/interrupts with cmdline: acpi=off
Comment 29 Len Brown 2003-11-30 21:08:58 UTC
> ACPI: INT_SRC_OVR (bus[0] irq[0x9] global_irq[0x9] polarity[0x0] trigger[0x0])  
 
Please try the two patches in bug #1351, attach the resulting dmesg and /proc/interrupts 
and report on if ACPI events and USB interrupts work.  This should address the 
polarity and trigger issue on IRQ9.  I don't know if it will also address the mysterious 
USB vs ACPI tying -- though this is a root cause and that may have been a decoy. 
 
thanks, 
-Len 
 
Comment 30 Joshua Schmidlkofer 2003-12-01 11:48:30 UTC
Created attachment 1581 [details]
2.6.0-test11 + patch from 1351 - ACPI/USB STILL BORKED

Len,
   I did this this morning, and right after boot, ACPI and USB are 1-1 still
the same.  (Each 196 in interrupt count).  I did not test the power button,
etc.  I had to go to work
Comment 31 Joshua Schmidlkofer 2003-12-01 11:48:55 UTC
Created attachment 1582 [details]
INTERRUPTS - 2.6.0-test11 + patch from 1351- ACPI/USB STILL BORKED
Comment 32 Len Brown 2003-12-01 14:49:57 UTC
Looks like the 1st pach did its thing: 
-  9:       3418    IO-APIC-edge  acpi 
+ 9:        261   IO-APIC-level  acpi 
But still this: 
 20:        261   IO-APIC-level  ohci_hcd 
 
I assume that irq9 still gets disabled when you wiggle the mouse enough? 
how about a couple of ACPI button presses in there -- do they register? 
 
the 2nd patch is a link in the text to the print_IO_APIC fix 
 
Comment 33 Joshua Schmidlkofer 2004-02-04 22:39:21 UTC
Created attachment 2021 [details]
2.6.2 - Behaviour changed, still broken.

2.6.2 is still broken.	Now the eth0 is sharing an interrupt with AGP.	This
happens w/ or w/o the NVidia driver.  ACPI no longer generates false
interrupts.  I get ACPI errors when loading the ohci-hcd module.
Comment 34 Joshua Schmidlkofer 2004-02-04 22:40:38 UTC
Created attachment 2022 [details]
dmesg output from 2.6.2 w/ pci=noacpi

Note: This is the boot when I used pci=noacpi.	Another attachment follows with
no such line.  w/ Full ACPI, I get no Ethernet, and w/ pci=noacpi I have
identical behaviour.
Comment 35 Joshua Schmidlkofer 2004-02-04 22:53:57 UTC
Created attachment 2023 [details]
dmesg output w/ fullacpi
Comment 36 Joshua Schmidlkofer 2004-02-04 22:55:12 UTC
Created attachment 2024 [details]
Interrupts w/ fullacpi

Here are the interrutps - note: no IRQs are apparently being delivered to the
NIC, or it's not answering.  If I send pings I just get tx errors.
Comment 37 Len Brown 2004-02-07 18:12:13 UTC
So as of 2.6.2 the ACPI SCI is no longer "tied" to USB? 
Can you clarify exactly what is not working at this point? 
Is the "nobody cared!" message gone, or does it come back someplace else? 
Unclear why you attached the /proc/cpuinfo -- did i miss something? 
 
Note that in 2.6.2, you have the boot parameter "acpi_irq_nobalance" 
to tell ACPI not to move interrupts around.  It might be interesting to 
compare /proc/interrupts with and without booting with this flag, 
and also to see if the error moves. 
 
Comment 38 Len Brown 2004-02-07 18:14:36 UTC
Created attachment 2058 [details]
test patch for ACPI interrupt over-ride

while your failure symptom no longer seems to involve the ACPI SCI, please
apply this patch to fix a known problem with the ACPI SCI -- as it is possible
that it is a side-effect that is troubling your system.
Comment 39 Joshua Schmidlkofer 2004-03-03 21:36:56 UTC
Wahoo!!  All the remaining problems are problems with devfs!!! 

I migrated too 100% udev, and all the problems are gone.  Interrupt routing was
not the issue with 2.6.3. #&^#$&^%^@# devfs was.  On completely different box I
ran into problems exactly the same, but i was about to reinstall, after
re-install all the problems were gone.  I quickly discovered that the difference
was only DevFS, so I migrated my Gentoo box to udev, and boom. Problems gone.


Note You need to log in before you can comment on or make changes to this bug.