Bug 1352

Summary: irq 9: nobody cared! (APIC SCI+USB) - SiS645
Product: ACPI Reporter: Joshua Schmidlkofer (menion)
Component: Config-InterruptsAssignee: Len Brown (lenb)
Severity: normal CC: acpi-bugzilla
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.0-test7 Subsystem:
Regression: --- Bisected commit-id:
Attachments: 2.6.0-test7 dmesg boot log.
dmesg after error
lspci -vv output
Problem in remissions - added 'noapic' to boot cmdline
dmesg - with 'noapic'
Interrupts from the Working Condition
Interrupts After rebooting and getting starts.
Interrupts After rebooting and then moving the mouse.
dmesg with - borked Interrupts.
dmeg - with pci=noacpi
print ioapic patch
dmesg - as much as it saves, with patch.
acpidmp output
/proc/interrupts - after hitting power button 3 times.
cmdline used for boot
dmesg with cmdline: acpi=off noapic
/proc/interrupts with cmdline: acpi=off noapic
dmesg with cmdline: acpi=off
/proc/interrupts with cmdline: acpi=off
2.6.0-test11 + patch from 1351 - ACPI/USB STILL BORKED
INTERRUPTS - 2.6.0-test11 + patch from 1351- ACPI/USB STILL BORKED
2.6.2 - Behaviour changed, still broken.
dmesg output from 2.6.2 w/ pci=noacpi
dmesg output w/ fullacpi
Interrupts w/ fullacpi
test patch for ACPI interrupt over-ride

Description Joshua Schmidlkofer 2003-10-13 21:56:41 UTC
Distribution: Redhat 9
Hardware Environment: 
   Pentium 4 2.4
   Soyo Dragon Lite - SiS 645 chipset
Software Environment:
   gcc (GCC) 3.2.2 20030222 (Red Hat Linux 3.2.2-5)
   module-init-tools version 0.9.14
Problem Description:
   ACPI interrupt is disabled after a while.  It is mysteriously linked to
another interrupt.  In this case, one of the USB controllers.  Previously it was
linked to the NVidia interrupt. Let me be clear: The ACPI controller has always
been alone on interrupt 9, in /proc/interrupts.  However, I can cause the number
of interrupts to climb by using the device that it is linked to.  When the
NVidia card was linked, anything that caused events except moving the mouse
caused the number to climb.  Now, moving my mouse, or writing from my USB drives
cause the number to climb.  When it hits 100002, then I get a Call Trace, and
the kernel disables IRQ #9.  The kernel appears to continue uninterrupted, with
no further problems.

Steps to reproduce:
    Boot with full ACPI support.  Move mouse, or use USB drive. It seems to be
similar to report # 905: http://bugme.osdl.org/show_bug.cgi?id=905,  however, in
this case, nothing is shown to share the acpi interrupt.
Comment 1 Joshua Schmidlkofer 2003-10-13 22:05:37 UTC
Created attachment 1041 [details]
2.6.0-test7 dmesg boot log.
Comment 2 Joshua Schmidlkofer 2003-10-13 22:05:59 UTC
Created attachment 1042 [details]
dmesg after error
Comment 3 Joshua Schmidlkofer 2003-10-13 22:06:52 UTC
Created attachment 1043 [details]
lspci -vv output
Comment 4 Joshua Schmidlkofer 2003-10-13 22:11:02 UTC
I just noticed that after the prior to the crash, I have a bunch of
connect-debounce messages, and then after, it re-detects and creates my
muti-card reader.
Comment 5 Joshua Schmidlkofer 2003-10-13 23:51:08 UTC
Created attachment 1044 [details]
Problem in remissions - added 'noapic' to boot cmdline

I booted up with noapic - prior to 2.6.0-test1 this would still have problems,
however, so far I have no USB error messages,and the ACPI is working fine!   I
don't know what this means entirely, but I am thankful that I don't have an HT
CPU about now =).
Comment 6 Joshua Schmidlkofer 2003-10-13 23:52:09 UTC
Created attachment 1045 [details]
dmesg - with 'noapic'

Here is the dmesg, if anyone needs to review the changes.
Comment 7 Shaohua 2003-10-14 22:20:07 UTC
Please try the patch in bug 1240 first to see if it's USB's bug. If still 
can't work, please attach '/proc/interrupt' and acpidmp
Comment 8 Joshua Schmidlkofer 2003-10-15 18:25:36 UTC
The patch from bug 1240 does not apply at all against 2.6.0-test7 - should I be
dropping to a different revision?

Comment 9 Len Brown 2003-10-22 15:09:47 UTC
The 2nd dmesg attachment is a continuation of the 1st, yes? 
By remission, do you mean that "noapic" makes it work, 
but that the original APIC-mode problem still persists? 
/proc/interrupts from PIC mode shows IRQ9 shared: acpi, ohci-hcd 
Can you attach the /proc/interrupts from the APIC mode failure? 
It would be interesting to see if ACPI and USB still share an IRQ, because 
the BIOS specifies that in APIC mode IRQ9 should be Edge Triggered Active High: 
ACPI: INT_SRC_OVR (bus[0] irq[0x9] global_irq[0x9] polarity[0x0] trigger[0x0]) 
which _isn't_ condusive to sharing, particularly with an active low PCI interrupt... 
It would also be good to know if this problem persists in 2.6.0-test8, 
and interesting to know if ACPI events, such as pressing the power button, 
result in acpi interrupts being received. 
Comment 10 Joshua Schmidlkofer 2003-10-23 20:40:30 UTC
Created attachment 1154 [details]
Interrupts from the Working Condition

The only change that I made: 
 - cmdline: ro root=/dev/hda7 pci=noacpi
  I added pci=noacpi
Comment 11 Joshua Schmidlkofer 2003-10-23 20:48:27 UTC
Created attachment 1155 [details]
Interrupts After rebooting and getting starts.

This is after starting the system, X, Firebird and Evolution.
Comment 12 Joshua Schmidlkofer 2003-10-23 20:50:17 UTC
Created attachment 1156 [details]
Interrupts After rebooting and then moving the mouse.

This is after everything was running.  Then I moved the mouse a bunch, then I
re-polled the interrupts.
Comment 13 Joshua Schmidlkofer 2003-10-23 20:52:18 UTC
NOTE:  The "Interrupts After" posts are with the following cmdline:
ro root=/dev/hda7
Comment 14 Joshua Schmidlkofer 2003-10-23 20:53:21 UTC
Created attachment 1157 [details]
dmesg with - borked Interrupts.
Comment 15 Joshua Schmidlkofer 2003-10-23 20:59:24 UTC
Created attachment 1158 [details]
dmeg - with pci=noacpi

This is with pci=noacpi, this seems to work great.  I've run since the release
of 2.6.0-test8 without incident.
Comment 16 Len Brown 2003-10-24 03:28:14 UTC
Please boot with acpi=off and attach dmesg and /proc/interrupts 
Please boot with acpi=off noapic, and attach dmesg and /proc interrupts. 
The MPS/IOAPIC mode results we got above via pci=noacpi look pretty 
much like the PIC IRQ case with the APIC turned on -- and doesn't match 
the ACPI/IOAPIC mapping at all. 
The above should get ACPI completely out of the way (should be the same 
Also, please attach the output of acpidmp so we can take a look at your _PRT 
and the output of dmidecode so we can identify your board and BIOS version. 
I'd like to see your MPS tables too, but off-hand I don't see a utility to dump it, 
maybe hwinfo from Suse? 
Comment 17 Joshua Schmidlkofer 2003-10-24 11:09:47 UTC
Len, will do. This is my home system, so I have to get home, then spend time
with fam.... but I will endeavour to get the requested info tonight.  
Comment 18 Len Brown 2003-10-24 15:56:04 UTC
Created attachment 1188 [details]
print ioapic patch

The fact that acpi on IRQ9 seems to have (always) exactly 6 more interrupts
more than USB up on IRQ20, even when a boat load of interrupts are added to
 surely can't be a coincidence.

Please apply this patch do dump out the IOAPIC _after_ it gets programmed by
and attach the dmesg output, need to look at the vectors...

Comment 19 Joshua Schmidlkofer 2003-11-11 21:05:47 UTC
Created attachment 1415 [details]
dmesg - as much as it saves, with patch.

dmesg w/ patch
Comment 20 Joshua Schmidlkofer 2003-11-11 21:06:20 UTC
Created attachment 1416 [details]
acpidmp output
Comment 21 Joshua Schmidlkofer 2003-11-11 21:07:21 UTC
Created attachment 1417 [details]
Comment 22 Joshua Schmidlkofer 2003-11-11 21:07:53 UTC
Created attachment 1418 [details]
/proc/interrupts - after hitting power button 3 times.
Comment 23 Joshua Schmidlkofer 2003-11-11 21:08:27 UTC
Created attachment 1419 [details]
cmdline used for boot
Comment 24 Joshua Schmidlkofer 2003-11-11 21:09:25 UTC
I also have the following files that I saved at the time:


Please let me know if you want them.
Comment 25 Joshua Schmidlkofer 2003-11-11 21:26:07 UTC
Created attachment 1420 [details]
dmesg with cmdline: acpi=off noapic
Comment 26 Joshua Schmidlkofer 2003-11-11 21:26:30 UTC
Created attachment 1421 [details]
/proc/interrupts with cmdline: acpi=off noapic
Comment 27 Joshua Schmidlkofer 2003-11-11 21:27:06 UTC
Created attachment 1422 [details]
dmesg with cmdline: acpi=off
Comment 28 Joshua Schmidlkofer 2003-11-11 21:27:40 UTC
Created attachment 1423 [details]
/proc/interrupts with cmdline: acpi=off
Comment 29 Len Brown 2003-11-30 21:08:58 UTC
> ACPI: INT_SRC_OVR (bus[0] irq[0x9] global_irq[0x9] polarity[0x0]
> trigger[0x0])  
Please try the two patches in bug #1351, attach the resulting dmesg and /proc/interrupts 
and report on if ACPI events and USB interrupts work.  This should address the 
polarity and trigger issue on IRQ9.  I don't know if it will also address the mysterious 
USB vs ACPI tying -- though this is a root cause and that may have been a decoy. 
Comment 30 Joshua Schmidlkofer 2003-12-01 11:48:30 UTC
Created attachment 1581 [details]
2.6.0-test11 + patch from 1351 - ACPI/USB STILL BORKED

   I did this this morning, and right after boot, ACPI and USB are 1-1 still
the same.  (Each 196 in interrupt count).  I did not test the power button,
etc.  I had to go to work
Comment 31 Joshua Schmidlkofer 2003-12-01 11:48:55 UTC
Created attachment 1582 [details]
INTERRUPTS - 2.6.0-test11 + patch from 1351- ACPI/USB STILL BORKED
Comment 32 Len Brown 2003-12-01 14:49:57 UTC
Looks like the 1st pach did its thing: 
-  9:       3418    IO-APIC-edge  acpi 
+ 9:        261   IO-APIC-level  acpi 
But still this: 
 20:        261   IO-APIC-level  ohci_hcd 
I assume that irq9 still gets disabled when you wiggle the mouse enough? 
how about a couple of ACPI button presses in there -- do they register? 
the 2nd patch is a link in the text to the print_IO_APIC fix 
Comment 33 Joshua Schmidlkofer 2004-02-04 22:39:21 UTC
Created attachment 2021 [details]
2.6.2 - Behaviour changed, still broken.

2.6.2 is still broken.	Now the eth0 is sharing an interrupt with AGP.	This
happens w/ or w/o the NVidia driver.  ACPI no longer generates false
interrupts.  I get ACPI errors when loading the ohci-hcd module.
Comment 34 Joshua Schmidlkofer 2004-02-04 22:40:38 UTC
Created attachment 2022 [details]
dmesg output from 2.6.2 w/ pci=noacpi

Note: This is the boot when I used pci=noacpi.	Another attachment follows with
no such line.  w/ Full ACPI, I get no Ethernet, and w/ pci=noacpi I have
identical behaviour.
Comment 35 Joshua Schmidlkofer 2004-02-04 22:53:57 UTC
Created attachment 2023 [details]
dmesg output w/ fullacpi
Comment 36 Joshua Schmidlkofer 2004-02-04 22:55:12 UTC
Created attachment 2024 [details]
Interrupts w/ fullacpi

Here are the interrutps - note: no IRQs are apparently being delivered to the
NIC, or it's not answering.  If I send pings I just get tx errors.
Comment 37 Len Brown 2004-02-07 18:12:13 UTC
So as of 2.6.2 the ACPI SCI is no longer "tied" to USB? 
Can you clarify exactly what is not working at this point? 
Is the "nobody cared!" message gone, or does it come back someplace else? 
Unclear why you attached the /proc/cpuinfo -- did i miss something? 
Note that in 2.6.2, you have the boot parameter "acpi_irq_nobalance" 
to tell ACPI not to move interrupts around.  It might be interesting to 
compare /proc/interrupts with and without booting with this flag, 
and also to see if the error moves. 
Comment 38 Len Brown 2004-02-07 18:14:36 UTC
Created attachment 2058 [details]
test patch for ACPI interrupt over-ride

while your failure symptom no longer seems to involve the ACPI SCI, please
apply this patch to fix a known problem with the ACPI SCI -- as it is possible
that it is a side-effect that is troubling your system.
Comment 39 Joshua Schmidlkofer 2004-03-03 21:36:56 UTC
Wahoo!!  All the remaining problems are problems with devfs!!! 

I migrated too 100% udev, and all the problems are gone.  Interrupt routing was
not the issue with 2.6.3. #&^#$&^%^@# devfs was.  On completely different box I
ran into problems exactly the same, but i was about to reinstall, after
re-install all the problems were gone.  I quickly discovered that the difference
was only DevFS, so I migrated my Gentoo box to udev, and boom. Problems gone.