Bug 1581

Summary: disabled PCI Interrupt Link devices.
Product: ACPI Reporter: Luming Yu (luming.yu)
Component: Config-InterruptsAssignee: Len Brown (lenb)
Status: CLOSED CODE_FIX    
Severity: high CC: acpi-bugzilla, andi-bz, ccheney, jkohen, lenb, pcnet32, tony
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.0-test9 Subsystem:
Regression: --- Bisected commit-id:
Attachments: a patch for fixing this issue
same patch -- ported to 2.4.23 and newer 2.6.0
x86_64 VIA chipset IOAPIC fix
debug patch against 2.6.5
eMachines M6807 - 2.6.6-rc2-bk3 + patch - dmesg output
eMachines M6807 - 2.6.6-rc2-bk3 + patch - lspci output
eMachines M6807 - 2.6.6-rc2-bk3 + patch - /proc/interrupts output
updated 2.6.5 debug patch
updated 2.6.5 debug patch
proposed 2.6.5 patch
m6805 hangs with this patch

Description Luming Yu 2003-11-24 01:15:10 UTC
Current implementation just add disabled link device to acpi_link list. And this
link device could break acpi_pci_link_allocate, because it assume every link
device is valid one.
Comment 1 Luming Yu 2003-11-24 01:22:45 UTC
Created attachment 1514 [details]
a patch for fixing this issue
Comment 2 Len Brown 2003-12-01 22:52:53 UTC
Created attachment 1591 [details]
same patch -- ported to 2.4.23 and newer 2.6.0
Comment 3 Len Brown 2003-12-22 21:54:14 UTC
One would think this was an obvious fix, but it isn't so simple -- 
we may actually want to do the opposite of the suggestion in this patch, 
to ignore that links are disabled rather than pay closer attention to them 
being disabled. 
 
Experimenting with 2.4.23 on my Intel 440GX with acpi=force and noapic... 
Where device 00:0c.0 is the on-board SCSI, which is covered by link "PRQ3". 
 
pci_link-0405 [20] acpi_pci_link_set     : Link disabled 
ACPI: Unable to set IRQ for PCI Interrupt Link [PRQ3] (likely buggy ACPI BIOS). Aborting 
ACPI-based IRQ routing. Try pci=noacpi or acpi=off 
 pci_irq-0266 [17] acpi_pci_irq_lookup   : Invalid IRQ link routing entry 
 pci_irq-0305 [17] acpi_pci_irq_derive   : Unable to derive IRQ for device 00:0c.0 
PCI: No IRQ known for interrupt pin A of device 00:0c.0 
 
Then the system actually runs properly with the device on IRQ11. 
 
Apply a patch to ignore disabled links: 
 
pci_link-0605 [12] acpi_pci_link_get_irq : Invalid link context 
 pci_irq-0266 [11] acpi_pci_irq_lookup   : Invalid IRQ link routing entry 
 pci_irq-0305 [11] acpi_pci_irq_derive   : Unable to derive IRQ for device 00:0c.0 
PCI: No IRQ known for interrupt pin A of device 00:0c.0 
 
So the we followed a _PRT entry to a link device that does not exist. 
 
Do the reverse and apply a patch to acpi_pci_link_set to proceed in the 
face of disabled links: 
 
and use these options to try to program the PIRQ off the default of 11: 
Kernel command line: root=/dev/sda2 console=tty0 console=ttyS0,115200n8 acpi=force noapic 
acpi_irq_balance acpi_irq_isa=11 
 
pci_link-0405 [20] acpi_pci_link_set     : Link disabled 
pci_link-0407 [20] acpi_pci_link_set     : but continuing anyway 
pci_link-0292 [21] acpi_pci_link_try_get_: No active IRQ resource found 
_CRS returns NULL! Using IRQ 10 fordevice (PCI Interrupt Link [PRQ3]). 
ACPI: PCI Interrupt Link [PRQ3] enabled at IRQ 10 
 
and the system successfully routes the SCSI to IRQ10 and runs properly. 
 
The fact that when ACPI couldn't get an IRQ for SCSI it worked at IRQ11 anyway 
suggests that our derive function is broken.  The fact that ACPI successfully programs 
this IRQ when the link is disabled suggests that not only should we not ignore disabled 
link devices, it might be useful to allow programming them when they're referenced 
by an active _PRT entry. 
 
Comment 4 Tony Lindgren 2004-03-25 16:28:58 UTC
Created attachment 2401 [details]
x86_64 VIA chipset IOAPIC fix

The attached patch is needed on x86_64 machines based on VIA chipset. It fixes
ACPI bug 2090, but may need modifications to be safe on other systems.

On x86_64, apic still needs to be specified in the kernel cmdline:

root=/dev/hda3 ro psmouse.proto=imps apic console=tty0

And then cat /proc/interrupts shows:

 0:	 70843	  IO-APIC-edge	timer
 1:	     9	  IO-APIC-edge	i8042
 2:	     0		XT-PIC	cascade
 8:	     0	  IO-APIC-edge	rtc
10:	     0	 IO-APIC-level	acpi
12:	    44	  IO-APIC-edge	i8042
14:	  2734	  IO-APIC-edge	ide0
15:	    19	  IO-APIC-edge	ide1
17:	     0	 IO-APIC-level	yenta
18:	     0	 IO-APIC-level	eth0
21:	   565	 IO-APIC-level	ehci_hcd, uhci_hcd, uhci_hcd, uhci_hcd
22:	     0	 IO-APIC-level	VIA8233
23:	     6	 IO-APIC-level	eth1
NMI:	     12
LOC:	  70752
ERR:	      0
MIS:	      0

And things are just working :)
Comment 5 Andi Kleen 2004-03-31 03:34:56 UTC
Len, what's the status of Tony's link patch? Is it going into mainline any time
soon?

I would like to have some solution for the eMachines laptop
Comment 6 Len Brown 2004-04-01 21:33:12 UTC
Another example of the disabled PCI link device issue from Don Fry: 
 
ACPI: PCI Interrupt Link [LMVI] (IRQs 18) 
ACPI: Unable to set IRQ for PCI Interrupt Link [LMVI] to 18 (likely buggy ACPI BIOS). Try 
pci=noacpi or acpi=off 
ACPI: No IRQ known for interrupt pin A of device 0000:00:06.0 - using IRQ 255 
 
00:06.0 VGA compatible controller: S3 Inc. Savage 4 (rev 04) (prog-if 00 [VGA]) 
        Subsystem: IBM: Unknown device 01c5 
        Flags: bus master, medium devsel, latency 248, IRQ 255 
        Memory at feb00000 (32-bit, non-prefetchable) [size=512K] 
        Memory at f0000000 (32-bit, prefetchable) [size=128M] 
        Expansion ROM at <unassigned> [disabled] [size=64K] 
        Capabilities: [dc] Power Management version 1 
 
Certainly confusing to users to have messages about disabled links on the console... 
 
Comment 7 Len Brown 2004-04-23 15:46:44 UTC
Created attachment 2677 [details]
debug patch against 2.6.5

please apply this debug patch to 2.6.5 and attach the
resulting dmesg and /proc/interrupts.

I've now found in practice all 4 combinations of enabled vs. functional _CRS,
so it is clear that we can't rely on the _SRS enabled bit to mean anything.
Comment 8 Chris Cheney 2004-04-24 22:44:35 UTC
Len,

This bug seems to fix most of the problems I reported wrt #2090, I tried it 
with 2.6.6-rc2-bk3. The one remaining issue I see is that it still doesn't use 
the correct irq for via-rhine. In WinXP the irq is 23, but with the acpi patch 
the irq gets set to 11, which doesn't work. I seem to recall a bug report 
about via-rhine and irq before but I couldn't locate it in bugzilla.

Chris 
Comment 9 Chris Cheney 2004-04-24 22:50:43 UTC
Created attachment 2690 [details]
eMachines M6807 - 2.6.6-rc2-bk3 + patch - dmesg output
Comment 10 Chris Cheney 2004-04-24 22:52:01 UTC
Created attachment 2691 [details]
eMachines M6807 - 2.6.6-rc2-bk3 + patch - lspci output
Comment 11 Chris Cheney 2004-04-24 22:52:44 UTC
Created attachment 2692 [details]
eMachines M6807 - 2.6.6-rc2-bk3 + patch - /proc/interrupts output
Comment 12 Chris Cheney 2004-04-24 23:07:13 UTC
Oh yea, I also noticed that the via ide pci device got assigned the irq 23. 
I'm not sure if that was supposed to actually happen or not since normally it 
would be 14/15 right?
Comment 13 Len Brown 2004-04-25 21:39:05 UTC
Re: eMachines M6807 VIA Rhine eth0 dead on IRQ11 instead of IRQ23 
 
lspci: 
00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 74) 
        Interrupt: pin A routed to IRQ 11 
 
DSDT: 
Name (APIC, Package (0x15) { ... 
Package (0x04) { 0x0012FFFF, 0x00, \_SB.PCI0.PIB.ALKB, 0x00 }, 
 
Device (ALKB)  {... 
Method (_SRS, 1, NotSerialized) { } 
 
No editing error there, the Set Resource Setting method for ALKB used by eth0 
is a NOP. 
 
dmesg: 
ACPI: PCI Interrupt Link [ALKB] (IRQs 23) <*11>, disabled. 
 
BIOS bug #1 is that it didn't mark ALKB as enabled in its _STA method. 
BIOS bug #2 is that it returned _CRS (current setting) 11, 
while at the same time it returned _PRS (possible settings)  23. 
One of 'em must be incorrect, which one? 
 
Linux uses the _CRS value (11), we check that it worked 
and _CRS still returns 11, so we believe that we succeeded: 
We got burnt by BIOS bug #2 and the result is a dead ethernet. 
 
I believe that ignoring that _CRS is not in _PRS is a workaround 
for another broken system.  Can't have it both ways, but it seems 
that VIA has a history of an unreliable _CRS... 
 
pci_link-0423 [31] acpi_pci_link_set     : Set IRQ 11 
ACPI: PCI Interrupt Link [ALKB] enabled at IRQ 11 
IOAPIC[0]: Set PCI routing entry (1-11 -> 0x81 -> IRQ 11 Mode:1 Active:1) 
00:00:11[B] -> 1-11 -> IRQ 11 
 
Re: IDE 
00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT8233/A/C/VT8235 
        Interrupt: pin A routed to IRQ 23 
 
 Package (0x04) { 0x0011FFFF, 0x00, \_SB.PCI0.PIB.ALKA, 0x00 }, 
 
ACPI: PCI Interrupt Link [ALKA] (IRQs 16 17 18 19 20 21 22 23) <*9>, disabled. 
LENB: extended resource for 23 
LENB: setting disabled link 
pci_link-0416 [29] acpi_pci_link_set     : Attempt to enable at IRQ 23 resulted in IRQ 9, using 23 
pci_link-0423 [29] acpi_pci_link_set     : Set IRQ 23 
ACPI: PCI Interrupt Link [ALKA] enabled at IRQ 23 
IOAPIC[0]: Set PCI routing entry (1-23 -> 0xb1 -> IRQ 23 Mode:1 Active:1) 
00:00:11[A] -> 1-23 -> IRQ 23 
 
BIOS bug #3, IDE uses ALKA, which is not only disabled 
and returns _CRS=9 that is not in _PRS, 
but doesn't even support IRQ14 where IDE really lives. 
IDE is saved by a combination of bugs: 
ALKA _SRS is a big NO-OP, so it doesn't matter we tried to set it to 23. 
IDE driver is hard-coded to IRQ14 and ignores what we tell it. 
 
So what becomes of the devices actually on ALKA?: 
 
LENB: extended resource for 23 
LENB: setting disabled link 
pci_link-0416 [29] acpi_pci_link_set     : Attempt to enable at IRQ 23 resulted in IRQ 9, using 
23pci_link-0423 [29] acpi_pci_link_set     : Set IRQ 23 
ACPI: PCI Interrupt Link [ALKA] enabled at IRQ 23 
IOAPIC[0]: Set PCI routing entry (1-23 -> 0xb1 -> IRQ 23 Mode:1 Active:1) 
00:00:11[A] -> 1-23 -> IRQ 23 
 
We tell them that they're on 23 because we did a _SRS on 23 -- 
even though _CRS still returns 9.  Which method tells the truth? 
Apparently we escape unscathed because IDE is the only 
device on device 11 pinA. 
 
Please verify that this system is running the latest BIOS, 
because this sure doesn't look production quality. 
 
Also, please verify that the ACPI interrupt is working by seeing 
if it responds to your power button.  (disable acpid first to avoid 
a system shutdown). 
 
Comment 14 Len Brown 2004-04-25 22:38:22 UTC
Created attachment 2707 [details]
updated 2.6.5 debug patch

please test this updated 2.6.5 debug patch.
When the IRQs are being re-programmed anyway (IOAPIC mode),
it verifies that _CRS is a member of _PRS before programming.
This should fix the Rhine on the M6807, hopefully it doesn't break anybody
else.
Comment 15 Len Brown 2004-04-26 00:36:30 UTC
Created attachment 2710 [details]
updated 2.6.5 debug patch

Improved debug patch.  This version validates _CRS against _PRS
for both PIC and IOAPIC mode.
Comment 16 Len Brown 2004-04-30 22:31:44 UTC
Created attachment 2767 [details]
proposed 2.6.5 patch

checked this version acpi-test tree
Comment 17 Chris Cheney 2004-05-01 16:45:31 UTC
The proposed patch works for me on the eMachines M6807 with the via-rhine as 
well. Thanks! :)
Comment 18 Tony Lindgren 2004-05-04 17:58:10 UTC
Created attachment 2790 [details]
m6805 hangs with this patch

Still some problems on my m6805 laptop. Sorry I could not try this patch
earlier.

System hangs when loading processor.ko if via-rhine is loaded. System also 
hangs during boot if ACPI processor module is compiled in.

ACPI button seems to work now though, at least the machine does not power off
immediately after pressing it.

Poweroff command does not work, just says acpi_power_off.
Comment 19 Chris Cheney 2004-05-04 18:24:32 UTC
Tony, 
 
Did you remember to apply the latest patch from bug #2090 if you are running 
in x86-64 mode? It is still needed afaik. I was running in x86 mode when doing 
my testing. 
 
Chris 
Comment 20 Tony Lindgren 2004-05-04 19:40:47 UTC
Thanks for the tip, but the patch from bug #2090 does not help either.

System still hangs when loading processor module. Yes, I'm running in x86_64 mode.

Tony
Comment 21 Javier Kohen 2004-05-06 01:15:08 UTC
I tried this patch on kernel 2.6.6-rc3 with and without additionally using the
patch supplied for bug 2090, but when I boot with both ioapic and acpi enabled
the CD-ROM drive bundled with the eMachines M6805 is not correctly configured.
The laptop hangs when the ide-cd module is loaded, telling that interrupts are
being lost.

Previously, when the initial information regarding ide interfaces is displayed,
I see it can detect the harddisk and the cd-rom properly, but even then I get
lost interrupts for hdc and hdd (hdc seems to be the interface where the CD unit
is plugged). The error message reads: "hdc: IRQ probe failed (0xbafa)" (idem
with hdd).

Passing the "pci=noacpi noapic" parameters leaves me with a so far working
system, but I don't seem to need either of the aformentioned patches in that case.

I'm not using the processor module (I removed the file, just in case), and I'm
running a 32-bit kernel.
Comment 22 Javier Kohen 2004-05-07 08:52:54 UTC
Also, I don't know if it's related to the ACPI bug, but when I close the
notebook lid it produces a constant flow of the following message and becomes
unusable (the CPU usage scales to 100%):
evgpe-0403: *** Error: acpi_ev_gpe_dispatch: Unable to queue handler for GPE[
B], event is disabled

I'm still using "pci=noacpi", and the noapic parameter doesn't seem to have an
effect on the outcome.
Comment 23 Javier Kohen 2004-05-07 22:49:56 UTC
Sorry, comment #21 is invalid. I thought I had IO-APIC enabled on the kernel,
but it turns out that I didn't. Now ACPI/APIC work.
Comment #22 is still valid, with and without an acpid daemon.
Comment 24 Javier Kohen 2004-05-07 23:51:32 UTC
From http://www.muru.com/linux/amd64/:

"Pavel Machek's patch for fix PCMCIA with ACPI here
(http://www.muru.com/linux/amd64/patches/patch-m680x-pcmcia-acpi-fix).
This fixes the problem where plugging/unplugging the power cord 
with yenta_socket hangs the machine."

This additionally  fixed the lid issue I wrote about on comment #22.
Comment 25 Chris Cheney 2004-05-08 01:11:52 UTC
For me yenta socket locks up even when power state isn't changed. The issue is 
that the ACPI registers are located in the 0x4000-0x407F range which Linux 
tries to use for cardbus. The pcmcia patch is useful as a temporary solution 
but I have filed a bug #2641 that is trying to get the i/o port 0x4000 issue 
resolved properly. It appears that WinXP somehow knows which ports to reserve 
for ACPI where under Linux it gets trampled on. My current thought is that 
WinXP uses some combination of the FADT and DSDT tables to reserve the ports. 
Comment 26 Len Brown 2004-05-11 23:20:12 UTC
closing -- the disabled link patch shipped in 2.6.6, and is on top of 2.4.27-pre2. 
if these boxes still have other problems, open other bugs if you haven't already. 
 
thanks, 
-Len