Bug 2874
Summary: | VIA: IRQ 11 (usb) and IRQ 19 (ethernet) tied together - Via Apollo Pro 266 | ||
---|---|---|---|
Product: | ACPI | Reporter: | Stian Jordet (stian_web) |
Component: | Config-Interrupts | Assignee: | ykzhao (yakui.zhao) |
Status: | REJECTED WILL_NOT_FIX | ||
Severity: | normal | CC: | acpi-bugzilla, sergio, shaohua.li, stern |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.7-rc3 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
dmesg from 2.6.7-rc3 with acpi
interrupts from 2.6.7-rc3 with acpi lspci -v dsdt lspci -vv IRQ's in Windows 2000 dmesg with acpi=off nterrupts with acpi=off dmesg from 2.6.19 Patch that fixes the issue test the kernel using the workaround patch dmesg from kernel 2.6.22.6 with debug patch /proc/interrupts from 2.6.22.6 with debug patch lspci -xxx dmesg interrupts dmesg using workaround patch in comment #29 interrupts using workaround patch in comment #29 dmesg using noapic interrupts using noapic dmesg using acpi_irq_nobalance interrupts using acpi_irq_nobalance |
Description
Stian Jordet
2004-06-12 07:02:19 UTC
Created attachment 3145 [details]
dmesg from 2.6.7-rc3 with acpi
Created attachment 3146 [details]
interrupts from 2.6.7-rc3 with acpi
Created attachment 3147 [details]
lspci -v
Created attachment 3148 [details]
dsdt
Created attachment 3149 [details]
lspci -vv
Just found out that it is lspci -vv you need. Sorry.
Uhh, I just realised that my interrupts does not match my summary for this bug. The reason is that I made the dmesg and interrupts with patch from bug #2574, which moved my usb interrupt from 9 to 11. Sorry about that. /proc/interrupts shows USB getting more interrupts than eth0. Is this due to devices plugged into USB? Does USB track eth0 1:1 if you have no USB devices plugged in? since you're using the patch from the previous bug report, all the links are disabled except LNKD on IRQ11. There is 1 use of LNKD in an APIC mode _PRT: Package (0x04) { 0x0011FFFF, 0x03, \_SB.LNKD, 0x00 } }) 0000:00:11.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 18) (prog-if 00 [UHCI]) pinD 0000:00:11.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 18) (prog-if 00 [UHCI]) pinD 0000:00:11.4 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 18) (prog-if 00 [UHCI]) pinD eth0 is here: 0000:00:05.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08) pinA which uses LNKD in PIC mode, but is hard-coded to IRQ19 in IOAPIC mode. 0000:00:0c.0 Multimedia controller: Philips Semiconductors SAA7134 (rev 01) pinA also uses LNKD in PIC mode and IRQ19 in IOAPIC mode -- though I don't see a driver loaded for this device. --- Can you simplify the system and still see the problem? ie. disable all optional devices in the BIOS and exclude the drivers from the kernel except usb and eth0? I'm sort of curious where the yenta devices hang... Also, if you boot with acpi=off, does MPS configure the interrups the same way as ACPI? Do you see the problem in that case? >/proc/interrupts shows USB getting more interrupts than eth0. >Is this due to devices plugged into USB? >Does USB track eth0 1:1 if you have no USB devices plugged in? No... USB seems to get some more. With no usb devices I have after a while with network activity: usb: 107456 eth: 77301 >since you're using the patch from the previous bug report, >all the links are disabled except LNKD on IRQ11. > >There is 1 use of LNKD in an APIC mode _PRT: >Package (0x04) { 0x0011FFFF, 0x03, \_SB.LNKD, 0x00 } }) > >0000:00:11.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev >18) (prog-if 00 [UHCI]) pinD >0000:00:11.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev >18) (prog-if 00 [UHCI]) pinD >0000:00:11.4 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev >18) (prog-if 00 [UHCI]) pinD > >eth0 is here: >0000:00:05.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08) pinA >which uses LNKD in PIC mode, but is hard-coded to IRQ19 in IOAPIC mode. > >0000:00:0c.0 Multimedia controller: Philips Semiconductors SAA7134 (rev 01) pinA >also uses LNKD in PIC mode and IRQ19 in IOAPIC mode -- >though I don't see a driver loaded for this device. The driver is called saa7134 and is indeed loaded (grep dmesg for saa7134). And if not that's enough, I also get these lines: saa7134[0]/irq[10,-257154]: r=0x20 s=0x00 PE saa7134[0]/irq: looping -- clearing enable bits _when_ network is initialized. >Can you simplify the system and still see the problem? >ie. disable all optional devices in the BIOS and exclude >the drivers from the kernel except usb and eth0? >I'm sort of curious where the yenta devices hang... Yes, I removed every expansion card from my system, except VGA, and I disabled everything in BIOS except SCSI (since that's where the root-disk is), and the problem was still there. Do you want to see dmesg and/or interrupts from that scenario? There was no change (except for all the missing devices..) >Also, if you boot with acpi=off, does MPS configure >the interrups the same way as ACPI? Do you see >the problem in that case? This system have never booted in MPS mode. Neither with kernel 2.4, and that have caused me some grief with certain Linux-installers and stuff... It hangs after these three lines: ENABLING IO-APIC IRQs Setting 2 in the phys_id_present_map ...changing IO-APIC Physical APIC ID to 2 ... ok. So I have no idea whether I get the problems in MPS-mode... My girlfriends pc have excactly the same behaviour as mine. That is a VIA KLE133 socket A chipset from Microstar. So it seems this is VIA specific, and not just a broken BIOS on my computer... Do you want dmesg and interrupts from her pc as well? Hope this helps... > saa7134[0]/irq[10,-257154]: r=0x20 s=0x00 PE
> saa7134[0]/irq: looping -- clearing enable bits
This doesn't look healthy. Is it possible to disable
this device (and the others) in the BIOS?
Maybe some clues can be gleaned by loading w/ debug. eg
# modprobe saa7134 irq_debug=2
What is the relationship between the saa7134 and eth0?
saa7134 complains when eth0 is configured, but doesn't complain otherwise?
Do the spurious usb interrupts still happen if saa7134 is configured
but eth0 is disabled?
Well, as I wrote in my last comment, I still get spurious interrupts without the saa7134 card or yenta adapter inserted, so I doubt it has anything to do with them. Btw. saa7134 is a tv-card. It's working fine, even with those messages. eth0 and saa7134 have no relation, except they are sharing the same irq.. I'll try to compile saa7134 as a module and load it with irq-debug later today... As I said, my girlfriends pc have the same symptoms, and she doesn't have one single pci-card in her pc. Thanks for looking into this :) Created attachment 3493 [details]
IRQ's in Windows 2000
Here is the interrupts from Windows 2000. As you can see, the usb-interrupts
are on another irq here. I don't know if that matters.
Anyway, I hope this sheds some light on this issue.
Is it possible to force usb-irq to 9 (since that's what Windows XP uses) and see if it gets better then? I finally found out what causes the pc to not boot when booting in IO-APIC mode. In IO-APIC mode usb does not work (just like my other VIA-board in IO-APIC mode...), but usb gets IRQ 19, the same as eth0 (which might explain why they are tied together?) I'll attach dmesg and interrupts. Please look at this bug again. Created attachment 3693 [details]
dmesg with acpi=off
Created attachment 3694 [details]
nterrupts with acpi=off
same with 2.6.9? Yes, sorry.. Also with latest 2.6.10-rc1-bk. As interrupt wires are assigned to slots rather than devices... Is it possible to modify the configuration and move devices between slots to see if the issues stay with the slot or move with the device? Cool that you have not yet forgot me :) Happy new year, btw. Since the problem is with usb and ethernet, and both these devices are integrated, it's not possible to move them, sorry. Anyway, I have earlier tried to remove every add-in card from the pc, it doesn't change anything. And my girlfriends pc (which have the same problem, as I've told earlier) doesn't have any add-in cards neither). Hi again, I just wonder, would it help if I open a new bug with information from my girlfriends pc? It is still a VIA chipset, but not Apollo Pro. It's a AMD cpu and UP. Maybe that would make it easier to see, if you get a new view of things? :) It exhibits the excact same behavior. Best regards, Stian Still the same with 2.6.18-rc2... Len; will it help if I ship my motherboard to you for testing? :) I will need it back, but if you think it will help to have it your self, I'd be more than willing to pay shipping both ways. Hmm. Could this be the problem: ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 *9 10 11 12 14 15) ACPI: PCI Interrupt 0000:00:11.2[D] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11 LNKD is enabled at irq 9, still it get's irq 11? And it makes sense if that's the problem, since Windows has usb in irq 9. Is there a way to force usb to irq 9, at least for testing? dmesg of kernel 2.6.18-rc6 http://lkml.org/lkml/2006/9/10/120 drivers/acpi/pci_link.c, line #583 acpi_irq_penalty[link->irq.active] += PIRQ_PENALTY_PCI_USING; printk(PREFIX "%s [%s] enabled at IRQ %d\n", acpi_device_name(link->device), acpi_device_bid(link->device), link->irq.active); bug #2243 may have some clues >ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
>ACPI: PCI Interrupt 0000:00:11.2[D] -> Link [LNKD] -> GSI 11 (level, low) ->
IRQ 11
I wonder how could this happen. irq 9 is in LNKD's possible irq list, so
kernel doesn't change it. Do you get the output from the stock kernel or do
you patches it? Can you try a latest kernel?
Created attachment 9751 [details]
dmesg from 2.6.19
This is dmesg from 2.6.19. It's exactly the same. No patches, this is vanilla
from kernel.org.
And yeah, it's really weird, don't see how this can happen. But it's been like
this forever, and I would be incredibly happy to get this finally fixed...
Can you narrow down how kernel changes it? From the code, I can't get it. the routine is 'acpi_pci_link_allocate', link->irq.active is 9 at the begining of the routine, but it gets changed to 11. Can you have some investigations and check which steps changed it? Print something in the routine might help. Sorry, that's beyond my skills. If you would whip up a patch, I'd be happy to apply it and test it... Well, I figured printk out :P It changes with this block: /* Attempt to enable the link device at this IRQ. */ if (acpi_pci_link_set(link, irq)) { printk(KERN_ERR PREFIX "Unable to set IRQ for %s [%s]. " "Try pci=noacpi or acpi=off\n", acpi_device_name(link->device), acpi_device_bid(link->device)); return -ENODEV; } else { acpi_irq_penalty[link->irq.active] += PIRQ_PENALTY_PCI_USING; printk(PREFIX "%s [%s] enabled at IRQ %d\n", acpi_device_name(link->device), acpi_device_bid(link->device), link->irq.active); } Does this help? Created attachment 10747 [details]
Patch that fixes the issue
Just for the record, the atteched patch fixes this issue. But it breaks nforce
chipsets, so it's far from perfect.
Hi Stian, I assume that 2.6.22.stable un-patched still fails here? Yes, I'm sorry. Both 2.6.22.6 and 2.6.23-rc5 fails. Isn't it possible to use a dmi-blacklist-thingy to apply this "fix" only to VIA-chipsets? Or something more clever? Hi, Stian Will you please test it using the kernel of 2.6.22.6 ? (enable the debug function of pci and acpi in kernel configuration). Please boot with option of noapic apic=debug initcall_debug and upload the following info. a. dmesg b. interrupts. c. lspci -xxx Thanks. Created attachment 13180 [details]
test the kernel using the workaround patch
Hi, Stian
Will you please test the kernel of using the workaround patch?(kernel can be 2.6.22.6 )
After applying the workaround patch, please boot system with option of apic=debug initcall_debug and verify whether the system can work well.(acpi is enabled, I/O APIC mode is used).
It is helpful that the following info is uploaded.
a. dmesg
b. interrupts.
Thanks.
Hi! Thanks for looking into this. I'm sorry I didn't respont earlier, I have been out of town this week. Anyway, 2.6.22.6 with your workaround patch, and booted with apic=debug initcall_debug seems to work very well. As you can see in my /proc/interrupts, usb now shares irq 17 with eth0, so it's hard to "test" if my issue is solved, but now it won't be a problem, I guess. Please tell me if you'd need anything else. Is this workaround patch something that might end up in the kernel some day? Created attachment 13201 [details]
dmesg from kernel 2.6.22.6 with debug patch
Created attachment 13202 [details]
/proc/interrupts from 2.6.22.6 with debug patch
Created attachment 13203 [details]
lspci -xxx
Oops, I spoke to fast. USB does not work with this patch :( Sorry about that... Hi, Stian the dmesg in comment #35 is incompleate. Will you please enable PCI debug in kernel configuration and upload the full dmesg ? Hi, Stian Will you please test the system with boot option of noapic apic=debug?(Not using the workaround patch in #comment 33). At the same time the PCI and acpi debug had better be enabled in kernel configuration. Thanks. Hi, Stian Will you please confirm whether USB can work well in comment #31? (Not using the workaround patch in #33). Thanks. Hmm. Actually it doesn't. I'll have to investigate this more later when I'm back home later today. With the Ubuntu 2.6.22-kernel it works as it should with the same irq-s as with vanilla 2.6.22.6 without your patch, so I don't see why it doesn't work. I'll try to figure it out tonight. I'll also enable more debug and post logs then. Thanks again :) Hi, Stian Will you please confirm whether USB can work well with the boot option of noapic apic=debug(not using the workaround patch in #comment 33)? Thanks. Stian, any response? Hi Michael and Yakui! First of all, I'm very, very sorry for replying so late! It's been extremely busy for me lately, and I have bought my first house! But enough of that. Second, I'm sorry I said usb didn't work with kernel 2.6.22.6 earlier. It was a misconfiguration on my part. (Missing the CONFIG_USB_DEVICE_CLASS / the udev-rule-replacement). That means that 2.6.24-rc4 works just as well as any other kernel. I still have to use "noirqdebug", but with that, usb is working and everything is perfectly well. Using your debug patch in comment #33, usb still works, albeit extremely slow. But if I copy some files over the network, the usb transfer speeds up. I guess this is because they share the same irq. Besides irq 9 (acpi) gets disabled with your debug patch. (see attached dmesg). I'll attach dmesg and interrupts from 2.6.24-rc4 with your debug-patch. Created attachment 13978 [details]
dmesg
Created attachment 13979 [details]
interrupts
Just FYI, 2.6.24-rc4 also works perfectly using the patch in comment #29. Hi, Stian Thanks for the info. It will be great if you can attach the output of dmesg and /proc/interrupts after the patch in comment #29 is applied on the kernel of 2.6.24-rc4. Thanks. Hi, Stian Will you please try the following test besides the test in comment #49? Test the kernel 2.6.24-rc4(not using any patch) and attach the output of dmesg and /proc/interrupts using the following boot options? a. noapic b. acpi_irq_nobalance Thanks. Created attachment 13990 [details] dmesg using workaround patch in comment #29 Created attachment 13991 [details] interrupts using workaround patch in comment #29 Created attachment 13992 [details]
dmesg using noapic
Created attachment 13993 [details]
interrupts using noapic
Created attachment 13994 [details]
dmesg using acpi_irq_nobalance
As far as I can see, the acpi_irq_nobalance kernel option doesn't seem to do anything...
Created attachment 13995 [details]
interrupts using acpi_irq_nobalance
Hi, Stian From the info in comment #51 and #55 it seems that the boot option of acpi_irq_nobalance and the patch in comment #29 have the same meaning. The won't change the default setting of LNK device. The root cause of this bug is related to the VIA chipset and the analysis about this bug is listed in the following: a. acpi=off . The USB host controller and ethernet controller share the same interrupt. Their interrupts are all routed to the Pin 19 of IO APIC. b. noapic : The USB host controller and ethernet controller use the same LINK device(LNKD). The interrupt is routed to Pin 9 of I8259. c. when the system is booted with ACPI and I/O APIC is used, the interrupt routes for USB and ethernet are totally different. The interrupt of USB controller is routed to 9 through LNKD device in case of no irqbalance(If irq balance is used, maybe it will be routed to 11).And ethernet device is directly routed to Pin 19 of I/O APIC. (The interrupt route info is found in the ACPI PRT table). d. When the patch in comment #32 is applied,the interrupt of USB host is forced to be routed to the pin 19 of I/O APIC.Only acpi interrupt is registered for IRQ 9 and LNKD device is disabled by calling the _DIS method. But the system reports that the error message of IRQ 9 nobody cared for. The only explanation for this is that _DIS method can't disable the LNKD device and the interrupt pins of USB hosts are still hardrouted to IRQ 9 through LNKD device. From the above analysis it seems that there are two routes for the interrupt pin of USB and ethernet in I/O APIC mode. One is directly routed to Pin 19 and the other is routed to I/O APIC through LNKD device. So it seems that USB interrupt and ethernet interrupt are tied together. And this bug is related to the hardware. It can't be fixed by software if there is no available spec for this chipset. (In reply to comment #57) > d. When the patch in comment #32 is applied,the interrupt of USB host is > forced to be routed to the pin 19 of I/O APIC.Only acpi interrupt is > registered > for IRQ 9 and LNKD device is disabled by calling the _DIS method. > But the system reports that the error message of IRQ 9 nobody cared for. The > only explanation for this is that _DIS method can't disable the LNKD device > and > the interrupt pins of USB hosts are still hardrouted to IRQ 9 through LNKD > device. > drivers/acpi/pci_link.c, line #583 acpi_irq_penalty[link->irq.active] += PIRQ_PENALTY_PCI_USING; printk(PREFIX "%s [%s] enabled at IRQ %d\n", acpi_device_name(link->device), acpi_device_bid(link->device), link->irq.active); this code is very old, year 2002, so maybe this code is not need it anymore and is the root of the problem, hardrouted something with IRQ_PENALTY. Could you take a look at those code , and see if it can be deleted. As far as I know, you are saying that ACPI, now, find irq routing for those device, and IIRC those code (pci_link.c) is for devices that we don't know the irq routing. Thanks, Hi, Sergio The patch in comment #29 can make the system work.But the root problem of this bug can't be fixed by the patch in comment #29. In fact the boot option of acpi_no_irqbalance is equal to the patch in comment #29 and both can make the system work. The problem in the comment #58 is explained from the following two views: a. The LINK device is programmed into default IRQ number by BIOS. When the LINK device driver is loaded, the _DIS method is called to disable the LINK device. When a LINK device is used by some PCI devices, it will be reprogrammed. In case of ACPI irq balance the OS will select the IRQ number according to irq penalty table. Maybe it will caused that the result IRQ number is different with the default IRQ number. If LINK device is hardrouted , the problem appears. If no irq balanced is used, the LINK device will be reprogrammed by the default IRQ number and there will be no problem. If the LINK device is hardrouted, maybe it will be better that no irq balance is used. (In current kernel the IRQ balance is the default mode). b. The source pointed to in comment #58 to has another purpose. If one LINK device is not programmed by BIOS(it means that Link->irq.active is zero) and used by some PCI devices, OS should select a proper IRQ number for this LINK device according to the irq penalty tables. Otherwise there will another problem(some irq is shared by too many PCI devices). So I think that the source code can't be deleted. Thanks. (In reply to comment #59) Hi, last change ykzhao remind me, is some here on bios setup change the IRQs for usb devices |