Bug 2665
Summary: | Re: hdc: lost interrupt ide-cd: cmd 0x3 timed out ... | ||
---|---|---|---|
Product: | ACPI | Reporter: | Alex Riesen (raa.lkml) |
Component: | Config-Interrupts | Assignee: | Len Brown (lenb) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | acpi-bugzilla, as, marcus_brodi, richlv |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.6-rc3-bk8 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
Output of acpidmp
lspci -vv dmesg Alt-Sysrq-T for modprobe patch vs 2.6.5 acpidmp output on Anssi's system Dmesg output on Anssi's system lspci -vv output on Anssi's system dmesg with 2.6.6 vanilla and patch from this bug report |
Description
Alex Riesen
2004-05-09 23:52:02 UTC
Created attachment 2829 [details]
Output of acpidmp
Created attachment 2830 [details]
lspci -vv
Created attachment 2831 [details]
dmesg
Created attachment 2832 [details]
Alt-Sysrq-T for modprobe
Created attachment 2840 [details]
patch vs 2.6.5
The problem is triggered by a BIOS bug which sets the current IRQ
to a value outside the list of possible IRQs:
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *9
ACPI handles this and selects a value from the possible list.
But since the active IRQ is set, Linux erroneously doesn't
look at its rules for deciding which IRQ in the list to select.
It chooses 15, which turns out to be a bad idea.
ACPI: PCI Interrupt Link [LNKF] enabled at IRQ 15
The fix is to simply forget the illegal current setting (IRQ 9)
and behave as if the BIOS didn't give us any current value.
Same problem here, in a non-SiS environment, the proposed patch also fixes the problem for me. Thanks Len! -[00]-+-00.0 VIA Technologies, Inc. VT8363/8365 [KT133/KM133] +-01.0-[01]----00.0 ATI Technologies Inc Radeon RV200 QW [Radeon 7500] +-07.0 VIA Technologies, Inc. VT82C686 [Apollo Super South] +-07.1 VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE +-07.2 VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller +-07.3 VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller +-07.4 VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] +-09.0 Ensoniq ES1371 [AudioPCI-97] +-0d.0 LSI Logic / Symbios Logic 53c810 +-0f.0 3Com Corporation 3c905B 100BaseTX [Cyclone] \-11.0 3Com Corporation 3c905B 100BaseTX [Cyclone] *** Bug 2689 has been marked as a duplicate of this bug. *** integrated on top of 2.4.27-pre2 and 2.6.6 ie. will show up in 2.4.27-pre3 and 2.6.7-rc1. closing. *** Bug 2888 has been marked as a duplicate of this bug. *** The patch to 2.6.5 which apparently made it to 2.6.6-bk2 makes my system unbootable. I reported my problem on the lkml, see http://marc.theaimsgroup.com/?l=linux-kernel&m=108793753409268&w=2 Created attachment 3264 [details]
acpidmp output on Anssi's system
Created attachment 3265 [details]
Dmesg output on Anssi's system
This is from running 2.6.7 with the patch removed.
Created attachment 3266 [details]
lspci -vv output on Anssi's system
re: comment #10 Anssi, Re comment #10, the e-mail at that link mentions that the problem is unchanged with acpi=off Is that accurate? lspci shows PCI-id/pin for CMD: 00:09.0 RAID bus controller: CMD Technology Inc PCI0649 (rev 02) - pinA acpidmp DSDT shows _PRT entry uses LNKC: Package (0x04) { 0x0009FFFF, 0x00, \_SB.PCI0.LNKC, 0x00 } dmesg_patch_removed: ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 6 7 10 11 12) *5 ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 12 CMD649: 100% native mode on irq 12 dmesg linux_2.6.7_boot_hang: ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 6 7 10 11 12) *5 ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10 ACPI: PCI interrupt 0000:00:09.0[A] -> GSI 10 (level, low) -> IRQ 10 CMD649: 100% native mode on irq 10 So in the old days, /proc/interrupts would probably show this device happy on IRQ5. For a short time we moved it to IRQ12 (the subject of this bug report), but apparently there is no PS2 mouse on this system so that bug wasn't noticed. That bug was fixed and moved CMD to IRQ10, where it dies. While the patch in this bug report changes the behaviour of this system, I don't think it is the root cause of this failure. Perhaps you can apply just the patch above to a vanilla 2.6.6 kernel and see if it moves the CMD to IRQ10, and if it works in that kernel. If yes, then we broke this system later in 2.6.7, probably with the fix for bug #2574 vs PIC mode PCI. Re: comment 14: Indeed, with acpi=off I didn't get the problem. Sorry. The CMD goes to IRQ 5 then, as you said. Re: comment 15: I tried 2.6.6 vanilla with just the patch in this bug report and it also puts the CMD on IRQ 10. As does 2.6.6-bk1 with the same patch and 2.6.6-bk2 and later have it included and also have the same problem. I captured the boot messages again from 2.6.6 and will attach it also. 2.6.6 with the patch from this bug report does boot, but /proc/interrupts doesn't show the CMD anywhere and actually trying to access a disk attached to it causes a hang. Created attachment 3270 [details]
dmesg with 2.6.6 vanilla and patch from this bug report
Thanks for testing 2.6.6+ patch above. This proves that the issue *is* IRQ10, and not the 2.6.7 mp_parse_prt() changes. One possibility is that IRQ10 is broken on this motherboard. Boot (latest unpatched kernel) with "acpi_irq_isa=10" and this should move the interrupt someplace else, probably IRQ11. However, I expect the MB is not broken and that we'll see the problem simply move to IRQ11. I notice that this device uses LNKA, which is also programmed to IRQ10: 00:0a.0 SCSI storage controller: LSI Logic / Symbios Logic (formerly NCR) 53c810 (rev 02) Is this device being used? Is it possible to disable it in the BIOS? One way to debug this is to disable all possible devices and see if the problem goes away, then see which device caused the problem. Please include the /proc/interrupts for the success and (if you can) the failure cases. I should add, Anssi, if you have a WinXP boot disk, it would be interesting to see where Windows assigns the IRQs on this system. Here are the IRQ assignments from Windows XP: IRQ 0 System timer OK IRQ 1 Standard 101/102-Key or Microsoft Natural PS/2 Keyboard OK IRQ 3 Communications Port (COM2) OK IRQ 4 Communications Port (COM1) OK IRQ 5 OK IRQ 6 Standard floppy disk controller OK IRQ 7 CMI8738/C3DX PCI Audio Device OK IRQ 7 CMD PCI-0649 Ultra DMA IDE Controller OK IRQ 8 System CMOS/real time clock OK IRQ 9 Microsoft ACPI-Compliant System OK IRQ 10 MPU-401 Compatible MIDI Device OK IRQ 11 SAPPHIRE RADEON 9600 PRO ATLANTIS OK IRQ 11 LSI Logic 53C810 Device OK IRQ 11 VIA Rev 5 or later USB Universal Host Controller OK IRQ 11 VIA Rev 5 or later USB Universal Host Controller OK IRQ 11 VIA Rev 5 or later USB Universal Host Controller OK IRQ 11 VIA Rev 5 or later USB Universal Host Controller OK IRQ 12 Broadcom NetXtreme Gigabit Ethernet OK IRQ 13 Numeric data processor OK IRQ 14 Primary IDE Channel OK IRQ 15 Secondary IDE Channel OK And that really tells the tale. The thing in IRQ 10 is part of the Winbond Super-IO chip on the board and the midi port is by default enabled and set to use IRQ 10. Giving the acpi_irq_isa=10 option worked fine, as did sabling the midi device in bios setup. IRQs get assigned like this then: CPU0 0: 331726 XT-PIC timer 1: 935 XT-PIC i8042 2: 0 XT-PIC cascade 7: 1126 XT-PIC parport0 8: 1 XT-PIC rtc 9: 0 XT-PIC acpi 10: 31 XT-PIC ide2, ehci_hcd 11: 7670 XT-PIC uhci_hcd, uhci_hcd, uhci_hcd, uhci_hcd, eth0, sym53c8xx 14: 7668 XT-PIC ide0 15: 1 XT-PIC ide1 NMI: 0 LOC: 331534 ERR: 217 MIS: 0 Now, I suppose if the midi port is enabled and set to IRQ 10, then it's actually quite right to give the acpi_irq_isa=10 parameter, isn't it? Or do the corresponding thing in bios setup. I guess Windows XP works because it understands a little more about ISA devices than Linux? By the way, I thought this VIA chipset (KT600 with 8237 southbridge) has an IO-APIC, but apparently not? Or maybe it's just disabled on this board? > Now, I suppose if the midi port is enabled and set to IRQ 10, then it's > actually > quite right to give the acpi_irq_isa=10 parameter, isn't it? yes, it is a valid workaround -- though Linux should figure this out automatically... > Or do the corresponding thing in bios setup. Yes, disabling it in the BIOS is simpler. Apparently there is no Linux driver bound to this device so you're not using it? > I guess Windows XP works because it > understands a little more about ISA devices than Linux? Yes, apparently Windows is parsing the DSDT and finding this motherboard device: Device (MIDI) { Name (_HID, EisaId ("PNPB006")) And Linux is ignoring this information, and the IRQ that the device claims. So the problem on this board is actually the one reported in bug #2733 -- just that this bug fix exposed it. So I'm re-closing this one. > By the way, I thought this VIA chipset (KT600 with 8237 southbridge) has an > IO-APIC, but apparently not? Or maybe it's just disabled on this board? The ACPI tables headers at the top of dmesg do not list an MADT, so Linux will not find one. Dunno if this chip-set has one -- but you may find that there is a BIOS option to enable/disable it if there is one physically present. |