Hardware Environment: AMD 64, ATI Xpress 200 chipset (MB MS-482M4) Software Environment: Linux 2.6.16-rc4 32 bit Problem Description: If after bootup NIC (eth0, eth1) are assigned to IRQ5 by ACPI, then they are not working. No interrupts from them recieved by kernel. But if other device on IRQ5 (soundcard, for example) generates interrupts, interrupts from NIC are passed and packets from network recieved. If eth0 and eth1 are assigned to IRQ15 by ACPI, all works well. (it can be if Linux boots after MS-Windows, and PC was not powered off between bootups of this two OSes) If kernel boots with "noapic nolapic acpi=off" all works well, inspite of eth0 and eth1 assigned to IRQ5. My kernel config, and dmesg are in attachments. With ACPI is on and this /proc/interrupts NICs are not working correctly: CPU0 0: 51431 XT-PIC timer 1: 693 XT-PIC i8042 2: 0 XT-PIC cascade 4: 58 XT-PIC ohci_hcd:usb1, ohci_hcd:usb2, ehci_hcd:usb3 5: 298 XT-PIC HDA Intel, eth1, eth0 8: 2 XT-PIC rtc 9: 11 XT-PIC acpi 14: 3351 XT-PIC ide0 15: 3 XT-PIC ide1 NMI: 0 ERR: 1 With ACPI is on and this /proc/interrupts NICs are working correctly: CPU0 0: 51431 XT-PIC timer 1: 693 XT-PIC i8042 2: 0 XT-PIC cascade 4: 58 XT-PIC ohci_hcd:usb1, ohci_hcd:usb2, ehci_hcd:usb3 5: 298 XT-PIC HDA Intel 8: 2 XT-PIC rtc 9: 11 XT-PIC acpi 14: 3351 XT-PIC ide0 15: 3 XT-PIC ide1, eth1, eth0 NMI: 0 ERR: 1
Created attachment 7403 [details] config for my kernel
Created attachment 7404 [details] dmesg for bad situation
Is this new behaviour? If so, what was the most recent kernel which didn't fail in this manner?
I don't know. I've also tried to run 2.6.15 and 2.6.16-rc3 on this machine. These kernels were affected to this bug too.
Firstly, why is this machine running in PIC mode? Does it run any better if you include the IOAPIC support in the kernel? That said, yes, PIC-mode is supposed to work too... The two failing devices are on LNKE. The BIOS (is telling us that it) is initializing LNKE to IRQ5: ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 7 10 11 12 14 15) but telling us the devices are on IRQ15, and we believe the link, not the PCI config space: ACPI: PCI Interrupt Link [LNKE] BIOS reported IRQ 15, using IRQ 5 ACPI: PCI Interrupt Link [LNKE] enabled at IRQ 5 ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [LNKE] -> GSI 5 (level, low) -> IRQ 5 eth0: VIA Rhine III at 0x1e800, 00:11:6b:32:94:2f, IRQ 5. eth0: MII PHY found at address 1, status 0x7869 advertising 05e1 Link 4de1. eth0: link up, 100Mbps, full-duplex, lpa 0x4DE1 8139too Fast Ethernet driver 0.9.27 ACPI: PCI Interrupt 0000:02:03.0[A] -> Link [LNKE] -> GSI 5 (level, low) -> IRQ 5 eth1: RealTek RTL8139 at 0xdca12800, 00:13:d3:a3:a9:06, IRQ 5 eth1: Identified 8139 chip type 'RTL-8100B/8139D' eth1: link down In theory, this shouldn't matter, we program the link to 5, and the interrupts should move to 5, but that appears to be a 2nd level (BIOS) bug... > If eth0 and eth1 are assigned to IRQ15 by ACPI, all works well. > (it can be if Linux boots after MS-Windows, and PC was not > powered off between bootups of this two OSes) That means that firstly, Windows is not getting fooled by the cold-boot BIOS bug that is putting this link on 5 -- it is re-programming this link to 15. Secondly, the BIOS on warm-reset is probably inheriting the correct setting for the link. Perhaps you can check this line after the reboot after windows: ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 7 10 11 12 14 15) Perhaps the '*' moves from 5 to 15? Please try booting (still in PIC mode) with "acpi_irq_isa=5" to prevent Linux from assigning that link to 5, and perhaps that will push it to do something that works. (you may need to enable "acpi_irq_balance" for this to work) > If kernel boots with "noapic nolapic acpi=off" all works well, > inspite of eth0 and eth1 assigned to IRQ5. In the "acpi=off" "noapic" case, eth0 and eth1 are on IRQ15, yes? While you're at it, this may be a good time to check that you've got the latest BIOS, and upgrade to a recent kernel. Please attach the output from acpidump. If they are movable, please report if you can get past the error by moving the ethernet devices to different slots (that will likely put them on a different link)
ping for response from bug reporter..
reject the bug due to no response from bug reporter.