Distribution: debian unstable Hardware Environment: TARGA Traveller 826 MT32 (turion64 notebook) Software Environment: 2.6.13-rc2, 2.6.13-rc3, 2.6.13-rc4, current linus git tree Problem Description: ehci_hcd causes severe breakage The notebook I have (it's a turion64 with ATI chipset, lspci below). ehci_hcd runs perfectly fine in the 2.6.12.2 and 2.6.12.3 kernels. However, with 2.6.13-rc2 and later (including linus git tree today) it causes sever breakage, making not only the usb subsystem unusable. The notebook has OHCI, EHCI and Cardbus on one IRQ line (#11). At boot time, first yenta, then ohci-hcd, then ehci-hcd are loaded. With 2.6.12.3 everything works fine, and after bootup, /proc/net/interupts shows: 11: 0 XT-PIC yenta, ohci_hcd:usb1, ohci_hcd:usb2, ehci_hcd:usb3 However, when booting and loading the modules on any later kernel, after loading ehci-hcd the kernel says: Jul 29 15:28:04 localhost kernel: ehci_hcd 0000:00:13.2: PCI device 1002:4373 (ATI Technologies Inc) Jul 29 15:28:04 localhost kernel: ehci_hcd 0000:00:13.2: new USB bus registered, assigned bus number 3 Jul 29 15:28:04 localhost kernel: ehci_hcd 0000:00:13.2: irq 11, io mem 0xfbdff000 Jul 29 15:28:04 localhost kernel: ehci_hcd 0000:00:13.2: USB 2.0 initialized, EHCI 1.00, driver 10 Dec 2004 Jul 29 15:28:04 localhost kernel: hub 3-0:1.0: USB hub found Jul 29 15:28:04 localhost kernel: hub 3-0:1.0: 8 ports detected Jul 29 15:28:04 localhost kernel: irq 11: nobody cared (try booting with the "irqpoll" option) Jul 29 15:28:04 localhost kernel: Jul 29 15:28:04 localhost kernel: Call Trace: <IRQ> <ffffffff80156c45>{__report_bad_irq+53} <ffffffff80156e57>{note_interrupt+439} Jul 29 15:28:04 localhost kernel: <ffffffff801567bf>{__do_IRQ+207} <ffffffff80111518>{do_IRQ+72} Jul 29 15:28:04 localhost kernel: <ffffffff8010ef62>{ret_from_intr+0} <EOI> Jul 29 15:28:04 localhost kernel: handlers: Jul 29 15:28:04 localhost kernel: [<ffffffff880ff580>] (yenta_interrupt+0x0/0xc0 [yenta_socket]) Jul 29 15:28:04 localhost kernel: [<ffffffff802965b0>] (usb_hcd_irq+0x0/0x70) Jul 29 15:28:04 localhost last message repeated 2 times Jul 29 15:28:04 localhost kernel: Disabling IRQ #11 So IRQ11 gets issued more than 100.000 times, and the kernel finally disables it. /proc/net/interrupt at this time: 11 100000 XT-PIC yenta, ohci_hcd:usb1, ohci_hcd:usb2, ehci_hcd:usb3 The same happens when no other drivers are boundto IRQ #11, i.e. when I only load ehci_hcd from "init=/bin/sh" mode. This is definitely a regression over previous kernels. I've tried to backport the 2.6.12.3 usb code into 2.6.13-rc4, but there are too many changes with the device model in order to make this feasible :( lspci: 0000:00:00.0 Host bridge: ATI Technologies Inc: Unknown device 5951 0000:00:02.0 PCI bridge: ATI Technologies Inc: Unknown device 5a34 0000:00:13.0 USB Controller: ATI Technologies Inc: Unknown device 4374 0000:00:13.1 USB Controller: ATI Technologies Inc: Unknown device 4375 0000:00:13.2 USB Controller: ATI Technologies Inc: Unknown device 4373
If I do not load ehci-hcd, and only use yenta and ohci-hcd on IRQ11, both cardbus and USB1.0 - devices work fine, with no bogus interrupts whatsoever.
this might be related to bug #4866
Can you please generate the dmesg output for good and bad kernels and diff them? My money's on acpi :(
Agreed, this is likely an ACPI or BIOS problem. That's where they usually come up. Another experiment: try with a 32bit kernel.
This is both with "acpi=off" as kernel bootup argument. ACPI causes so many problems on this device (like invalid IRQ routing, 50% softirq load in ACPI code when CPU is idle, ...) that I don't bother enabling it. I'll post dmesg shortly.
I can't try 32bit kernels since I don't have the space for installing a 32bit userspace onto a separate partition [and this is a notebook] :(
Created attachment 5425 [details] dmesg of kernel 2.6.13-rc3 booting up
Created attachment 5426 [details] dmesg of 2.6.12.2 booting up (usb-ehci working)
Ugh, fun. This looks like it _might_ be a pci resource issue. Since you have git, care to use 'git bisect' to try to see if you can find this bug? It would be most appreciated :)
We need an easy git-bisection HOWTO. I have a few emails from Linus saved away, but they're gobbledigook.
http://www.livejournal.com/users/kernelslacker/22371.html is a good start of such a HOWTO
I had this very same problem. It should be fixed in the latest 2.6.16-rc3 kernel. it was due to a bug in the EHCI handoff code. Please reopen this bug, if after testing it is still present.