Bug 9782
Summary: | IRQ problems in usb_hdc_irq (bluetooth usb device) & in usb | ||
---|---|---|---|
Product: | Drivers | Reporter: | Milan Kocian (milan.kocian) |
Component: | Bluetooth | Assignee: | Greg Kroah-Hartman (greg) |
Status: | REJECTED INVALID | ||
Severity: | normal | CC: | dbrownell, marcel, protasnb |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.24-rc8-git3 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 5089 | ||
Attachments: | boot log |
Description
Milan Kocian
2008-01-20 13:17:51 UTC
I tried the version 2.6.24-rc8-git7 and I don't see this segfault anymore. Interesting. (I only enabled debug statistics for powertop). I will test new 2.6.24 and do report. I tested 2.6.24 and I see this segfault too. I returned to 2.6.24-rc8-git7 and I see segfault again. So the state is that sometimes I see segfault during few minutes after boot and sometimes is all ok till next reboot. I enabled debug function in hci_usb but I see nothing sensible. So what to test next ? I tested 2.6.25: irq 19: nobody cared (try booting with the "irqpoll" option) Pid: 0, comm: swapper Not tainted 2.6.25 #1 [<c0168497>] __report_bad_irq+0x27/0x90 [<c0168756>] note_interrupt+0x256/0x290 [<c0167a71>] ? handle_IRQ_event+0x31/0x60 [<c0168e16>] handle_fasteoi_irq+0x86/0xe0 [<c0106f60>] do_IRQ+0x40/0x80 [<c0104b13>] common_interrupt+0x23/0x28 [<c0299155>] ? acpi_idle_enter_simple+0x165/0x1d0 [<c0298db6>] acpi_idle_enter_bm+0xba/0x2f4 [<c0306ad0>] cpuidle_idle_call+0x70/0xb0 [<c0306a60>] ? cpuidle_idle_call+0x0/0xb0 [<c0102b1a>] cpu_idle+0x5a/0xe0 [<c038f14f>] rest_init+0x5f/0x70 ======================= handlers: [<f88b7d40>] (usb_hcd_irq+0x0/0x60 [usbcore]) Disabling IRQ #19 I play a little with it. And now I can reproduce it immediately: 1) load all modules without ehci-hcd 2) load ehci-hcd 3) segfault appears during 1-2 seconds Unload/load ehci-hcd and I see this segfault again. Sometimes unloading/loading begin to work without segfaults then I do suspend-to-ram/back and I see it again. Second procedure: 1) load all modules without hci_usb (include ehci-hcd) 2) load hci-usb 3) start bluetooth 4) segfault I hope it helps. Milan, Did any earlier kernel version work for you by chance? Note that this is *NOT* a segfault; please correct the summary. This looks to be some kind of IRQ reporting issue. /proc/interrupts output would be relevant. I suspect ACPI is an issue here; notice how sometimes it uses irq 23, sometimes irq 19. Full "dmesg" output during boot, including the BT adapter (and until ehci is loaded) should be useful too. One theory is that the internal BT device has some IRQ wired up, and it happens to share the line used by EHCI. But that BT IRQ isn't being handled. (In reply to comment #5) > Milan, > Did any earlier kernel version work for you by chance? > Sorry, I didn't tested older kernel due to missing some drivers. Created attachment 16387 [details]
boot log
irq 19: nobody cared (try booting with the "irqpoll" option)
Pid: 0, comm: swapper Not tainted 2.6.25.3 #8
[<c016dc87>] __report_bad_irq+0x27/0x90
[<c016df3f>] note_interrupt+0x24f/0x290
[<c016d441>] ? handle_IRQ_event+0x31/0x60
[<c016e64b>] handle_fasteoi_irq+0xab/0xd0
[<c0107090>] do_IRQ+0x40/0x80
[<c0104bf3>] common_interrupt+0x23/0x28
[<c03bba37>] ? _spin_unlock_irqrestore+0x7/0x30
[<c01466ec>] tick_broadcast_oneshot_control+0x6c/0x120
[<c0145f8d>] tick_notify+0x29d/0x3e0
[<c012d5bd>] ? irq_exit+0x5d/0x90
[<c0107095>] ? do_IRQ+0x45/0x80
[<c0140f5d>] notifier_call_chain+0x2d/0x70
[<c014107a>] raw_notifier_call_chain+0x1a/0x20
[<c0145871>] clockevents_notify+0x21/0x80
[<c02aaaae>] acpi_state_timer_broadcast+0x31/0x34
[<c02ab28a>] acpi_idle_enter_simple+0x1aa/0x1ff
[<c028b657>] ? acpi_os_release_lock+0x8/0xa
[<c02aae52>] acpi_idle_enter_bm+0xba/0x348
[<c0141806>] ? pm_qos_requirement+0x26/0x30
[<c031dccb>] cpuidle_idle_call+0x7b/0xc0
[<c031dc50>] ? cpuidle_idle_call+0x0/0xc0
[<c01030a1>] cpu_idle+0x61/0xe0
[<c03b6e82>] start_secondary+0x195/0x1a3
=======================
handlers:
[<f88c41e0>] (usb_hcd_irq+0x0/0x60 [usbcore])
Disabling IRQ #19
(In reply to comment #6) > Note that this is *NOT* a segfault; please correct the summary. ok, summary corrected. > > This looks to be some kind of IRQ reporting issue. /proc/interrupts output > would be relevant. I suspect ACPI is an issue here; notice how sometimes it > uses irq 23, sometimes irq 19. Full "dmesg" output during boot, including > the > BT adapter (and until ehci is loaded) should be useful too. > > One theory is that the internal BT device has some IRQ wired up, and it > happens > to share the line used by EHCI. But that BT IRQ isn't being handled. > milon@ntm:~$ cat /proc/interrupts CPU0 CPU1 0: 47696 51098 IO-APIC-edge timer 1: 573 689 IO-APIC-edge i8042 8: 4 1 IO-APIC-edge rtc 9: 17281 16820 IO-APIC-fasteoi acpi 12: 1395 332 IO-APIC-edge i8042 16: 0 0 IO-APIC-fasteoi yenta, uhci_hcd:usb5, i915@pci:0000:00:02.0 17: 159 157 IO-APIC-fasteoi uhci_hcd:usb6, HDA Intel, firewire_ohci 18: 0 0 IO-APIC-fasteoi sdhc0:slot0 19: 75119 88942 IO-APIC-fasteoi ehci_hcd:usb2 20: 45860 36973 IO-APIC-fasteoi uhci_hcd:usb3 21: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 22: 1 2 IO-APIC-fasteoi ehci_hcd:usb1 220: 17348 223 PCI-MSI-edge eth0 221: 4538 1600 PCI-MSI-edge ahci NMI: 0 0 Non-maskable interrupts LOC: 52961 36619 Local timer interrupts RES: 9716 11084 Rescheduling interrupts CAL: 99 12764 function call interrupts TLB: 180 259 TLB shootdowns TRM: 0 0 Thermal event interrupts SPU: 0 0 Spurious interrupts ERR: 0 MIS: 0 Log is in attachment. For the record, the limited public info on the http://www.broadcom.com/products/Bluetooth/Bluetooth-RF-Silicon-and-Software-Solutions/BCM2045 webpage does indeed suggest that some Bluetooth IRQs could be wired up parallel to the USB signal wires ... alternative data paths include UART and SPI connections, which are routinely done with parallel IRQ paths. I suggest you see whether various BIOS tables -- DMI, PNPACPI, DSDT, even the info in /proc/acpi/wakeup -- list the Bluetooth device. If so, look for IRQ or GPE hookups. If the Bluetooth device lists such a thing, one issue will be how to tell the its device driver that it's got a "real" IRQ. Another will be how to make the hci_usb driver listen to it ... and ideally use the IRQ for something sensible. Can you confirm that there are no high speed USB devices on this system, by the way? Sure doesn't look like there are any. If so, the hardware seems to be sending an absurdly high number of spurious IRQs to the EHCI controller. As in, almost all of those 164K+ irqs are spurious ... one possible explanation is a BIOS that's interfering. Make sure it's told to ignore USB. The /sys/kernel/debug/ehci/*/registers data might help too. If the BT chip doesn't have a "real" IRQ then I'll start wondering about the ACPI IRQ routing. Something switched EHCI from irq 23 to irq 19, which isn't usual. (In reply to comment #10) > I suggest you see whether various BIOS tables -- DMI, PNPACPI, DSDT, even the > info in /proc/acpi/wakeup -- list the Bluetooth device. If so, look for IRQ > or > GPE hookups. If the Bluetooth device lists such a thing, one issue will be > how > to tell the its device driver that it's got a "real" IRQ. Another will be > how > to make the hci_usb driver listen to it ... and ideally use the IRQ for > something sensible. I don't understand what you exactly want. List of /proc/acpi/wakeup ? Here is it: milon@ntm:~$ cat /proc/acpi/wakeup Device S-state Status Sysfs node LID S3 *enabled SLPB S3 *enabled IGBE S4 disabled pci:0000:00:19.0 EXP0 S4 disabled pci:0000:00:1c.0 EXP1 S4 disabled pci:0000:00:1c.1 EXP2 S4 disabled EXP3 S4 disabled EXP4 S4 disabled PCI1 S4 disabled pci:0000:00:1e.0 USB0 S3 disabled pci:0000:00:1d.0 USB1 S3 disabled pci:0000:00:1d.1 USB2 S3 disabled USB3 S3 disabled pci:0000:00:1a.0 USB4 S3 disabled pci:0000:00:1a.1 EHC0 S3 disabled pci:0000:00:1d.7 EHC1 S3 disabled pci:0000:00:1a.7 HDEF S4 disabled pci:0000:00:1b.0 > > Can you confirm that there are no high speed USB devices on this system, by > the > way? Sure doesn't look like there are any. If so, the hardware seems to be > sending an absurdly high number of spurious IRQs to the EHCI controller. As > in, almost all of those 164K+ irqs are spurious ... one possible explanation > is > a BIOS that's interfering. Make sure it's told to ignore USB. The > /sys/kernel/debug/ehci/*/registers data might help too. I have no high speed USB device (I think that Fingerprint and BT controller are not high speed devices. :). I don't have /sys/kernel/debug directory in my system. I will try to enable it. milon@ntm:~$ lsusb Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 003 Device 021: ID 0483:2016 SGS Thomson Microelectronics Fingerprint Reader Bus 003 Device 020: ID 0a5c:2110 Broadcom Corp. Bluetooth Controller Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub > > If the BT chip doesn't have a "real" IRQ then I'll start wondering about the > ACPI IRQ routing. Something switched EHCI from irq 23 to irq 19, which isn't > usual. > Due to kernel's version change ? Hi, I upgraded BIOS in my Thinkpad X61s. Here is piece of changelog: (Fix) Unexpected interrupts from the USB controller may occur. (BIOS) From this time is all ok. Many thanks to Pavel Kysilka (goldenfish@linuxsoft.cz) for pointing to this bug. |