Bug 11472 - Docking Dell D630 prints stack trace for bad IRQ (__report_bad_irq+0x2b/0x90)
Summary: Docking Dell D630 prints stack trace for bad IRQ (__report_bad_irq+0x2b/0x90)
Status: REJECTED INSUFFICIENT_DATA
Alias: None
Product: ACPI
Classification: Unclassified
Component: Config-Hotplug (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Shaohua
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-01 04:28 UTC by Lewis Thompson
Modified: 2008-12-16 21:21 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.27-rc5-custom (from git August 31 2008)
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
lspci -vvnn (13.58 KB, text/plain)
2008-09-01 04:32 UTC, Lewis Thompson
Details
/var/log/messages during undock/redock (annotated) (7.28 KB, text/plain)
2008-09-01 04:34 UTC, Lewis Thompson
Details
config used to build kernel (85.64 KB, text/plain)
2008-09-01 04:44 UTC, Lewis Thompson
Details
dmesg immediately after boot (46.57 KB, text/plain)
2008-09-10 11:24 UTC, Lewis Thompson
Details
includes detailed stack trace (699.14 KB, text/plain)
2008-09-11 16:31 UTC, Lewis Thompson
Details
unable to reproduce the issue without ehci_hcd (136.82 KB, text/plain)
2008-09-11 16:36 UTC, Lewis Thompson
Details
includes detailed stack trace (158.75 KB, text/plain)
2008-09-11 16:38 UTC, Lewis Thompson
Details
acpidump output; machine booted up docked, never undocked (although suspended/resumed) (125.03 KB, text/plain)
2008-09-22 08:58 UTC, Lewis Thompson
Details
lsusb -v output prior to docking (33.82 KB, text/plain)
2008-09-22 09:04 UTC, Lewis Thompson
Details
lsusb -v output after docking (87.60 KB, application/octet-stream)
2008-09-22 09:05 UTC, Lewis Thompson
Details
able to see the issue without various usb drivers loaded (47.77 KB, text/plain)
2008-09-22 09:51 UTC, Lewis Thompson
Details

Description Lewis Thompson 2008-09-01 04:28:53 UTC
Latest working kernel version: n/a
Earliest failing kernel version: 
Distribution: Ubuntu 8.10
Hardware Environment: Dell D630 with D/Dock docking station
Software Environment:
Problem Description:

Multiple re-docks with Dell D630 fail, printing the following stack trace on dock:

Pid: 0, comm: swapper Not tainted 2.6.27-rc5-custom #2

Call Trace:
 <IRQ>  [<ffffffff802950eb>] __report_bad_irq+0x2b/0x90
 [<ffffffff80295400>] note_interrupt+0x2b0/0x2e0
 [<ffffffff80295b4d>] handle_fasteoi_irq+0xed/0x110
 [<ffffffff8020fa96>] do_IRQ+0x86/0x100
 [<ffffffff8020ce8e>] ret_from_intr+0x0/0x29
 <EOI>  [<ffffffffa00198ea>] ? acpi_idle_enter_simple+0x152/0x190 [processor]
 [<ffffffffa00198e2>] ? acpi_idle_enter_simple+0x14a/0x190 [processor]
 [<ffffffff8042fc09>] ? cpuidle_idle_call+0xb9/0x100
 [<ffffffff8020ae95>] ? cpu_idle+0x75/0x100
 [<ffffffff804d2286>] ? rest_init+0x66/0x70

Steps to reproduce:

1. Boot laptop connected to dock
2. Press undock button on dock, wait for "ACPI: \_SB_.PCI0.PCIE.GDCK - undocking" in /var/log/messages
3. Physically undock laptop
4. Re-dock laptop, wait for "ACPI: \_SB_.PCI0.PCIE.GDCK - docking" in /var/log/messages
5. Repeat steps 2 & 3
6. Re-dock laptop and I usually get the stack trace (from problem description)

Sometimes I can manage to re-dock twice before the third re-dock fails, but usually it bails out at the second attempt
Comment 1 Lewis Thompson 2008-09-01 04:31:07 UTC
This looked similar to existing bug http://bugzilla.kernel.org/show_bug.cgi?id=10431 (surprise undock hangs system, ethernet recognition issues on dock/undock - Sony Vaio VGN SZ483N laptop)

As such, I applied the suggested patch (http://bugzilla.kernel.org/attachment.cgi?id=16054) and re-tested: this had no impact on the behaviour I see

It's also worth pointing out that in every case I am always "properly" undocking (i.e. pressing the undock button, waiting for the event in messages)
Comment 2 Lewis Thompson 2008-09-01 04:32:12 UTC
Created attachment 17560 [details]
lspci -vvnn
Comment 3 Lewis Thompson 2008-09-01 04:34:39 UTC
Created attachment 17561 [details]
/var/log/messages during undock/redock (annotated)
Comment 4 Lewis Thompson 2008-09-01 04:44:40 UTC
Created attachment 17562 [details]
config used to build kernel

To clarify, I am running 2.6.27-rc5 from git August 31 2008
This has been tested both with/without the patch from 10431
Comment 5 Shaohua 2008-09-10 01:16:19 UTC
can you please attach the output of 'dmesg'? the /var/log/messages appears filtered some output.
Comment 6 Lewis Thompson 2008-09-10 11:24:34 UTC
Created attachment 17713 [details]
dmesg immediately after boot

dmesg output immediately after boot
Comment 7 Lewis Thompson 2008-09-11 16:30:34 UTC
Upgraded to 2.6.27-3-generic (Ubuntu) today and have done additional testing with increased klogd (-c 8) logging

I'm attaching two additional outputs:

1. kern.log-with-ehci_hcd: output from a series of dock/undocks until I managed to reproduce the issue.  This includes a much more details stack trace, along with an ehci_hcd message:

the following is seen after undocking.  in addition I'm seeing a lot of USB hub "cannot reset port" messages

Sep 11 23:43:50 hanoi kernel: [ 3072.418455] CPU0 attaching NULL sched-domain.
Sep 11 23:43:50 hanoi kernel: [ 3072.418476] CPU1 attaching NULL sched-domain.
Sep 11 23:43:50 hanoi kernel: [ 3072.440307] CPU0 attaching sched-domain:
Sep 11 23:43:50 hanoi kernel: [ 3072.440328]  domain 0: span 0-1 level MC
Sep 11 23:43:50 hanoi kernel: [ 3072.440333]   groups: 0 1
Sep 11 23:43:50 hanoi kernel: [ 3072.440346]   domain 1: span 0-1 level CPU
Sep 11 23:43:50 hanoi kernel: [ 3072.440355]    groups: 0-1
Sep 11 23:43:50 hanoi kernel: [ 3072.440361]    domain 2: span 0-1 level NODE
Sep 11 23:43:50 hanoi kernel: [ 3072.440371]     groups: 0-1
Sep 11 23:43:50 hanoi kernel: [ 3072.440384] CPU1 attaching sched-domain:
Sep 11 23:43:50 hanoi kernel: [ 3072.440388]  domain 0: span 0-1 level MC
Sep 11 23:43:50 hanoi kernel: [ 3072.440396]   groups: 1 0
Sep 11 23:43:50 hanoi kernel: [ 3072.440407]   domain 1: span 0-1 level CPU
Sep 11 23:43:50 hanoi kernel: [ 3072.440411]    groups: 0-1
Sep 11 23:43:50 hanoi kernel: [ 3072.440421]    domain 2: span 0-1 level NODE
Sep 11 23:43:50 hanoi kernel: [ 3072.440425]     groups: 0-1
Sep 11 23:43:50 hanoi kernel: [ 3072.785007] ehci_hcd 0000:00:1a.7: HC died; cleaning up

and when I redock:

Sep 11 23:44:15 hanoi kernel: [ 3097.228589] irq 22: nobody cared (try booting with the "irqpoll" option)
Sep 11 23:44:15 hanoi kernel: [ 3097.228608] Pid: 57, comm: kacpi_notify Not tainted 2.6.27-3-generic #1
Sep 11 23:44:15 hanoi kernel: [ 3097.228617] 
Sep 11 23:44:15 hanoi kernel: [ 3097.228619] Call Trace:
Sep 11 23:44:15 hanoi kernel: [ 3097.228626]  <IRQ>  [<ffffffff8029e59b>] __report_bad_irq+0x2b/0x90

2. kern.log-without-ehci_hcd: output from a series of dock/undocks where I have specifically not loaded the ehci_hcd module.  After some time I was unable to reproduce the issue

I know very little about the Linux kernel but it seems to me that ehci_hcd is going away on undock.  On redock certain interrupts get fired off but ehci_hcd isn't there to handle them
Comment 8 Lewis Thompson 2008-09-11 16:31:30 UTC
Created attachment 17731 [details]
includes detailed stack trace
Comment 9 Lewis Thompson 2008-09-11 16:36:56 UTC
Created attachment 17732 [details]
unable to reproduce the issue without ehci_hcd

WORKAROUND to prevent this issue (on Ubuntu 8.10 alpha 5):

# echo "blacklist ehci_hcd" > /etc/modprobe.d/blacklist-usb
# update-initramfs -u

(this disables high speed USB)
Comment 10 Lewis Thompson 2008-09-11 16:38:01 UTC
Created attachment 17733 [details]
includes detailed stack trace
Comment 11 Shaohua 2008-09-21 19:59:06 UTC
Looks like the dock has a usb hub, and the hub undocking has problem. 
please send out the output of 'acpidump'? Also the lsusb output after docking.
Comment 12 Shaohua 2008-09-21 20:02:56 UTC
Also can you try without usb device drivers like usblp/usbhid but with host controller driver like ehci/uhci driver?
Comment 13 Lewis Thompson 2008-09-22 08:58:14 UTC
Created attachment 17946 [details]
acpidump output; machine booted up docked, never undocked (although suspended/resumed)
Comment 14 Lewis Thompson 2008-09-22 09:04:26 UTC
Created attachment 17949 [details]
lsusb -v output prior to docking
Comment 15 Lewis Thompson 2008-09-22 09:05:16 UTC
Created attachment 17950 [details]
lsusb -v output after docking

lewiz@hanoi:~$ sudo lsusb -v > lsusb-v.post-dock
can't get debug descriptor: Connection timed out
Comment 16 Lewis Thompson 2008-09-22 09:51:20 UTC
Created attachment 17951 [details]
able to see the issue without various usb drivers loaded

disabled the following modules:

usbhid, snd_usb_audio, snd_usb_lib, usblp, btusb (also removed uvc webcam)

I was unable to remove usb_storage as I am booting from a USB disk

still able to see the issue.  included the stack trace, lspvi -vvvnn, lsusb -v and lsmod outputs
Comment 17 Shaohua 2008-09-22 19:41:03 UTC
I suspect some drivers are still operating under a hub after the hub is undocked. Currently usb hasn't hooked into docking station.

But let's our USB guys look at this.
Comment 18 David Brownell 2008-09-22 22:08:39 UTC
Press the undock button and USB errors follow; lots of messages follow, which are in this context harmless.

The "bad IRQ" thing seems a bit dubious.  Is it really "bad" or is it just that the drivers all said their IRQ handler had nothing to do?  If the latter, that's not really a problem if it happens now and then.  If it happens constantly, then worry; otherwise, it's just one outcome of drivers racing against hardware.
Comment 19 Alan Stern 2008-09-23 07:13:03 UTC
There is a pair of patches which might help with this problem, both located at http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/gregkh-02-usb.current/

They are usb-fix-ehci-periodic-transfers.patch and usb-ehci-fix-some-ehci-hangs-and-crashes.patch.
Comment 20 Zhang Rui 2008-11-16 23:30:30 UTC
lewis,
does the problem still exists in the latest kernel release?
Comment 21 Zhang Rui 2008-12-16 21:21:17 UTC
no response from the bug reporter.
Lewis, please re-open it if you can reproduce the problem in the latest upstream kernel.

Note You need to log in before you can comment on or make changes to this bug.