Bug 8432 - USB device keeps reseting while using ehci_hcd
Summary: USB device keeps reseting while using ehci_hcd
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Alan Stern
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-05-05 11:25 UTC by Jakub Schmidtke
Modified: 2007-06-09 13:58 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.20
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Dmesg of 2.6.19.7 version (19.87 KB, text/plain)
2007-05-05 11:26 UTC, Jakub Schmidtke
Details
Dmesg of 2.6.20 version (22.48 KB, text/plain)
2007-05-05 11:26 UTC, Jakub Schmidtke
Details
lsusb and lsusb -v -s from 2.6.20 version (3.26 KB, text/plain)
2007-05-05 11:28 UTC, Jakub Schmidtke
Details
/proc/bus/usb/devices from 2.6.20 with no additional devices (3.90 KB, text/plain)
2007-05-06 16:32 UTC, Jakub Schmidtke
Details
/sys/kernel/debug/usbmon/5t from 2.6.20 (19.04 KB, text/plain)
2007-05-06 16:34 UTC, Jakub Schmidtke
Details
/sys/kernel/debug/usbmon/5t from 2.6.19 (974 bytes, text/plain)
2007-05-06 16:35 UTC, Jakub Schmidtke
Details
Modify bad bInterval values (1.85 KB, patch)
2007-05-08 09:41 UTC, Alan Stern
Details | Diff
dmesg | grep 5-7 for 2.6.20 with 'bad bInterval values' patch (1.67 KB, text/plain)
2007-05-08 16:11 UTC, Jakub Schmidtke
Details

Description Jakub Schmidtke 2007-05-05 11:25:49 UTC
Most recent kernel where this bug did *NOT* occur: 2.6.19.7
Distribution:Arch
Hardware Environment:Asus G1 laptop
Software Environment:
Problem Description:When I use 2.6.20 kernel my dmesg show that about every
second there is new message generated:
"usb 5-7: reset high speed USB device using ehci_hcd and address 6"
This device is (probably) small OLED display in ma laptop, which is not
supported anyway. When I used 2.6.19.7 (vanilla) it didn't produce such
warnings. With 2.6.20 (vanilla) it does. Also, my USB keyboard behaves in a
strange way - when I keep one key pressed it repeats it few times, then it
pauses for a while, repeats the key again, pauses and so on. Sometimes it also
produces extra characters - for example 'modproooobee' instead of 'modprobe'.
When I do modprobe -r ehci_hcd, my keyboard stops responding, and starts working
again after a while, the OLED device is detected by lsusb command, but it stops
producing this warning.
2.6.21 version has exactly the same issue.

Steps to reproduce: Boot the laptop with 2.6.20 or 2.6.21 kernel.
Comment 1 Jakub Schmidtke 2007-05-05 11:26:27 UTC
Created attachment 11399 [details]
Dmesg of 2.6.19.7 version
Comment 2 Jakub Schmidtke 2007-05-05 11:26:44 UTC
Created attachment 11400 [details]
Dmesg of 2.6.20 version
Comment 3 Jakub Schmidtke 2007-05-05 11:28:17 UTC
Created attachment 11401 [details]
lsusb and lsusb -v -s from 2.6.20 version

lsusb and lsusb -v -s 5:6 from 2.6.20 version. In 2.6.19 it was exactly the
same, only different position of my usb printer in lsusb output.
Comment 4 Alan Stern 2007-05-06 15:29:27 UTC
There are ways to work around this problem.  But it would be better to find the
real cause and fix it, if possible.

A good way to start would be to unplug all your other high-speed USB devices and
use the usbmon facility to see what's happening with this 5-6 device. 
Instructions are in the kernel source file Documentation/usb/usbmon.txt.  You
might also want to turn on CONFIG_USB_DEBUG.
Comment 5 Alan Stern 2007-05-06 15:33:01 UTC
Sorry, I meant 5-7, not 5-6.

In the end we might want to compare the usbmon logs from 2.6.19 and 2.6.20.  For
now just try 2.6.20.

Also, could you attach the contents of /proc/bus/usb/devices?  You might need to
mount the directory first (mount -t usbfs none /proc/bus/usb).
Comment 6 Jakub Schmidtke 2007-05-06 16:32:20 UTC
Created attachment 11414 [details]
/proc/bus/usb/devices from 2.6.20 with no additional devices

There are two devices - the offending one, made by ASUS, and the other one
which is Syntek webcam. I am unable to switch it off though, as it is built-in.
I booted my laptop with nothing else connected to USB, and I switched off with
ACPI hotkeys my built-in usb bluetooth device.

The same file in 2.6.19.7 is almost exactly the same, the differences are only:

Manufacturer=Linux 2.6.19.7 uhci_hcd instead of Manufacturer=Linux 2.6.20
uhci_hcd everywhere
And "E:" line in Product=EHCI Host Controller section (the first section in
this file):

In 2.6.20 it is:
E:  Ad=81(I) Atr=03(Int.) MxPS=   4 Ivl=256ms

and in 2.6.19.7 it is:
E:  Ad=81(I) Atr=03(Int.) MxPS=   2 Ivl=256ms
Comment 7 Jakub Schmidtke 2007-05-06 16:34:22 UTC
Created attachment 11415 [details]
/sys/kernel/debug/usbmon/5t from 2.6.20

I started logging and kept it on for about 10-15 seconds.
Comment 8 Jakub Schmidtke 2007-05-06 16:35:37 UTC
Created attachment 11416 [details]
/sys/kernel/debug/usbmon/5t from 2.6.19

There was almost no activity in this file. Only when I did lsusb there was
something new printed. I have attached the file after few 'lsusb' commands.
Comment 9 Jakub Schmidtke 2007-05-06 16:40:38 UTC
I did those tests for 2.6.19.7 and 2.6.20, both with CONFIG_USB_DEBUG on.
While in 2.6.19 there was no additional dmesg output, in 2.6.20 it constantly
produced the same lines:

usbhid 5-7:1.0: resetting device
usbhid 5-7:1.0: suspend
ehci_hcd 0000:00:1d.7: port 7 high speed
ehci_hcd 0000:00:1d.7: GetStatus port 7 status 001005 POWER sig=se0 PE CONNECT
usb 5-7: reset high speed USB device using ehci_hcd and address 4
ehci_hcd 0000:00:1d.7: port 7 high speed
ehci_hcd 0000:00:1d.7: GetStatus port 7 status 001005 POWER sig=se0 PE CONNECT
 usbdev5.4_ep81: ep_device_release called for usbdev5.4_ep81
 usbdev5.4_ep02: ep_device_release called for usbdev5.4_ep02
usbhid 5-7:1.0: resume status -22
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: retrying intr urb
usbhid 5-7:1.0: resetting device
usbhid 5-7:1.0: suspend
ehci_hcd 0000:00:1d.7: port 7 high speed


and so on and so on, my dmesg after a while contained nothing but those lines.
And every second new 'block' of those lines was created

Comment 10 Alan Stern 2007-05-07 08:58:27 UTC
I see the problem.  The device reports bogus Interval values (i.e., 0) for its
two Interrupt endpoints.  You can see them in the lsusb output.  They show up
even more clearly in the /proc/bus/usb/devices output (note "Ivl" is short for
"Interval"):

   E:  Ad=81(I) Atr=03(Int.) MxPS=  64 Ivl=-2147483648us
   E:  Ad=02(O) Atr=03(Int.) MxPS= 512 Ivl=-2147483648us

Obviously they managed to confuse the USB stack.  In fact, it's not clear why
2.6.19 didn't give the same errors.

One approach would be to patch the drivers so that they handle Interval=0 values
in some reasonable way.  In fact we should do that anyhow.  But for now, your
easiest solution is to tell the kernel to stop using this device entirely.  You
can do that as follows:

   echo -n 5-7 >/sys/bus/usb/drivers/usb/unbind

If you put this line in a startup script then you should be more or less okay.
Comment 11 Jakub Schmidtke 2007-05-07 13:47:23 UTC
I created additional /etc/rc.d script to do that echo, so I can run it as soon
as possible, just after the ehci_hcd has been loaded. (if I run it in
/etc/rc.local it happens after this reset happens a few times). And it works
fine, thanks! :)

By the way, does it mean that this device does something wrong?
Does it also mean, that writing a driver for it could be impossible? I hoped to
learn about usb drivers someday and maybe try to write something, as it should
be really simple device... ;)
Comment 12 Alan Stern 2007-05-07 14:18:56 UTC
Yes, the device does something wrong.  It has illegal values stored in the
descriptors it sends to the computer.

Writing a driver for it isn't impossible.  In fact, the existing driver should
work okay with a few small changes.

BTW, what happens if you do "rmmod ehci-hcd"?  What does the lsusb output show then?
Comment 13 Jakub Schmidtke 2007-05-07 14:45:49 UTC
When I did rmmod ehci_hcd my usb keayboard stopped working for a while. After it
was back, that Asus device was back on the lsusb list:

Bus 004 Device 002: ID 0b05:1726 ASUSTek Computer, Inc.
Bus 004 Device 001: ID 0000:0000
Bus 002 Device 001: ID 0000:0000
Bus 001 Device 006: ID 046d:c312 Logitech, Inc.
Bus 001 Device 005: ID 046d:c01d Logitech, Inc.
Bus 001 Device 004: ID 2304:0208 Pinnacle Systems, Inc. [hex] Pinnacle Studio
PCTV USB2
Bus 001 Device 003: ID 04b4:6560 Cypress Semiconductor Corp. CY7C65640 USB-2.0
"TetraHub"
Bus 001 Device 001: ID 0000:0000
Bus 003 Device 003: ID 05e1:0501 Syntek Semiconductor Co., Ltd
Bus 003 Device 001: ID 0000:0000

(This time I have more devices connected).

Before I did that, lsusb output was:

Bus 004 Device 001: ID 0000:0000
Bus 002 Device 001: ID 0000:0000
Bus 001 Device 001: ID 0000:0000
Bus 003 Device 001: ID 0000:0000
Bus 005 Device 001: ID 0000:0000
Bus 005 Device 002: ID 04b4:6560 Cypress Semiconductor Corp. CY7C65640 USB-2.0
"TetraHub"
Bus 005 Device 009: ID 046d:c312 Logitech, Inc.
Bus 005 Device 007: ID 046d:c01d Logitech, Inc.
Bus 005 Device 006: ID 2304:0208 Pinnacle Systems, Inc. [hex] Pinnacle Studio
PCTV USB2
Bus 005 Device 004: ID 05e1:0501 Syntek Semiconductor Co., Ltd


As for the device driver - for that piece of hardware there is no driver I am
aware of (but I might be wrong)...
Comment 14 Jakub Schmidtke 2007-05-07 14:49:05 UTC
One more thing - dmesg shows, that when I removed ehci_hcd, all usb devices were
disconnected, and then connected again, this time using uhci_hcd driver.

After I modprobed ehci_hcd usb stopped working again, and after it was back I
had the same issue with the asus device - had to do that echo -n 5-7 (...) again
to get rid of it.
Comment 15 Alan Stern 2007-05-08 09:41:54 UTC
Created attachment 11435 [details]
Modify bad bInterval values

Here's a patch for 2.6.20.  Once you have this installed, you shouldn't need
the special startup script any more.  Let me know how it works.
Comment 16 Jakub Schmidtke 2007-05-08 16:11:00 UTC
Created attachment 11439 [details]
dmesg | grep 5-7 for 2.6.20 with 'bad bInterval values' patch

That works great :) I have attached output of dmesg | grep 5-7, which looks
like the patch is working properly :)

Will it be included into 2.6.21? (config.c is exactly the same in 2.6.20 and
2.6.21.1)
Comment 17 Alan Stern 2007-05-10 10:39:47 UTC
I'll submit the patch, or a slight variant of it.  It probably won't get into
2.6.21 since there's a simple workaround (that startup script).  It might get
into 2.6.22.
Comment 18 Alan Stern 2007-06-09 13:58:26 UTC
The patch has been accepted.  Closing the bug report.

Note You need to log in before you can comment on or make changes to this bug.