Bug 11159
Summary: | reset high speed USB device using ehci_hcd | ||
---|---|---|---|
Product: | Drivers | Reporter: | Balázs Hámorszky (balihb) |
Component: | USB | Assignee: | Greg Kroah-Hartman (greg) |
Status: | RESOLVED INVALID | ||
Severity: | normal | CC: | andrej, apsetupwvlink, cagnulein, cc, clarkembers, davidgutierrez2253, dencorpos, evanchsa, gryffus, icephoenix.nx1729+kernel, john.m.lang, kernel.org, kes-kes, mail2staisy, maomfamao, mikko.rantalainen, mitch, mywifiapsetup, mywifiextendersetups, mywifiextllc, mywifiextloginhelp, Mywifiextsetupus, mywifiiextsetup.us, peter.volkov, stern, the.ridikulus.rat, uzytkownik2 |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.26 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg
kernel output (before and after) lsusb debug output my kernel config ehci registers before ehci registers after dmesg dump Dmesg and ohci registers Fix unending polling in OHCI Dmesg and USB Registers dmesg and usb register dumps Make EHCI retry transaction errors 32 times Add Clear-TT-Buffer callback Make ehci-hcd wait for Clear-TT-Buffer to complete log file with problems log files of different OSes caused and not caused by this problem |
Description
Balázs Hámorszky
2008-07-24 22:45:58 UTC
Created attachment 16976 [details]
dmesg
Created attachment 16977 [details]
kernel output (before and after)
Created attachment 16978 [details]
lsusb
after the ehci module get confused
Created attachment 16979 [details]
debug output
debug output with a kernel from git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
with:
CONFIG_USB_DEBUG=y
CONFIG_USB_STORAGE_DEBUG=y
The same problem present on 2.6.25-2-686 (from debian sid) and 2.6.24 (but on 2.6.24, there is no error message, the ehci just stop working, but it takes longer to die) Created attachment 16980 [details]
my kernel config
Reply-To: akpm@linux-foundation.org (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Thu, 24 Jul 2008 22:45:58 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11159 > > Summary: reset high speed USB device using ehci_hcd > Product: Drivers > Version: 2.5 > KernelVersion: 2.6.26 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: USB > AssignedTo: greg@kroah.com > ReportedBy: balihb@gmail.com > > > Latest working kernel version: > Earliest failing kernel version: > Distribution: Debian/sid > Hardware Environment: ASUS M3A (AMD > > 770/SB600:http://www.asus.com/products.aspx?modelmenu=2&model=1934&l1=3&l2=149&l3=592&l4=0) > Software Environment: > Problem Description: > After using any usb2.0 device I have on my machine (only on linux, as on > winxp > there is no problem) for a few minutes I get this: > usb 6-5: reset high speed USB device using ehci_hcd and address 8 > usb 6-5: device descriptor read/64, error -110 > usb 6-5: device descriptor read/64, error -110 > usb 6-5: reset high speed USB device using ehci_hcd and address 8 > usb 6-5: device descriptor read/64, error -110 > usb 6-5: device descriptor read/64, error -110 > usb 6-5: reset high speed USB device using ehci_hcd and address 8 > usb 6-5: device not accepting address 8, error -110 > usb 6-5: reset high speed USB device using ehci_hcd and address 8 > usb 6-5: device not accepting address 8, error -110 > usb 6-5: USB disconnect, address 8 > sd 4:0:0:0: Device offlined - not ready after error recovery > sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 > end_request: I/O error, dev sdc, sector 530560 > sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 > end_request: I/O error, dev sdc, sector 530576 > Buffer I/O error on device sdc1, logical block 138851 > lost page write due to I/O error on sdc1 > sd 4:0:0:0: [sdc] READ CAPACITY failed > sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 > sd 4:0:0:0: [sdc] Sense not available. > sd 4:0:0:0: [sdc] Write Protect is off > sd 4:0:0:0: [sdc] Mode Sense: 00 00 00 00 > sd 4:0:0:0: [sdc] Assuming drive cache: write through > sd 4:0:0:0: [sdc] READ CAPACITY failed > sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 > sd 4:0:0:0: [sdc] Sense not available. > sd 4:0:0:0: [sdc] Write Protect is off > sd 4:0:0:0: [sdc] Mode Sense: 00 00 00 00 > sd 4:0:0:0: [sdc] Assuming drive cache: write through > sd 4:0:0:2: [sde] READ CAPACITY failed > sd 4:0:0:2: [sde] Result: hostbyte=0x01 driverbyte=0x00 > sd 4:0:0:2: [sde] Sense not available. > sd 4:0:0:2: [sde] Write Protect is off > sd 4:0:0:2: [sde] Mode Sense: 00 00 00 00 > sd 4:0:0:2: [sde] Assuming drive cache: write through > sd 4:0:0:1: [sdd] READ CAPACITY failed > sd 4:0:0:1: [sdd] Result: hostbyte=0x01 driverbyte=0x00 > sd 4:0:0:1: [sdd] Sense not available. > sd 4:0:0:1: [sdd] Write Protect is off > sd 4:0:0:1: [sdd] Mode Sense: 00 00 00 00 > sd 4:0:0:1: [sdd] Assuming drive cache: write through > Buffer I/O error on device sdc1, logical block 138851 > lost page write due to I/O error on sdc1 > sd 4:0:0:3: [sdf] READ CAPACITY failed > sd 4:0:0:3: [sdf] Result: hostbyte=0x01 driverbyte=0x00 > sd 4:0:0:3: [sdf] Sense not available. > sd 4:0:0:3: [sdf] Write Protect is off > sd 4:0:0:3: [sdf] Mode Sense: 00 00 00 00 > sd 4:0:0:3: [sdf] Assuming drive cache: write through > sd 4:0:0:1: [sdd] READ CAPACITY failed > sd 4:0:0:1: [sdd] Result: hostbyte=0x01 driverbyte=0x00 > sd 4:0:0:1: [sdd] Sense not available. > sd 4:0:0:2: [sde] READ CAPACITY failed > sd 4:0:0:2: [sde] Result: hostbyte=0x01 driverbyte=0x00 > sd 4:0:0:2: [sde] Sense not available. > ... > And after that nothing get recognized by the ehci module. > > Steps to reproduce: > Reply-To: me@felipebalbi.com Hi, On Thu, 24 Jul 2008 23:15:56 -0700, Andrew Morton <akpm@linux-foundation.org> wrote: >> After using any usb2.0 device I have on my machine (only on linux, as on > winxp >> there is no problem) for a few minutes I get this: >> usb 6-5: reset high speed USB device using ehci_hcd and address 8 >> usb 6-5: device descriptor read/64, error -110 >> usb 6-5: device descriptor read/64, error -110 >> usb 6-5: reset high speed USB device using ehci_hcd and address 8 >> usb 6-5: device descriptor read/64, error -110 >> usb 6-5: device descriptor read/64, error -110 >> usb 6-5: reset high speed USB device using ehci_hcd and address 8 >> usb 6-5: device not accepting address 8, error -110 >> usb 6-5: reset high speed USB device using ehci_hcd and address 8 >> usb 6-5: device not accepting address 8, error -110 >> usb 6-5: USB disconnect, address 8 >> sd 4:0:0:0: Device offlined - not ready after error recovery >> sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 >> end_request: I/O error, dev sdc, sector 530560 >> sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 >> end_request: I/O error, dev sdc, sector 530576 >> Buffer I/O error on device sdc1, logical block 138851 >> lost page write due to I/O error on sdc1 >> sd 4:0:0:0: [sdc] READ CAPACITY failed >> sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 >> sd 4:0:0:0: [sdc] Sense not available. >> sd 4:0:0:0: [sdc] Write Protect is off >> sd 4:0:0:0: [sdc] Mode Sense: 00 00 00 00 >> sd 4:0:0:0: [sdc] Assuming drive cache: write through >> sd 4:0:0:0: [sdc] READ CAPACITY failed >> sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 >> sd 4:0:0:0: [sdc] Sense not available. >> sd 4:0:0:0: [sdc] Write Protect is off >> sd 4:0:0:0: [sdc] Mode Sense: 00 00 00 00 >> sd 4:0:0:0: [sdc] Assuming drive cache: write through >> sd 4:0:0:2: [sde] READ CAPACITY failed >> sd 4:0:0:2: [sde] Result: hostbyte=0x01 driverbyte=0x00 >> sd 4:0:0:2: [sde] Sense not available. >> sd 4:0:0:2: [sde] Write Protect is off >> sd 4:0:0:2: [sde] Mode Sense: 00 00 00 00 >> sd 4:0:0:2: [sde] Assuming drive cache: write through >> sd 4:0:0:1: [sdd] READ CAPACITY failed >> sd 4:0:0:1: [sdd] Result: hostbyte=0x01 driverbyte=0x00 >> sd 4:0:0:1: [sdd] Sense not available. >> sd 4:0:0:1: [sdd] Write Protect is off >> sd 4:0:0:1: [sdd] Mode Sense: 00 00 00 00 >> sd 4:0:0:1: [sdd] Assuming drive cache: write through >> Buffer I/O error on device sdc1, logical block 138851 >> lost page write due to I/O error on sdc1 >> sd 4:0:0:3: [sdf] READ CAPACITY failed >> sd 4:0:0:3: [sdf] Result: hostbyte=0x01 driverbyte=0x00 >> sd 4:0:0:3: [sdf] Sense not available. >> sd 4:0:0:3: [sdf] Write Protect is off >> sd 4:0:0:3: [sdf] Mode Sense: 00 00 00 00 >> sd 4:0:0:3: [sdf] Assuming drive cache: write through >> sd 4:0:0:1: [sdd] READ CAPACITY failed >> sd 4:0:0:1: [sdd] Result: hostbyte=0x01 driverbyte=0x00 >> sd 4:0:0:1: [sdd] Sense not available. >> sd 4:0:0:2: [sde] READ CAPACITY failed >> sd 4:0:0:2: [sde] Result: hostbyte=0x01 driverbyte=0x00 >> sd 4:0:0:2: [sde] Sense not available. What were you doing with the device before it resets ? If it was mounted and it somehow reset, it could be even a flaky cable. Check, also, that hald-addon-storage is probably polling an unexistent /dev/sdX. On Fri, Jul 25, 2008 at 12:49, Felipe Balbi <me@felipebalbi.com> wrote: > Hi, > > On Thu, 24 Jul 2008 23:15:56 -0700, Andrew Morton > <akpm@linux-foundation.org> wrote: >>> After using any usb2.0 device I have on my machine (only on linux, as on >> winxp >>> there is no problem) for a few minutes I get this: >>> usb 6-5: reset high speed USB device using ehci_hcd and address 8 >>> usb 6-5: device descriptor read/64, error -110 >>> usb 6-5: device descriptor read/64, error -110 >>> usb 6-5: reset high speed USB device using ehci_hcd and address 8 >>> usb 6-5: device descriptor read/64, error -110 >>> usb 6-5: device descriptor read/64, error -110 >>> usb 6-5: reset high speed USB device using ehci_hcd and address 8 >>> usb 6-5: device not accepting address 8, error -110 >>> usb 6-5: reset high speed USB device using ehci_hcd and address 8 >>> usb 6-5: device not accepting address 8, error -110 >>> usb 6-5: USB disconnect, address 8 >>> sd 4:0:0:0: Device offlined - not ready after error recovery >>> sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 >>> end_request: I/O error, dev sdc, sector 530560 >>> sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 >>> end_request: I/O error, dev sdc, sector 530576 >>> Buffer I/O error on device sdc1, logical block 138851 >>> lost page write due to I/O error on sdc1 >>> sd 4:0:0:0: [sdc] READ CAPACITY failed >>> sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 >>> sd 4:0:0:0: [sdc] Sense not available. >>> sd 4:0:0:0: [sdc] Write Protect is off >>> sd 4:0:0:0: [sdc] Mode Sense: 00 00 00 00 >>> sd 4:0:0:0: [sdc] Assuming drive cache: write through >>> sd 4:0:0:0: [sdc] READ CAPACITY failed >>> sd 4:0:0:0: [sdc] Result: hostbyte=0x01 driverbyte=0x00 >>> sd 4:0:0:0: [sdc] Sense not available. >>> sd 4:0:0:0: [sdc] Write Protect is off >>> sd 4:0:0:0: [sdc] Mode Sense: 00 00 00 00 >>> sd 4:0:0:0: [sdc] Assuming drive cache: write through >>> sd 4:0:0:2: [sde] READ CAPACITY failed >>> sd 4:0:0:2: [sde] Result: hostbyte=0x01 driverbyte=0x00 >>> sd 4:0:0:2: [sde] Sense not available. >>> sd 4:0:0:2: [sde] Write Protect is off >>> sd 4:0:0:2: [sde] Mode Sense: 00 00 00 00 >>> sd 4:0:0:2: [sde] Assuming drive cache: write through >>> sd 4:0:0:1: [sdd] READ CAPACITY failed >>> sd 4:0:0:1: [sdd] Result: hostbyte=0x01 driverbyte=0x00 >>> sd 4:0:0:1: [sdd] Sense not available. >>> sd 4:0:0:1: [sdd] Write Protect is off >>> sd 4:0:0:1: [sdd] Mode Sense: 00 00 00 00 >>> sd 4:0:0:1: [sdd] Assuming drive cache: write through >>> Buffer I/O error on device sdc1, logical block 138851 >>> lost page write due to I/O error on sdc1 >>> sd 4:0:0:3: [sdf] READ CAPACITY failed >>> sd 4:0:0:3: [sdf] Result: hostbyte=0x01 driverbyte=0x00 >>> sd 4:0:0:3: [sdf] Sense not available. >>> sd 4:0:0:3: [sdf] Write Protect is off >>> sd 4:0:0:3: [sdf] Mode Sense: 00 00 00 00 >>> sd 4:0:0:3: [sdf] Assuming drive cache: write through >>> sd 4:0:0:1: [sdd] READ CAPACITY failed >>> sd 4:0:0:1: [sdd] Result: hostbyte=0x01 driverbyte=0x00 >>> sd 4:0:0:1: [sdd] Sense not available. >>> sd 4:0:0:2: [sde] READ CAPACITY failed >>> sd 4:0:0:2: [sde] Result: hostbyte=0x01 driverbyte=0x00 >>> sd 4:0:0:2: [sde] Sense not available. > > What were you doing with the device before it resets ? I was copying from it. > If it was mounted and it somehow reset, it could be > even a flaky cable. It can't be. On Windows I have no problem. > > Check, also, that hald-addon-storage is probably polling > an unexistent /dev/sdX. how? > > -- > Best Regards, > > Felipe Balbi > http://blog.felipebalbi.com > me@felipebalbi.com > > Reply-To: me@felipebalbi.com Hi, On Fri, 25 Jul 2008 12:58:41 +0200, "Balázs Hámorszky" <balihb@gmail.com> wrote: >> What were you doing with the device before it resets ? > > I was copying from it. Can you rebuild your kernel with CONFIG_USB_DEBUG enabled ? >> Check, also, that hald-addon-storage is probably polling >> an unexistent /dev/sdX. > > how? ps aux | grep hald-addon-storage if any hald-addon-storage is polling after the device has gone offline, you should see something like this: root 5841 0.0 0.0 24156 1280 ? S Jul16 0:44 hald-addon-storage: polling /dev/scd0 (every 2 sec) On Fri, Jul 25, 2008 at 13:01, Felipe Balbi <me@felipebalbi.com> wrote: > Hi, > > On Fri, 25 Jul 2008 12:58:41 +0200, "Bal Reply-To: me@felipebalbi.com On Fri, 25 Jul 2008 13:15:11 +0200, "Balázs Hámorszky" <balihb@gmail.com> wrote: > I've already done it and attached to the bug report > (http://bugzilla.kernel.org/show_bug.cgi?id=11159). beats me... Don't have hw to test and the problem is not on the reset I suppose. The reset happens probably because the device disconnects before. I suppose it's some sort of current spikes during copy operation. Might be some "quirk" in AMD ehci only. Dave, any comments ? On Thu, 24 Jul 2008, Andrew Morton wrote: > > http://bugzilla.kernel.org/show_bug.cgi?id=11159 > > > > Summary: reset high speed USB device using ehci_hcd > > After using any usb2.0 device I have on my machine (only on linux, as on > winxp > > there is no problem) for a few minutes I get this: > > usb 6-5: reset high speed USB device using ehci_hcd and address 8 > > usb 6-5: device descriptor read/64, error -110 > > usb 6-5: device descriptor read/64, error -110 > > usb 6-5: reset high speed USB device using ehci_hcd and address 8 > > ... > > And after that nothing get recognized by the ehci module. What makes you think that nothing gets recognized by ehci-hcd? Did you try unplugging the device and plugging it back in? Or did you try plugging in another high-speed device? What happens if you try the suggestion in http://bugzilla.kernel.org/show_bug.cgi?id=9638#c34 ? Alan Stern On Fri, Jul 25, 2008 at 15:33, Alan Stern <stern@rowland.harvard.edu> wrote: > What makes you think that nothing gets recognized by ehci-hcd? Did you > try unplugging the device and plugging it back in? Or did you try > plugging in another high-speed device? in the attachment "kernel output (before and after)" after the ehci failed with the card reader I've pluged in a pendrive, wich was only recognised by ohci. I've tried the unplugging other times a lot, but most of the time I've got the descriptor read error. > > What happens if you try the suggestion in > http://bugzilla.kernel.org/show_bug.cgi?id=9638#c34 ? I will try. > > Alan Stern > > Reply-To: david-b@pacbell.net On Friday 25 July 2008, Felipe Balbi wrote: > Dave, any comments ? No; doesn't happen for me, never seen it, I have no insights. Beyond the fact that at least one recent failure seems to have been caused by the device getting suspended during enumeration ... On Fri, 25 Jul 2008, Bal On Sat, Jul 26, 2008 at 05:57, Alan Stern <stern@rowland.harvard.edu> wrote: > On Fri, 25 Jul 2008, Bal On Sun, 27 Jul 2008, Bal Created attachment 17009 [details]
ehci registers before
Created attachment 17010 [details]
ehci registers after
after pendrive attached and the ehci module went wrong
On Mon, Jul 28, 2008 at 00:16, Alan Stern <stern@rowland.harvard.edu> wrote: > Let's say you have mounted a debugfs filesystem on /sys/kernel/debug > and your EHCI controller is located at PCI address 0000:00:13.5. Then > the files of interest are those located in > /sys/kernel/debug/ehci/0000:00:13.5/. I've attached the two registers file to the bugreport. On Thu, 31 Jul 2008, Bal On Thu, Jul 31, 2008 at 18:06, Alan Stern <stern@rowland.harvard.edu> wrote: > On Thu, 31 Jul 2008, Bal On Thu, 31 Jul 2008, Bal On Thu, Jul 31, 2008 at 23:30, Alan Stern <stern@rowland.harvard.edu> wrote: > That confirms the guess that your EHCI hardware has something wrong. I > have no idea how or why it is able to work under Windows, though. is it help anything if I make a log with: http://benoit.papillault.free.fr/usbsnoop/index.php or should I try with other OS? (BSD or Solaris) On Thu, 31 Jul 2008, Bal Reply-To: david-b@pacbell.net On Thursday 31 July 2008, Alan Stern wrote: > On Thu, 31 Jul 2008, Bal I've written a mail to asus. This was the REALLY useful answer: Due to the facts that there are various versions of Linux system, we cannot offer support for each version, could you please visit the official website of the OS you are using, and check if there is anything that help? If you still have any problem or the problem still exists, please feel free to contact me. Thank you for the support! Best regards and have a nice day! On Fri, 1 Aug 2008, Bal The hub doesn't help. When I've only copied from the pendrive (and done du -s *) it worked but with different pendrive and coping to it: usb 6-8.3: new high speed USB device using ehci_hcd and address 8 usb 6-8.3: configuration #1 chosen from 1 choice usb 6-8.3: New USB device found, idVendor=13fe, idProduct=1d00 usb 6-8.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3 usb 6-8.3: Product: DataTraveler 2.0 usb 6-8.3: Manufacturer: Kingston usb 6-8.3: SerialNumber: 5B6C13851DC0 Initializing USB Mass Storage driver... scsi4 : SCSI emulation for USB Mass Storage devices usbcore: registered new interface driver usb-storage USB Mass Storage support registered. usb-storage: device found at 8 usb-storage: waiting for device to settle before scanning usb-storage: device scan complete scsi 4:0:0:0: Direct-Access Kingston DataTraveler 2.0 PMAP PQ: 0 ANSI: 0 CCS sd 4:0:0:0: [sdc] 4030464 512-byte hardware sectors (2064 MB) sd 4:0:0:0: [sdc] Write Protect is off sd 4:0:0:0: [sdc] Mode Sense: 23 00 00 00 sd 4:0:0:0: [sdc] Assuming drive cache: write through sd 4:0:0:0: [sdc] 4030464 512-byte hardware sectors (2064 MB) sd 4:0:0:0: [sdc] Write Protect is off sd 4:0:0:0: [sdc] Mode Sense: 23 00 00 00 sd 4:0:0:0: [sdc] Assuming drive cache: write through sdc: sdc1 sd 4:0:0:0: [sdc] Attached SCSI removable disk sd 4:0:0:0: Attached scsi generic sg3 type 0 usb 6-8.3: reset high speed USB device using ehci_hcd and address 8 sd 4:0:0:0: [sdc] Result: hostbyte=0x05 driverbyte=0x00 end_request: I/O error, dev sdc, sector 265 usb 6-8.3: reset high speed USB device using ehci_hcd and address 8 usb 6-8.3: reset high speed USB device using ehci_hcd and address 8 usb 6-8.3: reset high speed USB device using ehci_hcd and address 8 usb 6-8.3: reset high speed USB device using ehci_hcd and address 8 usb 6-8.3: reset high speed USB device using ehci_hcd and address 8 usb 6-8.3: device descriptor read/all, error -110 usb 6-8.3: reset high speed USB device using ehci_hcd and address 8 usb 6-8.3: reset high speed USB device using ehci_hcd and address 8 I have an SB600 chipset and am experiencing the same or similar problem. I have a USB hard drive attached as well as a USB keyboard and mouse. When I "exercise" the hard drive via: for f in `seq 100`; do dd if=/dev/zero of=test bs=64k count=6000; done One of two things happens: 1. My mouse dies and no amount of unplug/replug works 2. The mouse works but is choppy. The reset messages stop entering the log if I stop my hard drive exerciser command. Excerpt from dmesg: usb 1-2.6: new low speed USB device using ehci_hcd and address 4 usb 1-2.6: configuration #1 chosen from 1 choice input: Logitech Inc. iFeel Mouse as /devices/pci0000:00/0000:00:13.5/usb1/1-2/1-2.6/1-2.6:1.0/input/input6 input,hidraw2: USB HID v1.00 Mouse [Logitech Inc. iFeel Mouse ] on usb-0000:00:13.5-2.6 usb 1-2.6: New USB device found, idVendor=046d, idProduct=c030 usb 1-2.6: New USB device strings: Mfr=4, Product=32, SerialNumber=0 usb 1-2.6: Product: iFeel Mouse usb 1-2.6: Manufacturer: Logitech Inc. process `skype' is using obsolete setsockopt SO_BSDCOMPAT usb 1-3: new high speed USB device using ehci_hcd and address 5 usb 1-3: configuration #1 chosen from 1 choice usb 1-3: New USB device found, idVendor=05e3, idProduct=0702 usb 1-3: New USB device strings: Mfr=2, Product=3, SerialNumber=0 usb 1-3: Product: USB Mass Storage Device usb 1-3: Manufacturer: Genesyslogic Initializing USB Mass Storage driver... scsi6 : SCSI emulation for USB Mass Storage devices usb-storage: device found at 5 usb-storage: waiting for device to settle before scanning usbcore: registered new interface driver usb-storage USB Mass Storage support registered. usb-storage: device scan complete scsi 6:0:0:0: Direct-Access WDC WD32 00JB-00KFA0 0811 PQ: 0 ANSI: 0 sd 6:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB) sd 6:0:0:0: [sdb] Test WP failed, assume Write Enabled sd 6:0:0:0: [sdb] Assuming drive cache: write through sd 6:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB) sd 6:0:0:0: [sdb] Test WP failed, assume Write Enabled sd 6:0:0:0: [sdb] Assuming drive cache: write through sdb: sdb1 sd 6:0:0:0: [sdb] Attached SCSI disk sd 6:0:0:0: Attached scsi generic sg2 type 0 kjournald starting. Commit interval 5 seconds EXT3 FS on sdb1, internal journal EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 usb 1-2.6: reset low speed USB device using ehci_hcd and address 4 Output of uname -a: Linux saturn 2.6.25.11-97.fc9.i686 #1 SMP Mon Jul 21 01:31:09 EDT 2008 i686 athlon i386 GNU/Linux I've found the problem. If I detach my hub, ehci works great. on windows I've used the pendrive while the hub was attached to another usb port without a problem. and on linux ehci lasted longer if I've attached the usb2.0 devices to the hub. but if I attache to another port while the hub is attached, than the problem comes (but slower with the 2.6.24 kernel). so, if the hub is faulty, than why there is no problem with it on windows? (at least I can use my usb2.0 devices again) Regards, Bal On Tue, 19 Aug 2008, Bal On Tue, Aug 19, 2008 at 17:59, Alan Stern <stern@rowland.harvard.edu> wrote: > It might be a problem with the wiring (i.e., crosstalk), not the hub. > What happens if you use two USB 2.0 devices at the same time (with no > hub)? both of them works great. > > I can't answer your question because I don't know (1) if the hub > really is faulty, (2) what is wrong with the hub, or (3) how Windows > works internally. > > Alan Stern > > On Tue, 19 Aug 2008, Bal (In reply to comment #35) > Okay, so probably the hub really is at fault. But I still don't know > what's wrong with it. And more importantly, I don't know how it > managed to crash the EHCI controller. That's really strange. > > Alan Stern At the risk of declaring victory prematurely, I too had a USB hub attached which I have since removed and everything is working great. My previous setup was: * Direct attached to PC: - Disk - Keyboard - Hub * Attached to HUB: - Dock for digital camera (off) - Corded mouse - iPod In my previous setup the USB system would die when doing lots of disk access on the USB disk. New setup is: * Direct attached to PC: - Disk - Keyboard - Mouse The system has been up and stable for hours wiht lots of very heavy activity on the disk. The hub is currently plugged in to another machine (Win2k) and is working just fine. Any ideas on what information I can provide? Stephen At this point my best suggestion is that you install the latest 2.6.27-rc kernel with CONFIG_USB_DEBUG enabled, go back to your old setup with the hub, and wait to see what happens. A similar issue here on two machines: 1) Asus M2400N laptop 2) IBM xSeries 330 server with a NEC USB 2.0 controller (in PCI) Kernel version: 2.6.26.5 The affected device is a new Samsung DVD writer. Badly slow access and dozens of damaged DVD+RWs (grrr...) are the most serious symptoms of this problem. Resets appear in bunches of about 20 - 100 in the kernel log. There are no other error messages between two resets. (I did not switch any special debugging options on...) Resets occur approximately every 30 seconds. Some bunches of resets are followed by a fatal series of I/O errors and rewritable media destruction. In many cases, however, normal operation is resumed after minutes of resetting and waiting. (Which does not save the DVD+RW from destruction, as unexpected delays seem to be a problem, too.) I saw some interesting backtarces complaining about bugs in the pktcdvd module. Unfortunately, they occured on a kernel tainted by tha madwifi ath_pci module. So I don't know whether I should post them here or not. Will try the autosuspend experiment mentioned here [https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/88746], but I don't believe this could work. Autosuspend is disabled im my kernel config. Enabling it and then re-disabling it at run-time does not sound logical to me. But I would do just anything to get rid of this awful problem. Ooops... The autosuspend trick did not help at all. The problem persists. One more important fact: The failing USB communication is somewhat dangerous for the whole system when a UDF disk is mounted. I have already experienced several hard lockups where even the magic SysRq key did not help at all. The other people found that removing their hubs fixed the problem. Does this work for you? I do not use any hub. The DVD writer is connected directly in both cases (on both machines). It was the only device connected via USB in the whole system at the time of the experiment. (More precisely: resets occur no matter if other USB devices are connected or not.) Then your problem is different from the ones described in this bug report. You should start another bug report for it, or else just post a description of the problem on the linux-usb mailing list. I have a USB keyboard and mouse (running through a KVM) and if I try to use an external USB hard-drive or a USB pen-drive I encounter this same problem. The keyboard and mouse stop responding. The USB hard-drive or pen-drive also stops working. Pen-drives are almost always readable and sometimes writeable. Hard-drives are sometimes readable but never writeable. I've tried hooking things up through hubs and I've used different motherboard ports and tried other cables and it makes no difference. Have also tried multiple USB devices and it again makes no difference. All these devices work without problem on other systems. If I login to the system remotely and do: rmmod ehci_hcd sleep 3 modprobe ehci_hcd the keyboard and mouse will work once again. I tried booting the latest Ubuntu 8.10 LiveCD beta (via a USB CD-Drive), which supposedly has kernel 2.6.27 and that failed also. My error logs are similar to the ones others have posted but I can post some if it would help. My system is: openSuSE 11.0 Linux abc 2.6.25.16-0.1-default #1 SMP 2008-08-21 00:34:25 +0200 x86_64 x86_64 x86_64 GNU/Linux ASUS M3A32-MVP motherboard (AMD 790FX/SB600) Phenom 9850 8GB RAM Please attach the dmesg log from a kernel built with CONFIG_USB_DEBUG enabled. If possible, make it a 2.6.27-rc8 kernel. Created attachment 18230 [details]
dmesg dump
Just noticed you wanted 2.6.27-rc8, I used 2.6.27-rc9 does that work? Dmesg dump attached. Sequence of events: - Boot - Connect 2 ssh sessions - Turn on USB hard-drive - Rsync a 1GB directory to the USB drive - Rsync hangs and keyboard stop responding (assume mouse also but in runlevel 3) - Ctrl-C the rsync - Try to unmount the USB drive but it won't unmount - Turn the USB drive power off then it unmounts ok - Keyboard still not responding - rmmod ehci_hcd - modprobe ehci_hcd - Keyboard now works again I wonder why your OHCI root hubs aren't autosuspending. Could you mount a debugfs filesystem and then post the contents of the ohci/0000:00:13.0/registers file? Getting back to the main problem... What happens if you plug the keyboard/mouse and the USB drive directly into the computer, and leave the two hubs disconnected? And what happens if you plug the keyboard/mouse directly into the computer and plug the drive into one of the hubs? Created attachment 18234 [details]
Dmesg and ohci registers
Sequence of events this time:
- KBD and USB drive connected direct, no hubs, works fine!
- KBD direct and USB drive connect to a DLink hub, works fine!
- Unmounted the USB drive.
- Plugged in the KVM (ATEN CS1784) which has a hub in it also.
Keyboard still plugged in direct.
- Got the error=-110 message.
- Keyboard still working.
- Did a register dump here (reg-a-13.*.txt).
- Tried to connect the USB drive to the ATEN hub,
system did not recognize it.
- Tried to connect the USB drive to the DLink hub again,
system did not recognize it here either.
- Did another register dump here (reg-b-13.*.txt).
- Tried to remove the ehci_hcd module but it wouldn't unload.
I also tried another ATEN CS1784 KVM that I had but it turns out to be broken.
Created attachment 18245 [details]
Fix unending polling in OHCI
First things first. The attached patch will fix the OHCI problem.
Now on to the real issue. Your results so far definitely indicate that the ATEN KVM is the source of the problem. Did the KVM have any keyboard or mouse attached into it when you first plugged it in during the previous test?
If you plug both the USB drive and the keyboard into the DLink hub, leaving the KVM unplugged, do they work?
If you then plug in the KVM (with nothing attached), do they continue to work?
If you then plug a mouse or keyboard into the KVM, does everything continue to work?
I will apply the patch and run some tests. Without a doubt the ATEN KVM triggers the problem, on this system, but on other systems it works without problems. Not a question, just wanted to re-iterate that the KVM itself does not "always" cause a problem, just on this system. Created attachment 18274 [details]
Dmesg and USB Registers
Have not run all the tests you asked for yet, but some feedback on the patch:
- Applied the patch.
- Booted with the KBD and Mouse plugged into the ATEN KVM.
- Turned on USB drive (connected direct).
- Got half way through typing the mount command and the keyboard went away.
- Powered off the USB drive.
- Did a dmesg dump and USB register dump (attached).
In addition to running the other requested tests with the unpatched kernel I will try the patched kernel without the ATEN KVM connected.
The ohci register dumps are now unnecessary. What we need to see are the files in the ehci/0000:00:13.5/ debug directory (not just the "registers" file, all of them). Haven't been able to get to the additional tests yet, but I will soon. Wanted to add some additional info that I forgot to mention, although it's probably not technical enough to help debug anything: when I boot the Linux systems and hit escape to clear the OpenSuSE splash screen and look at the boot messages there's garbage amongst the messages, eg: ^[^[^[^[^[^[^[^[^[^[^[ Your boot message here... When I hit a backspace the garbage stops. Also, I recently added a fourth system to the KVM that has a GeForce 8200 chipset. With this system when I boot and go into the BIOS, quite often the BIOS starts flashing screens/popups rapidly as if it's receiving keyboard input. Sometimes by hitting backspace I can get it to stop, sometimes I can't. Once the system boots it's fine and it also is fine with external USB drives. I've posted a support request with ATEN. Hopefully, I'll get back to testing it some more this week or next. Created attachment 18766 [details]
dmesg and usb register dumps
Tested four configurations with kernel 2.6.27.5:
1 - With ATEN KVM. Fails as before: can't complete write to USB HD
and keyboard communication is lost. (dump withkvm)
2 - ATEN KVM unplugged. KBD and USB HD plugged into a DLink hub.
This *fails* also: can't complete write to USB HD and keyboard
communication is lost. (dump withdlink)
3 - ATEN KVM unplugged. KBD plugged into DLink hub, HD plugged into MBD.
This works but it looks like (if I'm understanding the log messages)
that it was still losing communications with the keyboard, although
the keyboard continued to work. (dump withdlink-kbd)
4 - ATEN KVM unplugged. KBD plugged MBD, HD plugged into DLink hub.
This works: write completes, KBD continues to work. (dump withdlink-hd)
Note that the garbage output characters during boot seem to have gone away with the new kernel.
Your conclusions look right. And they would explain the disk problems: Many EHCI controllers have a bug which causes them to temporarily drop data on one port when a device on a different port disconnects. I'd say that whatever is causing the keyboard communications to drop out manages, every so often, to interfere with the disk traffic. The computer then tries to reset the disk drive, but many USB drives have a bug which prevents them from resetting correctly when they're in the middle of an I/O operation. As a result, the drive becomes unusable and the system puts it offline. As for what causes the keyboard problems... I don't know. It could simply be a cabling issue. For example, I've got a USB-to-PS/2 keyboard+mouse converter. It doesn't work at all when plugged directly into my computer, but it works just fine when connected through a USB extension cable. Try and figure that one out! You could try using a different keyboard. Maybe it wouldn't get all those communications dropouts when plugged into the DLink hub or the ATEN switch. Hello! I have same issue since todays upgrade... What should i do to properly report? Only connected device is USB logitech G5 mouse... It is KT400 chipset. I have tried unloading and reloading uhci_hcd, ehci_hcd and ohci_hcd modules with no difference... The mouse first turns on and after few seconds it turns off... When i run lsusb it again tries to turn on, but without sucess... My OS is openSUSE factory (development version) and i have tried 2.6.26 and 2.6.27 kernels... dmesg: device descriptor read/64, error -62 device descriptor read/64, error -62 new full speed USB device using ohci_hcd and address 3 device descriptor read/64, error -62 device descriptor read/64, error -62 new full speed USB device using ohci_hcd and address 4 device not accepting address 4, error -62 new full speed USB device using ohci_hcd and address 5 device not accepting address 5, error -62 lsusb (with connected mouse): Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Your mouse isn't working at all. This is a very different problem from the other problems in this bug report; you should not have posted it here. Either your mouse is broken or else your computer's OHCI controller is broken. I have created a (maybe) similar bug in: http://bugzilla.kernel.org/show_bug.cgi?id=12347 but in my case it is only affecting 64bit kernels with the same symptom, ehci_hcd does not work and the kernel reverts to using ohci_hcd instead. Hello. It seems that I've got same problem in Debian Lenny, kernel 2.6.26-1-686 on Dell Inspiron 1501 (RS480 + SB600 chipset). I found this problem with external box for HDD with chip ID 04fc:0c15 Sunplus Technology Co., Ltd (LC Power EH-25BSII). After this log in dmesg (it is there more times): [ 4479.602066] usb 1-1: reset high speed USB device using ehci_hcd and address 2 [ 4494.714952] usb 1-1: device descriptor read/64, error -110 [ 4509.931822] usb 1-1: device descriptor read/64, error -110 This box completely freezes and I must force plug out this box. After that I must reload ehci_hcd module. Ohci_hcd and uhci_hcd modules are not loaded in that moment. After that it works, but it is not good for filesystem and HDD inside because it always hangs while it is reading or writing data. Connecting additional power supply doesn't help. With my another USB->SATA converter with JMicron chip (idVendor=152d, idProduct=2338) it doesn't cause this problem. I hope it could be helpful. Created attachment 20097 [details]
Make EHCI retry transaction errors 32 times
For those of you with problems caused by hubs or KVMs, you can try out this patch (which is based on 2.6.28, although it might also work with earlier kernels). I have no idea whether it will fix all of your problems, but there's a good chance it will fix some of them.
I applied your patch on debian lenny kernel 2.6.26-1-686 and I turn on usb debugging and with testing command for f in `seq 100`; do dd if=/dev/zero of=test bs=64k count=6000; done it works ok, but if I do rsync -av --delete --progress /mnt/local_disk/ /mnt/external_disk/ and if I stop this process with Ctrl+Z in gnome-terminal, take it in this state for about 2 minutes and then fg this process external hdd freezes after about 10 seconds and dmesg says: [ 2320.736733] ehci_hcd 0000:00:13.5: port 1 high speed [ 2320.736743] ehci_hcd 0000:00:13.5: GetStatus port 1 status 001005 POWER sig=se0 PE CONNECT [ 2320.792730] usb 1-1: reset high speed USB device using ehci_hcd and address 2 [ 2325.793089] usb 1-1: usb-storage timed out on ep0in len=0/64 [ 2330.793390] usb 1-1: usb-storage timed out on ep0in len=0/64 [ 2335.793668] usb 1-1: usb-storage timed out on ep0in len=0/64 [ 2335.849613] ehci_hcd 0000:00:13.5: port 1 high speed [ 2335.849622] ehci_hcd 0000:00:13.5: GetStatus port 1 status 001005 POWER sig=se0 PE CONNECT [ 2335.905595] usb 1-1: device descriptor read/64, error -110 while I disconect external HDD. Now I don't need to reload ehci-hcd module, because after reconnecting of HDD it works OK. (In reply to comment #61) > I applied your patch on debian lenny kernel 2.6.26-1-686 and I turn on usb > debugging and with testing command > > for f in `seq 100`; do dd if=/dev/zero of=test bs=64k count=6000; done > > it works ok, but if I do > > rsync -av --delete --progress /mnt/local_disk/ /mnt/external_disk/ > > and if I stop this process with Ctrl+Z in gnome-terminal, take it in this > state > for about 2 minutes and then fg this process external hdd freezes after about > 10 seconds and dmesg says: > > [ 2320.736733] ehci_hcd 0000:00:13.5: port 1 high speed > [ 2320.736743] ehci_hcd 0000:00:13.5: GetStatus port 1 status 001005 POWER > sig=se0 PE CONNECT > [ 2320.792730] usb 1-1: reset high speed USB device using ehci_hcd and > address > 2 > [ 2325.793089] usb 1-1: usb-storage timed out on ep0in len=0/64 > [ 2330.793390] usb 1-1: usb-storage timed out on ep0in len=0/64 > [ 2335.793668] usb 1-1: usb-storage timed out on ep0in len=0/64 > [ 2335.849613] ehci_hcd 0000:00:13.5: port 1 high speed > [ 2335.849622] ehci_hcd 0000:00:13.5: GetStatus port 1 status 001005 POWER > sig=se0 PE CONNECT > [ 2335.905595] usb 1-1: device descriptor read/64, error -110 > > while I disconect external HDD. Now I don't need to reload ehci-hcd module, > because after reconnecting of HDD it works OK. > I also tested this procedure on Debian Lenny with vanilla kernel 2.6.28.3 without any patch and I transfered about 160GB to/from external HDD without any errors (there was no ehci-hcd error like on 2.6.26-1-686 kernel). So it seems to be OK with this the newest kernel version for me. I can see a similar problem with a Samsung DVD writer. There are many resets and inexplicable I/O errors in the logs. Re-connecting the drive does not help. Tested with *two* Samsung SE-S224Q drives, so drive malfunction is unlikely, and with two computers, a server and a laptop. The frequency of errors was approximately the same in all cases. Some more facts: DVD+RW with UDF: * Usually mounts and unmounts cleanly with no error messages. * Device resets always appear in bursts of about ten. * Writing a large amount of data often leads to media damage. DVD-RAM with Reiser4: * Numerous (> 30) resets occur when the FS is mounted. * Unmounts are usually quick and clean. * I/O errors are reported approximately once per 10 resets. * Unlike DVD+RW, files remain readable even after I/O error reports... Kernels: 2.6.28.7 and 2.6.28.9 I wish I knew of some way to improve the situation, but I don't. It might be a design flaw in all of Samsung's USB interface chips -- possibly also present in devices from other manufacturers, especially if they use interface chips with the same firmware. I have similar issue with 2.6.29 I hadn't before (in 2.6.28 - however it might be so as modules used to be loaded in incorrect order by udev): [250625.112100] usb 3-4: reset high speed USB device using ehci_hcd and address 2 [250640.646029] usb 3-4: device not accepting address 2, error -110 [250640.748038] usb 3-4: reset high speed USB device using ehci_hcd and address 2 [250656.288076] usb 3-4: device not accepting address 2, error -110 [250656.390089] usb 3-4: reset high speed USB device using ehci_hcd and address 2 [250666.812036] usb 3-4: device not accepting address 2, error -110 [250666.914045] usb 3-4: reset high speed USB device using ehci_hcd and address 2 [250677.336026] usb 3-4: device not accepting address 2, error -110 It is a single USB stick connected to port. When I'll have free time I'll check a) verbose output b) 2.6.28. I applied the EHCI patch to 2.6.28.9 and the DVD writer worked *fine* since then. There were no device resets and no damaged media. (The OHCI patch was rejected with 2.6.28.9, so I assume it had been merged before that.) Unfortunately, when I connected an OHCI device (a USB audio adapter) to the same (NEC) host controller as the (Samsung) DVD writer, hundreds of messages like this appeared in dmesg: sr 10:0:0:0: [sr1] Result: hostbyte=0x05 driverbyte=0x00 end_request: I/O error, dev sr1, sector 6363180 Buffer I/O error on device sr1, logical block 1590795 lost page write due to I/O error on sr1 usb 1-2: reset high speed USB device using ehci_hcd and address 4 usb 1-2: reset high speed USB device using ehci_hcd and address 4 usb 1-2: reset high speed USB device using ehci_hcd and address 4 usb 1-2: reset high speed USB device using ehci_hcd and address 4 When these things happen, both the DVD writer and the audio adapter freeze. Programs like alsamixer or dvd+rw-mediainfo remain blocked in an uninterruptible state. Unloading and re-loading both ohci_hcd and ehci_hcd does not help. The devices have to be reconnected physically. All the problems seem to start with this message: ohci_hcd 0000:01:05.0: bad entry 20d00 After the message, the sound card stops responding and the DVD writer has only a few seconds left before the never-ending series of device resets begins. BTW, lsusb freezes as well. (Haven't seen that before.) It surprises me that one misbehaving device (which is probably the Samsung drive) has such an impact on the whole host controller. I have been experiencing similar issues to what is described here, however it involves keyboard and mouse on ubuntu with their kernel 2.6.29. I understand this is not your vanilla kernel, and it's not involving the same type of devices as reported initially here. However please bear with me and read on. First I'd like to bring your attention to the fact there has been many similar bug reports about usb resets on various distro, a few examples are: https://bugs.launchpad.net/bugs/124406 : the resets affects keyboard and mouse https://bugs.launchpad.net/bugs/91230 : idem, just older http://www.mail-archive.com/linux-usb-users@lists.sourceforge.net/msg18199.html http://taint.org/2006/12/13/191554a.html http://bugs.gentoo.org/177266 https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.15/+bug/54419 ... There's plenty more if you search the net with these keywords: reset speed USB device using ehci_hcd. I believe it's a long standing bug which is elusive and difficult to track. What I observe across a lot of reports is often the following: - It gets dismissed by being hardware related. Certainly some of these reports are due to bad hardware. However there are many cases where the same config works perfectly well with other non-linux OS. - It's often resolved by trying to connect the affected devices differently. In my case it seems I had to put my external keyboard and mouse to a separate hub. - Many people are getting frustrated with it and either give up or play around like me. I gave up on migrating to linux a couple years ago because of this bug. Being so elusive and widespread, many of these bug reports end up with either no fix, or a workaround like mentioned before that does not really address the issue. And since nobody can do much about it, these reports become inactive/closed. One of the most active thread on this bug (first link I gave above) is probably going to end up being closed because it has been judged too "messy" (I agree but that's no reason to close it, it's just a messy bug). One might argue there are more than one bug here, because different devices are affected. My guess however is that it's one bug affecting various devices (hd, kb, mice, ...). In fact, having played around a bit, I would advance the theory that it is related to the presence of devices with different speed on the same bus/hub (lo+hi and lo+full for sure, hi+lo most likely). If you look at this post, you'll see a report of this bug on a usb 1.1 machine: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/124406/comments/254 So it could be (pure speculation) that this bug has been introduced since the addition of ehci, and is affecting usb1.1 as a result of the changes made to accomodate ehci. According to http://www.mjmwired.net/kernel/Documentation/usb/ehci.txt : Note that USB 2.0 support involves more than just EHCI. It requires other changes to the Linux-USB core APIs, including the hub driver, but those changes haven't needed to really change the basic "usbcore" APIs exposed to USB device drivers. Anyway, my point really is this: this bug is probably more widespread than one can imagine, it affects many distro, it's been reported many times over in many places with different facets. This bug report #11159 is probably the only open and recent report of this bug on the kernel list. So please, don't dismiss it because it's too elusive or leads to a messy discussion thread! I consider this bug to a linux killer, and there is a real need for the kernel team to be involved. It's quite possible that this problem is caused by a bug in the Clear-TT-Buffer handling in ehci-hcd. This was brought to my attention just recently, and I wrote a pair of patches to fix it. They have not yet been submitted, but you (or anyone else reading this bug report) can test them. I'll attach them to the bug report. They are against 2.6.30-rc6 but they probably will apply to earlier kernel versions too. You'll need to use both patches, in order. Created attachment 21743 [details]
Add Clear-TT-Buffer callback
Patch 1: modify the Clear-TT-Buffer interface by adding a callback pointer
Created attachment 21744 [details]
Make ehci-hcd wait for Clear-TT-Buffer to complete
Patch 2: Make ehci-hcd wait Clear-TT-Buffer requests to complete
i've the same issue. i'll try the patches ASAP. thanks i've tried them. After 300mb of data movement i've see a "reset high speed" :( but it occurs only time...so i think i could say the patches fixed something but not all...i've tried on 2.6.30.2 If i could help let me know :) Can you reproduce the reset? If you can't, then there's nothing to worry about. If you can, attach a usbmon log showing what happens starting shortly before the reset. ok, i'll try it. Thanks for your work :) i've created this file with usbmon during a moving the directory /usr/ on my ssd after 40mb it gives me a reset i've pressed CTRL-C ASAP i've seen the reset i hope it can help you here is the log: http://www.cagnulein.com/tmp/usbmon_ehci_reset.txt Unfortunately the usbmon log doesn't show why the reset occurred. Apparently the computer was writing data to the disk when the disk suddenly stopped accepting data. After 30 seconds the computer gave up and and reset the disk drive, after which it started working normally again. could it be an hardware problem? how can i verify this? Sure it could. Verify it by trying out different hardware. Also try a different cable. And last but not least: try another non-linux OS (but don't ask me for suggestion;) Most of these weird problems seem to occur if (and only if) I use a PCI USB controller plugged into a PCI-X slot of my IBM xSeries server. Some facts: * Tried two USB controllers, NEC and VIA. * NEC works fine when only one mass storage device is plugged in. Resets occur otherwise. * NEC doesn't like webcams. The image is either choppy or just black, varies from (kernel) version to version. * VIA doesn't support mass storage at all. Numerous I/O errors pop up immediately when a device is connected. * VIA supports webcams, but only at UHCI speeds. When a EHCI-only webcam is plugged in, it is detected, but doesn't work. But here comes the most important fact of all: There are *no* such problems on other Linux machines I run. For example, USB now works fine on my laptop. I can connect my DVB-T receiver, pen drive, bluetooth dongle, webcam and other crazy devices at once and nothing bad happens. Perhaps there's something wrong with PCI USB controllers in PCI-X slots. Unfortunately, the server is in use, so I can't bring it down and test USB thoroughly. i've try the usb on another pc and works well. so i've installed windows xp on the pc that has the problem. TADAM: the problem still remains! So it's an hardware issue :( Thanks for your time Greg, I don't think this bug report is helping anybody any more. We haven't heard anything from the OP in almost a year. You might as well close it out. Ok, closing out, thanks. I find the decision to close this bug quick and unfounded. Who has verified that the fix committed earlier has resolved this issue? It's not because the last person talking here said it was hardware-related that there was no bug. Ideally we should find a machine/configuration where this bug has happened, test without the fix, test with the fix, and then conclude. Please read my comment #67 above. This bug has been occuring regularly and people report it mostly against distros bug tracking. Since distro can't do much they often close the report because they find it hard to track and do anything about it. This practice of closing bug because they're too elusive to track isn't right, even less in open source (what's the problem with how long a bug is open?) (In reply to comment #84) > I find the decision to close this bug quick and unfounded. Who has verified > that the fix committed earlier has resolved this issue? I did not see any reports of this problem with the latest Linux kernel. If you do have this problem, with the 2.6.30 or later kernel, please post to the linux-usb@vger.kernel.org mailing list the kernel log messages and we will be glad to work with you there. (In reply to comment #84) > I find the decision to close this bug quick and unfounded. The bug report was opened more than a year ago. How can you call that "quick"? Evidently you did not receive the message I sent to you in response to comment #67. The gist of it was that this is not a single bug. Most of the bug reports you cited were either non-reproducible or caused by hardware problems. In others the reporters have stopped responding, making it impossible to find out what was wrong. So far there has been extremely little indication that errors in the software are responsible for these resets. Even if it is true that the software is partly to blame, we have no way to fix it unless we can find out exactly where the errors are. Errors get fixed when they are tracked down, and your rant doesn't contribute towards tracking down any of these problems. (In reply to comment #86) > (In reply to comment #84) > > I find the decision to close this bug quick and unfounded. > The bug report was opened more than a year ago. How can you call that > "quick"? I say quick as in quick-thinking. > Evidently you did not receive the message I sent to you in response to > comment #67. The gist of it was that this is not a single bug. I did receive it, but I was just starting out with linux and couldn't really deal with kernel hacking, I needed to get going with my work. > Most of the bug reports you cited were either non-reproducible or caused by > hardware problems. > In others the reporters have stopped responding, making it impossible to find > out what was wrong. That's what I call hard to track and elusive. Does that mean there's no bug? No. If there's a bug, even if hard to track, I think there should be an entry for it. All this talk about keeping a log only for reproducible bugs does not make sense to me, and only reflect a dislike for unreproducible bug. > So far there has been extremely little indication that errors in the software > are responsible for these resets. How do you explain other OS had no issue in the same configuration? > Even if it is true that the software is > partly to blame, we have no way to fix it unless we can find out exactly > where > the errors are. Errors get fixed when they are tracked down, and your rant > doesn't contribute towards tracking down any of these problems. Closing this bug doesn't contribute in nailing it down. That's what happened for the last two years: it gets reported and then closed, it reappers somewhere else but the OP quickly disappears because this bug is a showstopper (you can't work with it). Anyway, there's no need to consider this a rant and get personal about it. I appreciate a lot all the efforts from the so many people in getting GNU/Linux to where it is now. Also, bear in mind that in the past 4 years I really wanted to get away from M$. This bug kept me from making the move. Only recently did I try again, got this bug again, but did not want to get discouraged because of it. So I'm sorry for making so much noise, I'm tired of this bug as much as you are, but I don't want to forget about it. Is the fix you talked about in 2.6.30? If so then it should be in my arch release kernel. I'll rewire my devices (in order to reproduce the bug) and report back in about a week when I have more time to play. > That's what I call hard to track and elusive. Does that mean there's no bug? > No. If there's a bug, even if hard to track, I think there should be an entry > for it. All this talk about keeping a log only for reproducible bugs does not > make sense to me, and only reflect a dislike for unreproducible bug. If a bug isn't reproducible and can't be tracked then there is no hope of fixing it. In such cases, whether the bug report remains open or not doesn't really matter much. > How do you explain other OS had no issue in the same configuration? I can't explain it because I haven't been able to obtain enough information. Mostly this is because the reporters stop responding, but occasionally it's because special debugging hardware is needed and the reporter doesn't have it. One possible explanation might be that the drivers for the other OS were written with special knowledge of some bugs in the hardware. Manufacturers often don't share such knowledge with open-source programmers. But I have no way to know if this is the case here. > Closing this bug doesn't contribute in nailing it down. That's what happened > for the last two years: it gets reported and then closed, it reappers > somewhere > else but the OP quickly disappears because this bug is a showstopper (you > can't > work with it). You are making a very common mistake: confusing the bug with the bug report. I keep telling you that what we see reported here, in this one report, is really many different bugs. Several of them have already been tracked down and solved. You don't seem to understand this. > Only recently did I > try again, got this bug again, but did not want to get discouraged because of > it. So I'm sorry for making so much noise, I'm tired of this bug as much as > you > are, but I don't want to forget about it. If you would like to contribute, please open a new bug report. This is because your problem is almost certainly caused by a different bug from the one that originally led to this report. > Is the fix you talked about in 2.6.30? Only partially. The most recent code is available in 2.26.31-rc5 together with this patch: <http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/gregkh-all-2.6.31-rc5.patch>. I will wait for 2.6.31 to reach arch release then. I'll test before the new kernel, and then after. If the bug is still there, I'd be happy to help with the tracking, but I won't have time to deal with kernel hacking. I'd be happy to use whatever debug kernel that will work on my arch system if one cares to provide it. I'll open a new bug report if needed and back link to this one because even if there are other bugs that got mingled, I find that most of the discussion here relates to the resets I experience with my keyboard and mouse. Okay, good. Mostly what I object to is people jumping on board someone else's bug report, with their own bugs that are quite different from the one originally reported, merely because one of the symptoms is the same. I came here because my mouse and keyboard become unusable (jumpy mouse, repeating keys) in various settings (tried 2 different hubs and different cables) and each time dmesg shows that keyboard and/or mouse get reset. Earlier in the thread keyboard and mouse were mentioned as well as resets so I thought it was a good place here. Initially I reported my issue against ubuntu bug 124406 (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/124406?comments=all). However this bug got closed because it was judged too messy. I felt this bug did not get the proper attention it deserved, so I came here because I thought the kernel team would be more able to deal with it. I entered comment #67 and also created another ubuntu bug entry (see https://bugs.launchpad.net/ubuntu/+bug/383722). There you'll find more info on what I experience along with relevant dmesg logs. I understand your point except that in practice when a bug is difficult to track and reproduce it gets reported in many different ways, and there's not much we can do about that. I'm sorry to be so forthcoming but I'm tired of always seeing this bug getting dismissed in similar ways: can't reproduce, OP gone, confusion with other bugs... If we keep sticking to the rules of clean and reproducible bugs then we'll never become good at dealing with concurrency issues and parallel computing! BTW: I can reproduce the bug and so does Rolf Leggewie (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/124406/comments/254). All right. We'll wait until you can install the latest kernel with all the existing patches and are able to do some testing. You will need to enable CONFIG_USB_DEBUG, CONFIG_USB_MON, and CONFIG_DEBUG_FS. I'm thinking I've spot the particular USB configuration when the problem is present. It seems to be when copied large amount of data (say 1.9 GiB) from USB using SD-card reader. I've not check this time but it seems that it does not occure on dd of partition. Reproduced on 2.6.30.4 - I'll check the -rc soon. I were injured this problem too. When I install clean system 17.0 Linux mint all were OK. During some updates by 'Update manager' on 17.1 the problem were rised. The hardware has not been changed This is my post: http://askubuntu.com/questions/666601/system-is-loading-about-10-min-what-is-comming-on with messages on logs. This problem does not exists when rebooting. It is only when cool start. Linux keshome 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux updating to 3.16/4.1.3/4.2 has no any effect Linux keshome 4.2.0-040200-generic #201508301530 SMP Sun Aug 30 19:31:40 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux KES777, do you know of a kernel version that worked okay? And can you attach the log messages to this bug report? The Ubuntu question has been removed. Created attachment 189931 [details]
log file with problems
The log file where you can see problem:
[ 3222.072011] usb 1-2: new high-speed USB device number 59 using ehci-pci
When this problem occurs
the FF tabs with adobe flash player does not work ,whole FF is halted.
Also can not run VirtualBox guest OS
Also the theme on desktop is changed to some other theme
Cool start has always this problem.
Sometimes when rebooting there is no this problem.
It sounds like your computer has lots of problems, not just with USB. Have you checked to see if a BIOS update is available? Booting from [linuxmint-17.1-cinnamon-64bit.iso](https://www.google.com.ua/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0CCcQFjABahUKEwjb96KftL3IAhVFBywKHZpiC4w&url=http%3A%2F%2Fwww.linuxmint.com%2Fedition.php%3Fid%3D172&usg=AFQjCNGhNbBoPHn7h7j5jLG6Cgrs3xpogA&sig2=EiLtHf60Iuqh6l7cX0WtJw) has no any of those problems. If you know which kernel version works and which one doesn't work, you can use git bisect to find the cause of the problem. I do not know how to use it, how to compile my own kernel, install etc. Created attachment 190741 [details]
log files of different OSes caused and not caused by this problem
The OSes:
1. Windows 7
2. Windows XP
3. FreeBSD 9.3
are not affected by "new high-speed USB device number" problem. They works fine.
The FreeBSD 9.3 just report about some error for umass:
Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): Retrying command
Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00
Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): Error 5, Retries exhausted
4. cinnamon mint 15 (3.8.0-19-generic) works fine, but kern.log file is spamed by this messages:
Oct 21 13:54:17 kes-desktop kernel: [ 4260.504011] usb 1-2: new high-speed USB device number 34 using ehci-pci
Oct 21 13:54:17 kes-desktop kernel: [ 4260.572330] hub 1-0:1.0: unable to enumerate USB device on port 2
Oct 21 13:54:17 kes-desktop kernel: [ 4260.788218] hub 1-0:1.0: unable to enumerate USB device on port 2
5. ubuntu 12.04 also is not affected.
These has problems:
1. ubuntu 14.04. It is loading too slow and after loading I can not visit sites that uses adobe_flash plugin. The browser is halted (see print_screen: ubuntu-14.04-ishalted.png).
2. cinnamon-mint-16
3. Fresh installation of 17.1
I also want to check how work mint 14 and 13, but I can not install them due to the installer error
The detailed log messages for those systems see at attachment
It seems some bug is introduced to the new kernels. If someone give me detaided instructions how to use bisect and how to compile/install kernels from source I can complete that.
Thank you.
There are lots of tutorials explaining how to use git bisect and how to build/install kernels. Google is your friend. log files of different OSes caused and not caused by this problem The OSes: 1. Windows 7 2. Windows XP 3. FreeBSD 9.3 are not affected by "new high-speed USB device number" problem. They works fine. The FreeBSD 9.3 just report about some error for umass: Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00 Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB request completed with an error Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): Retrying command Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00 Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB request completed with an error Oct 18 00:30:23 kernel: (probe0:umass-sim0:0:0:0): Error 5, Retries exhausted 4. cinnamon mint 15 (3.8.0-19-generic) works fine, but kern.log file is spamed by this messages: Oct 21 13:54:17 kes-desktop kernel: [ 4260.504011] usb 1-2: new high-speed USB device number 34 using ehci-pci Oct 21 13:54:17 kes-desktop kernel: [ 4260.572330] hub 1-0:1.0: unable to enumerate USB device on port 2 Oct 21 13:54:17 kes-desktop kernel: [ 4260.788218] hub 1-0:1.0: unable to enumerate USB device on port 2 5. ubuntu 12.04 also is not affected. These has problems: 1. ubuntu 14.04. It is loading too slow and after loading I can not visit sites that uses adobe_flash plugin. The browser is halted (see print_screen: ubuntu-14.04-ishalted.png). 2. cinnamon-mint-16 3. Fresh installation of 17.1 I also want to check how work mint 14 and 13, but I can not install them due to the installer error The detailed log messages for those systems see at attachment It seems some bug is introduced to the new kernels. If someone give me detaided instructions how to use bisect and how to compile/install kernels from source I can complete that. Thank you. |