Bug 2078

Summary: USB mass storage driver hangs
Product: Drivers Reporter: Brice Arnould (98111)
Component: USBAssignee: Matthew Dharm (mdharm-usb)
Status: REJECTED INSUFFICIENT_DATA    
Severity: high CC: dbrownell, greg
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.2 (vanilla) Subsystem:
Regression: --- Bisected commit-id:
Attachments: The dmesg before i connect the peripheral
The dmesg close after i connect it
The dmesg close after the cat
The dmesg after cat died
The dmesg output after reiserfs crash
Reduce transfer rate

Description Brice Arnould 2004-02-11 16:23:12 UTC
Distribution:Gentoo Linux 1.4 
 
Hardware Environment: 
Via KT400 Motherboard (Soltek 75-frv), AMD Athlon-xp, "noname" IDE to USB 2.0 
case, Hitachi hard drive 80Go. 
 
Software Environment: 
GNU, cat from "GNU coreutils" 
 
Problem Description: 
First, please excuse for annoying you if the problem is on my side, and for my 
bad english, and for being a newbie. I promise you that i've googled and RTFM 
and everything. 
The problem is that a new hardware, certified by the vendor to be 
UMS-compliant, is shortly acessible then hang; altough it work "well" (as well 
as the rest) on a win2k PC. 
My USB mouse works when i connect it to the same port. 
The most strange is that onetime I was even able to mount a partition and copy 
20Go to it. But then i've done a "cat..." and disk hanged. It also hang when 
reiserfs try to replay log. Perhaps it's when i acess to certain sectors of the 
disk or to too concomitant sectors ? 
I've tried with and without "new taskfile code", "preemptible kernel" and 
"usbfs" (among others). The usb (1.1) key of my brother works on the same 
connector. 
 
Steps to reproduce: 
1- Connect the peripheral. 
2- Do a cat /dev/sda (or a hdparm -t /dev/sda or wathever) 
3- Wait. After 3 lines, cat hangs and kernel seems to do the same thing in 
loop. 
4- If i wait very long, cat die with "cat: /dev/sda: Erreur 
d'entr
Comment 1 Brice Arnould 2004-02-11 16:27:27 UTC
Created attachment 2084 [details]
The dmesg before i connect the peripheral
Comment 2 Brice Arnould 2004-02-11 16:28:12 UTC
Created attachment 2085 [details]
The dmesg close after i connect it
Comment 3 Brice Arnould 2004-02-11 16:30:20 UTC
Created attachment 2086 [details]
The dmesg close after the cat
Comment 4 Brice Arnould 2004-02-11 16:31:14 UTC
Created attachment 2087 [details]
The dmesg after cat died
Comment 5 Brice Arnould 2004-02-12 02:49:11 UTC
If I connect it during the boot phase of the kernel (before EHCI 
initialisation ?), it works but at USB 1.1 speed. So i've thinked that this is 
a bug of EHCI driver and moved it to this category, please correct me if it is 
a mistake. 
Comment 6 Brice Arnould 2004-02-12 09:13:48 UTC
Created attachment 2096 [details]
The dmesg output after reiserfs crash

I've told about crash just while mounting reiserfs partition; this time I've
logged it. Kernel say that there is a bug in "fs/reiserfs/prints.c:339"; but
i'm not sure of it because the disk could have simply became suddenly
unreachable.
Also, as those crash of reiserfs block all sync or mount/umount and so can lead
to loss of data, i've changed severity to High (according to the Bugzilla
help). Please excuse me if this is exagerated.
Comment 7 Brice Arnould 2004-02-12 13:29:54 UTC
If that can help, i've founded the datashet of the chip included in this case 
(PDF). 
http://www.genesyslogic.com/eimages_product/GL811EDatasheet_111.pdf 
Comment 8 Matthew Dharm 2004-02-19 00:10:32 UTC
First, this is a Genesys Logic device.  They have compatibility problems with 
Linux which appear to be chip bugs (in some chips).  However, this could be 
something else....

What makes this case interesting is that the reset-recovery seems to work 
here, at least some.  We can get back into a state where we can exchange 
commands with the device (TEST_UNIT_READY, etc.) repeatedly, but we can't seem 
to get that big data exchange to work.  With most complaints about GL devices, 
we can never get back into a good state.

David -- perhaps this would be a good candidate for your EHCI debugging 
patches?

If those don't show anything, then I'm going to write this off as another bad 
Genesys Logic unit.

Also, the crashes coming from the ReiserFS code should be looked at by the 
ReiserFS people.  It looks like they can't handle journal recovery on flaky 
disks.
Comment 9 Matthew Dharm 2004-03-01 00:21:17 UTC
Created attachment 2258 [details]
Reduce transfer rate

Try this patch, which reduces the transfer throughput.
Comment 10 Matthew Dharm 2004-04-03 02:57:50 UTC
An entire month has passed with no new data.  Closing as "Insufficient Data"