Bug 5173 - USB subsystems hangs when stressing the bulk-only storage device
USB subsystems hangs when stressing the bulk-only storage device
Status: CLOSED CODE_FIX
Product: Drivers
Classification: Unclassified
Component: USB
i386 Linux
: P2 normal
Assigned To: Alan Stern
:
Depends on:
Blocks: USB
  Show dependency treegraph
 
Reported: 2005-09-02 02:21 UTC by Feng-sung Yang
Modified: 2005-10-12 07:34 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.13
Tree: Mainline
Regression: ---


Attachments
1 fail, 3 okay, 3 pending, 1 not found (20.75 KB, application/octet-stream)
2005-09-04 23:38 UTC, Feng-sung Yang
Details

Description Feng-sung Yang 2005-09-02 02:21:21 UTC
Most recent kernel where this bug did not occur: None
Distribution: Mandriva Linux release 2006.0 (Cooker) for i586
Hardware Environment: i686 Intel(R) Pentium(R) 4 CPU 3.00GHz
Software Environment:
Kernel tested: 2.6.13-rc7, 2.6.13
Problem Description:

I write a test program to tests flash cards through card readers. All the flash
cards are identical. The card readers are the same, too. This program relies on
usb-storage to build the block devices for the flash card/card readers. Then
read and write tests with O_SYNC open option are performed.

If only one pair is tested, it seems okay. However, typically several pairs are
tested simultaneously. Inside my program, each transfer unit of read() or
write() is 64K bytes. The data is generated randomly. And the address is also
randomly generated but guaranteed to be valid.

After some long tests (about several hundreds MB for each pair), one pair may
hang. That is, the I/O operation (read() or write()) never returns. Command
lsusb will never return, either. Even if I unplugged all the card readers, the
USB subsystem still hangs. And the block devices /dev/sdX are still present. By
the ps command, it can be observed that all storage and scsi drivers are still
present and one of them is blocked. The following is the ps result of tests for
4 pairs:

[root@fsyang syslog]# ps -eo
pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep scsi
 8635  8635 TS       -   0  19   0  0.0 S    text.lock.scsi_error          
scsi_eh_35
 8649  8649 TS       -   0  19   0  0.0 S    text.lock.scsi_error          
scsi_eh_36
 8685  8685 TS       -   0  24   0  0.0 S    text.lock.scsi_error          
scsi_eh_37
 8713  8713 TS       -   0  24   0  0.0 D    command_abort                 
scsi_eh_38
[root@fsyang syslog]# ps -eo
pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep usb
 8636  8636 TS       -   0  24   0  0.0 S    text.lock.usb                 
usb-storage
 8650  8650 TS       -   0  24   0  0.0 S    text.lock.usb                 
usb-storage
 8687  8687 TS       -   0  24   0  0.0 S    text.lock.usb                 
usb-storage
 9202  9202 TS       -   0  22   0  0.0 D+   text.lock.usb                  lsusb

 




The following is the ver_linux result:

Gnu C                  4.0.1
Gnu make               3.80
binutils               2.16.91.0.2
util-linux             2.12q
mount                  2.12q
module-init-tools      3.2-pre8
e2fsprogs              1.38
reiserfsprogs          line
reiser4progs           line
nfs-utils              1.0.7
Linux C Library        2.3.5
Dynamic linker (ldd)   2.3.5
Procps                 3.2.5
Net-tools              1.60
Console-tools          0.2.3
Sh-utils               5.2.1
udev                   068
Modules Loaded         sd_mod usb_storage nfsd exportfs ipv6 nfs lockd nfs_acl
sunrpc snd_pcm_oss snd_mixer_oss snd_intel8x0 snd_ac97_codec snd_pcm snd_timer
snd soundcore snd_page_alloc sk98lin af_packet ide_cd loop vfat fat nls_cp950
ntfs intel_agp agpgart hw_random ata_piix libata scsi_mod ehci_hcd uhci_hcd
usbcore ext3 jbd





The following is the last few messages of /var/log/kernel/info:
(The devices are 5-1.1, 5-1.2, 5-1.3, and 5-1.4. Only 5-1.4 failed to complete
the testing)

Sep  2 16:17:10 fsyang kernel: usb-storage: Command WRITE_10 (10 bytes)
Sep  2 16:17:10 fsyang kernel: usb-storage:  2a 00 00 0f 11 00 00 00 80 00
Sep  2 16:17:10 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x2150 L
65536 F 0 Trg 0 LUN 0 CL 10
Sep  2 16:17:10 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31
bytes
Sep  2 16:17:10 fsyang kernel: usb-storage: Status code 0; transferred 31/31
Sep  2 16:17:10 fsyang kernel: usb-storage: -- transfer complete
Sep  2 16:17:10 fsyang kernel: usb-storage: Bulk command transfer result=0
Sep  2 16:17:10 fsyang kernel: usb-storage: usb_stor_bulk_transfer_sglist: xfer
65536 bytes, 16 entries
Sep  2 16:17:10 fsyang kernel: usb-storage: Status code 0; transferred 65536/65536
Sep  2 16:17:10 fsyang kernel: usb-storage: -- transfer complete
Sep  2 16:17:10 fsyang kernel: usb-storage: Bulk data transfer result 0x0
Sep  2 16:17:10 fsyang kernel: usb-storage: Attempting to get CSW...
Sep  2 16:17:10 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 13
bytes
Sep  2 16:17:10 fsyang kernel: usb-storage: Status code 0; transferred 13/13
Sep  2 16:17:10 fsyang kernel: usb-storage: -- transfer complete
Sep  2 16:17:10 fsyang kernel: usb-storage: Bulk status result = 0
Sep  2 16:17:10 fsyang kernel: usb-storage: Bulk Status S 0x53425355 T 0x2150 R
0 Stat 0x0
Sep  2 16:17:10 fsyang kernel: usb-storage: scsi cmd done, result=0x0
Sep  2 16:17:10 fsyang kernel: usb-storage: *** thread sleeping.
Sep  2 16:17:10 fsyang kernel: usb-storage: queuecommand called
Sep  2 16:17:10 fsyang kernel: usb-storage: *** thread awakened.
Sep  2 16:17:10 fsyang kernel: usb-storage: Command WRITE_10 (10 bytes)
Sep  2 16:17:10 fsyang kernel: usb-storage:  2a 00 00 0f 11 80 00 00 80 00
Sep  2 16:17:10 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x2151 L
65536 F 0 Trg 0 LUN 0 CL 10
Sep  2 16:17:10 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31
bytes
Sep  2 16:17:10 fsyang kernel: usb-storage: Status code 0; transferred 31/31
Sep  2 16:17:10 fsyang kernel: usb-storage: -- transfer complete
Sep  2 16:17:10 fsyang kernel: usb-storage: Bulk command transfer result=0
Sep  2 16:17:10 fsyang kernel: usb-storage: usb_stor_bulk_transfer_sglist: xfer
65536 bytes, 15 entries
Sep  2 16:17:10 fsyang kernel: usb-storage: Status code -71; transferred 56832/65536
Sep  2 16:17:10 fsyang kernel: usb-storage: -- unknown error
Sep  2 16:17:10 fsyang kernel: usb-storage: Bulk data transfer result 0x4
Sep  2 16:17:10 fsyang kernel: usb-storage: -- transport indicates error, resetting
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_reset_device returns -19
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_Bulk_reset called
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_control_msg: rq=ff
rqtype=21 value=0000 index=00 len=0
Sep  2 16:17:13 fsyang kernel: usb-storage: Soft reset failed: -19
Sep  2 16:17:13 fsyang kernel: usb-storage: scsi cmd done, result=0x70000
Sep  2 16:17:13 fsyang kernel: usb-storage: *** thread sleeping.
Sep  2 16:17:13 fsyang kernel: usb-storage: queuecommand called
Sep  2 16:17:13 fsyang kernel: usb-storage: *** thread awakened.
Sep  2 16:17:13 fsyang kernel: usb-storage: Command WRITE_10 (10 bytes)
Sep  2 16:17:13 fsyang kernel: usb-storage:  2a 00 00 0f 11 80 00 00 80 00
Sep  2 16:17:13 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x2152 L
65536 F 0 Trg 0 LUN 0 CL 10
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31
bytes
Sep  2 16:17:13 fsyang kernel: usb-storage: Status code -19; transferred 0/31
Sep  2 16:17:13 fsyang kernel: usb-storage: -- unknown error
Sep  2 16:17:13 fsyang kernel: usb-storage: Bulk command transfer result=4
Sep  2 16:17:13 fsyang kernel: usb-storage: -- transport indicates error, resetting
Sep  2 16:17:13 fsyang kernel: usb-storage: unable to lock device for reset: -19
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_Bulk_reset called
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_control_msg: rq=ff
rqtype=21 value=0000 index=00 len=0
Sep  2 16:17:13 fsyang kernel: usb-storage: Soft reset failed: -19
Sep  2 16:17:13 fsyang kernel: usb-storage: scsi cmd done, result=0x70000
Sep  2 16:17:13 fsyang kernel: usb-storage: *** thread sleeping.
Sep  2 16:17:13 fsyang kernel: usb-storage: queuecommand called
Sep  2 16:17:13 fsyang kernel: usb-storage: *** thread awakened.
Sep  2 16:17:13 fsyang kernel: usb-storage: Command WRITE_10 (10 bytes)
Sep  2 16:17:13 fsyang kernel: usb-storage:  2a 00 00 0f 11 80 00 00 80 00
Sep  2 16:17:13 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x2153 L
65536 F 0 Trg 0 LUN 0 CL 10
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31
bytes
Sep  2 16:17:13 fsyang kernel: usb-storage: Status code -19; transferred 0/31
Sep  2 16:17:13 fsyang kernel: usb-storage: -- unknown error
Sep  2 16:17:13 fsyang kernel: usb-storage: Bulk command transfer result=4
Sep  2 16:17:13 fsyang kernel: usb-storage: -- transport indicates error, resetting
Sep  2 16:17:13 fsyang kernel: usb-storage: unable to lock device for reset: -19
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_Bulk_reset called
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_control_msg: rq=ff
rqtype=21 value=0000 index=00 len=0
Sep  2 16:17:13 fsyang kernel: usb-storage: Soft reset failed: -19
Sep  2 16:17:13 fsyang kernel: usb-storage: scsi cmd done, result=0x70000
Sep  2 16:17:13 fsyang kernel: usb-storage: *** thread sleeping.
Sep  2 16:17:13 fsyang kernel: usb-storage: queuecommand called
Sep  2 16:17:13 fsyang kernel: usb-storage: *** thread awakened.
Sep  2 16:17:13 fsyang kernel: usb-storage: Command WRITE_10 (10 bytes)
Sep  2 16:17:13 fsyang kernel: usb-storage:  2a 00 00 0f 11 80 00 00 80 00
Sep  2 16:17:13 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x2154 L
65536 F 0 Trg 0 LUN 0 CL 10
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31
bytes
Sep  2 16:17:13 fsyang kernel: usb-storage: Status code -19; transferred 0/31
Sep  2 16:17:13 fsyang kernel: usb-storage: -- unknown error
Sep  2 16:17:13 fsyang kernel: usb-storage: Bulk command transfer result=4
Sep  2 16:17:13 fsyang kernel: usb-storage: -- transport indicates error, resetting
Sep  2 16:17:13 fsyang kernel: usb-storage: unable to lock device for reset: -19
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_Bulk_reset called
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_control_msg: rq=ff
rqtype=21 value=0000 index=00 len=0
Sep  2 16:17:13 fsyang kernel: usb-storage: Soft reset failed: -19
Sep  2 16:17:13 fsyang kernel: usb-storage: scsi cmd done, result=0x70000
Sep  2 16:17:13 fsyang kernel: usb-storage: *** thread sleeping.
Sep  2 16:17:13 fsyang kernel: usb-storage: queuecommand called
Sep  2 16:17:13 fsyang kernel: usb 5-1.4: USB disconnect, address 50
Sep  2 16:17:13 fsyang kernel: usb-storage: storage_disconnect() called
Sep  2 16:17:13 fsyang kernel: usb-storage: usb_stor_stop_transport called
Sep  2 16:17:13 fsyang kernel: usb-storage: *** thread awakened.
Sep  2 16:17:13 fsyang kernel: usb-storage: -- exiting
Sep  2 16:17:18 fsyang kernel: usb-storage: command_abort called
Sep  2 16:17:18 fsyang kernel: usb-storage: usb_stor_stop_transport called


This log isn't affected by unplugging all the USB card readers. So I think the
USB subsystem is dead...

By the way, the possibility of failure depends on the host (PC) and hub used. So
I think the hardware brand isn't the most important factor. The card reader
seems not robust enough to recover from errors. Most of the devices under Kernel
2.6.12.5 will always fail (with messages "No such device or adderss", etc.) when
several are tested simultaneously. By monitoring the USB bus, we know the device
failed to respond correctly when errors happens. For example, it failed to ACK
the bulk reset command or endlessly NAK the bulk in command. As a result, Kernel
2.6.12.5 will put it offline. Kernel 2.6.13 is better. However, isn't good
enough. Sometimes still failed to recover from errors and I need to close the
block device and reopen it for further testings. And this won't always solve the
errors. For example, I re-opened the block device for 10 times and write() still
failed. Then I gave up!
Comment 1 Alan Stern 2005-09-02 07:51:06 UTC
The patches in these two messages will help prevent usb-storage from hanging
when one of these errors occurs.  Note that both patches must be applied, in
this order:

http://marc.theaimsgroup.com/?l=linux-usb-devel&m=112258712200336&w=2
http://marc.theaimsgroup.com/?l=linux-usb-devel&m=112258718111885&w=2

Unforunately they didn't get submitted in time to appear in 2.6.13.  You may
also find that this patch

http://marc.theaimsgroup.com/?l=linux-usb-devel&m=112551468126219&w=2

helps the port resets to work better.

I don't know what else we can do about improving the error recovery.  Perhaps
adding a time delay before doing the port reset would help?  You can try editing
the source code to do this.  In drivers/usb/storage/transport.c, immediately
after the statement label Handle_Errors: in the usb_stor_invoke_transport()
routine, insert a line saying

    msleep(2000);

Make that however many milliseconds you think is needed.
Comment 2 Feng-sung Yang 2005-09-04 23:38:03 UTC
Created attachment 5896 [details]
1 fail, 3 okay, 3 pending, 1 not found
Comment 3 Feng-sung Yang 2005-09-04 23:47:21 UTC
Thanks for the patches.
The situation gets better. But after long stress testings, the USB subsystem
still hangs. I tested 8 pairs of flash card/card readers at the same time. After
the tests are all done, these 8 pairs are removed and then reconnected
simultaneously.

Initially the tests are okay. (By "okay", it means the tests can finish but some
may take much longer time to finish than others. It should be a retry issue.)
After several tests, sometimes one pair won't be detected or it will be detected
much later than others. The port number of this card reader is not specific.
Later I connected another USB disk (with 8 partitions) to the bus while testing
to read some other files.

At the last test, the first card reader (/dev/sda) failed to be opened for write
after some tests. The following 3 are okay although lots of re-opening /dev/sdX
is performed to retry the testing. The mentioned 4 card readers are connected to
the same hub. Another four readers connected to another hub are detected inside
/proc/bus/usb/005/. However, the block devices /dev/sdX never appear. Doing a
"lsusb" hangs.

[root@fsyang fsyang]# ls -l /proc/bus/usb/005
total 0
-rw-rw----  1 fsyang usb 43 Sep  5 10:55 001
-rw-rw-r--  1 root   usb 43 Sep  5 11:22 038
-rw-rw-r--  1 root   usb 43 Sep  5 11:23 040
-rw-rw-r--  1 root   usb 51 Sep  5 13:43 046
-rw-rw-r--  1 root   usb 50 Sep  5 14:03 062
-rw-rw-r--  1 root   usb 50 Sep  5 14:03 063
-rw-rw-r--  1 root   usb 50 Sep  5 14:03 064
-rw-rw-r--  1 root   usb 50 Sep  5 14:03 065
-rw-rw-r--  1 root   usb 50 Sep  5 14:03 066
-rw-rw-r--  1 root   usb 50 Sep  5 14:03 067
-rw-rw-r--  1 root   usb 50 Sep  5 14:03 068


[root@fsyang kernel]# ps -eo
pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep usb
25033 25033 TS       -   0  24   0  0.0 S    -                             
usb-storage
28030 28030 TS       -   0  24   0  0.0 S    -                             
usb-storage
28046 28046 TS       -   0  24   0  0.0 S    -                             
usb-storage
28082 28082 TS       -   0  24   0  0.0 S    -                             
usb-storage
28108 28108 TS       -   0  24   0  0.0 S    -                             
usb-storage
28134 28134 TS       -   0  24   0  0.0 S    -                             
usb-storage
28160 28160 TS       -   0  24   0  0.0 S    -                             
usb-storage
28186 28186 TS       -   0  24   0  0.0 S    -                             
usb-storage
28960 28960 TS       -   0  23   0  0.0 D+   -                              lsusb

[root@fsyang kernel]# ps -eo
pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep scsi
 3445  3445 TS       -   0  24   0  0.0 S    -                             
scsi_eh_5
 3471  3471 TS       -   0  19   0  0.0 S    -                             
scsi_eh_6
 3497  3497 TS       -   0  24   0  0.0 S    -                             
scsi_eh_7
 3523  3523 TS       -   0  24   0  0.0 S    -                             
scsi_eh_8
14934 14934 TS       -   0  24   0  0.0 S    -                             
scsi_eh_73
16122 16122 TS       -   0  19   0  0.0 S    -                             
scsi_eh_77
16137 16137 TS       -   0  19   0  0.0 S    -                             
scsi_eh_78
16174 16174 TS       -   0  19   0  0.0 S    -                             
scsi_eh_79
16200 16200 TS       -   0  19   0  0.0 S    -                             
scsi_eh_80
16226 16226 TS       -   0  19   0  0.0 S    -                             
scsi_eh_81
16252 16252 TS       -   0  19   0  0.0 S    -                             
scsi_eh_82
16278 16278 TS       -   0  19   0  0.0 S    -                             
scsi_eh_83
17901 17901 TS       -   0  24   0  0.0 S    -                             
scsi_eh_91
17953 17953 TS       -   0  19   0  0.0 S    -                             
scsi_eh_93
17979 17979 TS       -   0  19   0  0.0 S    -                             
scsi_eh_94
18005 18005 TS       -   0  19   0  0.0 S    -                             
scsi_eh_95
18033 18033 TS       -   0  19   0  0.0 S    -                             
scsi_eh_96
18062 18062 TS       -   0  24   0  0.0 S    -                             
scsi_eh_97
18774 18774 TS       -   0  19   0  0.0 S    -                             
scsi_eh_98
18801 18801 TS       -   0  19   0  0.0 S    -                             
scsi_eh_99
18827 18827 TS       -   0  19   0  0.0 S    -                             
scsi_eh_100
18853 18853 TS       -   0  19   0  0.0 S    -                             
scsi_eh_101
18879 18879 TS       -   0  19   0  0.0 S    -                             
scsi_eh_102
18905 18905 TS       -   0  19   0  0.0 S    -                             
scsi_eh_103
18931 18931 TS       -   0  19   0  0.0 S    -                             
scsi_eh_104
19650 19650 TS       -   0  24   0  0.0 S    -                             
scsi_eh_105
19676 19676 TS       -   0  24   0  0.0 S    -                             
scsi_eh_106
19702 19702 TS       -   0  24   0  0.0 S    -                             
scsi_eh_107
19728 19728 TS       -   0  24   0  0.0 S    -                             
scsi_eh_108
19754 19754 TS       -   0  19   0  0.0 S    -                             
scsi_eh_109
19780 19780 TS       -   0  24   0  0.0 S    -                             
scsi_eh_110
19806 19806 TS       -   0  19   0  0.0 S    -                             
scsi_eh_111
20275 20275 TS       -   0  19   0  0.0 S    -                             
scsi_eh_112
25031 25031 TS       -   0  19   0  0.0 S    -                             
scsi_eh_129
25786 25786 TS       -   0  19   0  0.0 S    -                             
scsi_eh_130
25801 25801 TS       -   0  19   0  0.0 S    -                             
scsi_eh_131
25838 25838 TS       -   0  19   0  0.0 S    -                             
scsi_eh_132
25864 25864 TS       -   0  19   0  0.0 S    -                             
scsi_eh_133
25890 25890 TS       -   0  19   0  0.0 S    -                             
scsi_eh_134
25916 25916 TS       -   0  19   0  0.0 S    -                             
scsi_eh_135
25942 25942 TS       -   0  19   0  0.0 S    -                             
scsi_eh_136
28029 28029 TS       -   0  24   0  0.0 S    -                             
scsi_eh_145
28045 28045 TS       -   0  19   0  0.0 S    -                             
scsi_eh_146
28081 28081 TS       -   0  19   0  0.0 S    -                             
scsi_eh_147
28107 28107 TS       -   0  19   0  0.0 S    -                             
scsi_eh_148
28133 28133 TS       -   0  19   0  0.0 S    -                             
scsi_eh_149
28159 28159 TS       -   0  19   0  0.0 S    -                             
scsi_eh_150
28185 28185 TS       -   0  19   0  0.0 S    -                             
scsi_eh_151

Since the data amount is too large. I didn't enable the USB storage verbose
debug message option. Only the normal messages are logged
(/var/log/kernel/info). They are attached here (bz2 file). I hope it will be
helpful.
Comment 4 Feng-sung Yang 2005-09-05 00:49:48 UTC
About my previous comment, the scsi_en_XX drivers seem not got released after my
test, so I do some more tests now. I find if I disconnect the card reader when
the program is testing it (I am sure write operations will result in this
phenomemon. But I don't know if read will...). The usb_storage driver will be
gone, however, the scsi_en_XX still present. In my previous comment, I did
remove the card readers several times during the test to test if the
program/Linux is robust enough.
Comment 5 Feng-sung Yang 2005-09-05 00:54:24 UTC
Now.. I am sure the reading operations will also prevent the scsi_en_XX driver
from released if you disconnect the device at the same time.
Comment 6 Alan Stern 2005-09-05 10:52:05 UTC
You're trying to do too much at once.  If you are interested in tracking down
and solving these hangups, then you should test only one storage device at a
time.  Then when something goes wrong, you'll be able to find out where and why.

Also you definitely should turn on CONFIG_USB_STORAGE_DEBUG.  It's the only way
to tell what the usb-storage driver is really doing.  With only one device being
tested, the output won't be too large to handle -- especially if it's one of the
devices that starts getting errors almost immediately.

There have been some patches posted for the SCSI core that might help solve some
of the problems you see.  They aren't included in 2.6.13, but they might be in
the current -mm tree. 
Comment 7 Feng-sung Yang 2005-09-06 02:11:55 UTC
I have tried the -mm1 patch (plus the
http://marc.theaimsgroup.com/?l=linux-usb-devel&m=112551468126219&w=2 reset patch).
This time, only one card reader is used. I remove the card reader while it's
being read (dd if=/dev/sda of=/dev/null). Then the usb-storage driver is gone
with scsi_eh_4 left. I am not sure if I should open another bug report for this
bug since it may have already been discovered by others although I haven't find
the corresponding bug report.

The following is the last trace messages:

Sep  6 17:00:34 fsyang kernel: usb-storage: Command READ_10 (10 bytes)
Sep  6 17:00:34 fsyang kernel: usb-storage:  28 00 00 01 30 40 00 00 40 00
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x4cf L
32768 F 128 Trg 0 LUN 0 CL 10
Sep  6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31
bytes
Sep  6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 31/31
Sep  6 17:00:34 fsyang kernel: usb-storage: -- transfer complete
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk command transfer result=0
Sep  6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_sglist: xfer
32768 bytes, 1 entries
Sep  6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 32768/32768
Sep  6 17:00:34 fsyang kernel: usb-storage: -- transfer complete
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk data transfer result 0x0
Sep  6 17:00:34 fsyang kernel: usb-storage: Attempting to get CSW...
Sep  6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 13
bytes
Sep  6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 13/13
Sep  6 17:00:34 fsyang kernel: usb-storage: -- transfer complete
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk status result = 0
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk Status S 0x53425355 T 0x4cf R 0
Stat 0x0
Sep  6 17:00:34 fsyang kernel: usb-storage: scsi cmd done, result=0x0
Sep  6 17:00:34 fsyang kernel: usb-storage: *** thread sleeping.
Sep  6 17:00:34 fsyang kernel: usb-storage: queuecommand called
Sep  6 17:00:34 fsyang kernel: usb-storage: *** thread awakened.
Sep  6 17:00:34 fsyang kernel: usb-storage: Command READ_10 (10 bytes)
Sep  6 17:00:34 fsyang kernel: usb-storage:  28 00 00 01 30 80 00 00 40 00
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x4d0 L
32768 F 128 Trg 0 LUN 0 CL 10
Sep  6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31
bytes
Sep  6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 31/31
Sep  6 17:00:34 fsyang kernel: usb-storage: -- transfer complete
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk command transfer result=0
Sep  6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_sglist: xfer
32768 bytes, 1 entries
Sep  6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 32768/32768
Sep  6 17:00:34 fsyang kernel: usb-storage: -- transfer complete
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk data transfer result 0x0
Sep  6 17:00:34 fsyang kernel: usb-storage: Attempting to get CSW...
Sep  6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 13
bytes
Sep  6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 13/13
Sep  6 17:00:34 fsyang kernel: usb-storage: -- transfer complete
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk status result = 0
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk Status S 0x53425355 T 0x4d0 R 0
Stat 0x0
Sep  6 17:00:34 fsyang kernel: usb-storage: scsi cmd done, result=0x0
Sep  6 17:00:34 fsyang kernel: usb-storage: *** thread sleeping.
Sep  6 17:00:34 fsyang kernel: usb-storage: queuecommand called
Sep  6 17:00:34 fsyang kernel: usb-storage: *** thread awakened.
Sep  6 17:00:34 fsyang kernel: usb-storage: Command READ_10 (10 bytes)
Sep  6 17:00:34 fsyang kernel: usb-storage:  28 00 00 01 30 c0 00 00 40 00
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x4d1 L
32768 F 128 Trg 0 LUN 0 CL 10
Sep  6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31
bytes
Sep  6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 31/31
Sep  6 17:00:34 fsyang kernel: usb-storage: -- transfer complete
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk command transfer result=0
Sep  6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_sglist: xfer
32768 bytes, 1 entries
Sep  6 17:00:34 fsyang kernel: usb-storage: Status code -71; transferred 26112/32768
Sep  6 17:00:34 fsyang kernel: usb-storage: -- unknown error
Sep  6 17:00:34 fsyang kernel: usb-storage: Bulk data transfer result 0x4
Sep  6 17:00:34 fsyang kernel: usb-storage: -- transport indicates error, resetting
Sep  6 17:00:34 fsyang kernel: usb 5-1: USB disconnect, address 17
Sep  6 17:00:34 fsyang kernel: usb-storage: storage_disconnect() called
Sep  6 17:00:34 fsyang kernel: usb-storage: usb_stor_stop_transport called
Sep  6 17:00:36 fsyang kernel: usb-storage: No reset during disconnect
Sep  6 17:00:36 fsyang kernel: usb-storage: usb_stor_Bulk_reset called
Sep  6 17:00:36 fsyang kernel: usb-storage: No reset during disconnect
Sep  6 17:00:36 fsyang kernel: usb-storage: scsi cmd done, result=0x70000
Sep  6 17:00:36 fsyang kernel: usb-storage: *** thread sleeping.
Sep  6 17:00:36 fsyang kernel: usb-storage: queuecommand called
Sep  6 17:00:36 fsyang kernel: usb-storage: Fail command during disconnect
Sep  6 17:00:36 fsyang kernel: SCSI error : <4 0 0 0> return code = 0x10000
Sep  6 17:00:36 fsyang kernel: usb-storage: queuecommand called
Sep  6 17:00:36 fsyang kernel: usb-storage: Fail command during disconnect
Sep  6 17:00:36 fsyang kernel: SCSI error : <4 0 0 0> return code = 0x10000
Sep  6 17:00:36 fsyang kernel: usb-storage: -- usb_stor_release_resources
Sep  6 17:00:36 fsyang kernel: usb-storage: -- sending exit command to thread
Sep  6 17:00:36 fsyang kernel: usb-storage: -- dissociate_dev
Sep  6 17:00:36 fsyang kernel: usb-storage: *** thread awakened.
Sep  6 17:00:36 fsyang kernel: usb-storage: -- exiting

As for the hang problem, since I haven't find it happen for only one card
reader. I can only use several card readers simultaneously to generate this
error. I have tried four today and no error happened. I'll try more to log the
verbose debug message.
Comment 8 Feng-sung Yang 2005-09-06 06:19:46 UTC
Hi, I have open another bug report for the orphan scsi_eh_XX problem as
http://bugzilla.kernel.org/show_bug.cgi?id=5195 since it's more serious than I
thought before.
Comment 9 David Brownell 2005-09-14 12:12:40 UTC
Hey, I just noticed this.  I have nothing to contribute except for
some applause at the notion of this kind of stress test!  Linux
needs this kind of testing, as well as the one-drive-at-a-time sort
of testing.  It can be a real pain to _find_ the bugs reported by
this sort of test, but it's important to know how well these work.
Comment 10 Alan Stern 2005-09-28 13:35:36 UTC
I think these problems have been fixed in 2.6.14-rc2.  Try using that kernel and
see what happens.
Comment 11 Feng-sung Yang 2005-10-12 01:19:34 UTC
Hi, I have tried linux-2.6.14-rc4 dozens of times and no hang problem happens
again. Although the device (/dev/sdX) usually failed to be opened or written
when 8 pairs were tested simultaneously, I think it's another problem. Thanks a lot.
Comment 12 Alan Stern 2005-10-12 07:34:24 UTC
Okay, I'll mark this bug CLOSED.

Note You need to log in before you can comment on or make changes to this bug.