Most recent kernel where this bug did not occur: None Distribution: Mandriva Linux release 2006.0 (Cooker) for i586 Hardware Environment: i686 Intel(R) Pentium(R) 4 CPU 3.00GHz Software Environment: Kernel tested: 2.6.13-mm1 with patch from http://marc.theaimsgroup.com/?l=linux-usb-devel&m=112551468126219&w=2 Problem Description: 1. I have tried the -mm1 patch (plus the http://marc.theaimsgroup.com/?l=linux-usb-devel&m=112551468126219&w=2 reset patch). Only one card reader is used. I remove the card reader while it's being read (dd if=/dev/sda of=/dev/null). Then the usb-storage driver is gone with scsi_eh_4 left. The following is the last trace messages: Sep 6 17:00:34 fsyang kernel: usb-storage: Command READ_10 (10 bytes) Sep 6 17:00:34 fsyang kernel: usb-storage: 28 00 00 01 30 40 00 00 40 00 Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x4cf L 32768 F 128 Trg 0 LUN 0 CL 10 Sep 6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes Sep 6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 31/31 Sep 6 17:00:34 fsyang kernel: usb-storage: -- transfer complete Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk command transfer result=0 Sep 6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_sglist: xfer 32768 bytes, 1 entries Sep 6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 32768/32768 Sep 6 17:00:34 fsyang kernel: usb-storage: -- transfer complete Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk data transfer result 0x0 Sep 6 17:00:34 fsyang kernel: usb-storage: Attempting to get CSW... Sep 6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes Sep 6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 13/13 Sep 6 17:00:34 fsyang kernel: usb-storage: -- transfer complete Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk status result = 0 Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk Status S 0x53425355 T 0x4cf R 0 Stat 0x0 Sep 6 17:00:34 fsyang kernel: usb-storage: scsi cmd done, result=0x0 Sep 6 17:00:34 fsyang kernel: usb-storage: *** thread sleeping. Sep 6 17:00:34 fsyang kernel: usb-storage: queuecommand called Sep 6 17:00:34 fsyang kernel: usb-storage: *** thread awakened. Sep 6 17:00:34 fsyang kernel: usb-storage: Command READ_10 (10 bytes) Sep 6 17:00:34 fsyang kernel: usb-storage: 28 00 00 01 30 80 00 00 40 00 Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x4d0 L 32768 F 128 Trg 0 LUN 0 CL 10 Sep 6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes Sep 6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 31/31 Sep 6 17:00:34 fsyang kernel: usb-storage: -- transfer complete Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk command transfer result=0 Sep 6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_sglist: xfer 32768 bytes, 1 entries Sep 6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 32768/32768 Sep 6 17:00:34 fsyang kernel: usb-storage: -- transfer complete Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk data transfer result 0x0 Sep 6 17:00:34 fsyang kernel: usb-storage: Attempting to get CSW... Sep 6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes Sep 6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 13/13 Sep 6 17:00:34 fsyang kernel: usb-storage: -- transfer complete Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk status result = 0 Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk Status S 0x53425355 T 0x4d0 R 0 Stat 0x0 Sep 6 17:00:34 fsyang kernel: usb-storage: scsi cmd done, result=0x0 Sep 6 17:00:34 fsyang kernel: usb-storage: *** thread sleeping. Sep 6 17:00:34 fsyang kernel: usb-storage: queuecommand called Sep 6 17:00:34 fsyang kernel: usb-storage: *** thread awakened. Sep 6 17:00:34 fsyang kernel: usb-storage: Command READ_10 (10 bytes) Sep 6 17:00:34 fsyang kernel: usb-storage: 28 00 00 01 30 c0 00 00 40 00 Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk Command S 0x43425355 T 0x4d1 L 32768 F 128 Trg 0 LUN 0 CL 10 Sep 6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes Sep 6 17:00:34 fsyang kernel: usb-storage: Status code 0; transferred 31/31 Sep 6 17:00:34 fsyang kernel: usb-storage: -- transfer complete Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk command transfer result=0 Sep 6 17:00:34 fsyang kernel: usb-storage: usb_stor_bulk_transfer_sglist: xfer 32768 bytes, 1 entries Sep 6 17:00:34 fsyang kernel: usb-storage: Status code -71; transferred 26112/32768 Sep 6 17:00:34 fsyang kernel: usb-storage: -- unknown error Sep 6 17:00:34 fsyang kernel: usb-storage: Bulk data transfer result 0x4 Sep 6 17:00:34 fsyang kernel: usb-storage: -- transport indicates error, resetting Sep 6 17:00:34 fsyang kernel: usb 5-1: USB disconnect, address 17 Sep 6 17:00:34 fsyang kernel: usb-storage: storage_disconnect() called Sep 6 17:00:34 fsyang kernel: usb-storage: usb_stor_stop_transport called Sep 6 17:00:36 fsyang kernel: usb-storage: No reset during disconnect Sep 6 17:00:36 fsyang kernel: usb-storage: usb_stor_Bulk_reset called Sep 6 17:00:36 fsyang kernel: usb-storage: No reset during disconnect Sep 6 17:00:36 fsyang kernel: usb-storage: scsi cmd done, result=0x70000 Sep 6 17:00:36 fsyang kernel: usb-storage: *** thread sleeping. Sep 6 17:00:36 fsyang kernel: usb-storage: queuecommand called Sep 6 17:00:36 fsyang kernel: usb-storage: Fail command during disconnect Sep 6 17:00:36 fsyang kernel: SCSI error : <4 0 0 0> return code = 0x10000 Sep 6 17:00:36 fsyang kernel: usb-storage: queuecommand called Sep 6 17:00:36 fsyang kernel: usb-storage: Fail command during disconnect Sep 6 17:00:36 fsyang kernel: SCSI error : <4 0 0 0> return code = 0x10000 Sep 6 17:00:36 fsyang kernel: usb-storage: -- usb_stor_release_resources Sep 6 17:00:36 fsyang kernel: usb-storage: -- sending exit command to thread Sep 6 17:00:36 fsyang kernel: usb-storage: -- dissociate_dev Sep 6 17:00:36 fsyang kernel: usb-storage: *** thread awakened. Sep 6 17:00:36 fsyang kernel: usb-storage: -- exiting 2. When stressing (reading/writing) 7 card readers simultaneously, a similar condition happens. (Actually 8 card readers are connected. But only 7 are found. I am not sure whether it's a driver problem...) I think the signal quality on USB bus seems really bad when multiple card readers are tested at the same time. (The environment is the same as http://bugzilla.kernel.org/show_bug.cgi?id=5173) Sometimes it can be found that some scsi_eh_XX is present without any corresponding USB device. I think when doing error recovery for that USB device, the scsi_eh_XX driver is accidentially replaced by another one with cancelling the original one. When one more orphan scsi_eh_XX is found, I try to stop the testing immediately. I think it still takes two or more seconds to really stop the testing. The ps result is as below: ~ 1012$ ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep scsi 23474 23474 TS - 0 19 0 0.0 S - scsi_eh_119 25649 25649 TS - 0 24 0 0.0 S - scsi_eh_125 25686 25686 TS - 0 19 0 0.0 S - scsi_eh_126 25723 25723 TS - 0 19 0 0.0 S - scsi_eh_127 25760 25760 TS - 0 19 0 0.0 S - scsi_eh_128 25797 25797 TS - 0 24 0 0.0 S - scsi_eh_129 25834 25834 TS - 0 24 0 0.0 S - scsi_eh_130 25871 25871 TS - 0 24 0 0.0 S - scsi_eh_131 27074 27074 TS - 0 19 0 0.0 S - scsi_eh_132 scsi_eh_119 is caused by the previous test before this one since I was unable to save the verbose debug information in time. So that I run the tests again to cause the orphan scsi_eh_128. The device mappings is: /sys/block 1004$ ls -l sd?/device lrwxrwxrwx 1 root root 0 9月 6 20:59 sda/device -> ../../devices/pci0000:00/0000:00:1d.7/usb5/5-1/5-1.1/5-1.1:1.0/host125/target125:0:0/125:0:0:0/ lrwxrwxrwx 1 root root 0 9月 6 20:59 sdb/device -> ../../devices/pci0000:00/0000:00:1d.7/usb5/5-1/5-1.2/5-1.2:1.0/host126/target126:0:0/126:0:0:0/ lrwxrwxrwx 1 root root 0 9月 6 20:58 sdc/device -> ../../devices/pci0000:00/0000:00:1d.7/usb5/5-1/5-1.3/5-1.3:1.0/host127/target127:0:0/127:0:0:0/ lrwxrwxrwx 1 root root 0 9月 6 20:59 sdd/device -> ../../devices/pci0000:00/0000:00:1d.7/usb5/5-1/5-1.4/5-1.4:1.0/host132/target132:0:0/132:0:0:0/ lrwxrwxrwx 1 root root 0 9月 6 20:59 sde/device -> ../../devices/pci0000:00/0000:00:1d.7/usb5/5-2/5-2.1/5-2.1:1.0/host129/target129:0:0/129:0:0:0/ lrwxrwxrwx 1 root root 0 9月 6 20:59 sdf/device -> ../../devices/pci0000:00/0000:00:1d.7/usb5/5-2/5-2.2/5-2.2:1.0/host130/target130:0:0/130:0:0:0/ lrwxrwxrwx 1 root root 0 9月 6 20:59 sdg/device -> ../../devices/pci0000:00/0000:00:1d.7/usb5/5-2/5-2.3/5-2.3:1.0/host131/target131:0:0/131:0:0:0/ /sys/block 1006$ ls -l /proc/bus/usb/005 總計 0 -rw-rw---- 1 fsyang usb 43 9月 6 20:28 001 -rw-rw---- 1 fsyang usb 43 9月 6 20:28 002 -rw-rw---- 1 fsyang usb 43 9月 6 20:28 003 -rw-rw-r-- 1 root usb 50 9月 6 20:56 016 -rw-rw-r-- 1 root usb 50 9月 6 20:56 017 -rw-rw-r-- 1 root usb 50 9月 6 20:56 018 -rw-rw-r-- 1 root usb 50 9月 6 20:56 020 -rw-rw-r-- 1 root usb 50 9月 6 20:56 021 -rw-rw-r-- 1 root usb 50 9月 6 20:56 022 -rw-rw-r-- 1 root usb 50 9月 6 20:59 023 /sys/block 1007$ lsusb Bus 005 Device 023: ID 058f:6369 Alcor Micro Corp. Bus 005 Device 022: ID 058f:6369 Alcor Micro Corp. Bus 005 Device 021: ID 058f:6369 Alcor Micro Corp. Bus 005 Device 020: ID 058f:6369 Alcor Micro Corp. Bus 005 Device 018: ID 058f:6369 Alcor Micro Corp. Bus 005 Device 017: ID 058f:6369 Alcor Micro Corp. Bus 005 Device 016: ID 058f:6369 Alcor Micro Corp. Bus 005 Device 003: ID 05e3:0605 Genesys Logic, Inc. Bus 005 Device 002: ID 05e3:0605 Genesys Logic, Inc. Bus 005 Device 001: ID 0000:0000 Bus 004 Device 001: ID 0000:0000 Bus 003 Device 001: ID 0000:0000 Bus 002 Device 001: ID 0000:0000 Bus 001 Device 001: ID 0000:0000 The final part of /var/log/kernel/info is attached as "info1.bz2". The head is gone since I periodically cut this file when testing so that it can be kept at a reasonable size.
Created attachment 5916 [details] /var/log/kernel/info
Created attachment 5931 [details] Fix errors in the SCSI core Attached is a patch for 2.6.13-mm1 that fixes a number of errors in the SCSI core. Among other things, it should get rid of all those extra error-handler processes. Keep all the other patches and apply this one in addition.
Created attachment 5977 [details] info.050912.bz2
Created attachment 5978 [details] info.050912.bz2
Thanks a lot. I have tried this patch and I haven't encounter any previously reported problems yet. The patches I used in this kernel is: -mm1 http://marc.theaimsgroup.com/?l=linux-usb-devel&m=112551468126219&w=2 msleep(2000) in drivers/usb/storage/transport.c "Fix errors in the SCSI core" patch However, when I use -mm2 patch to replace -mm1. Things become odd. The usb_storage and scsi_eh_X drivers won't get released after the USB device is removed even when only one card reader is plugged-in and then unplugged. And, after some stress testings for multiple card readers, the USB system hangs as before with a lot of driver left. Since the difference of USB drivers between these two patches are small so I am curious if -mm2 shouldn't be used. The following is the ps result when and after the device is disconnected: [root@fsyang ~]# ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep usb 3123 3123 TS - 0 24 0 0.0 S - usb-storage [root@fsyang ~]# ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep scsi 3120 3120 TS - -5 28 0 0.0 S< - scsi_eh_0 [root@fsyang ~]# # Unplugged...... [root@fsyang ~]# ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep usb 3123 3123 TS - 0 24 0 0.0 D kthread_stop usb-storage [root@fsyang ~]# ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep scsi 3120 3120 TS - -5 29 0 0.0 S< - scsi_eh_0 And the /var/log/kernel/info is attached as info.050912.bz2.
I think this has been fixed in the latest kernel. Try using 2.6.14-rc2-git6.
Hi, I have tried linux-2.6.14-rc4 dozens of times and this problem didn't happen again. Thanks a lot.
Okay, I'll mark this bug CLOSED.