Bug 5237
Summary: | scsi_eh not get released by disconnecting the device when doing rescan | ||
---|---|---|---|
Product: | Drivers | Reporter: | Feng-sung Yang (fsyang_tw) |
Component: | USB | Assignee: | Alan Stern (stern) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | stern |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.13-mm1, 2.6.14-rc4, 2.6.14 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 5089 | ||
Attachments: |
info.050913.bz2
/var/log/kernel/info info.expected info.hang rescan.py kallsyms info.051101.bz2 kallsyms.051101.bz2 Fix race between sd_rescan and sd_remove |
Description
Feng-sung Yang
2005-09-12 21:11:25 UTC
Created attachment 5983 [details]
info.050913.bz2
It seems that even the perl sample code is translated into python. The write() can still finish with "Segmentation fault". The thread must be used so the write() will hang and cause scsi_eh_XX to hang also. The code is as below: ===================================================================== #!/usr/bin/python import os import sys import time import thread n = int(sys.argv[1]) deviceDir = '/sys/devices/pci0000:00/0000:00:1d.7/usb5/5-1/5-1:1.0/' rescanFile = deviceDir + 'host%d/target%d:0:0/%d:0:0:0/rescan' % ((n,) * 3) sizeFile = deviceDir + 'host%d/target%d:0:0/%d:0:0:0/block/size' % ((n,) * 3) bDone = False def Run(): while True: if not os.path.isdir(deviceDir): print '"%s" is gone!' % deviceDir break if not os.path.exists(sizeFile): print '"%s" is gone!' % sizeFile break size = file(sizeFile, 'r').read() if size == "0\n": print "Size becomes 0!" break print '***open ' + rescanFile try: fd = os.open(rescanFile, os.O_WRONLY); print "write 1 to rescan file"; os.write(fd, "1") print 'try to close' os.close(fd) print 'close done' except (IOError, OSError): break time.sleep(0.1); # 0.1 second if __name__ == "__main__": thread.start_new_thread(Run, ()) while not bDone: time.sleep(0.2) print 'Finish!!!!!!' ===================================================================== I don't understand, is the kernel oopsing when you do this? If so, please provide the oops message. Greg, sorry I don't understand what you mean (My English is bad...) and reply this so lately since I seemed not got notified. What I want is that the storage driver can just remove the /sys files for that device when I try to do rescan without hanging the write() function call. When the card reader is removed while the rescan operation (write() function call) is continuously tried, the usb_storage driver will disappear as expected. The corresponding sys files in /sys/class/usb_device/, /sys/class/scsi_host/, /sys/class/scsi_device/, and /sys/class/usb_device/ will also disappear. Howerver, the scsi_eh_X driver won't get released and my write() is pended forever even after my program is terminated by CTRL-C. Now I try linux-2.6.14-rc4 with squashfs and unionfs patches. The error condition still happens. I tried several times to produce the error and the ps result is different from what I previously reported: ~ 1000$ ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep usb ~ 1001$ ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep scsi 3838 3838 TS - -5 28 0 0.0 S< scsi_error_handler scsi_eh_0 ~ 1002$ By the way, I wrongly filled the "Most recent kernel where this bug did not occur:" field. It should be "2.6.11" instead of "2.6.13-mm1" The usb storage verbose message "info.051011.bz2" is as attached. Created attachment 6275 [details]
/var/log/kernel/info
Hi, This time I enabled the SCS log function to log all SCSI messages. Two logs are attached. The first one (info.expected) is for the non-thread version. The rescan operation will fail and my program will terminate as expected. The second one (info.hang) is for the threaded version and the write() hangs. After I CTRL-C the program after some delay, the log is still the same (no more messages appended). I also attached my simple python program (use the variable bUseThread to enable/disable thread). The kernel symbol file is also attached. It seems the problem is in the SCSI module.... Created attachment 6339 [details]
info.expected
Created attachment 6340 [details]
info.hang
Created attachment 6341 [details]
rescan.py
Created attachment 6342 [details]
kallsyms
Can you try this again using 2.6.14? Created attachment 6431 [details]
info.051101.bz2
Hi,
The result is the same for 2.6.14 (with unionfs and squashfs patches). I
also update the kernel log and kallsyms here.
[root@fsyang fsyang]# ps -eo
pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep usb
[root@fsyang fsyang]# ps -eo
pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:30,comm | grep scsi
4913 4913 TS - -5 28 0 0.0 S< scsi_error_handler
scsi_eh_3
By the way, could any body tell me why I am not notified when this bug is
modified? I had ever received some notifications before. But now I am not
notified so that I must check this bug daily... Thanks...
Created attachment 6432 [details]
kallsyms.051101.bz2
I am able to reproduce the failure on my computer, and I'm working to fix it. You should be receiving email notifications about updates to this bug. Bugzilla does send out a message automatically to the address listed as the Submitter. Created attachment 6452 [details]
Fix race between sd_rescan and sd_remove
Try the attached patch. It fixed the problem on my computer.
Hi, I have tried your patch and things worked as expected so far. Thanks a lot. |