Bug 5010

Summary: ehci-hcd does not work for external DVD-RW "Freecom Classic"
Product: Drivers Reporter: Leonid Kof (kof)
Component: USBAssignee: Matthew Dharm (mdharm-usb)
Status: CLOSED CODE_FIX    
Severity: normal CC: greg, stern
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.11 Subsystem:
Regression: --- Bisected commit-id:
Bug Depends on:    
Bug Blocks: 5089    
Attachments: System information (outputs of lsmod, dmesg, etc.) relevant to the bug
Relevant part of /var/log/messages, same kernel configuration as before, just with usb verbose debug and usb mass storage debug turned on
Output of dmesg after the second plugging of the USB device
Log messages for the test described below
Log messages, see comment for details
Log messages with "options usb-storage delay_use=30"
Delay Get-Max-LUN request
drivers/usb/storage/usb.c
Delay Get-Max-LUN request
/var/log/messages after applying the patch
/var/log/messages after applying the patch
/var/log/messages with several USB mass storage devices

Description Leonid Kof 2005-08-06 03:03:20 UTC
Most recent kernel where this bug did not occur: 2.6.3
I did not try kernels 2.6.4-2.6.10, so I cannot exactly identify the last
bugfree kernel version.
The DVD-RW worked on my old laptop with Debian-testing/2.6.3,
it does not work on the new one with SuSE 9.3/2.6.11

Distribution: SuSE 9.3
Hardware Environment:
Laptop Samsung X20, Pentium M 1.6 GHz, 2GB RAM, USB Controller: Intel
Corporation 82801FB/FBM/FR/FW/FRW, External USB2 DVD-RW "Freecom Classic"

Software Environment: SuSE 9.3 with kernel 2.6.11.4-21.7
Problem Description:
When a USB2 burner is plugged just after boot, it works fine.
After unplugging and re-plugging it stops responding.
SCSI driver says something like "scsi1 (0:0): rejecting I/O to offline device".
(The relevant part of /var/log/messages is attached to the bug report)

Steps to reproduce:
Boot with USB2 burner unplugged.
Plug the USB2 burner
Test whether it reacts (I tried "eject sr0")
Unplug the USB2 burner.
Plug the USB2 burner again.
"eject sr0" does not work any more: Either it outputs "cannot open /dev/sr0" or
just hangs.
When eject hangs, the system refuses to hybernate.
"kill -9" does not kill hanging "eject".


Comments:
1. The bug seems to be related to the bugs #4373 and #3007, but is not quite the
same.
2. SuSE seems to be aware of the problem:
http://portal.suse.com/sdb/de/2004/01/USB_2_0.html,
their workaround does not work for me.
(The SuSE page ist in german. If necessary, I can translate it for you)
3. Once I could make it run with a sequence of loading/unloading uhci-hcd,
plugging/unplugging the DVD-RW and loading/unloading ehci-hcd.
Unfortunately, I cannot reproduce the right sequence.
Comment 1 Leonid Kof 2005-08-06 03:05:44 UTC
Created attachment 5525 [details]
System information (outputs of lsmod, dmesg, etc.) relevant to the bug
Comment 2 Alan Stern 2005-08-07 08:18:42 UTC
It will help a lot if you turn on USB verbose debugging (CONFIG_USB_DEBUG) and
USB Mass Storage verbose debugging (CONFIG_USB_STORAGE_DEBUG) in the kernel
configuration.  It might also help if you upgrade to the current 2.6.13-rc release.

(BTW, there was no need to send three separate copies of your dmesg output,
since each copy contained everything that was in the previous copy.)
Comment 3 Leonid Kof 2005-08-20 13:40:21 UTC
Created attachment 5698 [details]
Relevant part of /var/log/messages, same kernel configuration as before, just with usb verbose debug and usb mass storage debug turned on
Comment 4 Leonid Kof 2005-08-20 13:41:53 UTC
Created attachment 5699 [details]
Output of dmesg after the second plugging of the USB device
Comment 5 Alan Stern 2005-08-20 14:19:45 UTC
As far as I can see, the only difference between the first plug-in and the
second is that the first one has a 2-second delay during which the hotplug
system loads the usb-storage driver.  It could be that the drive is defective
and won't work properly without this delay.

Just as a quick test, try doing "rmmod usb-storage" after unplugging the drive
and before replugging it.  It's enough to post the /var/log/messages extract,
since it contains pretty much everything that was in your dmesg output.  If you
would insert a disc (say, an audio CD) in the drive before running your test,
that ought to reduce the total number of messages in the log.
Comment 6 Leonid Kof 2005-08-21 02:32:56 UTC
Have just tried the quick test suggested by Alan Stern.
It does not help :-((
Comment 7 Leonid Kof 2005-08-21 03:07:26 UTC
An augmentation to my previous comment:
"rmmod usb-storage" works only in the situation where the USB drive was
responding when plugged in.
If I unplug the drive in the situation where it does not respond, rmmod just says:
# rmmod usb-storage
ERROR: Module usb_storage is in use

I do not have other usb-storage devices plugged in at the moment.
Comment 8 Alan Stern 2005-08-21 08:24:06 UTC
Another way of testing this occurred to me.  After booting, before you plug in
the DVD device for the first time, do "modprobe usb-storage".  That will
eliminate the hotplug delay.  The question is, will the device still work? 
Remember to attach the system log showing what happens (and put a disc in the
drive before starting the test).

Also, 
Comment 9 Leonid Kof 2005-08-21 14:03:49 UTC
Created attachment 5711 [details]
Log messages for the test described below

As suggested by Alan, I have just tried the following test:
1. booted
2. entered the command "modprobe usb-storage". According to "lsmod" the module
was properly loaded
3. plugged in the USB device, with a disc put in. It did *not* respond.
"eject sr0" resulted in "Cannot open the device /dev/sr0".
Comment 10 Leonid Kof 2005-08-21 14:15:37 UTC
Created attachment 5712 [details]
Log messages, see comment for details

I have just tried again whether the drive would respond just after booting.
I was really surprised: now the drive refused to respond just after first
plugging. The only explanation I have is that there must be race conditions or
timing issues in modules. Otherwise the error would be deterministic.
Comment 11 Alan Stern 2005-08-22 07:09:58 UTC
A more likely explanation is that there is a race condition or timing error in
the drive itself.  Or else the drive is just broken.

From what you've said, it's not fully clear exactly how you've been running your
tests.  Try doing this:  With the USB cable unplugged, turn on the drive and
leave it on for 30 seconds or so (until it has finished its initialization,
reading the disc's table of contents, and so on).  Then plug in the USB cable.

Another thing to try.  Edit /etc/modprobe.conf and add a line saying:

    options usb-storage delay_use=30

Then make sure usb-storage is unloaded, and run your test again.
Comment 12 Leonid Kof 2005-08-22 13:20:41 UTC
Created attachment 5730 [details]
Log messages with "options usb-storage delay_use=30"

As the drive cannot load disk without being plugged in, the test cannot be
carried out in exactly the suggested sequence.

I performed the following test sequence:
1. booted
1.5 added the line "options usb-storage delay_use=30" to /etc/modprobe.conf
2. plugged the drive in.
3. loaded a disk. I did not try "eject" at this step, I loaded the disk just by
pressing the button on the drive
4. unplugged the drive
5. rebooted with the drive unplugged. After reboot I had an unplugged drive
with a disc inside and a kernel with usb-storage unloaded. The drive does not
perform any self-initialization as long as it is unplugged
6. plugged the drive in. It did *not* respond to "eject", just after the first
plugging. ("eject" says "cannot open /dev/sr0")
7. Drive initialization attempts took 5 to 10 minutes (hearable movements of
mechanical parts). After these attempts have been finished, "eject" does not
terminate any more, it just hangs.
Comment 13 Alan Stern 2005-08-23 07:00:17 UTC
Created attachment 5735 [details]
Delay Get-Max-LUN request

No initialization until the USB cable is plugged in?  I was afraid of that.

Okay, here's a patch that will cause the usb-storage driver to delay issuing
the Get-Max-LUN request, which is the first thing it sends to the device. 
Currently that request gets sent immediately, and it has been failing.	(You
can see those failures near the start of your logs.)  I'm guessing that failure
of the Get-Max-LUN causes the later failures somehow.  With this patch, maybe
nothing will fail.

So apply the patch, rebuild the usb-storage driver, and run your test again.
Comment 14 Leonid Kof 2005-08-23 11:22:10 UTC
Created attachment 5736 [details]
drivers/usb/storage/usb.c

Sorry, I cannot apply the patch:
# patch --dry-run usb.c  ~/usb_patch.c
patching file usb.c
Hunk #1 FAILED at 765.
Hunk #2 FAILED at 902.
2 out of 2 hunks FAILED -- saving rejects to file usb.c.rej
Comment 15 Alan Stern 2005-08-23 11:48:13 UTC
Created attachment 5737 [details]
Delay Get-Max-LUN request

I thought you were using kernel 2.6.11, but apparently not.  This patch should
apply against your source file okay.
Comment 16 Leonid Kof 2005-08-23 12:37:25 UTC
Created attachment 5738 [details]
/var/log/messages after applying the patch

The patch does work.
Thank you very much indeed!
You are the best! ;-)

P.S. I attached the log messages for you to see what happens.
P.P.S. As for for my kernel, it's 2.6.11.4-21.7 from the SuSE distribution, not
from kernel.org
Comment 17 Leonid Kof 2005-08-23 12:49:13 UTC
Created attachment 5739 [details]
/var/log/messages after applying the patch

Something went wrong with the previous attachment.
Next try...
Comment 18 Alan Stern 2005-08-23 13:00:29 UTC
That's the first time I've seen this particular problem.  I'll submit the patch
for inclusion in the kernel.  It won't be in time for 2.6.13, but maybe it will
appear in 2.6.14.

You should go ahead and turn off the USB Mass Storage verbose debugging, and you
can remove the usb-storage options line from /etc/modprobe.conf.  The default
delay is 5 seconds, which ought to be enough for your drive.  If it isn't, you
can put the options line back in, with some value for delay_use between 5 and 30
-- try different numbers until you find one that works.

Let me know if everything continues to work okay without the debugging or the
modprobe option.  If it does, you can go ahead and close out this bug.
Comment 19 Leonid Kof 2005-08-25 10:06:04 UTC
Created attachment 5761 [details]
/var/log/messages with several USB mass storage devices

The patch works really fine as long as the DVD-drive is the only USB mass
storage.
Today I (unintentionally) performed a harder test:

1. Plugged in the iPod
2. Mounted the iPod
3. Copied some mp3 files onto the iPod
4. started xmms directly on the iPod file system
5. Plugged in the DVD-drive
6. after the DVD-drive was initialized, the iPod file system gets unmounted due
to USB errors. It cannot be remounted:
"mount /USB"
results in
"/dev/sda1 is not a valid block device"
Unplugging and replugging of the iPod does not solve the problem.
Comment 20 Alan Stern 2005-08-25 13:59:52 UTC
This is a separate problem.  You should close this bug and start a new bug
report.  When you do, include a description of what happens if you just try
plugging in the iPod while using the DVD drive.
Comment 21 Leonid Kof 2005-08-26 10:23:07 UTC
The patch works fine even without the "delay"-option in modprobe.conf and with
verbose debugging turned off.