Bug 14841 - unable to enumerate USB device on port X after suspend/resume
unable to enumerate USB device on port X after suspend/resume
Status: CLOSED DOCUMENTED
Product: Drivers
Classification: Unclassified
Component: USB
All Linux
: P1 normal
Assigned To: Greg Kroah-Hartman
:
Depends on:
Blocks: 7216 14230
  Show dependency treegraph
 
Reported: 2009-12-19 11:45 UTC by Fruhwirth Clemens
Modified: 2010-01-11 19:57 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.32
Tree: Mainline
Regression: Yes


Attachments

Description Fruhwirth Clemens 2009-12-19 11:45:58 UTC
After a suspend/resume cycle my SD card fails to work (attached over usb-storage). 

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/256767 has a lengthly "me-too" list of people affected. So that bug seems very real to me. Also it is not harmless as some people claim, the device fails to work for me unless I cold-boot without uswsusp-resuming.

I found the following message interesting:
http://lkml.org/lkml/2008/10/24/209
and it makes me speculate that this "EHCI most be loaded before UHCI/OHCI" condition is violated when (re-)initializing the USB ports from a cold-boot resume.

That is what's going on my usbmon port for that particular USB port:

eb302780 964861293 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
eb302780 964861295 C Ci:2:001:0 0 4 = 01050100
eb302780 964861297 S Co:2:001:0 s 23 01 0010 0002 0000 0
eb302780 964861301 C Co:2:001:0 0 0
eb302780 964861303 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
eb302780 964861306 C Ci:2:001:0 0 4 = 01050000
eb302780 964889554 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
eb302780 964889559 C Ci:2:001:0 0 4 = 01050000
eb302780 964920138 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
eb302780 964920148 C Ci:2:001:0 0 4 = 01050000
eb302780 964950136 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
eb302780 964950146 C Ci:2:001:0 0 4 = 01050000
eb302780 964980134 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
eb302780 964980144 C Ci:2:001:0 0 4 = 01050000
eb302780 964980153 S Co:2:001:0 s 23 03 0004 0002 0000 0
eb302780 964980156 C Co:2:001:0 0 0
eb302780 965032890 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
eb302780 965033076 C Ci:2:001:0 0 4 = 03051000
f6c62b80 965033101 C Ii:2:001:1 0:2048 1 = 04
f6c62b80 965033110 S Ii:2:001:1 -115:2048 4 <
eb302780 965087074 S Co:2:001:0 s 23 01 0014 0002 0000 0
eb302780 965087081 C Co:2:001:0 0 0
eb302780 965087091 S Ci:2:000:0 s 80 06 0100 0000 0040 64 <
eb302780 965091083 C Ci:2:000:0 -71 0
eb302780 965091093 S Ci:2:000:0 s 80 06 0100 0000 0040 64 <
eb302780 965095208 C Ci:2:000:0 -71 0
eb302780 965095216 S Ci:2:000:0 s 80 06 0100 0000 0040 64 <
eb302780 965099333 C Ci:2:000:0 -71 0
eb302780 965099341 S Co:2:001:0 s 23 03 0004 0002 0000 0
eb302780 965099344 C Co:2:001:0 0 0
eb302780 965150135 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
eb302780 965150451 C Ci:2:001:0 0 4 = 00011100
eb302780 965150454 S Co:2:001:0 s 23 01 0014 0002 0000 0
eb302780 965150457 C Co:2:001:0 0 0
eb302780 965150459 S Co:2:001:0 s 23 01 0001 0002 0000 0
eb302780 965150462 C Co:2:001:0 0 0
eb302780 965150473 S Co:2:001:0 s 23 01 0001 0002 0000 0
eb302780 965150476 C Co:2:001:0 0 0
eb302780 965150480 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
eb302780 965150483 C Ci:2:001:0 0 4 = 00010000
f6c62b80 965154400 C Ii:2:001:1 0:2048 1 = 04
f6c62b80 965154402 S Ii:2:001:1 -115:2048 4 <
Comment 1 Andrew Morton 2009-12-22 23:14:59 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sat, 19 Dec 2009 11:45:59 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=14841
> 
>            Summary: unable to enumerate USB device on port X after
>                     suspend/resume
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.32
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: USB
>         AssignedTo: greg@kroah.com
>         ReportedBy: clemens@endorphin.org
>         Regression: Yes

It's a regression from 2.6.31 (I assume?) to 2.6.32.

> 
> After a suspend/resume cycle my SD card fails to work (attached over
> usb-storage). 
> 
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/256767 has a lengthly
> "me-too" list of people affected. So that bug seems very real to me. Also it is
> not harmless as some people claim, the device fails to work for me unless I
> cold-boot without uswsusp-resuming.
> 
> I found the following message interesting:
> http://lkml.org/lkml/2008/10/24/209
> and it makes me speculate that this "EHCI most be loaded before UHCI/OHCI"
> condition is violated when (re-)initializing the USB ports from a cold-boot
> resume.
> 
> That is what's going on my usbmon port for that particular USB port:
> 
> eb302780 964861293 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> eb302780 964861295 C Ci:2:001:0 0 4 = 01050100
> eb302780 964861297 S Co:2:001:0 s 23 01 0010 0002 0000 0
> eb302780 964861301 C Co:2:001:0 0 0
> eb302780 964861303 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> eb302780 964861306 C Ci:2:001:0 0 4 = 01050000
> eb302780 964889554 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> eb302780 964889559 C Ci:2:001:0 0 4 = 01050000
> eb302780 964920138 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> eb302780 964920148 C Ci:2:001:0 0 4 = 01050000
> eb302780 964950136 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> eb302780 964950146 C Ci:2:001:0 0 4 = 01050000
> eb302780 964980134 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> eb302780 964980144 C Ci:2:001:0 0 4 = 01050000
> eb302780 964980153 S Co:2:001:0 s 23 03 0004 0002 0000 0
> eb302780 964980156 C Co:2:001:0 0 0
> eb302780 965032890 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> eb302780 965033076 C Ci:2:001:0 0 4 = 03051000
> f6c62b80 965033101 C Ii:2:001:1 0:2048 1 = 04
> f6c62b80 965033110 S Ii:2:001:1 -115:2048 4 <
> eb302780 965087074 S Co:2:001:0 s 23 01 0014 0002 0000 0
> eb302780 965087081 C Co:2:001:0 0 0
> eb302780 965087091 S Ci:2:000:0 s 80 06 0100 0000 0040 64 <
> eb302780 965091083 C Ci:2:000:0 -71 0
> eb302780 965091093 S Ci:2:000:0 s 80 06 0100 0000 0040 64 <
> eb302780 965095208 C Ci:2:000:0 -71 0
> eb302780 965095216 S Ci:2:000:0 s 80 06 0100 0000 0040 64 <
> eb302780 965099333 C Ci:2:000:0 -71 0
> eb302780 965099341 S Co:2:001:0 s 23 03 0004 0002 0000 0
> eb302780 965099344 C Co:2:001:0 0 0
> eb302780 965150135 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> eb302780 965150451 C Ci:2:001:0 0 4 = 00011100
> eb302780 965150454 S Co:2:001:0 s 23 01 0014 0002 0000 0
> eb302780 965150457 C Co:2:001:0 0 0
> eb302780 965150459 S Co:2:001:0 s 23 01 0001 0002 0000 0
> eb302780 965150462 C Co:2:001:0 0 0
> eb302780 965150473 S Co:2:001:0 s 23 01 0001 0002 0000 0
> eb302780 965150476 C Co:2:001:0 0 0
> eb302780 965150480 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> eb302780 965150483 C Ci:2:001:0 0 4 = 00010000
> f6c62b80 965154400 C Ii:2:001:1 0:2048 1 = 04
> f6c62b80 965154402 S Ii:2:001:1 -115:2048 4 <
>
Comment 2 Alan Stern 2009-12-23 16:36:51 UTC
> > http://bugzilla.kernel.org/show_bug.cgi?id=14841
> > 
> >            Summary: unable to enumerate USB device on port X after
> >                     suspend/resume

> It's a regression from 2.6.31 (I assume?) to 2.6.32.

Is it a regression?  Did this work okay under 2.6.31?  Can you provide 
a usbmon trace showing the successful behavior under 2.6.31?

> > After a suspend/resume cycle my SD card fails to work (attached over
> > usb-storage). 

But it works okay before you suspend/resume?  Was the card reader 
plugged in during the suspend?  What happens if you unplug the reader 
after resuming and then plug it in again?

> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/256767 has a lengthly
> > "me-too" list of people affected. So that bug seems very real to me. Also it is
> > not harmless as some people claim, the device fails to work for me unless I
> > cold-boot without uswsusp-resuming.

The people chiming in on that bug report actually described a wide
variety of different kinds of failures.  Some of them have been fixed
and some haven't.

> > I found the following message interesting:
> > http://lkml.org/lkml/2008/10/24/209
> > and it makes me speculate that this "EHCI most be loaded before UHCI/OHCI"
> > condition is violated when (re-)initializing the USB ports from a cold-boot
> > resume.

If you read that log message more closely you'll see that it says
"ehci_hcd should always be loaded before uhci_hcd and ohci_hcd".  
_Should_, not _must_.  Loading the drivers in the wrong order can give
rise to one or two "unable to enumerate" messages, but not an unending
stream of them.

> > That is what's going on my usbmon port for that particular USB port:
> > 
> > eb302780 964861293 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> > eb302780 964861295 C Ci:2:001:0 0 4 = 01050100
> > eb302780 964861297 S Co:2:001:0 s 23 01 0010 0002 0000 0
> > eb302780 964861301 C Co:2:001:0 0 0
> > eb302780 964861303 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> > eb302780 964861306 C Ci:2:001:0 0 4 = 01050000
> > eb302780 964889554 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> > eb302780 964889559 C Ci:2:001:0 0 4 = 01050000
> > eb302780 964920138 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> > eb302780 964920148 C Ci:2:001:0 0 4 = 01050000
> > eb302780 964950136 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> > eb302780 964950146 C Ci:2:001:0 0 4 = 01050000
> > eb302780 964980134 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> > eb302780 964980144 C Ci:2:001:0 0 4 = 01050000

That was a normal debounce.

> > eb302780 964980153 S Co:2:001:0 s 23 03 0004 0002 0000 0
> > eb302780 964980156 C Co:2:001:0 0 0
> > eb302780 965032890 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> > eb302780 965033076 C Ci:2:001:0 0 4 = 03051000

That was a successful port reset.

> > f6c62b80 965033101 C Ii:2:001:1 0:2048 1 = 04
> > f6c62b80 965033110 S Ii:2:001:1 -115:2048 4 <

This indicates that when the reset was over, something happened -- 
probably a disconnect event.  It's possible that the reader wasn't able 
to reset correctly.

> > eb302780 965087074 S Co:2:001:0 s 23 01 0014 0002 0000 0
> > eb302780 965087081 C Co:2:001:0 0 0
> > eb302780 965087091 S Ci:2:000:0 s 80 06 0100 0000 0040 64 <
> > eb302780 965091083 C Ci:2:000:0 -71 0
> > eb302780 965091093 S Ci:2:000:0 s 80 06 0100 0000 0040 64 <
> > eb302780 965095208 C Ci:2:000:0 -71 0
> > eb302780 965095216 S Ci:2:000:0 s 80 06 0100 0000 0040 64 <
> > eb302780 965099333 C Ci:2:000:0 -71 0

These -71 errors confirm that the device was not connected properly.

> > eb302780 965099341 S Co:2:001:0 s 23 03 0004 0002 0000 0
> > eb302780 965099344 C Co:2:001:0 0 0
> > eb302780 965150135 S Ci:2:001:0 s a3 00 0000 0002 0004 4 <
> > eb302780 965150451 C Ci:2:001:0 0 4 = 00011100

And so does this port status value.

Alan Stern
Comment 3 Fruhwirth Clemens 2009-12-27 21:31:29 UTC
Hello Alan, hello Andrew,

After some more probing, I found that my initial conclusion that this
is related to suspend/resuming has been totally wrong. First, the
device works only under a USB 1.1 root despite the USB description
telling me that this is a USB 2.0 device. The symptoms observed come
merely from the fact that connecting/reconnecting after boot tries to
attach the device under a USB 2.0 root. I have not yet found out, why
after a cold boot the device runs at fullspeed and not highspeed, (USB
handoff?). The suspend/resume cycle triggers a "connect" (or
reconnect) and therefore I get the same symptoms when doing a
reconnect by hand without sleeping inbetween.

In contrast to that, the device works with Windows (7). The strategy
for Windows is to fallback to lower speed.
I can force the device to USB 1.1 by using "companion" of the EHCI
driver, where I can force handoff to the companion controller (thanks
Alan for this feature) so I can live without any changes. However, I'm
wondering whether an automatic fallback is a good idea for devices
that fail to enumerate?

On Wed, Dec 23, 2009 at 5:36 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
>> > http://bugzilla.kernel.org/show_bug.cgi?id=14841
>> >
>> >            Summary: unable to enumerate USB device on port X after
>> >                     suspend/resume
>
>> It's a regression from 2.6.31 (I assume?) to 2.6.32.

No, I have seen this behaviour with 2.3.31 too.

> Is it a regression?  Did this work okay under 2.6.31?  Can you provide
> a usbmon trace showing the successful behavior under 2.6.31?

Alan, you have seen this trace already.
I posted a successful SD card probe here involving the capacity-is-wrong-bug
http://article.gmane.org/gmane.linux.scsi/56469
I guess this reader is nothing but broken.

I attached a log where I did the following:

Boot with device detached until the bootloader (so that I don't the
BIOS initialization can't screw things up)
Attach device
Boot into the kernel
Start usbmon (device works) - here the trace starts
Disconnect/Reconnect device
Devices doesn't connect properly after timestamp 20588...
-- 
Fruhwirth Clemens http://clemens.endorphin.org
Comment 4 Alan Stern 2009-12-28 03:03:31 UTC
On Sun, 27 Dec 2009, Clemens Fruhwirth wrote:

> Hello Alan, hello Andrew,
> 
> After some more probing, I found that my initial conclusion that this
> is related to suspend/resuming has been totally wrong. First, the
> device works only under a USB 1.1 root despite the USB description
> telling me that this is a USB 2.0 device.

In theory there's nothing wrong with that.  It's perfectly valid for a
USB-2.0 device to run only at full speed, not at high speed.  Of
course, in your case it's more likely that this indicates the device is
somehow broken.

Have you posted the "lsusb -v" output for the device?  It should 
indicate (if you're using a recent version of lsusb) at what speeds the 
device is meant to run.

>  The symptoms observed come
> merely from the fact that connecting/reconnecting after boot tries to
> attach the device under a USB 2.0 root. I have not yet found out, why
> after a cold boot the device runs at fullspeed and not highspeed, (USB
> handoff?). The suspend/resume cycle triggers a "connect" (or
> reconnect) and therefore I get the same symptoms when doing a
> reconnect by hand without sleeping inbetween.

There's the question of why it works during boot but not afterward.  
This could be caused by the timing of events.  During boot lots of
other stuff is going on, and it could cause the USB initialization
sequence to be delayed.  That in turn could easily affect the outcome
if the device is broken.

> In contrast to that, the device works with Windows (7). The strategy
> for Windows is to fallback to lower speed.

Linux does the same thing.  But it can't when the device disconnects
and then reconnects, which is what happened in your logs.  Each
reconnect is handled independently, starting fresh at high speed.

> I can force the device to USB 1.1 by using "companion" of the EHCI
> driver, where I can force handoff to the companion controller (thanks
> Alan for this feature) so I can live without any changes. However, I'm
> wondering whether an automatic fallback is a good idea for devices
> that fail to enumerate?

We already do it, subject to the limitation described above.

> I attached a log where I did the following:
> 
> Boot with device detached until the bootloader (so that I don't the
> BIOS initialization can't screw things up)
> Attach device
> Boot into the kernel
> Start usbmon (device works) - here the trace starts
> Disconnect/Reconnect device
> Devices doesn't connect properly after timestamp 20588...

The usbmon trace doesn't contain any information about the reconnect.  
Can you do the same thing again, but using the 0u file instead of 2u?

Alan Stern
Comment 5 Fruhwirth Clemens 2010-01-11 00:05:32 UTC
As discussed on the mailing list, the USB reader in question is simply broken. It reports itself as USB 2.0 but is in fact unable to operate at that speed. We could discuss an automatic fallback for devices to USB 1.1 in these cases as Windows implements it.
Comment 6 Rafael J. Wysocki 2010-01-11 19:57:10 UTC
On Monday 11 January 2010, Clemens Fruhwirth wrote:
> On Mon, Jan 11, 2010 at 4:15 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> > On Sun, 10 Jan 2010, Rafael J. Wysocki wrote:
> >
> >> Bug-Entry     : http://bugzilla.kernel.org/show_bug.cgi?id=14841
> >> Subject       : unable to enumerate USB device on port X after suspend/resume
> >> Submitter     : Fruhwirth Clemens <clemens@endorphin.org>
> >> Date          : 2009-12-19 11:45 (23 days old)
> >
> > This bug has been resolved as WILL NOT FIX.  It ended up having nothing
> > to do with suspend or resume, it was caused by strange behavior in the
> > device hardware.
> 
> Should we track "automatic fallback to USB 1.1" as feature request?
> You already said that this isn't easily possible with the way the USB
> stack is structured at the moment (connect-centric-view instead of
> port-centric-view). However, Windows does that fallback and I'm afraid
> that this is the default device testing scenario.

Note You need to log in before you can comment on or make changes to this bug.