Bug 68161 - Unstable work of xhci with USB3.0 card reader and UDMA7 CompactFlash card.
Summary: Unstable work of xhci with USB3.0 card reader and UDMA7 CompactFlash card.
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Greg Kroah-Hartman
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-04 22:17 UTC by tatxarata
Modified: 2015-03-13 16:03 UTC (History)
5 users (show)

See Also:
Kernel Version: 3.12.6
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Log when trying to mount card via thunar. (412.12 KB, text/x-log)
2014-01-04 22:18 UTC, tatxarata
Details
Capture from wireshark's device usbmon when trying to mount card via thunar. (2.04 MB, application/octet-stream)
2014-01-04 22:19 UTC, tatxarata
Details
Log when I mount and copy file from shell. (238.16 KB, text/x-log)
2014-01-04 22:20 UTC, tatxarata
Details
Capture from wireshark's device usbmon when I mount and copy file from shell. (1.43 MB, application/octet-stream)
2014-01-04 22:21 UTC, tatxarata
Details
Capture from wireshark's device usbmon when mounting via thunar in USB2.0 port. (3.05 MB, application/octet-stream)
2014-01-04 22:22 UTC, tatxarata
Details

Description tatxarata 2014-01-04 22:17:24 UTC
I've got SanDisk UDMA7 32Gb Extreme Pro CompactFlash card and Transcend
TS-RDF8K USB 3.0 card reader with latest firmware TS22. BIOS on my Dell
E6230is also updated to the latest version.

When card reader is in USB 2.0 port all work as expected.

When inserted to USB 3.0 port there are 3 opportunities.
1. In the most rare case all work as expected. However I have not managed to
reproduce this case with the latest kernel.
2. Sometimes card is not recognized and block device is not created.
3. In the most cases card is recognized, I can mount it from shell by 'mount'
command, copy files from/to it with 'cp' command etc. But when I enter
directory where card is mounted with midnight commander or thunar (I use xfce) device hangs for couple of seconds and then resets. Same thing happens when i try to mount card from thunar. In the last case rarely it happens, that device after freeze doesn't go to reset, but than it hangs when I try to work with it.

I've got older Transcend 4Gb 120x CompactFlash card that works without any
issues in USB 3.0 port.

Also I've tried Ginzzu GR-336B card reader with the same results.

ps
Sorry for my poor English. It's not my native language.
Comment 1 tatxarata 2014-01-04 22:18:29 UTC
Created attachment 120891 [details]
Log when trying to mount card via thunar.
Comment 2 tatxarata 2014-01-04 22:19:19 UTC
Created attachment 120901 [details]
Capture from wireshark's device usbmon when trying to mount card via thunar.
Comment 3 tatxarata 2014-01-04 22:20:25 UTC
Created attachment 120911 [details]
Log when I mount and copy file from shell.
Comment 4 tatxarata 2014-01-04 22:21:10 UTC
Created attachment 120921 [details]
Capture from wireshark's device usbmon when I mount and copy file from shell.
Comment 5 tatxarata 2014-01-04 22:22:44 UTC
Created attachment 120931 [details]
Capture from wireshark's device usbmon when mounting via thunar in USB2.0 port.
Comment 6 Greg Kroah-Hartman 2014-01-05 03:39:50 UTC
On Sat, Jan 04, 2014 at 10:17:24PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=68161
> 
>             Bug ID: 68161
>            Summary: Unstable work of xhci with USB3.0 card reader and
>                     UDMA7 CompactFlash card.

Please send this to the linux-usb@vger.kernel.org mailing list.
Comment 7 tatxarata 2014-01-07 10:04:37 UTC
Since reporting this bug I've invested some time to get myself familiar with USB
protocol and analyzed attached capture files. It seems like device reset
occurs after device returns urb_status=-75 (-EOVERFLOW). This can be seen in
attachment https://bugzilla.kernel.org/attachment.cgi?id=120901 in packet
#1987. Also I've noticed that host tries to read device by chunks of 240
sectors while device returns on each query no more than 120 sectors (61440
bytes). 
From traffic it is clearly seen that EOVERFLOW occurs after the device is already mounted and while software tries to browse it's content. 
When I do something like 'dd if=/dev/sdb of=/dev/null' where sdb is CF card or mount and copy with shell commands host<->device communication scheme is the same (240 sectors requested, 120 returned), but this doesn't lead to EOVERFLOW. In that cases read speed is at about 80Mb/s. 
So I suppose that something wrong happens only while software like thunar or midnight commander tries to browse the contents of card (maybe parallel threads trying to access card simultaneously?).

With that knowledge I've tried to tweak some device parameters in /sys
filesystem. When I put value 60 in /sys/block/sd?/queue/max_sectors_kb then
all operates correctly without any resets. However in this case read speed of
card drops down by factor of two at around 40Mb/s. 

When I set max_sectors_kb to 64 then device does reset upon mount in thunar, however, surprisingly, this doesn't lead to dropping of device mount, like in case of default value of 120. In this case read speed is at about 89.5Mb/s, as expected by card specs. So I've added udev rule that corrects value of max_sectors_kb to 64 upon device connection. For now I can live with this 10 seconds latency of device mounting if latter it works properly. However I think that the reason of this issue must be clarified and fixed.

Also tried to set queue/scheduler to noop with no effect.

In case of USB2.0 host<->device traffic looks pretty the same way as in case of USB3.0. Host also tries to read device by chunks of 240 sectors and device returns only 120. However for some reason this doesn't lead to EOVERFLOW.

Main difference I've managed to find between usb 2.0 and 3.0 traffic is the device initialization. In case of 3.0 there are some CLEAR_FEATURE/SET_FEATURE requests that are missing in case of 2.0, so maybe device operates differently by that reason.

I'm going to investigate further.

ps 
My main kernel for now is 3.10.17-gentoo, all that written above is also true for this version too.
Comment 8 tatxarata 2014-01-08 19:19:51 UTC
Oops.. my bad...

It seems like wireshark misses some data while capturing on usbmon device. According to LBA addresses in subsequent SCSI commands it looks like on a request of 240 sectors host really gets from device 240 sectors. On the other hand for each such request in the capture exists only one URB_BULK packet with data and the size of this data covers only 120 sectors (61440 bytes). As a consequence size of capture file is about twice less than size of files transferred to create this capture.

In my previous examinations I've took into account only size of URB_BULK packets and missed out the difference between subsequent LBA addresses.

For such URB_BULK packets wireshark states "URB length: 122880", "Data length: 61440". Reading of Documentation/usb/usbmon.txt didn't clarified for me what does this mean. Whether this is limitation of usbmon or a feature of wireshark.

Sorry for that inconvenience.
Comment 9 Alan Stern 2014-04-09 18:26:12 UTC
Does the 3.14 kernel work any better?  It includes several changes to xhci-hcd.
Comment 10 tatxarata 2014-04-21 19:26:17 UTC
On 04/09/2014 10:26 PM, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=68161
>
> --- Comment #9 from Alan Stern <stern@rowland.harvard.edu> ---
> Does the 3.14 kernel work any better?  It includes several changes to
> xhci-hcd.
>

I've tested my setup with vanilla kernel version 3.14.1. Nothing changed 
at all.
Comment 11 Alan Stern 2015-03-13 16:03:04 UTC
It's been almost a year since there was any news about this bug.  Is it still a problem with the 3.19 or 4.0-rc kernels?

Note You need to log in before you can comment on or make changes to this bug.