Bug 14785 - ehci_hcd (usb-storage) not reliable with p55 and Core i7 860
Summary: ehci_hcd (usb-storage) not reliable with p55 and Core i7 860
Status: RESOLVED INVALID
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Greg Kroah-Hartman
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-12-10 22:18 UTC by Alexander Holler
Modified: 2012-02-22 21:47 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.38.2 2.6.33.2 2.6.32 2.6.31.7 2.6.30.9-96.fc11.x86_64
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Log of 2.6.32 kernel-messages (bzip2 compressed) (21.38 KB, application/x-bzip)
2009-12-10 22:18 UTC, Alexander Holler
Details
Output of lspci -vvv (bzip2 compressed) (4.58 KB, application/x-bzip)
2009-12-10 22:19 UTC, Alexander Holler
Details
Output of lsusb -vvv (bzip2 compressed) (2.29 KB, application/x-bzip)
2009-12-10 22:19 UTC, Alexander Holler
Details
Log of 2.6.31.6 kernel-messages (bzip2 compressed) (53.10 KB, application/x-bzip)
2009-12-10 23:23 UTC, Alexander Holler
Details
Various Logs (compress tar.gz file( (26.12 KB, application/x-gzip)
2009-12-13 16:18 UTC, dasnoopy
Details

Description Alexander Holler 2009-12-10 22:18:11 UTC
Created attachment 24145 [details]
Log of 2.6.32 kernel-messages (bzip2 compressed)

Using a Gigabyte MB P55M-UD4 (Intel p55 express chipset) equipped with an Intel Core I7 860 CPU, ehci_hcd fails (here reproducible) while working with two highspeed usb-storage devices.

I've already written some weeks ago a description of the problem on the LKML, but I recently discovered that I can provoke the error too using dd on an external DVD-drive (USB 2.0) while writing the output to an USB-hdd (USB 2.0 too). (Two different devices from other manufacturers than I described in the mail below).

The (short) thread started here: http://lkml.org/lkml/2009/11/13/427

The problem becomes visible after transfering a CSW failed. I've used all these external devices without any problems for months with different HW (e.g. Intel core2duo Notebook and an Asus MB with VIA-chipset and an AMD64x2, so I assume this is a problem specific to the p55 chipset (but I'm not sure).

This happens with all kernels I tried, including the stock Fedora 11 kernel, 2.6.31.6, 2.6.31.7 and 2.6.32 (the last three from git).

An exerpt of the log using kernel 2.6.32:

-------------------------
Dec 10 21:56:10 krabat kernel: [ 1544.700855] usb-storage: *** thread sleeping.
Dec 10 21:56:10 krabat kernel: [ 1544.700871] usb-storage: queuecommand called
Dec 10 21:56:10 krabat kernel: [ 1544.700880] usb-storage: *** thread awakened.
Dec 10 21:56:10 krabat kernel: [ 1544.700882] usb-storage: Command READ_10 (10 bytes)
Dec 10 21:56:10 krabat kernel: [ 1544.700884] usb-storage:  28 00 24 16 09 ea 00 00 f0 00
Dec 10 21:56:10 krabat kernel: [ 1544.700893] usb-storage: Bulk Command S 0x43425355 T 0x5f88 L 122880 F 128 Trg 0 LUN 0 CL 10
Dec 10 21:56:10 krabat kernel: [ 1544.700897] usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
Dec 10 21:56:10 krabat kernel: [ 1544.724795] usb-storage: Status code 0; transferred 31/31
Dec 10 21:56:10 krabat kernel: [ 1544.724799] usb-storage: -- transfer complete
Dec 10 21:56:10 krabat kernel: [ 1544.724801] usb-storage: Bulk command transfer result=0
Dec 10 21:56:10 krabat kernel: [ 1544.724805] usb-storage: usb_stor_bulk_transfer_sglist: xfer 122880 bytes, 3 entries
Dec 10 21:56:10 krabat kernel: [ 1544.729842] usb-storage: Status code 0; transferred 122880/122880
Dec 10 21:56:10 krabat kernel: [ 1544.729844] usb-storage: -- transfer complete
Dec 10 21:56:10 krabat kernel: [ 1544.729846] usb-storage: Bulk data transfer result 0x0
Dec 10 21:56:10 krabat kernel: [ 1544.729848] usb-storage: Attempting to get CSW...
Dec 10 21:56:10 krabat kernel: [ 1544.729851] usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
Dec 10 21:56:10 krabat kernel: [ 1544.729934] usb-storage: Status code 0; transferred 122880/122880
Dec 10 21:56:10 krabat kernel: [ 1544.729937] usb-storage: -- transfer complete
Dec 10 21:56:10 krabat kernel: [ 1544.729939] usb-storage: Bulk data transfer result 0x0
Dec 10 21:56:10 krabat kernel: [ 1544.729941] usb-storage: Attempting to get CSW...
Dec 10 21:56:10 krabat kernel: [ 1544.729944] usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
Dec 10 21:56:40 krabat kernel: [ 1574.887650] usb-storage: command_abort called
Dec 10 21:56:40 krabat kernel: [ 1574.887654] usb-storage: usb_stor_stop_transport called
Dec 10 21:56:40 krabat kernel: [ 1574.887656] usb-storage: -- cancelling URB
Dec 10 21:56:40 krabat kernel: [ 1574.899662] usb-storage: command_abort called
Dec 10 21:56:40 krabat kernel: [ 1574.899666] usb-storage: usb_stor_stop_transport called
Dec 10 21:56:40 krabat kernel: [ 1574.899668] usb-storage: -- cancelling URB
Dec 10 21:56:40 krabat kernel: [ 1574.899686] usb-storage: Status code -104; transferred 0/13
Dec 10 21:56:40 krabat kernel: [ 1574.899688] usb-storage: -- transfer cancelled
Dec 10 21:56:40 krabat kernel: [ 1574.899691] usb-storage: Bulk status result = 4
Dec 10 21:56:40 krabat kernel: [ 1574.899693] usb-storage: -- command was aborted
Dec 10 21:56:40 krabat kernel: [ 1574.899734] usb-storage: usb_stor_pre_reset
Dec 10 21:56:40 krabat kernel: [ 1574.911664] usb-storage: Status code -104; transferred 13/13
Dec 10 21:56:40 krabat kernel: [ 1574.911668] usb-storage: -- transfer cancelled
Dec 10 21:56:40 krabat kernel: [ 1574.911670] usb-storage: Bulk status result = 4
Dec 10 21:56:40 krabat kernel: [ 1574.911672] usb-storage: -- command was aborted
Dec 10 21:56:40 krabat kernel: [ 1574.911676] usb-storage: usb_stor_pre_reset
D
-------------------------

As described in the thread on the LKML, after that ehci_hcd is unusable and I have to unload and reload the module (or using unbind to reset ehci_hcd).

I've attached a stripped down log (minus ~1GB debug lines without failures) produced with kernel 2.6.32 with all kernel-messages and the outputs of lspci -vvv and lsusb -vvv.

Kind regards,

Alexander Holler
Comment 1 Alexander Holler 2009-12-10 22:19:12 UTC
Created attachment 24146 [details]
Output of lspci -vvv (bzip2 compressed)
Comment 2 Alexander Holler 2009-12-10 22:19:50 UTC
Created attachment 24147 [details]
Output of lsusb -vvv (bzip2 compressed)
Comment 3 Alexander Holler 2009-12-10 23:23:54 UTC
Created attachment 24151 [details]
 Log of 2.6.31.6 kernel-messages (bzip2 compressed)

I've added another log produced with kernel 2.6.31.6 where I've got the error during a dd from an usb-dvd to an usb-hdd.

The log with kernel 2.6.32 was produced using two usb-hdds and doing a rsync between two partitions in two large encrypted files (using loop and dm-crypt).
Comment 4 Alexander Holler 2009-12-12 12:46:52 UTC
Some more details: I'm using the newest BIOS F5 (description "Update for P55 B3 stepping") which is currently available for this MB, and all devices were attached directly to ports on the MB, so no external USB-hubs are involved.
Comment 5 dasnoopy 2009-12-13 16:17:35 UTC
Same problem here ...since I've installed the latest stable kernel 2.6.32 I'm having the same issue on my linux box using an external high speed usb 2.0 HDD


If you are working on it,after some times the external USB2 hdd stop to respond and in the /var/log/messages youn can find an "reset high speed USB device using ehci_hcd" message 


Also for me after that, the  "ehci_hcd" module is unusable and I
have to unload and reload the module (or using unbind to reset ehci_hcd).

Some details about my hw and ws configuration (if can be useful) :

- MB INTEL DP43TF (P43 express chipset) with latest bios available
- Slackware 13.0
- Gnome 2.28.1 (gnome-slackbuild)
- external LaCie Desktop Hard Disk Hi-Speed USB 2.0  : internal disk is a sata hdd 500GB (Seagate ST3500820AS) : the box  is connected to the one of four motherboard USB ports (no USB hubs here)
- external HDD have a FAT32 and a NTFS-5 partition 

I noticed the issue only after the kernel update with the latest stable version.. 
I din't made any changes in my linux box and I'm using the same sw and hw configuration since months. I used different kernels (mainly 2.6.30.5 and 2.6.31.6) without any problem.

Thanks
Andrea from Italy

ps : I'll attach the (I think) most interesting parts of /var/log/messages,daemon, debug and syslog  generated on my system  when issue happens.
Comment 6 dasnoopy 2009-12-13 16:18:50 UTC
Created attachment 24169 [details]
Various Logs (compress tar.gz file(
Comment 7 Timur Maximov 2010-04-03 12:30:30 UTC
Workaround for this bug:

CONFIG_NO_HZ=n
CONFIG_HZ=100
CONFIG_HZ_100=y

HZ must be <1000

This found for kernel 2.6.27-gentoo-r10 and tested 2.6.33-zen1. Bug present in all versions, but can need long time period for receive error (~8h continues read for zero load unhuman system).

Test system: i5-750/GA-P55M-UD4.
Comment 8 Alexander Holler 2010-04-08 15:04:38 UTC
I can't confirm this workaround.

Using 2.6.33.2 I'm getting the same failures as described above with a tickless config (CONFIG_NO_HZ=YES) or with CONFIG_HZ=100 or 250. As usual it needs between 10 and 40 minutes to provoke the error when using rsync between two usb-hds (as described above).
Comment 9 Siarhei Siamashka 2011-01-31 20:00:58 UTC
I also have USB reliability problems, easily reproducible when just trying to test sdhc memory card in a cardreader with 'badblocks' tool in write-mode (-w option).

Could one of the multiple P55 chipset USB related errata [1] require some workaround in the kernel to resolve this issue?

Test system: i7-860/ASUS P7P55D-E

1. http://www.intel.com/Assets/PDF/specupdate/322170.pdf
Comment 10 Siarhei Siamashka 2011-02-27 20:51:25 UTC
BIOS upgrade to the fresh recently released version 1504 for ASUS P7P55D-E fixed the 'badblocks' problem for me. For anyone still suffering from USB reliability issues, I would suggest checking whether any BIOS update is also available for your motherboard.

It's interesting that probably Intel is also not very happy about the quality of BIOS implementations and the lack of (timely) errata fixes for the motherboards based on their chips: http://lwn.net/Articles/429812/
Comment 11 Alexander Holler 2011-04-14 08:25:51 UTC
I've upgraded my BIOS (again) and this Intel-HW is still unusable to copy/modify files on two usb-storage devices at the same time (vanilla 2.6.38.2).
Comment 12 Greg Kroah-Hartman 2012-02-22 21:47:36 UTC
All USB bugs should be sent to the linux-usb@vger.kernel.org mailing 
list, and not entered into bugzilla.  Please bring this issue up there,
if it is still a problem in the latest kernel release.

Note You need to log in before you can comment on or make changes to this bug.