Created attachment 24145 [details] Log of 2.6.32 kernel-messages (bzip2 compressed) Using a Gigabyte MB P55M-UD4 (Intel p55 express chipset) equipped with an Intel Core I7 860 CPU, ehci_hcd fails (here reproducible) while working with two highspeed usb-storage devices. I've already written some weeks ago a description of the problem on the LKML, but I recently discovered that I can provoke the error too using dd on an external DVD-drive (USB 2.0) while writing the output to an USB-hdd (USB 2.0 too). (Two different devices from other manufacturers than I described in the mail below). The (short) thread started here: http://lkml.org/lkml/2009/11/13/427 The problem becomes visible after transfering a CSW failed. I've used all these external devices without any problems for months with different HW (e.g. Intel core2duo Notebook and an Asus MB with VIA-chipset and an AMD64x2, so I assume this is a problem specific to the p55 chipset (but I'm not sure). This happens with all kernels I tried, including the stock Fedora 11 kernel, 2.6.31.6, 2.6.31.7 and 2.6.32 (the last three from git). An exerpt of the log using kernel 2.6.32: ------------------------- Dec 10 21:56:10 krabat kernel: [ 1544.700855] usb-storage: *** thread sleeping. Dec 10 21:56:10 krabat kernel: [ 1544.700871] usb-storage: queuecommand called Dec 10 21:56:10 krabat kernel: [ 1544.700880] usb-storage: *** thread awakened. Dec 10 21:56:10 krabat kernel: [ 1544.700882] usb-storage: Command READ_10 (10 bytes) Dec 10 21:56:10 krabat kernel: [ 1544.700884] usb-storage: 28 00 24 16 09 ea 00 00 f0 00 Dec 10 21:56:10 krabat kernel: [ 1544.700893] usb-storage: Bulk Command S 0x43425355 T 0x5f88 L 122880 F 128 Trg 0 LUN 0 CL 10 Dec 10 21:56:10 krabat kernel: [ 1544.700897] usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes Dec 10 21:56:10 krabat kernel: [ 1544.724795] usb-storage: Status code 0; transferred 31/31 Dec 10 21:56:10 krabat kernel: [ 1544.724799] usb-storage: -- transfer complete Dec 10 21:56:10 krabat kernel: [ 1544.724801] usb-storage: Bulk command transfer result=0 Dec 10 21:56:10 krabat kernel: [ 1544.724805] usb-storage: usb_stor_bulk_transfer_sglist: xfer 122880 bytes, 3 entries Dec 10 21:56:10 krabat kernel: [ 1544.729842] usb-storage: Status code 0; transferred 122880/122880 Dec 10 21:56:10 krabat kernel: [ 1544.729844] usb-storage: -- transfer complete Dec 10 21:56:10 krabat kernel: [ 1544.729846] usb-storage: Bulk data transfer result 0x0 Dec 10 21:56:10 krabat kernel: [ 1544.729848] usb-storage: Attempting to get CSW... Dec 10 21:56:10 krabat kernel: [ 1544.729851] usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes Dec 10 21:56:10 krabat kernel: [ 1544.729934] usb-storage: Status code 0; transferred 122880/122880 Dec 10 21:56:10 krabat kernel: [ 1544.729937] usb-storage: -- transfer complete Dec 10 21:56:10 krabat kernel: [ 1544.729939] usb-storage: Bulk data transfer result 0x0 Dec 10 21:56:10 krabat kernel: [ 1544.729941] usb-storage: Attempting to get CSW... Dec 10 21:56:10 krabat kernel: [ 1544.729944] usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes Dec 10 21:56:40 krabat kernel: [ 1574.887650] usb-storage: command_abort called Dec 10 21:56:40 krabat kernel: [ 1574.887654] usb-storage: usb_stor_stop_transport called Dec 10 21:56:40 krabat kernel: [ 1574.887656] usb-storage: -- cancelling URB Dec 10 21:56:40 krabat kernel: [ 1574.899662] usb-storage: command_abort called Dec 10 21:56:40 krabat kernel: [ 1574.899666] usb-storage: usb_stor_stop_transport called Dec 10 21:56:40 krabat kernel: [ 1574.899668] usb-storage: -- cancelling URB Dec 10 21:56:40 krabat kernel: [ 1574.899686] usb-storage: Status code -104; transferred 0/13 Dec 10 21:56:40 krabat kernel: [ 1574.899688] usb-storage: -- transfer cancelled Dec 10 21:56:40 krabat kernel: [ 1574.899691] usb-storage: Bulk status result = 4 Dec 10 21:56:40 krabat kernel: [ 1574.899693] usb-storage: -- command was aborted Dec 10 21:56:40 krabat kernel: [ 1574.899734] usb-storage: usb_stor_pre_reset Dec 10 21:56:40 krabat kernel: [ 1574.911664] usb-storage: Status code -104; transferred 13/13 Dec 10 21:56:40 krabat kernel: [ 1574.911668] usb-storage: -- transfer cancelled Dec 10 21:56:40 krabat kernel: [ 1574.911670] usb-storage: Bulk status result = 4 Dec 10 21:56:40 krabat kernel: [ 1574.911672] usb-storage: -- command was aborted Dec 10 21:56:40 krabat kernel: [ 1574.911676] usb-storage: usb_stor_pre_reset D ------------------------- As described in the thread on the LKML, after that ehci_hcd is unusable and I have to unload and reload the module (or using unbind to reset ehci_hcd). I've attached a stripped down log (minus ~1GB debug lines without failures) produced with kernel 2.6.32 with all kernel-messages and the outputs of lspci -vvv and lsusb -vvv. Kind regards, Alexander Holler
Created attachment 24146 [details] Output of lspci -vvv (bzip2 compressed)
Created attachment 24147 [details] Output of lsusb -vvv (bzip2 compressed)
Created attachment 24151 [details] Log of 2.6.31.6 kernel-messages (bzip2 compressed) I've added another log produced with kernel 2.6.31.6 where I've got the error during a dd from an usb-dvd to an usb-hdd. The log with kernel 2.6.32 was produced using two usb-hdds and doing a rsync between two partitions in two large encrypted files (using loop and dm-crypt).
Some more details: I'm using the newest BIOS F5 (description "Update for P55 B3 stepping") which is currently available for this MB, and all devices were attached directly to ports on the MB, so no external USB-hubs are involved.
Same problem here ...since I've installed the latest stable kernel 2.6.32 I'm having the same issue on my linux box using an external high speed usb 2.0 HDD If you are working on it,after some times the external USB2 hdd stop to respond and in the /var/log/messages youn can find an "reset high speed USB device using ehci_hcd" message Also for me after that, the "ehci_hcd" module is unusable and I have to unload and reload the module (or using unbind to reset ehci_hcd). Some details about my hw and ws configuration (if can be useful) : - MB INTEL DP43TF (P43 express chipset) with latest bios available - Slackware 13.0 - Gnome 2.28.1 (gnome-slackbuild) - external LaCie Desktop Hard Disk Hi-Speed USB 2.0 : internal disk is a sata hdd 500GB (Seagate ST3500820AS) : the box is connected to the one of four motherboard USB ports (no USB hubs here) - external HDD have a FAT32 and a NTFS-5 partition I noticed the issue only after the kernel update with the latest stable version.. I din't made any changes in my linux box and I'm using the same sw and hw configuration since months. I used different kernels (mainly 2.6.30.5 and 2.6.31.6) without any problem. Thanks Andrea from Italy ps : I'll attach the (I think) most interesting parts of /var/log/messages,daemon, debug and syslog generated on my system when issue happens.
Created attachment 24169 [details] Various Logs (compress tar.gz file(
Workaround for this bug: CONFIG_NO_HZ=n CONFIG_HZ=100 CONFIG_HZ_100=y HZ must be <1000 This found for kernel 2.6.27-gentoo-r10 and tested 2.6.33-zen1. Bug present in all versions, but can need long time period for receive error (~8h continues read for zero load unhuman system). Test system: i5-750/GA-P55M-UD4.
I can't confirm this workaround. Using 2.6.33.2 I'm getting the same failures as described above with a tickless config (CONFIG_NO_HZ=YES) or with CONFIG_HZ=100 or 250. As usual it needs between 10 and 40 minutes to provoke the error when using rsync between two usb-hds (as described above).
I also have USB reliability problems, easily reproducible when just trying to test sdhc memory card in a cardreader with 'badblocks' tool in write-mode (-w option). Could one of the multiple P55 chipset USB related errata [1] require some workaround in the kernel to resolve this issue? Test system: i7-860/ASUS P7P55D-E 1. http://www.intel.com/Assets/PDF/specupdate/322170.pdf
BIOS upgrade to the fresh recently released version 1504 for ASUS P7P55D-E fixed the 'badblocks' problem for me. For anyone still suffering from USB reliability issues, I would suggest checking whether any BIOS update is also available for your motherboard. It's interesting that probably Intel is also not very happy about the quality of BIOS implementations and the lack of (timely) errata fixes for the motherboards based on their chips: http://lwn.net/Articles/429812/
I've upgraded my BIOS (again) and this Intel-HW is still unusable to copy/modify files on two usb-storage devices at the same time (vanilla 2.6.38.2).
All USB bugs should be sent to the linux-usb@vger.kernel.org mailing list, and not entered into bugzilla. Please bring this issue up there, if it is still a problem in the latest kernel release.