Bug 3056

Summary: ehci "fatal error" at startup
Product: Drivers Reporter: Dale K D (dale_d)
Component: USBAssignee: David Brownell (dbrownell)
Status: CLOSED CODE_FIX    
Severity: normal CC: greg, tht
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: all kernels > 2.6.6 Subsystem:
Regression: --- Bisected commit-id:
Attachments: dmesg output from boot (2.6.7) with CONFIG_USB_DEBUG=Y
output from lspci -vv (2.6.7)
output of lspci -vv on Dell Poweredge/Tekram602/FedoraCore2-2.6.6
output of dmesg on Dell Poweredge/Tekram602/FedoraCore2-2.6.6
output of lspci -vv
output of lsusb -vv
patch working for ALI and Intel

Description Dale K D 2004-07-12 10:10:08 UTC
Distribution: Gentoo

Hardware Environment: Athlon

Problem Description: errors in dmesg from ehci, and USB 2 HDD will not work.

Steps to reproduce: Plug HDD into USB 2 port.

ehci_hcd 0000:00:10.3: EHCI Host Controller
ehci_hcd 0000:00:10.3: irq 10, pci mem d69fec00
ehci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 3
ehci_hcd 0000:00:10.3: USB 2.0 enabled, EHCI 1.00, driver 2004-May-10
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 6 ports detected
ehci_hcd 0000:00:10.3: fatal error
ehci_hcd 0000:00:10.3: HC died; cleaning up
process `named' is using obsolete setsockopt SO_BSDCOMPAT
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19
usb usb3: string descriptor 0 read error: -19

The drive will detect/work properly with uhci.

I tried this with 2.6.8-rc1, as well as 2.6.7-mm6 and 2.6.7-mm7 and have the
exact same problem.
Comment 1 David Brownell 2004-07-12 11:16:18 UTC
Please post "lspci -vv" output, and the corresponding initialization 
output with CONFIG_USB_DEBUG defined.  Also /var/log/dmesg 
(dmesg output while system is booting) inluding PCI/ACPI/IRQ 
setup information. 
 
Doesn't look like anything related to the Maxtor; EHCI is just 
getting a fatal error as it starts up.  Which is extremely rare, 
and almost certainly indicates a non-USB problem. 
Comment 2 Dale K D 2004-07-13 06:26:28 UTC
Created attachment 3342 [details]
dmesg output from boot (2.6.7) with CONFIG_USB_DEBUG=Y
Comment 3 Dale K D 2004-07-13 06:27:17 UTC
Created attachment 3343 [details]
output from lspci -vv (2.6.7)
Comment 4 Dale K D 2004-07-13 09:44:37 UTC
*** Bug 2897 has been marked as a duplicate of this bug. ***
Comment 5 David Brownell 2004-07-21 06:48:48 UTC
I'm guessing this is a PCI setup problem, possibly related 
to those VIA quirks (which break all VIA hardware I've got 
access to, fwiw). 
Comment 6 Jan Drewes 2004-08-01 23:48:00 UTC
Created attachment 3455 [details]
output of lspci -vv on Dell Poweredge/Tekram602/FedoraCore2-2.6.6
Comment 7 Jan Drewes 2004-08-01 23:48:54 UTC
Created attachment 3456 [details]
output of dmesg on Dell Poweredge/Tekram602/FedoraCore2-2.6.6
Comment 8 Jan Drewes 2004-08-01 23:51:40 UTC
I have the same problem using Fedora Core 2, kernel 2.6.6-1.435.2.3, on a dell
PowerEdge 2600 system (serverworks chipset), equipped with a Tekram DC-602T
USB2.0 PCI-card based on a NEC chipset.
loading ehci_hcd goes just fine, but as soon as I connect my Maxtor OneTouch
USB/firewire drive to one of the Tekrams ports, I get the error mentioned above.
It is just the same when I modprobe ehci_hcd with the maxtor already attached,
and  it looks like this:
[root@blue2 /]# modprobe ehci_hcd
[root@blue2 /]# Aug  2 08:38:10 blue2 kernel: ehci_hcd 0000:0b:00.2: EHCI Host
Controller
Aug  2 08:38:10 blue2 kernel: ehci_hcd 0000:0b:00.2: irq 11, pci mem 42899c00
Aug  2 08:38:10 blue2 kernel: ehci_hcd 0000:0b:00.2: new USB bus registered,
assigned bus number 2
Aug  2 08:38:10 blue2 kernel: ehci_hcd 0000:0b:00.2: USB 2.0 enabled, EHCI 0.95,
driver 2004-May-10
Aug  2 08:38:10 blue2 kernel: hub 2-0:1.0: USB hub found
Aug  2 08:38:10 blue2 kernel: hub 2-0:1.0: 5 ports detected
Aug  2 08:38:10 blue2 kernel: ehci_hcd 0000:0b:00.2: fatal error
Aug  2 08:38:10 blue2 kernel: ehci_hcd 0000:0b:00.2: HC died; cleaning up
uname -a
Linux blue2 2.6.6-1.435.2.3 #1 Thu Jul 1 08:25:29 EDT 2004 i686 i686 i386 GNU/Linux

I have added attachments with outputs from lspci -vv and dmesg. The HDD runs
fine both with uhci on the mainboards own USB1 as well as with ohci on the NEC
card, but, of course, only very slow.
 
Comment 9 Jan Drewes 2004-08-02 00:04:33 UTC
I forgot to add that the very same harddisk runs with ehci in USB2-mode on the
following 2 systems:
Acer Travelmate 800 LCib Laptop (Intel Centrino Chipset)
Pentium IV 2,4GHz based on an ASUS Mainboard, also Intel chipset.
Tested with different kernel versions from 2.6.5 to 2.6.7, also the very same
Fedora Kernel 2.6.6 mentioned above.
Comment 10 Dale K D 2004-08-27 05:36:57 UTC
Is this problem ever going to be worked on?

The problem still exists with 2.6.8.1

I currently have a box stuck on 2.6.6 because of this bug which was introduced
in the 2.6.7 code.
Comment 11 David Brownell 2004-11-18 22:53:20 UTC
Note that bug #3033 has a similar report (along with the 
original bug report) of "hc died" on startup. 
 
FWIW I haven't reproduced this with a Maxtor or a few 
different controllers.  So it's not that widespread, 
though it's sort of puzzling nonetheless. 
 
Does 2.6.10-rc2 still have this failure?  Does the 
newish "usb-handoff" kernel boot parameter change 
the behavior? 
Comment 12 Stefan Hoelldampf 2005-01-05 03:02:54 UTC
Same problem here with Fedora Core 3 with kernel-2.6.9-1.724_FC3 and
kernel-2.6.10-1.727_FC3. The parameter "usb-handoff" did not help.

It's a PCMCIA-to-USB2.0 Adapter with NEC chipset and a USB2.0 HDD case with
Myson Century CS8818 chipset.

--------------------------------------------------------------------------------
PCI: Enabling device 0000:03:00.1 (0000 -> 0002)
ACPI: PCI interrupt 0000:03:00.1[B] -> GSI 10 (level, low) -> IRQ 10
ohci_hcd 0000:03:00.1: OHCI Host Controller
PCI: Setting latency timer of device 0000:03:00.1 to 64
ohci_hcd 0000:03:00.1: irq 10, pci mem e0a7a000
ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 5
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
ehci_hcd 0000:03:00.2: fatal error
ehci_hcd 0000:03:00.2: HC died; cleaning up
ohci_hcd 0000:03:00.1: wakeup
usb 5-1: new full speed USB device using address 2
usb 5-1: not running at top speed; connect to a high speed hub
scsi1 : SCSI emulation for USB Mass Storage devices
  Vendor: FUJITSU   Model: MHR2040AT         Rev: 40BA
  Type:   Direct-Access                      ANSI SCSI revision: 02
SCSI device sdb: 78140160 512-byte hdwr sectors (40008 MB)
sdb: assuming drive cache: write through
 sdb: sdb1 sdb2 sdb3
Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0
USB Mass Storage device found at 2
--------------------------------------------------------------------------------
Comment 13 Stefan Hoelldampf 2005-01-05 03:06:21 UTC
Created attachment 4337 [details]
output of lspci -vv
Comment 14 Stefan Hoelldampf 2005-01-05 03:08:18 UTC
Created attachment 4338 [details]
output of lsusb -vv
Comment 15 David Brownell 2005-01-05 14:05:11 UTC
Created attachment 4340 [details]
patch working for ALI and Intel

I've submitted this patch for 2.6.11 but it should do as well
with older kernels too.

I'm still not sure what's triggering this chip bug, but for
at least some systems it clearly solves this bug.
Comment 16 Stefan Hoelldampf 2005-01-07 13:19:50 UTC
Thanks, the patch fixes the problem.
Comment 17 David Brownell 2005-01-10 22:00:20 UTC
*** Bug 4007 has been marked as a duplicate of this bug. ***
Comment 18 Thorsten Tasch 2005-01-13 01:19:56 UTC
I've tested the patch and now I can use usb 2.0 high speed mode om Acer
TravelMate 4000.

But it doesn't really work with my usb harddisk since my computer loses the
connection to it very often. Sometimes I have this problem also with another
computer but now I have to reboot because the whole usb subsystem hasn't worked
anymore.
Comment 19 David Brownell 2005-01-13 19:45:14 UTC
Thomas, you should have filed a new bug instead of re-opening  
this old one.  As you said, you don't even have the same symptoms. 
So whatever you're seeing, it's a different bug. 
Comment 20 Thorsten Tasch 2005-01-15 02:38:34 UTC
Sorry, I thought the patch has some problems with my system because without the
patch full speed mode works without problems and with the patch my system
crashes (kernel 2.6.8.1). But now I've also tested the patch with kernel 2.6.10
and with this version it works without problems. My usb harddisk is even more
stable than ever before!