Bug 8535
Description
Nicolas Mailhot
2007-05-24 11:02:48 UTC
Created attachment 11585 [details]
kernel config
Created attachment 11586 [details]
Xorg logs
Created attachment 11587 [details]
xorg conf
Created attachment 11588 [details]
kernel logs
Created attachment 11589 [details]
lsusb
Created attachment 11590 [details]
lspci
*** Bug 8534 has been marked as a duplicate of this bug. *** Can you duplicate this with CONFIG_USB_DEBUG turned on? It would help also to see a usbmon trace of an incident. Follow the instructions in the kernel source file Documentation/usb/usbmon.txt. There are really only a few minor patches in HID area between 2.6.22-rc2 and 2.6.22-rc2-mm2, I'd be surprised that they could be causing what you see, they are basically just adding some quirks which apply to only a few hardware. But just to make sure - could you please revert git-hid.patch and report if you still can reproduce the problem? Thanks. Ok Will try all this friday evening or next tuesday (bed calls now, and we've got a 3-day weekend) To be honest I first dismissed this as a fluke and hoped the newt mm would just fix it. But it's happened several times already, and never on mainline Report received from Zan Lynx: drivers/hid/hid-core.c: report (size 3) (unnumbered) drivers/hid/hid-core.c: report 0 (size 3) = 00 00 00 hid-debug: input Keyboard.00e0 = 0 hid-debug: input Keyboard.00e1 = 0 hid-debug: input Keyboard.00e2 = 0 hid-debug: input Keyboard.00e3 = 0 hid-debug: input Keyboard.00e4 = 0 hid-debug: input Keyboard.00e5 = 0 hid-debug: input Keyboard.00e6 = 0 hid-debug: input Keyboard.00e7 = 0 drivers/hid/hid-core.c: report (size 4294967284) (unnumbered) drivers/hid/hid-core.c: report 0 (size 4294967284) = hid-debug: input Keyboard.00e0 = 0 hid-debug: input Keyboard.00e1 = 0 hid-debug: input Keyboard.00e2 = 0 hid-debug: input Keyboard.00e3 = 0 hid-debug: input Keyboard.00e4 = 0 hid-debug: input Keyboard.00e5 = 0 hid-debug: input Keyboard.00e6 = 0 hid-debug: input Keyboard.00e7 = 0 drivers/hid/hid-core.c: report (size 6) (numbered) drivers/hid/hid-core.c: report 2 (size 5) = 00 00 00 00 00 hid-debug: input Button.0001 = 0 hid-debug: input Button.0002 = 0 hid-debug: input Button.0003 = 0 hid-debug: input Button.0004 = 0 hid-debug: input Button.0005 = 0 hid-debug: input Button.0006 = 0 hid-debug: input Button.0007 = 0 hid-debug: input Button.0008 = 0 hid-debug: input GenericDesktop.X = 0 hid-debug: input GenericDesktop.Y = 0 hid-debug: input GenericDesktop.Wheel = 0 Hangcheck: hangcheck value past margin! drivers/hid/usbhid/hid-core.c: input irq status -75 received i.e. it seems that we are being passed URB with urb->actual_length == 4294967284 from USB core. Jiri, do you want me to revert git-hid.patch or try CONFIG_USB_DEBUG + usbmon first? It's be great if you would have time to test in this order: - check whether it is reproducible without CONFIG_HIDRAW - check whether it is reproducible with git-hid.patch reverted - if it is still reproducible, then there is something wrong in usb core, so we'd definitely need the usbmon traces and let someone (Alan, Greg, Oliver, etc) to try to check usb core patches Thanks a lot Ok, I'll do this. So far I've only added USB_DEBUG + usbmon I had a mouse dead event but it was not followed by the usual keyboard+system death. So I supposed it's somewhere in my usbmon logs but since it's not timestamped I don't known where (25M of usbmon logs so far) Do you want the USBMON logs by PM anyway ? I'll rebuild a kernel without CONFIG_HIDRAW now (a pity, I suppose CONFIG_HIDRAW is responsible for my keyboard MM keys working) Created attachment 11601 [details]
Kernel logs with usb debug
Here are new logs with USB_DEBUG on as demanded by Alan Stern
The usbmon trace is too big I'll try to send it by PM
Mouse was lost several times (restored by switching to the console and
restarting gpm, which killed X), the last time followed by a system freeze
I'm testing without CONFIG_HIDRAW now
I've been greeted with a keyboard death this morning, which sort of answers if the bug happens without CONFIG_HIDRAW usb 2-8.3: unlink qh8-1c02/ffff810005ad6780 start 7 [1/2 us] Next step on Jiri's list is to try with git-hid.patch reverted Thanks a lot for your effort. Just out of curiosity - do you have any other USB devices, or are the HID keyboard and mice the only ones connected to the system? Please let me know how's the situation with git-hid reverted. Thanks a lot. There is a webcam that is connected but never used (gift, waits for a driver in mainline), a dangling harmony cable waiting for Logitech Linux support, and various stuff (smartmedia reader, ipod, X5) with is only connected for short periods of times every other week. And definitely never on the problem kernels Ok, just had an incident with git-hid.patch reverted 3 min from boot to system death that must be a record 16:36:57 Linux version 2.6.22-0.9.rc2.mm1.cfs14.fc8.nim (mockbuild@rousalka.dyndns.org) (gcc version 4.1.2 20070502 (Red Hat 4.1.2-12)) #1 SMP PREEMPT Sun May 27 05:26:42 EDT 2007 [ OK, so we have ruled out HID-related changes completely. I'd bet this is related to USB core, so let's see if Alan comes with any proposal (in the meantime I'll go through USB-related patches in -mm and see whether I'll be able to spot something alarming). I have been trying to reproduce the hang with your .config, but the kernel panics upon bootup, seems related to some sysfs stuff, I'll debug this separately. I have also looked at the other report of the very same problem, and it seems that both of you have SLUB turned on. Could you please try going back to SLAB instead? It could be that we are getting some corruption of urb->actual_length somewhere, some SLUB race might be at fault (I'd be quite surprised if it affects only this one specific case, though). Thanks a lot for your testing effort! Adding Christoph Lameter if you think it's SLUB-related Note that I use almost the same kernel config on mainline, and it has slub on too. Also I did quite a lot of SLUB testing on pre-2.6.22 mm kernel and hit stuff but not this one. Lastly since the bug takes variable time to trigger (sometimes minutes sometimes hours, even days) I should perhaps test again mainline. I've never hit it there but maybe I just didn't run it long enough and was lucky. So far I've looked at only one of your usbmon logs, the short one without CONFIG_HIDRAW. It didn't contain any errors at all, which wasn't much help. All those link and unlink messages from ehci-hcd are strange; I would expect to see messages from usbhid along with them. The really intriguing messages are the -75 errors; that's what I was hoping to see in the usbmon log. Maybe some will show up in the 3 MB file. One possible source of problems is the fact that your keyboard and mouse are non-high-speed devices running from a high-speed controller. You might try doing "rmmod ehci-hcd", forcing the full-speed controller will be used instead. If you do, do the hangs still occur? Another thing to worry about is CONFIG_USB_EHCI_TT_NEWSCHED. It's still a little experimental. You can try turning it off to see what happens. Keyboard & mice are not connected directly but through a hub (used to be a standalone high-speed one, now it's integrated in the monitor and I don't know its capabilities) I'll try: 1. without SLUB 2. without ehci 3. without CONFIG_USB_EHCI_TT_NEWSCHED. All are enabled on mainline kernels where I don't have the problem (and IIRC they were enabled on pre-2.6.22 mm kernels too) If possible, I'd like to see a usbmon trace covering the period when you get one of those "input irq status -75 received" messages. BTW, the integrated hub's capabilities are listed in your USB_DEBUG kernel log. Search through the log file for the string "2-8:1.0". And for some unclear reason, the mouse describes itself (in its Product string) as a Trackball. I tried some more tests on my laptop with this bug today. The bug seems related to using a USB hub. I believe this because when I plugged the device directly into the laptop it worked for many hours, and is still working as I type now. USB_DEBUG gives this about the hub (snippet from a bugged session): May 30 15:11:21 localhost kernel: PM: Adding info for usb:2-2 May 30 15:11:21 localhost kernel: usb 2-2: uevent May 30 15:11:21 localhost kernel: usb 2-2: usb_probe_device May 30 15:11:21 localhost kernel: PM: Adding info for No Bus:usbdev2.2_ep00 May 30 15:11:21 localhost kernel: usb 2-2: configuration #1 chosen from 1 choice May 30 15:11:21 localhost kernel: usb 2-2: adding 2-2:1.0 (config #1, interface 0) May 30 15:11:21 localhost kernel: PM: Adding info for usb:2-2:1.0 May 30 15:11:21 localhost kernel: usb 2-2:1.0: uevent May 30 15:11:21 localhost kernel: usb 2-2:1.0: uevent May 30 15:11:21 localhost kernel: hub 2-2:1.0: usb_probe_interface May 30 15:11:21 localhost kernel: hub 2-2:1.0: usb_probe_interface - got id May 30 15:11:21 localhost kernel: hub 2-2:1.0: USB hub found May 30 15:11:21 localhost kernel: hub 2-2:1.0: 7 ports detected May 30 15:11:21 localhost kernel: hub 2-2:1.0: standalone hub May 30 15:11:21 localhost kernel: hub 2-2:1.0: ganged power switching May 30 15:11:21 localhost kernel: hub 2-2:1.0: global over-current protection May 30 15:11:21 localhost kernel: PM: Adding info for No Bus:usbdev2.2_ep81 May 30 15:11:21 localhost kernel: hub 2-2:1.0: TT per port May 30 15:11:21 localhost kernel: hub 2-2:1.0: TT requires at most 8 FS bit times (666 ns) May 30 15:11:21 localhost kernel: hub 2-2:1.0: power on to power good time: 100ms May 30 15:11:21 localhost kernel: hub 2-2:1.0: local power source is good May 30 15:11:21 localhost kernel: hub 2-2:1.0: no over-current condition exists May 30 15:11:21 localhost kernel: hub 2-2:1.0: enabling power on all ports May 30 15:11:21 localhost kernel: usb 2-2: link qh256-0001/ffff81000462d240 start 255 [1/0 us] May 30 15:11:21 localhost kernel: drivers/usb/core/inode.c: creating file '002' May 30 15:11:21 localhost kernel: hub 3-0:1.0: state 7 ports 3 chg 0000 evt 0000 May 30 15:11:21 localhost kernel: hub 2-2:1.0: state 7 ports 7 chg 0000 evt 0000 May 30 15:11:21 localhost kernel: hub 2-2:1.0: port 1, status 0301, change 0001, 1.5 Mb/s May 30 15:11:21 localhost kernel: hub 2-2:1.0: debounce: port 1: total 100ms stable 100ms status 0x301 May 30 15:11:21 localhost kernel: hub 2-2:1.0: port 1 not reset yet, waiting 10ms May 30 15:11:21 localhost kernel: usb 2-2.1: new low speed USB device using ehci_hcd and address 3 May 30 15:11:21 localhost kernel: hub 2-2:1.0: port 1 not reset yet, waiting 10ms May 30 15:11:21 localhost kernel: usb 2-2.1: skipped 1 descriptor after interface May 30 15:11:21 localhost kernel: usb 2-2.1: default language 0x0409 May 30 15:11:21 localhost kernel: usb 2-2.1: new device found, idVendor=045e, idProduct=0059 May 30 15:11:21 localhost kernel: usb 2-2.1: new device strings: Mfr=1, Product=2, SerialNumber=0 May 30 15:11:21 localhost kernel: usb 2-2.1: Product: Microsoft Wireless Intellimouse Explorer The bug has now been merged in mainline, and affects post-2.6.22 fedora-devel kernels (kernel-2.6.23-0.28.rc0.git6.fc8.x86_64, kernel-2.6.23-0.29.rc0.git6.fc8.x86_64) Jul 18 20:51:08 rousalka kernel: drivers/hid/usbhid/hid-core.c: input irq status -75 received In itself a -75 error isn't a problem. It means there was a communications failure, which can happen from time to time (especially with high-speed USB). But if it causes a system crash then something really is wrong. There's code in the usbhid driver to retry and recover from those errors; it should be running. I'd still like to see a usbmon trace for a period including one of those -75 errors. The usbmon log should compress really well, up to 4 times or so. Is it 3MB after bzip2 or before? (In reply to comment #28) > The usbmon log should compress really well, up to 4 times or so. Is it 3MB > after bzip2 or before? compressed with bzip2 at maximum level of compression IIRC. I've attached everything that fit under bugzilla's 1M lilit there Attached it where? Created attachment 12091 [details]
'usbmon' run ("cat /sys/kernel/debug/usbmon/5u")
Attached is a 'usbmon' run when 'my cursor got lost'.
System is a Thinkpad X60 running Fedora rawhide (kernel-PAE-2.6.23-0.43.rc0.git16.fc8).
System usually boots up, always 'losing' first character typed on usb keyboard and with cursor 'missing'. Pulling out and reinserting cursor recovers cursor for a while.
I started this log with a functioning cursor, and killed the 'cat' after losing the cursor, pulling it out, and plugging it back in.
Hope this helps....
tom
Created attachment 12092 [details]
cat /sys/kernel/debug/usbmon/5u > /tmp/usbmon-5u.2
Here is another output from "cat /sys/kernel/debug/usbmon/5u > /tmp/usbmon-5u.2"
Here is output of 'lsusb':
[root@localhost tmp]# lsusb
Bus 5 Device 9: ID 0461:4d03 Primax Electronics, Ltd Kensington Mouse-in-a-box
Bus 5 Device 5: ID 05f3:0203 PI Engineering, Inc.
Bus 5 Device 2: ID 17ef:1000
Bus 5 Device 1: ID 0000:0000
Bus 3 Device 1: ID 0000:0000
Bus 4 Device 3: ID 0483:2016 SGS Thomson Microelectronics Fingerprint Reader
Bus 4 Device 2: ID 0a5c:2110 Broadcom Corp.
Bus 4 Device 1: ID 0000:0000
Bus 2 Device 1: ID 0000:0000
Bus 1 Device 1: ID 0000:0000
[root@localhost tmp]#
The log in comment #31 shows no errors, just an unplug and a replug. The log in comment #32 does show one error: dff46c30 1863662787 C Ii:5:008:1 0:8 -4 This occurred 7 seconds before the unplug. What type of USB controller do you have on bus 5? And what is device 2 on bus 5? Is it an external hub? Or is it something built into the Kensington Mouse? Device 2 on bus 5 is a USB mouse (says HP on the top, suspect Kensington makes it). I'm running a Thinkpad X60, believe with ICH7: [root@localhost Downloads]# lspci 00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub (rev 03) 00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03) 00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03) 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 02) 00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 (rev 02) 00:1c.2 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 3 (rev 02) 00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 02) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2) 00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge (rev 02) 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA AHCI Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 02) 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller 03:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG Network Connection (rev 02) 15:00.0 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev b4) 15:00.1 FireWire (IEEE 1394): Ricoh Co Ltd R5C552 IEEE 1394 Controller (rev 09) 15:00.2 Generic system peripheral [0805]: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 18) [root@localhost Downloads]# The mouse is connected directly into a Thinkpad 'dock'. Created attachment 12108 [details]
/var/log/messages showing BUGs, INFOs, etc.
With latest Fedora kernel (kernel-PAE-2.6.23-0.45.rc0.git16.fc8), I now get a crash with log when I remove USB mouse (mouse was dead prior to removal).
Not sure its the same issue as above, but I'm posting it here for completeness.
In fact your Kensington Mouse was device 9 on bus 5. Device 2 must be the dock's internal hub. Maybe the dock is behaving strangely... What does /proc/bus/usb/devices show? The crash when you unplugged the mouse is a completely different issue. If it continues for the next few kernel releases you should file a separate new bug report for it. Sorry for the confusion.... I have 2 configurations: one @home, one @work. The usbmon logs were both from the @home config. The log in comment 35 is from the @work config. Here is 'cat /proc/bus/usb/devices' from @work: (I'll log the one from @home later tonight). The dock behaves 'fine' with earlier kernels, btw. [root@localhost ~]# cat $_ T: Bus=05 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=480 MxCh= 8 B: Alloc= 0/800 us ( 0%), #Int= 4, #Iso= 0 D: Ver= 2.00 Cls=09(hub ) Sub=00 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.22-0.15.rc0.git1.fc8PAE ehci_hcd S: Product=EHCI Host Controller S: SerialNumber=0000:00:1d.7 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 4 Ivl=256ms T: Bus=05 Lev=01 Prnt=01 Port=05 Cnt=01 Dev#= 2 Spd=480 MxCh= 4 D: Ver= 2.00 Cls=09(hub ) Sub=00 Prot=02 MxPS=64 #Cfgs= 1 P: Vendor=17ef ProdID=1000 Rev= 0.01 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 2mA I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=01 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 1 Ivl=256ms I:* If#= 0 Alt= 1 #EPs= 1 Cls=09(hub ) Sub=00 Prot=02 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 1 Ivl=256ms T: Bus=05 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 5 Spd=1.5 MxCh= 0 D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=045e ProdID=0084 Rev= 0.00 S: Manufacturer=Microsoft S: Product=Basic Optical Mouse C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=usbhid E: Ad=81(I) Atr=03(Int.) MxPS= 4 Ivl=10ms T: Bus=05 Lev=02 Prnt=02 Port=01 Cnt=02 Dev#= 6 Spd=480 MxCh= 4 D: Ver= 2.00 Cls=09(hub ) Sub=00 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0409 ProdID=0058 Rev= 1.00 S: Manufacturer=NEC Corporation S: Product=USB2.0 Hub Controller C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 1 Ivl=256ms T: Bus=05 Lev=03 Prnt=06 Port=01 Cnt=01 Dev#= 7 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=03f0 ProdID=1017 Rev= 1.00 S: Manufacturer=Hewlett-Packard S: Product=hp LaserJet 1300 S: SerialNumber=00CNCB954325 C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr= 2mA I: If#= 0 Alt= 0 #EPs= 3 Cls=07(print) Sub=01 Prot=03 Driver=usblp E: Ad=01(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=255ms I:* If#= 0 Alt= 1 #EPs= 2 Cls=07(print) Sub=01 Prot=02 Driver=usblp E: Ad=01(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I: If#= 0 Alt= 2 #EPs= 1 Cls=07(print) Sub=01 Prot=01 Driver=usblp E: Ad=01(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms T: Bus=05 Lev=03 Prnt=06 Port=02 Cnt=02 Dev#= 8 Spd=1.5 MxCh= 0 D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=413c ProdID=2005 Rev= 1.04 S: Manufacturer=DELL S: Product=DELL USB Keyboard C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=01 Driver=usbhid E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=10ms T: Bus=04 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.22-0.15.rc0.git1.fc8PAE uhci_hcd S: Product=UHCI Host Controller S: SerialNumber=0000:00:1d.3 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 2.00 Cls=e0(unk. ) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0a5c ProdID=2110 Rev= 1.00 S: Manufacturer=Broadcom Corp S: Product=BCM2045B C:* #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I:* If#= 1 Alt= 2 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 32 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 32 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 64 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 64 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 64 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 64 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=84(I) Atr=02(Bulk) MxPS= 32 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 32 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 0 Cls=fe(app. ) Sub=01 Prot=00 Driver=(none) T: Bus=04 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 3 Spd=12 MxCh= 0 D: Ver= 1.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=0483 ProdID=2016 Rev= 0.01 S: Manufacturer=STMicroelectronics S: Product=Biometric Coprocessor C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) E: Ad=81(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 4 Ivl=20ms T: Bus=03 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.22-0.15.rc0.git1.fc8PAE uhci_hcd S: Product=UHCI Host Controller S: SerialNumber=0000:00:1d.2 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms T: Bus=02 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.22-0.15.rc0.git1.fc8PAE uhci_hcd S: Product=UHCI Host Controller S: SerialNumber=0000:00:1d.1 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms T: Bus=01 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.22-0.15.rc0.git1.fc8PAE uhci_hcd S: Product=UHCI Host Controller S: SerialNumber=0000:00:1d.0 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms [root@localhost ~]# Created attachment 12118 [details]
Output of 'cat /sys/kernel/debug/usbmon/5u
(In reply to comment #38) > Created an attachment (id=12118) [details] > Output of 'cat /sys/kernel/debug/usbmon/5u > Above attachment is another 'lost mouse and system froze a few minutes later'. I'm 'standardizing' the setup as much as I can: same mouse in same USB socket in both my @home and @work setups. Here is output of 'lsusb' from @home setup: [root@localhost ~]# lsusb Bus 5 Device 5: ID 05f3:0203 PI Engineering, Inc. Bus 5 Device 6: ID 045e:0040 Microsoft Corp. Wheel Mouse Optical Bus 5 Device 2: ID 17ef:1000 Bus 5 Device 1: ID 0000:0000 Bus 4 Device 2: ID 0a5c:2110 Broadcom Corp. Bus 4 Device 3: ID 0483:2016 SGS Thomson Microelectronics Fingerprint Reader Bus 4 Device 1: ID 0000:0000 Bus 3 Device 1: ID 0000:0000 Bus 2 Device 1: ID 0000:0000 Bus 1 Device 1: ID 0000:0000 [root@localhost ~]# Here is output of 'cat /proc/bus/usb/devices': [root@localhost ~]# cat /proc/bus/usb/devices T: Bus=05 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=480 MxCh= 8 B: Alloc= 0/800 us ( 0%), #Int= 4, #Iso= 0 D: Ver= 2.00 Cls=09(hub ) Sub=00 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.22-0.15.rc0.git1.fc8PAE ehci_hcd S: Product=EHCI Host Controller S: SerialNumber=0000:00:1d.7 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 4 Ivl=256ms T: Bus=05 Lev=01 Prnt=01 Port=05 Cnt=01 Dev#= 2 Spd=480 MxCh= 4 D: Ver= 2.00 Cls=09(hub ) Sub=00 Prot=02 MxPS=64 #Cfgs= 1 P: Vendor=17ef ProdID=1000 Rev= 0.01 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 2mA I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=01 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 1 Ivl=256ms I:* If#= 0 Alt= 1 #EPs= 1 Cls=09(hub ) Sub=00 Prot=02 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 1 Ivl=256ms T: Bus=05 Lev=02 Prnt=02 Port=01 Cnt=01 Dev#= 5 Spd=1.5 MxCh= 0 D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=05f3 ProdID=0203 Rev= 1.20 S: Manufacturer=P.I. Engineering S: Product=P� Keyboard/Mouse to USB 2dapter C:* #Ifs= 2 Cfg#= 1 Atr=80 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=01 Driver=usbhid E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=10ms I:* If#= 1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=usbhid E: Ad=82(I) Atr=03(Int.) MxPS= 4 Ivl=10ms T: Bus=05 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 6 Spd=1.5 MxCh= 0 D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=045e ProdID=0040 Rev= 3.00 S: Manufacturer=Microsoft S: Product=Microsoft 3-Button Mouse with IntelliEye(TM) C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=usbhid E: Ad=81(I) Atr=03(Int.) MxPS= 4 Ivl=10ms T: Bus=04 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.22-0.15.rc0.git1.fc8PAE uhci_hcd S: Product=UHCI Host Controller S: SerialNumber=0000:00:1d.3 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 2.00 Cls=e0(unk. ) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0a5c ProdID=2110 Rev= 1.00 S: Manufacturer=Broadcom Corp S: Product=BCM2045B C:* #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I:* If#= 1 Alt= 2 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 32 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 32 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 64 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 64 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 64 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 64 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=84(I) Atr=02(Bulk) MxPS= 32 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 32 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 0 Cls=fe(app. ) Sub=01 Prot=00 Driver=(none) T: Bus=04 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 3 Spd=12 MxCh= 0 D: Ver= 1.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=0483 ProdID=2016 Rev= 0.01 S: Manufacturer=STMicroelectronics S: Product=Biometric Coprocessor C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) E: Ad=81(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 4 Ivl=20ms T: Bus=03 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.22-0.15.rc0.git1.fc8PAE uhci_hcd S: Product=UHCI Host Controller S: SerialNumber=0000:00:1d.2 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms T: Bus=02 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.22-0.15.rc0.git1.fc8PAE uhci_hcd S: Product=UHCI Host Controller S: SerialNumber=0000:00:1d.1 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms T: Bus=01 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.22-0.15.rc0.git1.fc8PAE uhci_hcd S: Product=UHCI Host Controller S: SerialNumber=0000:00:1d.0 C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms [root@localhost ~]# I can produce these pretty much 'at will'. Let me know if you need more..... Here's something to try. First, make sure the CONFIG_USB_DEBUG is enabled in your kernel's configuration. Then the next time you get a mouse freeze, make copies of the files in /sys/class/usb_host/usb_host5 and attach them to this bug report. OK. Will need to get/make a Fedora kernel with CONFIG_USB_DEBUG enabled: currently not set. I have a feeling that this might be caused by gregkh-usb-usb-ehci-cpufreq-fix.patch... this patch was an attempt to keep split-interrupt transactions from being run to low/full speed devices on the EHCI controller during cpu frequency transitions. You problem seems to be with low/full speed devices on the EHCI controller, with exactly the version of the kernel when this patch was added, and you appear to have cpu_freq enabled and a processor that would use it... Could you back this patch out and test? It may be easier to disable CONFIG_CPU_FREQ than finding the patch. I looked all over the patch, and although it seems rather murky, it also seems not doing anything wrong with the hw_token. I figure that qtd->length somehow gets zeroed (by ehci_qtd_init perhaps), or maybe qh token ends with bigger length than qtd here: if (state == QH_STATE_IDLE && cpu_to_hc32(ehci, qtd->qtd_dma) == qh->hw_current) token = hc32_to_cpu(ehci, qh->hw_token); But without having Nicolas' or Tom's hardware I cannot tinker with it and find it. BTW, Nicolas sent me his logs out of line, and he has the same thing as Tom in comment #32 (Nicolas got an Opteron but otherwise same thing): ffff8100057f6100 1435001203 S Ii:2:004:1 -115:8 4 < ffff8100055ccc00 1436320906 C Ii:2:003:1 0:8 8 = 00000600 00000000 ....... ffff8100055ccc00 1467276116 S Ii:2:003:1 -115:8 8 < ffff8100057f6100 1467276128 C Ii:2:004:1 0:8 -4 Thanks, I was struggling yesterday trying to find/undo the patch. In any case, I built a version of Fedora Rawhide with USB_DEBUB set (kernel-2.6.23-0.44.tblv3.rc0.git16.fc8), and here is what I got on booting. The situation was that the mouse was 'alive' during boot (with 'console mouse services'and gdmgreeter) but almost immediately 'dead' after entering password and rendering of gnome screen. I pulled the mouse and reinserted. Got this in /var/log/messages. Is this useful? I'll post the tar of /sys/class/usb_host/usb_host5 in the next message. Need more? Need me to disable CONFIG_CPU_FREQ? Unless I hear otherwise, I'll attempt to build a CPU_FREQ-less kernel ... tom Jul 25 06:20:26 localhost kernel: hub 5-6:1.0: state 7 ports 4 chg 0000 evt 0008 Jul 25 06:20:26 localhost kernel: hub 5-6:1.0: port 3, status 0100, change 0001, 12 Mb/s Jul 25 06:20:26 localhost kernel: usb 5-6.3: USB disconnect, address 6 Jul 25 06:20:26 localhost kernel: usb 5-6.3: unregistering device Jul 25 06:20:26 localhost kernel: usb 5-6.3: usb_disable_device nuking all URBs Jul 25 06:20:26 localhost kernel: usb 5-6.3: unlink qh8-0e01/f7a56460 start 7 [1/2 us] Jul 25 06:20:26 localhost kernel: ehci_hcd 0000:00:1d.7: shutdown urb f7ebd870 pipe 40408680 ep1in-intr Jul 25 06:20:26 localhost kernel: usb 5-6.3: unregistering interface 5-6.3:1.0 Jul 25 06:20:26 localhost kernel: usb_endpoint usbdev5.6_ep81: ep_device_release called for usbdev5.6_ep81 Jul 25 06:20:26 localhost kernel: usb 5-6.3:1.0: uevent Jul 25 06:20:26 localhost kernel: usb 5-6.3:1.0: uevent Jul 25 06:20:26 localhost kernel: usb_endpoint usbdev5.6_ep00: ep_device_release called for usbdev5.6_ep00 Jul 25 06:20:26 localhost kernel: usb 5-6.3: uevent Jul 25 06:20:26 localhost kernel: hub 5-6:1.0: debounce: port 3: total 100ms stable 100ms status 0x100 Jul 25 06:20:30 localhost kernel: hub 5-6:1.0: state 7 ports 4 chg 0000 evt 0008 Jul 25 06:20:30 localhost kernel: hub 5-6:1.0: port 3, status 0301, change 0001, 1.5 Mb/s Jul 25 06:20:30 localhost kernel: hub 5-6:1.0: debounce: port 3: total 100ms stable 100ms status 0x301 Jul 25 06:20:30 localhost kernel: hub 5-6:1.0: port 3 not reset yet, waiting 10ms Jul 25 06:20:31 localhost kernel: usb 5-6.3: new low speed USB device using ehci_hcd and address 7 Jul 25 06:20:31 localhost kernel: hub 5-6:1.0: port 3 not reset yet, waiting 10ms Jul 25 06:20:31 localhost kernel: usb 5-6.3: skipped 1 descriptor after interface Jul 25 06:20:31 localhost kernel: usb 5-6.3: default language 0x0409 Jul 25 06:20:31 localhost kernel: usb 5-6.3: new device strings: Mfr=1, Product=3, SerialNumber=0 Jul 25 06:20:31 localhost kernel: usb 5-6.3: Product: Microsoft 3-Button Mouse with IntelliEye(TM) Jul 25 06:20:31 localhost kernel: usb 5-6.3: Manufacturer: Microsoft Jul 25 06:20:31 localhost kernel: usb 5-6.3: uevent Jul 25 06:20:31 localhost kernel: usb 5-6.3: usb_probe_device Jul 25 06:20:31 localhost kernel: usb 5-6.3: configuration #1 chosen from 1 choice Jul 25 06:20:31 localhost kernel: usb 5-6.3: adding 5-6.3:1.0 (config #1, interface 0) Jul 25 06:20:31 localhost kernel: usb 5-6.3:1.0: uevent Jul 25 06:20:31 localhost kernel: usb 5-6.3:1.0: uevent Jul 25 06:20:31 localhost kernel: usb 5-6.3:1.0: uevent Jul 25 06:20:31 localhost kernel: usbhid 5-6.3:1.0: usb_probe_interface Jul 25 06:20:31 localhost kernel: usbhid 5-6.3:1.0: usb_probe_interface - got id Jul 25 06:20:31 localhost kernel: input: Microsoft Microsoft 3-Button Mouse with IntelliEye(TM) as /class/input/input9 Jul 25 06:20:31 localhost kernel: usb 5-6.3: link qh8-0e01/f7a56460 start 7 [1/2 us] Jul 25 06:20:31 localhost kernel: input: USB HID v1.10 Mouse [Microsoft Microsoft 3-Button Mouse with IntelliEye(TM)] on usb-0000:00:1d.7-6.3 Jul 25 06:20:31 localhost kernel: drivers/usb/core/inode.c: creating file '007' Jul 25 06:20:55 localhost kernel: hub 5-6:1.0: state 7 ports 4 chg 0000 evt 0000 Created attachment 12135 [details]
'tar cf usb_host5.tar /sys/class/usb_host/usb_host5': mouse dead->mouse reinserted
Created attachment 12136 [details]
'usbmon' run ("cat /sys/kernel/debug/usbmon/5u") for above 'run'
OK.... I built a Fedora Rawhide kernel with USB_DEBUG enabled and CPU_FREQ disabled (kernel-2.6.23-0.44.tblv4.rc0.git16.fc8). This system seems definitely happier: mouse and keyboard seem to come up 'working', I have not observed any 'lost keypresses', and the mouse has not 'dropped off the end of the earth'. On the other hand, I'm clocking my laptop at maximum rate.... Any 'usbmon' info from this that would be useful? What else can I do to help? Are you sure you had the debugging version of the ehci-hcd driver loaded when you created the attachment in comment #45? That directory should contain a bunch of files with names like "registers" and "async". (And by the way, be careful to capture the file contents while the mouse is dead, before you unplug it.) I don't think usbmon will be any use for the non-CPU_FREQ experiment. Stuart will have to tell you how to debug that aspect. No, not sure: I compiled kernel with 'CONFIG_USB_DEBUG', but perhaps I need to run tar with '--ignore-failed-read'..... I do so next time. This is harder than it appears..... Looks like some of the files in /sys/class/usb_host/usb_host5 are all zeros. Here is the scenario and what I did: 1. Booted with kernel built with CONFIG_USB_DEBUG. 2. When gdmgreeter came up, mouse was 'lost' (i.e. cursor on screen didn't move). 3. As per request in #48, I CTL-ALT-F1 to console, logged in as root. 4. I did a 'tar -cf usb_host5-2.tar --ignore-failed-read /sys/class/usb_host/usb_host5' 5. Here is output from running tar: tar: Removing leading `/' from member names tar: /sys/class/usb_host/usb_host5/companion: File shrank by 4096 bytes; padding with zeros tar: /sys/class/usb_host/usb_host5/registers: File shrank by 3396 bytes; padding with zeros tar: /sys/class/usb_host/usb_host5/periodic: File shrank by 728 bytes; padding with zeros tar: /sys/class/usb_host/usb_host5/async: File shrank by 4096 bytes; padding with zeros tar: /sys/class/usb_host/usb_host5/uevent: Warning: Cannot open: Permission denied 6. Here is 'ls -l /sys/class/usb_host/usb_host5': total 0 -r--r--r-- 1 root root 4096 2007-07-25 15:00 async -rw-r--r-- 1 root root 4096 2007-07-25 15:00 companion lrwxrwxrwx 1 root root 0 2007-07-25 14:59 device -> ../../../devices/pci0000:00/0000:00:1d.7 -r--r--r-- 1 root root 4096 2007-07-25 15:00 periodic -r--r--r-- 1 root root 4096 2007-07-25 15:00 registers lrwxrwxrwx 1 root root 0 2007-07-25 14:59 subsystem -> ../../../class/usb_host --w------- 1 root root 4096 2007-07-25 14:59 uevent 7. So, additional files in this directary are 'empty', and not included in tar file. 8. Pulling the plug on the mouse while still in this terminal produced lots of spew, starting with: hub 5-6:1.0: state 7 ports 4 chg 0000 evt 0008 hub 5-6:1.0: port 3, status 0100, change 0001, 12 Mb/s usb 5-6.3: USB disconnect, address 6 usb 5-6.3: unregistering device usb 5-6.3: usb_disable_device nuking all URBs usb 5-6.3: unlink qh8-0e01/dfc855a0 start 7 [1/2 us] ehci_hcd 0000:00:1d.7: shutdown urb f7ef32d0 pipe 40408680 ep1in-intr usb 5-6.3: unregistering interface 5-6.3:1.0 usb_endpoint usbdev5.6_ep81: ep_device_release called for usbdev5.6_ep81 ============================================================================= BUG sysfs_dir_cache (Not tainted): Padding overwritten. 0x64646150-0x20676e69 ----------------------------------------------------------------------------- INFO: Slab 0xc1c3c5d0 used=42 fp=0x00000000 flags=0x40000083 [<c04062b1>] show_trace_log_lvl+0x1a/0x2f [<c0406d4a>] show_trace+0x12/0x14 [<c0406d62>] dump_stack+0x16/0x18 [<c0484896>] slab_err+0x3e/0x44 [<c0484927>] slab_pad_check+0x8b/0xc6 [<c04849fa>] check_slab+0x98/0xa4 [<c0485623>] __slab_free+0xbd/0x251 [<c0485da9>] kmem_cache_free+0x8a/0xbe [<c04c4b75>] release_sysfs_dirent+0x6e/0x83 [<c04c4d98>] sysfs_addrm_finish+0x1dd/0x1f6 [<c04c3ac2>] sysfs_hash_and_remove+0x52/0x7f [<c04c41bf>] sysfs_remove_file+0xd/0xf [<c057160c>] device_remove_file+0x1f/0x2a [<c0571883>] device_del+0x204/0x288 [<c058b517>] usb_disable_device+0xdf/0x142 [<c0586d6e>] usb_disconnect+0xd3/0x16c [<c058819a>] hub_thread+0x64d/0xeb4 [<c043e360>] kthread+0x3b/0x63 [<c0405d53>] kernel_thread_helper+0x7/0x10 ======================= and ending with: BUG: unable to handle kernel paging request at virtual address 5a5a5a76 printing eip: c04c3a90 *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: i915 drm netconsole ipv6 nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink xt_tcpudp ipt_REJECT iptable_filter ip_tables x_tables cpufreq_ondemand acpi_cpufreq dm_multipath video sbs button bay dock battery ac kvm_intel kvm sr_mod cdrom snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq hci_usb firewire_ohci snd_seq_device irda bluetooth snd_pcm_oss snd_mixer_oss e1000 parport_pc snd_pcm ata_piix parport firewire_core crc_ccitt crc_itu_t i2c_i801 usblp sdhci serio_raw rtc_cmos mmc_core snd_timer ata_generic i2c_core snd iTCO_wdt soundcore iTCO_vendor_support snd_page_alloc sg dm_snapshot dm_zero dm_mirror dm_mod ahci libata sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd CPU: 0 EIP: 0060:[<c04c3a90>] Not tainted VLI EFLAGS: 00010206 (2.6.23-0.44.tblv3.rc0.git16.fc8 #1) EIP is at sysfs_hash_and_remove+0x20/0x7f eax: f7ef6480 ebx: c06d90dc ecx: f7ef6490 edx: 5a5a5a5a esi: f7ef6480 edi: dfeae44c ebp: c2e8be7c esp: c2e8be60 ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 Process khubd (pid: 204, ti=c2e8b000 task=c2ea6000 task.ti=c2e8b000) Stack: f7ef6480 00000000 00000000 00000000 dfeae534 dfeae44c dfeae44c c2e8be84 c04c5bdb c2e8be94 c05731a8 dfeae430 dfeae44c c2e8beac c0571891 dfe070b8 dfeae430 dfe07060 dfeae44c c2e8bee0 c058b517 c06f01b0 c06f076e dfe071fc Call Trace: [<c04062b1>] show_trace_log_lvl+0x1a/0x2f [<c0406361>] show_stack_log_lvl+0x9b/0xa3 [<c0406521>] show_registers+0x1b8/0x289 [<c0406709>] die+0x117/0x24a [<c06318ac>] do_page_fault+0x492/0x563 [<c06300da>] error_code+0x72/0x78 [<c04c5bdb>] sysfs_remove_link+0xb/0xd [<c05731a8>] bus_remove_device+0x22/0x82 [<c0571891>] device_del+0x212/0x288 [<c058b517>] usb_disable_device+0xdf/0x142 [<c0586d6e>] usb_disconnect+0xd3/0x16c [<c058819a>] hub_thread+0x64d/0xeb4 [<c043e360>] kthread+0x3b/0x63 [<c0405d53>] kernel_thread_helper+0x7/0x10 ======================= Code: 14 89 d0 5b 5e 5f 5d c3 90 90 90 55 89 e5 57 56 89 c6 53 89 d3 83 ec 10 85 c0 74 5f 8d 45 e4 89 f2 e8 6f 0e 00 00 8d 4e 10 eb 24 <80> 7a 1c 00 74 19 8b 72 14 89 df ac ae 75 08 84 c0 75 f8 31 c0 EIP: [<c04c3a90>] sysfs_hash_and_remove+0x20/0x7f SS:ESP 0068:c2e8be60 I tried to repeat this by rebooting, etc. Followed exactly the same steps until this last one: this time I got a hard panic with no sync. I attach the complete dmesg output below. Suggestions, please! Created attachment 12141 [details]
dmesg output showing borkage when 'lost' USB mouse is removed.
Don't worry too much about that bug and crash; they are a completely separate issue. Where's the tar file? Created attachment 12157 [details]
tar of usb_host5
Here it is, but it looks just like the last one.
I just changed CONFIG_USB_DEBUG. Are there additional configs that need to be set as well?
Is there a different BZ that I should be filing the 'crash' to?
Also, another data point: my ThinkPad X60 'dock' has 4 USB jacks; the ThinkPad itself has 3 that are exposed when in the dock.
If I plug the mouse in any of the ports on the dock, the mouse gets connected via something like:
Jul 26 06:18:13 localhost kernel: usb 5-6.2: new low speed USB device using ehci_hcd and address 5
If I plug the mouse in any of the ports on the laptop (not the dock), the mouse gets connected with something like:
Jul 26 07:28:24 localhost kernel: usb 3-1: new low speed USB device using uhci_hcd and address 3
In this configuration, the system (latest Fedora Rawhide: kernel-PAE-2.6.23-0.47.rc1.git1.fc8) seems stable, and the mouse and keyboard seem just fine.
Never mind the tar archive. The most important thing I wanted to check was the "async" file, and it looks like that one really did shrink down to nothing. Besides, the fact that everything works without CPU_FREQ means that Stuart's trail is the correct one to follow. He says that he has been able to reproduce symptoms similar to yours and he's trying to come up with a fix. Don't worry about reporting the BUG; I have already notified the appropriate person. The change in behavior according to whether you plug the mouse into the dock or directly into the computer is expected. This particular problem manifests whenever a low- or full-speed device (like your mouse) is plugged into a high-speed hub (like the one in your dock). Note from Tejun Heo <htejun@gmail.com>: http://article.gmane.org/gmane.linux.kernel/556682/raw Does this patch fix the problem? Please test, thanks. I was able to reproduce these symptoms on a system with an nVidia EHCI controller, and I think I've found the problem. The EHCI 1.0 spec, section 4.12.2.5, says... "If the Active bit is a one and the SplitXState is DoStart (regardless of the value of S-mask), the host controller will simply set Active bit to a zero. The host controller is not required to write the transfer state back to the current qTD." Apparently the Intel & Broadcom controllers *don't* write the transfer state back to the current qTD, but nVidia does... so, with nVidia, when the inactivate bit is set, the controller will set the active bit to 0 in the qTD, and then, when it is reactivated, my patch just sets the active bit back to 1 in the transfer overlay, not in the qTD... so the next time an IRQ comes in, qh_completions sees that the qtd isn't active and assumes it is finished, even though it isn't. I believe the fix will be something like this. It fixes the problem on my system, but I'm still looking to make sure I don't see any problems with it. --- linux-2.6.22-rc1-mm1-orig/drivers/usb/host/ehci-sched.c.orig 2007-07-27 03:47:37.000000000 -0500 +++ linux-2.6.22-rc1-mm1-orig/drivers/usb/host/ehci-sched.c 2007-07-27 03:47:40.000000000 -0500 @@ -566,12 +566,15 @@ static void qh_reactivate_split_intr_qhs if (safe == 0) { not_done = 1; } else if (safe > 0) { + struct ehci_qtd * qtd; + qtd = list_entry (qh->qtd_list.next, struct ehci_qtd, qtd_list); /* See EHCI 1.0 section 4.15.2.4. */ token = qh->hw_token; qh->hw_token = (token | HALT_BIT) & ~ACTIVE_BIT; wmb(); qh->hw_info1 &= ~INACTIVATE_BIT; wmb(); + qtd->hw_token = (token & ~HALT_BIT) | qh->was_active; qh->hw_token = (token & ~HALT_BIT) | qh->was_active; } } Ok, I have a patch to fix this. I'm still testing on a couple systems to make sure it's OK, but there isn't really much to it. Which kernel should I make this patch against? It applies on top of my previous patch, which is in the -mm kernel... should my patch be against the latest -mm kernel? Thanks! Post a patch against 2.6.23-rc1; that kernel has your other changes too. Created attachment 12215 [details]
proposed patch for ehci cpufreq issue with nvidia controllers
This patch is against kernel 2.6.23-rc1.
I assume I need to send this patch separately to linux-usb-devel in the usual manner...? On Tue, Jul 31, 2007 at 02:36:53PM -0700, bugme-daemon@bugzilla.kernel.org wrote: > I assume I need to send this patch separately to linux-usb-devel in the usual > manner...? And be sure to CC: the usb maintainer so he doesn't miss it :) Tested on MacBook Core2Duo against Linus' tree (2.6.23-rc1-g6c8dca5d). Seems to solve issue for me. Created attachment 12228 [details]
Slightly modified patch for Fedora 2.6.23-0.61.rc1.git9
The attached slightly modified patch 'works for me' with 2.6.23-0.61.rc1.git9 on Thinkpad X60.
This laptop does not have nVidia, but Intel chip:
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 02)
But the patch appears to work.
Is that to be expected?
The patch should work ok with any ECHI controller. I'm a little surprised you saw this failure with the Intel EHCI controller, though, because I had tested my previous patch on Intel systems. It's possible my previous testing was inadequate, though--I had been focusing on Broadcom systems--or that there's some other reason you were seeing the issue even though I didn't on Intel EHCI controllers. |