Bug 6584

Summary: USB switch and Microsoft Wireless USB keyboard causes endless loop of error and log flooding
Product: Drivers Reporter: Vesa Tervo (oh3nwq)
Component: USBAssignee: Alan Stern (stern)
Status: REJECTED INSUFFICIENT_DATA    
Severity: normal CC: bunk, greg, kernel
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.17-rc3 Subsystem:
Regression: --- Bisected commit-id:
Bug Depends on:    
Bug Blocks: 5089    
Attachments: syslog and dmesg
annotated kernel logs
Handle status overflow errors
results from lsusb -v as requested by Alan
Use maxpacket-size input buffer
Add "kvm" modules parameter to usbhid driver

Description Vesa Tervo 2006-05-19 11:55:01 UTC
Most recent kernel where this bug did not occur: 2.4.*

Distribution: gentoo vanilla-sources, also on gentoo-sources

Hardware Environment: Belkin USB 4-Port switch F1U200, Microsoft Wireless
Multimedia Keyboard 1.1

Software Environment: vanilla 2.6.17-rc3, all gentoo versions between 2.6.9-2.6.16.*

Problem Description: Switch between two computers with USB switch, on return
with about 25% certainty the computer will start endless loop of log flooding
with [kernel] drivers/usb/input/hid-core.c: input irq status -75 received
it only stops if I switch the keyboard again away from the computer with the USB
switch.
I have reported this bug on Gentoo bugzilla, the url to report is
http://bugs.gentoo.org/show_bug.cgi?id=116037 and the annotated log with dmesg
output is also on http://666.raapr.org/test2.tar.bz2

Steps to reproduce:
(sometimes 1 and 2 have to be repeated ~5 times)
1. switch USB switch to another computer to use the keyboard
2. switch USB switch back to the computer
3. flooding starts
Comment 1 Vesa Tervo 2006-05-19 11:57:07 UTC
Created attachment 8148 [details]
syslog and dmesg
Comment 2 Greg Kroah-Hartman 2006-05-19 13:41:00 UTC
Can you try 2.6.17-rc4?  It should have fixed this issue.
Comment 3 Vesa Tervo 2006-05-19 13:59:26 UTC
sure - I'll try that tomorrow with gentoo vanilla-sources 2.6.17-rc4

Vesa
Comment 4 Vesa Tervo 2006-05-20 10:30:25 UTC
Nope,
exactly the same happens with gentoo vanilla-sources 2.6.17-rc4.

Can you work on the previous system logs, or do you want me 
to run a new file on rc4?

Vesa
Comment 5 Daniel Drake 2006-05-20 11:26:16 UTC
Created attachment 8153 [details]
annotated kernel logs

Taken from downstream bug
Comment 6 Greg Kroah-Hartman 2006-08-30 01:00:23 UTC
Looks like an uhci issue.
Comment 7 Alan Stern 2006-08-30 13:10:45 UTC
Created attachment 8911 [details]
Handle status overflow errors

No, this is not connected with UHCI.  It looks like the keyboard is sending
more data than the usbhid driver expects to receive.

This patch might help.	It won't prevent the errors from occurring, but it
should reset the device after one second.  Of course, it's not clear that
resetting the keyboard will stop it from trying to send too much data...
Comment 8 Jeffrey Singleton 2006-12-02 06:43:09 UTC
This is still happening with all 2.6.17 and 2.6.18. Messages from Vesa's
attached kernel logs happen consistently under any of the above versions.

Its very annoying to be typing something and have your keyboard freak out and
start auto-repeating characters across the page until the module resets itself.
Eventually the keyboard is simply killed altogether.

I have tried, gentoo-vanilla, gentoo-sources, and even vanilla sources from
kernel.org -- any of the above using .17 or .18 sources are broke.

Currently running 2.6.16-gentoo-r13 where the problem did not exist. So,
whatever changed between 2.6.16 and .17 broke uhci_hcd to the point where using
a usb kvm is out of the question.
Comment 9 Alan Stern 2006-12-02 13:21:35 UTC
This has nothing to do with uhci-hcd.  The same thing would happen on a computer
using ohci-hcd instead.

It's a combination of problems.  The first problem is that USB KVM switches
don't work very well, especially when there's a hub plugged into the KVM and the
keyboard is attached to the hub.  But even if the keyboard were attached
directly to the KVM it still wouldn't work right; the keyboard insists on
sending more data than it is supposed to.  That's what -75 means; it is -EOVERFLOW.

Can you attach the output from  "lsusb -v" for the keyboard?
Comment 10 Jeffrey Singleton 2006-12-02 14:35:36 UTC
Well. What changed from 2.6.16 to 2.6.17/18? My KVM works great on 2.6.16
switching between 3 different OS's and 2 different platforms.

I could attach output but I don't have either of the kernels installed anymore.
The errors in the 'annotated kernel logs' attached to the original bug are the
same I saw in my dmesg.

Something changed, and it breaks kvm's. Is it a big enough problem for the
developers to care?
Comment 11 Jeffrey Singleton 2006-12-03 04:56:26 UTC
Ok so maybe my errors are not the same. In 2.6.18 this is what I see over and
over until the keyboard just dies:

usb 3-1.1: reset low speed USB device using ohci_hcd and address 4
drivers/usb/input/usbkbd.c: can't resubmit intr, 0000:00:03.1-1.1/input0, status -19
usb 3-1.1: failed to restore interface 0 altsetting 0 (error=-110)
usb 3-1.1: USB disconnect, address 4
usb 3-1.1: new low speed USB device using ohci_hcd and address 6
usb 3-1.1: configuration #1 chosen from 1 choice
input: Chicony Saitek Eclipse Keyboard as /class/input/input4
input: Chicony Saitek Eclipse Keyboard as /class/input/input5
input: USB HID v1.11 Device [Chicony Saitek Eclipse Keyboard] on
usb-0000:00:03.1-1.1

Also I created an attachment with the results from lsusb -v as you requested.
Comment 12 Jeffrey Singleton 2006-12-03 04:58:07 UTC
Created attachment 9725 [details]
results from lsusb -v as requested by Alan
Comment 13 Alan Stern 2006-12-03 09:21:37 UTC
Created attachment 9728 [details]
Use maxpacket-size input buffer

If you look through the earlier messages in this bug report or the error
messages in the logs, it ought to be obvious that what changed was hid-core.c. 
And by the way, the change didn't break KVMs.  They work just as well (or just
as badly) as they ever did; the only difference is an annoyingly large number
of messages in the system log.

Anyway, this patch ought to go some way toward solving Vesa's problem.	It may
not work perfectly but it should be better than before.  There may still be an
issue remaining about extracting multiple reports from a single packet; I don't
know about that.

Jeffrey, I'm not clear on exactly what your problem is, if any.  The log
extract you included in Comment #11 looks perfectly normal, exactly like what
one would expect given that switching the KVM disconnects the keyboard from the
computer.  In fact, you could try bypassing the KVM entirely and attaching the
keyboard directly to the computer's USB port; then unplug and replug it (as
though you had switched the KVM away and back to that computer) and see if you
don't observe exactly the same messages appearing in the log.

If things still seem wrong, attach a kernel log showing more context and maybe
also a usbmon log.
Comment 14 Vesa Tervo 2006-12-03 09:48:56 UTC
Thanks Alan,
I will try this new patch later this week. I will report if it does the trick. 
Comment 15 Jeffrey Singleton 2006-12-04 17:17:50 UTC
That's just the thing, I didn't switch. Those messages happen over and over,
interrupting my keyboard functions (keys get stuck, sometimes it don't work at
all). But only on 2.6.17 or 2.6.18.

On 2.6.16 I have no messages, no disconnects, nothing but the normal
keyboard/mouse messages as it should be. 

The message I pasted in comment #11 is not the same message I get when using my
KVM with the 2.6.16 kernel. Since I am currently on 2.6.16, here is what I see
when I switch between computers:

usb 3-1.4: USB disconnect, address 7
usb 3-1.4: new low speed USB device using ohci_hcd and address 8
usb 3-1.4: configuration #1 chosen from 1 choice
input: Logitech USB-PS/2 Optical Mouse as /class/input/input3

That is a normal message.
The message I pasted in comment #11 is obviously a different message.

Again, something did change between 2.6.16 and 2.6.17, then continues to be in
2.6.18. Because my USB KVM works, and works flawlessly under 2.6.16 and all of
the versions previous.
Comment 16 Alan Stern 2006-12-05 12:06:28 UTC
Created attachment 9736 [details]
Add "kvm" modules parameter to usbhid driver

Okay, I see.  You are reporting two different problems on two different
systems, but putting them both in the same bug report.	You can understand why
I was confused.

Apparently your KVM has a nasty habit of breaking the connection between the
computer and the keyboard even when you don't change the switch setting.  There
has been at least one other report of the same thing happening.

The change to hid-core had to be made, because some devices really do need to
be reset when they stop returning data.  But a badly-behaved KVM can make it
look like a keyboard needs to be reset when in fact it doesn't.

The attached patch adds a new module parameter to usbhid.  If you install it
with "modprobe usbhid kvm=y" then it should ignore the communication errors,
just like 2.6.16 did.  The patch should apply to 2.6.19 okay; if you want to
use it with 2.6.18 then you'll have to apply part of the patch by hand.
Comment 17 Jeffrey Singleton 2006-12-05 15:49:47 UTC
Excellent. I will go grab the .19 sources now. But just in case. What do I have
to do manually if I choose to stay with .18?

And do I just run this from /usr/src/linux ?

Thanks for sticking with this even though I broke the bugzilla rule of one bug
per bug report.
Comment 18 Alan Stern 2006-12-06 08:32:02 UTC
You don't "run" the patch file; you do

    cd /usr/src/linux ; patch -p1 <patch_file_name

You can always apply a patch manually.  Just edit the source files listed in the
patch files: Lines marked with a '-' should be removed and lines marked with a
'+' should be added.
Comment 19 Jeffrey Singleton 2006-12-06 15:53:24 UTC
awesome thanks..I will reply again if things dont work out. Otherwise,consider
my part(s) of this bug satisfied. And THANK-YOU!!!
Comment 20 Adam Sulmicki 2006-12-07 11:18:57 UTC
Couldn't this bug be related?

It invovles the same keyboard, and the same error, no kvm thought.

http://bugzilla.kernel.org/show_bug.cgi?id=4724
Comment 21 Jeffrey Singleton 2006-12-09 04:48:32 UTC
Hi .. well .. I patched the stable gentoo 2.6.18-r3 sources myself. I received
no errors during compile so I guess I did it right. However, even though the
error messages in the log went away, after about 2 hours of inactivity my
keyboard goes dead. Unplugging it and replugging it does not work, and neither
does switching to/fro other computers.

So I moved to just plain old 2.6.19 ... When I tried to apply the patch from
/usr/src/linux using the following command:

Myth linux # patch -p1 <../usbhid.patch
 
patching file drivers/usb/input/hid-core.c
Hunk #1 FAILED at 54.           
Hunk #2 FAILED at 1074.         
2 out of 2 hunks FAILED -- saving rejects to file drivers/usb/input/hid-core.c.rej


Here is the rej file:

Myth linux # cat drivers/usb/input/hid-core.c.rej
***************                 
*** 54,59 ****                  
  module_param_named(mousepoll, hid_mousepoll_interval, uint, 0644);
  MODULE_PARM_DESC(mousepoll, "Polling interval of mice");
                                
  /*                            
  * Register a new report for a device.
  */                            
--- 54,63 ----                  
  module_param_named(mousepoll, hid_mousepoll_interval, uint, 0644);
  MODULE_PARM_DESC(mousepoll, "Polling interval of mice");
                                
+ static int hid_use_kvm;       
+ module_param_named(kvm, hid_use_kvm, bool, 0444);
+ MODULE_PARM_DESC(kvm, "Ignore errors caused by a KVM");
+                               
  /*                            
  * Register a new report for a device.
  */                            
***************                 
*** 1070,1075 ****              
                case -EILSEQ:           /* protocol error or unplug */
                case -EPROTO:           /* protocol error or unplug */
                case -ETIME:            /* protocol error or unplug */
                case -ETIMEDOUT:        /* Should never happen, but... */
                        clear_bit(HID_IN_RUNNING, &hid->iofl);
                        hid_io_error(hid);                                     
                        
--- 1074,1088 ----
                case -EILSEQ:           /* protocol error or unplug */
                case -EPROTO:           /* protocol error or unplug */
                case -ETIME:            /* protocol error or unplug */
+                       /*
+                        * Many KVMs break the data pathway between the
+                        * host and the device at various times (some do so
+                        * even when the switch is set to connect the host
+                        * to the device).  If one of these things is being
+                        * used then we need to ignore transmission errors.
+                        */
+                       if (hid_use_kvm)
+                               break;
                case -ETIMEDOUT:        /* Should never happen, but... */
                        clear_bit(HID_IN_RUNNING, &hid->iofl);
                        hid_io_error(hid);

What do you think?                              
Comment 22 Jeffrey Singleton 2006-12-09 04:54:08 UTC
Ok scratch the bottom part, there must have been a extra space or something in
my copy-n-paste into vim. I redid it pasting into nano instead, patched
succeeded. So I am compiling now. But I am kind of expecting the same results,
no error messages in the logs but my keyboard is gonna die after some idle time.

be back soon.
Comment 23 Alan Stern 2006-12-09 13:09:33 UTC
Try capturing a usbmon trace for the keyboard (see the instructions in the
kernel source file Documentation/usb/usbmon.txt).  You may need to leave the
capture running for quite a while (like until the keyboard dies) to get good
information.  

Attach the trace to this bug report.  Include also the dmesg log showing what
happens when the keyboard dies, and what happens when you then unplug the
keyboard and replug it unsuccessfully.
Comment 24 Daniel Drake 2007-04-29 07:14:24 UTC
no response to comment #23, marking as NEEDINFO
Comment 25 Adrian Bunk 2007-04-29 10:46:23 UTC
Please reopen this bug if:
- it is still present with kernel 2.6.21 and
- you can provide the requested information.