Bug 912

Summary: Autorepeat storm on USB keyboard resulting in X/matroxfb intermingling on G400 system
Product: Drivers Reporter: Nicolas Mailhot (Nicolas.Mailhot)
Component: Input DevicesAssignee: Vojtech Pavlik (vojtech)
Status: CLOSED CODE_FIX    
Severity: normal    
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.5.75-bk1 (and probably a few versions before) to 2.6.0-test5 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Screenshot
Kernel config
Same kernel boot dmesg without problems
Same kernel interrupts (not with the bug - sorry)
lspci
lsusb
sysfs dump
XF86Config

Description Nicolas Mailhot 2003-07-12 01:19:46 UTC
Distribution: Red Hat Raw Hide

Hardware Environment: http://www.giga-byte.com/products/7vax.htm (Via KT 400 +
Via VT 8235 + Realtek RTL8100BL), F10 bios, matrox g400, pure ehci input (usb1
mouse & keyboard on external ehci hub), Hercules Fortissimo III 7.1 ...

Software Environment: XFree86-4.3.0-15.1, gcc-3.3-12

Problem Description:

This is an intermitent problem I've had in the last months with whatever 2.5
kernel I was testing at the time. It is rare, does not manifests itself under
particular load, and renders the computer totally unusable (must do a system
reset to get it back). I've filled it under usb even though it might be a
console/fb/intut/whatever bug.

I think I hit this about half a dozen time already. This time I managed to
photograph the screen.

Suddenly keyboard locks & the mouse cursor is shadowed by the console cursor (ie
the console mouse cursor appears in the middle of the screen and follows the X
cursor movements at a distance). After a few seconds (I think), keyboard input
switches to autorepeat and proceeds to fill whatever field was focused at the
time with garbage. Keyboard stays locked, hitting num lock for example won't
change the num lock status.

Steps to reproduce:

Unknown. Today I had evo open and was sending a message, and this was about the
only activity on the system. It's too rare to get any occurrence pattern - I'm
typing this on the very same kernel that locked for example.

It might be matroxfb related, I don't think I had it when matroxfb was broken
(but then lots of other changes went into the kernel since this time too)
Comment 1 Nicolas Mailhot 2003-07-12 01:22:47 UTC
Created attachment 507 [details]
Screenshot
Comment 2 Nicolas Mailhot 2003-07-12 01:23:48 UTC
Created attachment 508 [details]
Kernel config
Comment 3 Nicolas Mailhot 2003-07-12 01:25:47 UTC
Created attachment 509 [details]
Same kernel boot dmesg without problems
Comment 4 Nicolas Mailhot 2003-07-12 01:26:45 UTC
Created attachment 510 [details]
Same kernel interrupts (not with the bug - sorry)
Comment 5 Nicolas Mailhot 2003-07-12 01:28:08 UTC
Created attachment 511 [details]
lspci
Comment 6 Nicolas Mailhot 2003-07-12 01:30:22 UTC
Created attachment 512 [details]
lsusb
Comment 7 Nicolas Mailhot 2003-07-12 01:32:05 UTC
Created attachment 513 [details]
sysfs dump
Comment 8 Nicolas Mailhot 2003-07-12 01:34:14 UTC
Created attachment 514 [details]
XF86Config
Comment 9 Greg Kroah-Hartman 2003-07-12 20:28:38 UTC
I'm going to reject this for now, as there's no real certainty that this
is a kernel bug.

If you can run a serial console and catch a kernel oops, then please add it to 
this bug and re-open it.
Comment 10 Nicolas Mailhot 2003-07-13 03:16:21 UTC
I'm fairly certain this *is* a kernel bug or a bug in userspace triggered by
2.4->2.5 changes.
.
Despite regularly testing 2.5 I still use 2.4 more often, and I've never had
this with "stable" kernels, even with strange nptl-enabled rawhide ones.
.
I'll try to run only this kernel now to see if I can reproduce it and get a
trace - yesterday I had a second one and this time logging out of X with the
mouse restored the system. No ooops in messages, though.
Comment 11 Greg Kroah-Hartman 2003-07-13 11:22:17 UTC
This really sounds like a XFree issue.

Please leave this bug closed until you get proof that it's a kernel issue?
An oops message would be the best proof.
Comment 12 Nicolas Mailhot 2003-07-28 13:05:52 UTC
Bug cloned in XFree86 bugzilla as
http://bugs.xfree86.org/show_bug.cgi?id=532
Comment 13 Nicolas Mailhot 2003-09-15 01:37:29 UTC
I'm reopening this bug since investigation with XFree people didn't gave
anything so far and one of the symptoms is kaeyborad stuck (which a lot of other
people have been plagued with too)
Comment 14 Nicolas Mailhot 2003-09-15 01:38:27 UTC
Seems more input than usb related -> Vojtech Pavlik
Comment 15 Nicolas Mailhot 2003-09-22 14:50:32 UTC
Well, I can confirm the root is an input bug.
Switching the graphic card from a G400 to a Redeon 9200 cured console/X
incestious relationship.

However the other symptom (keyboard on autorepeat the dead, needs X restart to
be cured) is still present on the setup.

It seems the autorepeat storm was triggering something in matroxfb/mga, without
it I don't think it would be that easy to hit.
Comment 16 Nicolas Mailhot 2004-01-20 13:47:51 UTC
Seems the latest 2.6.1-mm have some magic that fixes this (or makes it a lot
less frequent ? Still testing to see if it's really gone)
Comment 17 Vojtech Pavlik 2004-01-23 15:22:28 UTC
Well, a lot of magic went into 2.6.1-mm and 2.6.2-pre. Anyway, if your USB
keyboard dies (EMI causes the hub to disable it or similar), and a key is
pressed at that time, the only chance to stop the autorepeat is to pull
out the USB keyboard, plug it in again and press any key. Does that help?
Comment 18 Nicolas Mailhot 2004-01-24 04:18:43 UTC
The only thing that helped was to close X with the mouse and have gdm
reinitialize the ressources (and then not always). Lately <- helped a little,
but that was not always the case.

With recent mainline kernels it was getting so bad sometimes the autorepeat
started when I was mousing only without any hand on the keyboard (seems
input/usb was getting really confused).

Anyway I'm using -mm only for now. If I don't have a single autorepeat after two
weeks I'll close the bug.
Comment 19 Vojtech Pavlik 2004-02-01 04:39:46 UTC
So, what's the current status? How's 2.6.2-rc?
Comment 20 Nicolas Mailhot 2004-02-01 04:58:14 UTC
Didin't have a single problem occurence since switching to 2.6.1-mm :

uname -a
Linux rousalka.dyndns.org 2.6.2-rc1-mm3-rous2 #1 Sun Jan 25 15:31:13 CET 2004
i686 athlon i386 GNU/Linux

So either it's fixed or it was made very unlikely by code changes.
I will wait a week still before closing it - I've never been able to pinpoint
exactly what caused the bug and there have been periods of relative lull before
(never so long though)

Do you wan me to finish my testing using mainline kernels ?
At the time I switched to -mm they were clearly broken and mm wasn't, though
maybe the related changes have been merged since.
Comment 21 Vojtech Pavlik 2004-02-01 05:45:18 UTC
Yes, please try with 2.6.2-rc3. It should contain all the -mm fixes.
Comment 22 Nicolas Mailhot 2004-02-01 10:24:01 UTC
I've started testing 2.6.2-rc3-bk1 now
Comment 23 Nicolas Mailhot 2004-02-04 14:28:11 UTC
Still no problem - I'll mark this one fixed.
I hope there will be no need or reopening it in the future.