Bug 7291

Summary: messed up keyboard events
Product: Drivers Reporter: David Gerber (dg-kernel-bug)
Component: Input DevicesAssignee: drivers_input-devices
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: andi-bz, frank, john.stultz, Matt_Domsch
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.19-rc1 Subsystem:
Regression: --- Bisected commit-id:
Attachments: dmesg output
dmidecode output
debug patch
dmesg with debug pattch
Fix typo in C3 test

Description David Gerber 2006-10-09 07:12:32 UTC
Most recent kernel where this bug did not occur: 
none found
Distribution: 
ubuntu edgy
Hardware Environment: 
Dell Inspiron 9400 (aka E1705), Intel Core 2 T7200, Radeon x1400
Software Environment:
Problem Description:
When X11 is loaded, keyboard events get messed up. For example I press 'a' and 
I see 'aaa' on the screen. This effectively renders the keyboard unusable. I 
can't seem to reproduce that effect both in the console or in gdm, though.

If I boot the kernel with 'noapic' there's no such symptom and the keyboard is 
fine. On the other hand one of the CPU core gets disabled.

Steps to reproduce:
Boot, log into X11, try to type something into any string area. It's not 
possible to predict when the sticky keys effect will happen but it usually 
happens quickly (less than 30 seconds).

It's also possible to reproduce that with a plain ubuntu distribution 'live' 
CD and the vesa driver so that means no proprietary drivers get loaded.
Comment 1 David Gerber 2006-10-09 07:15:22 UTC
Created attachment 9190 [details]
dmesg output

dmesg output (I also tested WITHOUT 'fglrx' loaded and the bug happens as well)
Comment 2 David Gerber 2006-10-09 07:16:36 UTC
Created attachment 9191 [details]
dmidecode output
Comment 3 Dmitry Torokhov 2006-10-09 07:34:39 UTC
Does booting with "nolapic" help?
Comment 4 David Gerber 2006-10-09 08:05:02 UTC
No.
With 'nolapic', both cores are enabled and the bug shows up.
Comment 5 Tarmo T 2006-10-09 18:16:08 UTC
I'm not sure if I'm seeing the exact same issue but when using 2.6.18 or
2.6.18-mm1 the keyboard works fine for several hours and then either just stops
working or gets stuck on one key that was last pressed, never releasing that key
and not generating events for any other key that is pressed.

I'm using an less known brand (Sabio Digital model "SD-MW1 PM") laptop with
1.83Ghz Pentium-M dothan, nvidia geforce 6600 GO. I haven't tried yet with
'noapic' nor 'nolapic'.

Is it likely that my issue is the same one that this bugreport refers to?
Comment 6 David Gerber 2006-10-09 19:00:56 UTC
I don't think so. The bug shows up pretty quickly and effectively renders the
machine unusable (if you use X11, that is.. all the rest seems to work fine).
Comment 7 Frank Sorenson 2006-10-09 19:37:32 UTC
I also see this keyboard stuttering in X.  I've got a Dell Inspiron E1705, like
David Gerber, however booting with 'nolapic' solves the keyboard-repeat problem
for me (but disables one of the CPU cores).

When the extra keyboard events are coming through, I see interrupt 1 in
/proc/interrupts incrementing by the number of characters that X sees, not the
number of key events I really generated:

         CPU0       CPU1
1:       3666       3713    IO-APIC-edge  i8042 

With 'nolapic', all the interrupt triggers are XT-PIC, and no extra interrupts
are received.
Comment 8 Dmitry Torokhov 2006-10-09 19:42:24 UTC
David, I wonder if it is timer issue. What 
does "cat /sys/devices/system/clocksource/clocksource0/*" show on your system?
Comment 9 David Gerber 2006-10-09 20:13:02 UTC
It shows:
jiffies
jiffies
Comment 10 Dmitry Torokhov 2006-10-09 21:56:19 UTC
Your clocksource is junk :(

1. What is your CONFIG_NR_CPUS in .config?
2. Can I please see /proc/cpuinfo
3. Please try enabling CONFIG_X86_PM_TIMER
Comment 11 David Gerber 2006-10-09 22:10:51 UTC
1. 2
2. 
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 CPU         T7200  @ 2.00GHz
stepping        : 6
cpu MHz         : 1000.000
cache size      : 4096 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc
pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips        : 3999.20
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

(this is with 'noapic' though, tell me if you want one with apic enabled.. but I
think it's pretty much the same except there are 2 cores)

3. CONFIG_X86_PM_TIMER is already enabled
Comment 12 john stultz 2006-10-10 10:45:19 UTC
I know it looks odd that there is only the jiffies clocksource installed, but
that is because x86_64 hasn't been moved to use the generic timekeeping code yet
(I'm working on it :)

Note in the dmesg:
[    0.000000] time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.

That means its using the TSC. You could try booting w/ "notsc" to see if forcing
the acpi_pm to be used changes the behavior. 

Actually, the fact "noapic" fixes it makes me suspicious it is C3 halting
related. You can also try "idle=poll" (this disables idle power management, so
for your batteries sake, I wouldn't reccomend this as a workaround) to see if
that resolves it as well.

Comment 13 David Gerber 2006-10-10 11:54:53 UTC
'notsc' fixed the problem here.
Comment 14 David Gerber 2006-10-15 05:38:48 UTC
I forgot to mention that 'idle=poll' does not fix the problem.
But it seems 'notsc' has a serious side effect. When I close my laptop's lid,
the system freezes (complete freeze, eg. sound buffers loop). Happens with the
acpid daemon disabled too.
Comment 15 john stultz 2006-10-16 10:56:10 UTC
David: Did this occur w/ 2.6.18? What was the last kernel that worked ok for you?

So re-reading the bug, it seems the "noapic" option only avoided the issue by
forcing the system into UP mode, and the issue was not related to the C3 halting.

Regarding the laptop lid closing issue, do you have your system setup to
suspend/hibernate on lid close? That may be a separate issue and might need a
new bug to be opened.
Comment 16 Frank Sorenson 2006-10-16 12:48:01 UTC
Not David, but I've been having the same keyboard issues.  The problem
definitely occurs with 2.6.18.  I tried 2.6.17 (and saw the problems), but I
haven't tried any earlier kernels.

So far, the following all seem to fix the problem for me: noapic, nolapic,
idle=poll, notsc

However, simply starting with 'nosmp' does not solve the problem (and in fact,
will not boot for me).

Frank
Comment 17 David Gerber 2006-10-17 11:58:54 UTC
John: that was with 2.6.17. I never tried with anything lower than that. I 
bought this laptop quite recently.

The lid closing issue also happens when I boot with 'single' and make sure 
there's no acpi daemon running so there shouldn't be any script running when I 
do that (it's not configured to hibernate/suspend anyway). Should I fill a new 
bug report against the ACPI category? I don't know if disabling the TSC is 
supposed to work for that stuff, though.

Frank: 'nosmp' hangs for me too.
Comment 18 john stultz 2006-10-17 18:09:20 UTC
David: Sorry for the clarification, but you're saying 2.6.17 worked fine without
any boot options, right?  Did the lid-closing issue show up there? The
lid-closing part is probably a separate bug, and should be opened independently
(please add me to the CC).

Have you tried 2.6.18 to see if it has the same issue as well?

Frank: Your symptoms are so similar, but yet quite distinct (idle=poll works for
you, but not David, same w/ lapic, and 2.6.17 is broken for you but not David).
Since you're both on the same hardware, could you both post your BIOS versions
here to see if that's a factor?

Also, just to be clear, both of you are running x86_64 kernels (not i386), right? 

Andi: I'm really not sure what the deal is here. Unless its one of those C2=>C3
BIOS quirks, I'm not sure why the TSC would be skewing here and still be selected.
Comment 19 David Gerber 2006-10-17 18:16:57 UTC
John: 2.6.17 does need 'notsc' to work. And the lid closing issue shows up here
but only with 'notsc'.

I didn't try 2.6.18 but 2.6.19-rc1 and it needs 'notsc'. I didn't try the lid
closing with it yet. Will do so when filling a new bug report for the lid bug.

BIOS version is A03 for me and I'm running x86_64 kernels.
Comment 20 Frank Sorenson 2006-10-18 16:16:19 UTC
John: I went back and double-checked idle=poll, and I find that perhaps I did
not test extensively enough before.  idle=poll does not fix the keyboard events,
though I think it may reduce them over passing no argument at all.

I know for certain that 'nolapic' and 'notsc' both fix the problem.

I will go back and double-check with 2.6.17 to make sure on that count.

I am also running BIOS version A03.
Comment 21 Grant Likely 2006-10-20 22:33:09 UTC
I can confirm this problem on ubuntu-amd64 kernel 2.6.17-10-generic (I know; 
not mainline; sorry but it's what I'm running).

notsc solves the extra keyboard repeat issue, but still leaves delayed 
responses to keypresses.

MB: Asus A8N-VM CSM, BIOS v1001

processor       : 0/1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 43
model name      : AMD Athlon(tm)64 X2 Dual Core Processor  3800+
stepping        : 1
cpu MHz         : 1000.000
cache size      : 512 KB
physical id     : 0
siblings        : 2
core id         : 0/1
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 
3dnowext 3dnow up pni lahf_lm cmp_legacy
bogomips        : 2011.62
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp
Comment 22 David Gerber 2006-10-21 04:49:22 UTC
Grant:
Guess what.. I have 2 of those boards and it was a pain to get them to work 
properly. First, there a version 1002 beta BIOS which is the one I use. 
Second, I never got the stock Ubuntu (2.6.17) kernel to work properly 
without 'noapic' and didn't know about 'notsc' at the time but there were 
plenty of other issues like missing interrupts for one of my 3com network card 
and the other box had random lockups.

To solve it, I had to use a custom built kernel with the following config ( 
http://zapek.com/misc/config-a8nvmcsm.txt ). You might want to try this.

64-bit seems to rule :)
Comment 23 Andi Kleen 2006-10-21 07:00:15 UTC
Can we please see the output of cat /proc/acpi/processor/*/* ?
Desktop systems normally don't support C3 and also there is code
in x86-64 to disable TSC automatically when C3 is supported

Ok, we have had cases where motherboards faked C3 as C2, 
maybe that is the problem.
Comment 24 Frank Sorenson 2006-10-21 09:11:41 UTC
Here's the output from my Dell Inspiron E1705:

cat /proc/acpi/processor/*/*
processor id:            0
acpi id:                 0
bus mastering control:   yes
power management:        yes
throttling control:      yes
limit interface:         yes
active limit:            P0:T0
user limit:              P0:T0
thermal limit:           P0:T0
active state:            C3
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 2000 usec
states:
    C1:                  type[C1] promotion[C2] demotion[--] latency[001] usage[
                                00000010] duration[00000000000000000000]
    C2:                  type[C2] promotion[C3] demotion[C1] latency[001] usage[
                                02853700] duration[00000000005592946318]
   *C3:                  type[C3] promotion[--] demotion[C2] latency[057] usage[
                                96456241] duration[00000000196840067209]
state count:             8
active state:            T0
states:
   *T0:                  00%
    T1:                  12%
    T2:                  25%
    T3:                  37%
    T4:                  50%
    T5:                  62%
    T6:                  75%
    T7:                  87%
processor id:            1
acpi id:                 1
bus mastering control:   yes
power management:        yes
throttling control:      yes
limit interface:         yes
active limit:            P0:T0
user limit:              P0:T0
thermal limit:           P0:T0
active state:            C3
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 2000 usec
states:
    C1:                  type[C1] promotion[C2] demotion[--] latency[001] usage[
                                00000010] duration[00000000000000000000]
    C2:                  type[C2] promotion[C3] demotion[C1] latency[001] usage[
                                02767764] duration[00000000005358743251]
   *C3:                  type[C3] promotion[--] demotion[C2] latency[057] usage[
                                94040078] duration[00000000196635776570]
state count:             8
active state:            T0
states:
   *T0:                  00%
    T1:                  12%
    T2:                  25%
    T3:                  37%
    T4:                  50%
    T5:                  62%
    T6:                  75%
    T7:                  87%
Comment 25 Andi Kleen 2006-10-21 09:52:42 UTC
Ok, yours is a laptop, but you're different from the original poster
which was a desktop which shouldn't have C3. Laptops are supposed to have it

Can the original reporter please add the information i requested?

(it would be easier if each of you with different machines opened a new 
bug. I suspect we have a couple of completely different issues mixed
here that just happen to show similar symptoms) 

Anyways, you need notsc right for your laptop right to boot a 64bit kernel?
If yes can you please post a dmesg of this without notsc. Thanks.
If you don't need notsc or don't use 64bit please open a new bug.

AMD bugs should also go else where because they're likely unrelated too.

Comment 26 Frank Sorenson 2006-10-21 10:15:44 UTC
No, the original poster of this bug was also reporting a laptop (the same laptop
as I have--Dell Inspiron 9400/E1705), and he posted his dmesg output when he
filed the bug.
Comment 27 Andi Kleen 2006-10-21 10:52:14 UTC
Created attachment 9322 [details]
debug patch

Ok. 

Can you please send dmesg with the following debug patch applied
(and notsc not specified manually). Thanks?
Comment 28 Frank Sorenson 2006-10-21 12:13:49 UTC
Created attachment 9323 [details]
dmesg with debug pattch
Comment 29 Andi Kleen 2006-10-21 12:29:11 UTC
Created attachment 9324 [details]
Fix typo in C3 test

Ah, there was a typo in the C3 test. Does this patch help? (revert the debug
patch first)
Comment 30 Frank Sorenson 2006-10-21 22:57:54 UTC
This patch does seem to fix the issue for me.
Comment 31 Andi Kleen 2006-10-22 06:00:53 UTC
Thanks for testing. I submitted it for 2.6.19


Fixed, but I can't close the bug. Owner please do that.
Comment 32 David Gerber 2006-10-22 07:30:42 UTC
Works for me too. Marking the bug as closed. Many thanks to everyone.