Bug 9327 (ECINTBOUNCING)

Summary: 2.6.24-rc2 (esthetic?) regression: auto select interrupt bouncing
Product: ACPI Reporter: Romano Giannetti (romano.giannetti)
Component: ECAssignee: Alexey Starikovskiy (astarikovskiy)
Status: CLOSED CODE_FIX    
Severity: normal CC: acpi-bugzilla, astarikovskiy, romano.giannetti, spamcan
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24-rc2 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 9243    
Attachments: .config
add printk to see protocol
result of lshw command
output of lspci -vvx
DSDT, created with cat /proc/acpi/dsdt > dsdt.hex
gzipped dmesg of 2.6.34 with the debug patch applied
Disable GPE mode on broken hardware
Try to workaround "optimized" controllers
dmesg-2.6.24-rc3-git1-debug1.txt
dmesg-2.6.24-rc3-git1-debug1-workaround1.txt
debug info patch
Workaround optimized controllers
dmesg-2.6.24-rc3-git1-debug2-workaround2-debug-mode.txt

Description Romano Giannetti 2007-11-08 04:30:42 UTC
Most recent kernel where this bug did not occur: 2.6.24-rc1
Distribution: Ubuntu Feisty
Hardware Environment: Laptop Toshiba Satellite U305-S5077
Problem Description:

After the ACPI changes between 2.6.24-rc1 and -rc2 I have my logs
"spammed" (every 2-3 seconds) by: 

[  423.112903] ACPI: EC: missing IBF_1 confirmations,switch off interrupt mode.
[  423.113020] ACPI: EC: non-query interrupt received, switching to interrupt mode
[  426.078972] ACPI: EC: missing IBF_1 confirmations,switch off interrupt mode.
[  426.079645] ACPI: EC: non-query interrupt received, switching to interrupt mode
[  426.622773] ACPI: EC: missing IBF_1 confirmations,switch off interrupt mode.
[  426.622889] ACPI: EC: non-query interrupt received, switching to interrupt mode

It seems that no harm is done, apart (but it could be another thing)
that gnome-panel is much slower on startup.
Comment 1 Romano Giannetti 2007-11-08 04:31:50 UTC
Created attachment 13458 [details]
.config
Comment 2 Alexey Starikovskiy 2007-11-08 05:38:21 UTC
Created attachment 13460 [details]
add printk to see protocol

Please make a log with this patch applied.
Also, what hardware are you using?
Comment 3 Romano Giannetti 2007-11-08 08:29:12 UTC
I will try as soon as possible your debug patch (I will be at a conference in Munich till monday, so I don't know how responsive I'll be). My hardware is a Toshiba Satellite U305, I'll attach the output of lshw, lspci -vvx and /proc/acpi/dsdt to this bug. Tell me if you need more info.
Comment 4 Romano Giannetti 2007-11-08 08:29:51 UTC
Created attachment 13464 [details]
result of lshw command
Comment 5 Romano Giannetti 2007-11-08 08:30:21 UTC
Created attachment 13465 [details]
output of lspci -vvx
Comment 6 Romano Giannetti 2007-11-08 08:31:44 UTC
Created attachment 13466 [details]
DSDT, created with cat /proc/acpi/dsdt > dsdt.hex
Comment 7 Romano Giannetti 2007-11-08 14:47:06 UTC
I did the test...

[  163.918487] ACPI: EC: transaction start                                                                                                                             
[  163.918588] ACPI: EC: ---> status = 0x 8                                                                                                                            
[  163.918661] ACPI: EC: ---> status = 0x 8                                                                                                                            
[  163.918667] ACPI: EC: <--- data = 0xe0                                                                                                                              
[  164.415613] ACPI: EC: ---> status = 0x 0                                                                                                                            
[  164.415624] ACPI: EC: missing IBF_1 confirmations,switch off interrupt mode.                                                                                        
[  164.415630] ACPI: EC: <--- data = 0x 0                                                                                                                              
[  164.415638] ACPI: EC: ---> status = 0x 2                                                                                                                            
[  164.415644] ACPI: EC: ---> status = 0x 2                                                                                                                            
[  164.415650] ACPI: EC: ---> status = 0x 2                                                                                                                            
[  164.415655] ACPI: EC: ---> status = 0x 2                                                                                                                            
[  164.415661] ACPI: EC: ---> status = 0x 0                                                                                                                            
[  164.415665] ACPI: EC: transaction end                                                                                                                               
[  164.415737] ACPI: EC: ---> status = 0x 0                                                                                                                            
[  164.415741] ACPI: EC: non-query interrupt received, switching to interrupt mode                                                                                     
[  167.197188] ACPI: EC: ---> status = 0x 0                                                                                                                            
[  167.197202] ACPI: EC: <--- command = 0x80 

I will attach the full dmesg (gzipped).
Comment 8 Romano Giannetti 2007-11-08 14:49:19 UTC
Created attachment 13474 [details]
gzipped dmesg of 2.6.34 with the debug patch applied
Comment 9 Alexey Starikovskiy 2007-11-09 12:12:43 UTC
Thanks!
Your controller fails to send interrupt after address (first <---data) in write (command=0x81) transaction, but sends interrupts for every other byte.
Comment 10 Alexey Starikovskiy 2007-11-09 14:35:09 UTC
Created attachment 13489 [details]
Disable GPE mode on broken hardware

Please check if this patch helps.
Comment 11 Romano Giannetti 2007-11-12 03:24:49 UTC
It is triggered; and it seems to works ok (but the laptop worked without the patch, too). Only now I have a *lot* more debugging messages from EC, several per second; I imagine this is normal, since now the driver is polling. Excerpts of dmesg:

Nov 12 12:07:57 rukbat kernel: [   11.450187] ACPI: EC: Look up EC in DSDT
Nov 12 12:07:57 rukbat kernel: [   11.451523] ACPI: EC: ---> status = 0x 8
Nov 12 12:07:57 rukbat kernel: [   11.451592] ACPI: EC: <--- command = 0x80
Nov 12 12:07:57 rukbat kernel: [   11.451660] ACPI: EC: transaction start
Nov 12 12:07:57 rukbat kernel: [   11.451727] ACPI: EC: ---> status = 0x 8
Nov 12 12:07:57 rukbat kernel: [   11.451815] ACPI: EC: ---> status = 0x 8
Nov 12 12:07:57 rukbat kernel: [   11.451880] ACPI: EC: your hardware is broken in some unexpected way, disabling interrupt mode forever.
Nov 12 12:07:57 rukbat kernel: [   11.452001] ACPI: EC: <--- data = 0x65
Nov 12 12:07:57 rukbat kernel: [   11.452072] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.452146] ACPI: EC: ---> status = 0x 1
Nov 12 12:07:57 rukbat kernel: [   11.452232] ACPI: EC: ---> status = 0x 1
Nov 12 12:07:57 rukbat kernel: [   11.452304] ACPI: EC: ---> data = 0x 5
Nov 12 12:07:57 rukbat kernel: [   11.452371] ACPI: EC: transaction end
Nov 12 12:07:57 rukbat kernel: [   11.452440] ACPI: EC: ---> status = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.452506] ACPI: EC: <--- command = 0x81
Nov 12 12:07:57 rukbat kernel: [   11.452574] ACPI: EC: transaction start
Nov 12 12:07:57 rukbat kernel: [   11.452659] ACPI: EC: ---> status = 0x 8
Nov 12 12:07:57 rukbat kernel: [   11.452730] ACPI: EC: ---> status = 0x 8
Nov 12 12:07:57 rukbat kernel: [   11.452796] ACPI: EC: <--- data = 0x65
Nov 12 12:07:57 rukbat kernel: [   11.452865] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.452932] ACPI: EC: ---> status = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.452998] ACPI: EC: <--- data = 0x 5
Nov 12 12:07:57 rukbat kernel: [   11.453067] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.453151] ACPI: EC: ---> status = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.453223] ACPI: EC: ---> status = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.453289] ACPI: EC: transaction end
Nov 12 12:07:57 rukbat kernel: [   11.453473] ACPI: EC: ---> status = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.453538] ACPI: EC: <--- command = 0x80
Nov 12 12:07:57 rukbat kernel: [   11.453606] ACPI: EC: transaction start
Nov 12 12:07:57 rukbat kernel: [   11.453690] ACPI: EC: ---> status = 0x 8
Nov 12 12:07:57 rukbat kernel: [   11.453762] ACPI: EC: ---> status = 0x 8
Nov 12 12:07:57 rukbat kernel: [   11.453828] ACPI: EC: <--- data = 0xf0
Nov 12 12:07:57 rukbat kernel: [   11.453898] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.453980] ACPI: EC: ---> status = 0x 1
Nov 12 12:07:57 rukbat kernel: [   11.454053] ACPI: EC: ---> status = 0x 1
Nov 12 12:07:57 rukbat kernel: [   11.454121] ACPI: EC: ---> data = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.454188] ACPI: EC: transaction end
Nov 12 12:07:57 rukbat kernel: [   11.454257] ACPI: EC: ---> status = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.454323] ACPI: EC: <--- command = 0x81
Nov 12 12:07:57 rukbat kernel: [   11.454390] ACPI: EC: transaction start
Nov 12 12:07:57 rukbat kernel: [   11.454475] ACPI: EC: ---> status = 0x 8
Nov 12 12:07:57 rukbat kernel: [   11.454544] ACPI: EC: ---> status = 0x 8
Nov 12 12:07:57 rukbat kernel: [   11.455592] ACPI: EC: <--- data = 0xf0
Nov 12 12:07:57 rukbat kernel: [   11.455661] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.455728] ACPI: EC: ---> status = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.455794] ACPI: EC: <--- data = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.455865] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.455932] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456000] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456068] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456135] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456202] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456270] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456337] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456405] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456473] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456540] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456608] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456675] ACPI: EC: ---> status = 0x 2
Nov 12 12:07:57 rukbat kernel: [   11.456743] ACPI: EC: ---> status = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.456826] ACPI: EC: ---> status = 0x 0
Nov 12 12:07:57 rukbat kernel: [   11.456896] ACPI: EC: transaction end

and then forever like that. I have recompiled without the debug patch and will reboot. If you need more info tell me. 

Thanks
Comment 12 Alexey Starikovskiy 2007-11-12 07:59:42 UTC
Thanks, debug patch is not needed any longer. Last patch will go into mainline.
Comment 13 Alexey Starikovskiy 2007-11-16 12:24:38 UTC
Created attachment 13581 [details]
Try to workaround "optimized" controllers

Please try if the following patch helps too...
Comment 14 Rafael J. Wysocki 2007-11-16 15:11:58 UTC
commit 66c5f4e7367b0085652931b2f3366de29e7ff5ec
Author: Alexey Starikovskiy <astarikovskiy@suse.de>
Date:   Mon Oct 22 14:18:56 2007 +0400

    ACPI: EC: Add workaround for "optimized" controllers

http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=66c5f4e7367b0085652931b2f3366de29e7ff5ec
Comment 15 Alexey Starikovskiy 2007-11-16 15:14:27 UTC
Rafael, this is different.
Comment 16 Romano Giannetti 2007-11-18 04:30:35 UTC
Hmmm,

sorry, I lost contact... :-). Should I try the last patch on top of 2.6.24-rc3? It seems that it's not in. Thanks...
Comment 17 Alexey Starikovskiy 2007-11-18 04:57:28 UTC
Yes, would be great if you do. Len didn't like the first patch :)
Comment 18 Romano Giannetti 2007-11-18 09:31:27 UTC
OK, tried it. It seems it works. If you like you can add a 
Tested-by: Romano Giannetti <romano.giannetti@gmail.com>

Thanks again!
Comment 19 Len Brown 2007-11-18 22:06:29 UTC
f2d68935ba08cf80f151bbdb5628381184e4a498
(ACPI: EC: Workaround for optimized controllers)
applied to acpi test branch
Comment 20 Mats Johannesson 2007-11-20 01:57:36 UTC
The patch does _not_ fix my hardware x86_64 (amd64 Acer Aspire 1524 notebook):

Nov 20 10:23:34 sleipner kernel: ACPI: EC: missing confirmations,switch off interrupt mode.
Nov 20 10:23:34 sleipner kernel: ACPI: EC: non-query interrupt received, switching to interrupt mode
Nov 20 10:24:03 sleipner kernel: ACPI: EC: missing confirmations,switch off interrupt mode.
Nov 20 10:24:03 sleipner kernel: ACPI: EC: non-query interrupt received, switching to interrupt mode
Nov 20 10:24:04 sleipner kernel: ACPI: EC: missing confirmations,switch off interrupt mode.
Nov 20 10:24:04 sleipner kernel: ACPI: EC: non-query interrupt received, switching to interrupt mode
Nov 20 10:24:33 sleipner kernel: ACPI: EC: missing confirmations,switch off interrupt mode.
Nov 20 10:24:33 sleipner kernel: ACPI: EC: non-query interrupt received, switching to interrupt mode
Nov 20 10:24:34 sleipner kernel: ACPI: EC: missing confirmations,switch off interrupt mode.
Nov 20 10:24:34 sleipner kernel: ACPI: EC: non-query interrupt received, switching to interrupt mode

And let me add that this is a real regression - not only cosmetic - since the messages prevent the HD from switching off. Possibly other drawbacks as well.

I'll read up on this whole bug-page and return with requested info etc.
Comment 21 Alexey Starikovskiy 2007-11-20 03:20:03 UTC
Mats, I will need the dmesg with the debug patch from comment #2 applied.
Comment 22 Mats Johannesson 2007-11-20 07:14:58 UTC
Created attachment 13647 [details]
dmesg-2.6.24-rc3-git1-debug1.txt

OK, this one is with your debug patch only.
Comment 23 Mats Johannesson 2007-11-20 07:16:31 UTC
Created attachment 13648 [details]
dmesg-2.6.24-rc3-git1-debug1-workaround1.txt

And this one is with both the debug and the workaround patch applied.
Comment 24 Mats Johannesson 2007-11-20 07:25:35 UTC
Command line option i8042.nomux is to prevent log spamming when accessing eg /proc/acpi/thermal_zone/*/temp* and at the same time using the touchpad (old, old bug which never will be fixed). That spam is in the form of:

psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 4
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 - driver resynched.
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: issuing reconnect request
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 4
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 - driver resynched.

This latest spamming (yours ;-) I patched out of the kernel by /* commenting */ out the messages, but even so there has been a disturbing message a couple of times:

loke@sleipner:~$ grep degraded /var/log/messages
Nov 15 02:29:56 sleipner kernel: ACPI: EC: missing OBF_1 confirmation,switching to degraded mode.
Nov 19 05:11:16 sleipner kernel: ACPI: EC: missing OBF_1 confirmation,switching to degraded mode.

I don't like that at all...
Comment 25 Alexey Starikovskiy 2007-11-20 08:56:52 UTC
Created attachment 13649 [details]
debug info patch
Comment 26 Alexey Starikovskiy 2007-11-20 08:58:06 UTC
Created attachment 13650 [details]
Workaround optimized controllers

Please try if this patch helps.
Comment 27 Mats Johannesson 2007-11-20 10:40:09 UTC
Created attachment 13653 [details]
dmesg-2.6.24-rc3-git1-debug2-workaround2-debug-mode.txt

Took a bit of time to sort out the correct patch order. Is:

Pristine kernel source
+ debug info patch (id=13649)
+ Workaround optimized controllers (id=13650)

Did two kernels (one with #define DEBUG commented out) and goofed around some minutes in pure text consoles, touched the touchpad and catted the acpi thermal zones.

Blissfully silent from the EC!

And there were no differences between debug enabled or disabled in dmesg. Thank you Alexey. I'll run with this patch-set from now on (in a real scenario with wifi etc), and keep an eye on the logs. Will shout if something falls through.

Ping me if anything else needs testing.
Comment 28 Len Brown 2007-12-03 22:41:40 UTC
shipped in linux-2.6.24-rc4 -- closed:

commit e790cc8bbb990df900eabdda18a5a480d22a60c8
Author: Alexey Starikovskiy <astarikovskiy@suse.de>
Date:   Wed Nov 21 03:23:32 2007 +0300

    ACPI: EC: Workaround for optimized controllers (version 3)

    Some controllers fail to send confirmation GPE after address or data write.
    Detect this and don't expect such confirmation in future.
    This is a generalization of previous workaround
    (66c5f4e7367b0085652931b2f3366de29e7ff5ec), which did only read address.

    Reference: http://bugzilla.kernel.org/show_bug.cgi?id=9327

    Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
    Tested-by: Romano Giannetti <romano.giannetti@gmail.com>
    Tested-by: Mats Johannesson
    Signed-off-by: Len Brown <len.brown@intel.com>