Bug 14858 - ACPI events on T20 thinkpad stop being reported
Summary: ACPI events on T20 thinkpad stop being reported
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: EC (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Alexey Starikovskiy
URL:
Keywords:
: 10932 (view as bug list)
Depends on:
Blocks:
 
Reported: 2009-12-21 17:48 UTC by florz
Modified: 2010-01-25 09:20 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.32
Subsystem:
Regression: No
Bisected commit-id:


Attachments
acpidump (186.30 KB, text/plain)
2009-12-22 01:29 UTC, florz
Details
/sys/firmware/acpi/interrupts/* (1.29 KB, text/plain)
2009-12-22 01:31 UTC, florz
Details
kern.log (232.57 KB, text/plain)
2009-12-22 02:08 UTC, florz
Details
Protect Query execution (1.67 KB, patch)
2009-12-22 23:43 UTC, Alexey Starikovskiy
Details | Diff
Protect Query execution #2 (1.69 KB, patch)
2009-12-23 08:19 UTC, Alexey Starikovskiy
Details | Diff
Protect Query execution #3 (2.46 KB, patch)
2009-12-23 20:19 UTC, Alexey Starikovskiy
Details | Diff
kern.log #2 (389.05 KB, text/plain)
2009-12-23 21:29 UTC, florz
Details
Accelerate query propogation (6.32 KB, patch)
2009-12-24 00:50 UTC, Alexey Starikovskiy
Details | Diff

Description florz 2009-12-21 17:48:54 UTC
I have here a T20 thinkpad with a T22 BIOS and a 2.6.32 kernel. After
boot, acpi_listen shows me all the events that are being generated by
the lid switch, hotkeys, batteries, ac, and maybe more. However,
when I disconnect and reconnect ac power a few times in rapid
succession, that leads to all of these events not being reported
anymore. Suspend to RAM and to disk both don't fix it, only a
reboot does.

The BIOS and EC firmware are the newest versions available
(1.12 and 1.06, respectively).

"Commit 2a84cb removed delay needed by some slow controllers (Acer TM4001)" did not solve it.

Whether this is a regression is unknown, which I cannot select in the interface. However, Debian's 2.6.25-2-686 kernel is affected, too.
Comment 1 Alexey Starikovskiy 2009-12-21 19:25:30 UTC
Please uncomment #define DEBUG in drivers/acpi/ec.c and attach dmesg output.
Comment 2 Zhang Rui 2009-12-22 01:24:42 UTC
please attach the acpidump output, and the output of "grep . /sys/firmware/acpi/interrupts/*".
Comment 3 florz 2009-12-22 01:29:38 UTC
Created attachment 24250 [details]
acpidump
Comment 4 florz 2009-12-22 01:31:17 UTC
Created attachment 24251 [details]
/sys/firmware/acpi/interrupts/*
Comment 5 florz 2009-12-22 02:08:37 UTC
Created attachment 24252 [details]
kern.log

kern.log of one boot.

I hope you don't need any information before the beginning of the file? The ringbuffer seems to have been just a few kB too short ...

Up to second 47, everything worked fine. Then, in second 252 I started unplugging and replugging the power supply until it stopped working. The debug output stopped, too.
Comment 6 florz 2009-12-22 04:49:35 UTC
FYI: The problem still exists in 2.6.32.2. I'll continue any further debugging on that version unless you object.
Comment 7 Alexey Starikovskiy 2009-12-22 23:43:49 UTC
Created attachment 24260 [details]
Protect Query execution 

Just a one more wild guess. Please check if it helps.
Comment 8 florz 2009-12-23 01:13:53 UTC
That deadlocks on boot at "ACPI: Using PIC for interrupt routing"
Comment 9 Alexey Starikovskiy 2009-12-23 08:19:23 UTC
Created attachment 24262 [details]
Protect Query execution #2

Typo fixed, sorry. Please check with this one.
Comment 10 florz 2009-12-23 15:04:03 UTC
Unless I misunderstand this interface somehow, there is still a typo in there. I changed the "guery" to "query" and got the same deadlock as before.
Comment 11 Alexey Starikovskiy 2009-12-23 20:19:30 UTC
Created attachment 24270 [details]
Protect Query execution #3

Ok, order of locks was wrong... Please check if this one makes any difference. I've checked it on my Thinkpad, at least it does not lock up.
Comment 12 florz 2009-12-23 21:29:12 UTC
Created attachment 24273 [details]
kern.log #2

Boots again, but problem persists. If you think that I should increase the log buffer size, let me know ...
Comment 13 Alexey Starikovskiy 2009-12-24 00:50:25 UTC
Created attachment 24274 [details]
Accelerate query propogation

Well, it was ugly anyway... Let's try to touch from other side...
From the log it looks like EC is working, it is just too busy with traffic and has no time to reply to query in time. With this patch we split acknowledge of the query and query method execution, and this allows us to execute acknowledge on fast work queue. Please check if it helps...
Comment 14 florz 2009-12-24 01:33:10 UTC
That looks very much like a fix. I have generated probably a few hundred events now, and it's still alive.

FYI: I have reverted all your other patches before applying this one. Let me know if I should test any other combinations (well, only the first one has chances of not conflicting, I guess ;-).

Thanks a lot so far!
Comment 15 Len Brown 2009-12-30 06:30:52 UTC
marked as RESOLVED as the patch in comment #13 is in acpi-test
Comment 16 Len Brown 2010-01-23 19:15:46 UTC
commit a62e8f1978f49e52f87a711ff6711b323d4b12ff
Author: Alexey Starikovskiy <astarikovskiy@suse.de>
Date:   Thu Dec 24 11:34:16 2009 +0300

    ACPI: EC: Accelerate query execution


shipped in Linux-2.6.33-rc5

closed.
Comment 17 Henrique de Moraes Holschuh 2010-01-23 23:37:15 UTC
Is this patch being considered for -stable?
Comment 18 Alexey Starikovskiy 2010-01-25 09:20:49 UTC
*** Bug 10932 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.