Bug 10932

Summary: no ACPI events after S3 resume - T22
Product: ACPI Reporter: Andreas Waidler (arandes)
Component: ECAssignee: Alexey Starikovskiy (astarikovskiy)
Status: CLOSED DUPLICATE    
Severity: normal CC: acpi-bugzilla, astarikovskiy, hmh, lenb, lmctlx, ming.m.lin
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24.5 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Output of acpidump, dmesg, dmidecode, /proc/interrupts, lspci and kernel config
Output of dmesg on 2.6.22.19 with acpi.debug_level 0x0f and 0x1f.
dmesg of 2.6.24-tuxonice-r9 with acpi.debug_level=0x0f
Vanilla 2.6.23.17 — config, dmesg with acpi.debug_level 0x0f and 0x1f
Output of `git bisect view` and `git bisect log`
Output of lspci -vxxx
use the attached tool to read I/O port

Description Andreas Waidler 2008-06-17 07:48:41 UTC
Distribution: Gentoo
Hardware Environment: IBM Thinkpad T22
Software Environment: Latest BIOS (1.12), kernel 2.6.24.5 with TuxOnIce 3.0rc7 and Gentoo patchset applied.
Problem Description:
After resuming from suspend while being on battery acpid does not receive any 
ACPI events. The <Fn> combinations stop working since they are not caught by 
acpid anymore, neither are unplugging the AC adapter or opening/closing the lid 
and so on. Before suspending /proc/interrupt reports continously many ACPI 
interrupts even without pressing a button. After waking up /proc/interrupts 
keeps reporting ACPI interrupts but only when an event is triggered by hand 
(e.g. <Fn><F*> keypress) which produces much fewer interrupts than before.

I am using the module thinkpad-acpi but even the <Fn><F7> combination, which 
works without thinkpad-acpi loaded, stops working after resuming from suspend on 
battery.

Restarting acpid does not help.

If I resume while being on AC the events continue working.

I tried the boot options "acpi=noirq", "pci=noacpi", "noapic" and "nolapic" 
without success.

I don't know if the DSDT is buggy since i didn't find a fixed version on the net 
but disassembling and reassembling gave me 9 warnings and 2 remarks. I can 
attach the DSDT if that would help.

Theres a similar bug which is marked fixed but nothing written there helped me:
http://bugzilla.kernel.org/show_bug.cgi?id=2643

Steps to reproduce:
Trigger ACPI event (e.g. <Fn><F3>)
echo mem > /sys/power/state
Wait unti machine sleeps, wake it up
<Fn><F3> and other ACPI events will not be received by acpid
Comment 1 Andreas Waidler 2008-06-17 07:58:12 UTC
Created attachment 16527 [details]
Output of acpidump, dmesg, dmidecode, /proc/interrupts, lspci and kernel config

The file contains the output of acpidump, dmidecode and the kernel config.
For dmesg and lspci the output before and after suspend is provided.
The contents of /proc/interrupts are supplied before suspend, after waking up and after pressing a key combination (which should trigger an ACPI event) after waking up.
Comment 2 Zhang Rui 2008-06-18 19:02:59 UTC
do you have any kernel that s3 used to work?
if yes, what's the latest work kernel version?
could you please re-attach the dmesg output that with acpi.debug_level=0x0f?
Comment 3 Andreas Waidler 2008-06-19 12:04:37 UTC
Created attachment 16550 [details]
Output of dmesg on 2.6.22.19 with acpi.debug_level 0x0f and 0x1f.

I can not remember any kernel whose S3 worked properly on my Thinkpad T22.

Yesterday I tried out the vanilla kernels 2.6.25 and 2.6.25.7 with the same 
results.

I just tested vanilla kernel 2.6.22.19. This one does not receive any ACPI 
events at all, but using acpi.debug_level=0x1f I can see that <Fn><F4> (the 
sleep-combination) triggers something but then again this is not recognized by 
acpid. Pressing <Fn><F7> (combination to switch between TFT and external VGA) 
twice turns off my TFT because then the Notebook runs on external VGA only. This 
is very interesting. In any other kernel that I tested this combination did only 
generate ACPI events (well, as long as events could be received) but did not do 
anything to the hardware. I attached the output of dmesg using kernel 2.6.22.19 
with both acpi.debug_level=0x1f and acpi.debug_level=0x0f.

I will test another kernel and attach its dmesg later.
Comment 4 Andreas Waidler 2008-06-19 12:26:29 UTC
Created attachment 16551 [details]
dmesg of  2.6.24-tuxonice-r9 with acpi.debug_level=0x0f

Zhang Rui, here is the dmesg output you requested.

I rebootet the machine, pressed <Fn><F3> (to turn of TFT), removed AC adapter, 
turned the TFT on again, logged in and did 'echo mem > /sys/power/state'. 
I wrote this message while the machine was sleeping and after waking it up 
pressing <Fn><F3> did create an event which was received by acpid!

But after resuming from S3 the second time the events are dead again.
Comment 5 Zhang Rui 2008-06-19 19:54:02 UTC
are there any ACPI interrupts when pressing Fn+F3?

When the events still works, you can run "grep . /sys/firmware/acpi/interrupts/*" before and after pressing Fn+F3 to see which GPE the hotkey fires.
And when the events stop to work, do the same test and check if the corresponding GPE is fired.

BTW: when you got the dmesg output in comment #4, the events still work, right?
Please attach the dmesg AFTER the events are dead.
Comment 6 Andreas Waidler 2008-06-20 01:09:21 UTC
The output of dmesg in comment #4 has been created after I resumed for the 
second time, ACPI was dead at this time. I did the following:

<Fn><F3> # worked
removed AC adapter # got events
<Fn><F3> # worked
echo mem > /sys/power/state
<Fn><F3> # this worked, too
echo mem > /sys/power/state
<Fn><F3> # this did not work, ACPI is dead again
dmesg > 2.6.24-tuxonice-r9_0x0f_dmesg.txt
Comment 7 Andreas Waidler 2008-06-20 01:47:07 UTC
Created attachment 16560 [details]
Vanilla 2.6.23.17 — config, dmesg with acpi.debug_level 0x0f and 0x1f

Using 2.6.23.17 S3 seems to works without problems. I attached the output of 
dmesg with acpi.debug_level=0x1f and acpi.debug_level=0x0f and the kernel 
config. The outputs have been created after I pressed <Fn><F3> a few times and 
going to sleep (by pressing <Fn><F4> which I mapped to 'echo mem > /sys/power/state' 
using acpid) and waking up a few times.

In /sys/firmware/acpi using kernel 2.6.23.17 I only have the directory 'tables' 
which contains the files 'BOOT', 'DSDT', 'FACP' and 'FACS'. Using 
2.6.24-tuxonice-r9 the directory 'interrupts' is not there either.
Comment 8 Zhang Rui 2008-06-22 20:22:36 UTC
So this is a regression, it would be great if you can run git bisect to narrow down the problem to a specific commit.
Comment 9 Zhang Rui 2008-07-07 23:48:21 UTC
would you please rebuild the kernel with kernel config CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC set?
could you do the test in comment#5 in the latest kernel release?
Comment 10 Andreas Waidler 2008-07-08 10:06:30 UTC
Well, I was busy the last weeks, but I had enough time to clone the git 
repository and run some tests. I wanted to run bisect, and so I checked the 
kernel versions 2.6.23, 2.6.24 and 2.6.25 from the repository again. Before 
booting a kernel, I detached the AC adapter and did not plug it in until all 
tests were finished. Then I booted the kernel, pressed FnF4 and resumed a few 
times.

With 2.6.23 the events were dead only few times (about 1 test of 6), using 
2.6.24 about half of the tests failed and with 2.6.25 the machine did never 
receive events after resuming.

Now I don't know how I could bisect the problem, since the issue is also 
existing in 2.6.23 and just became severe in 2.6.24. I may try to bisect the 
problem down to a specific commit, after which the failure rate of 1/6 turns 
into the 1/2 one, but I hardly believe that I will find origin of that 
behaviour.

With CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC in 2.6.26-rc9 enabled I can see 
that FnF4 fires gpe0B, after resuming the events are dead and pressing FnF4 does 
not fire a gpe0B anymore. FnF3 does not seem to fire a GPE. The value for gpe09 
is constantly rising, both with working acpi and with acpi being dead. 
Comment 11 Andreas Waidler 2008-07-31 15:31:04 UTC
Created attachment 17044 [details]
 Output of `git bisect view` and `git bisect log`

I just finished bisect'ing. It was quite difficult sometimes to determine if 
the kernel was a "good" or "bad" one, since the bug also appears in 2.6.23 but 
less frequently. Especially the last kernel seemed to fail more often than the 
other "good" ones, but it did not fail as often as the "bad" ones. (I marked it 
as good). The one before the last one seemed to work perfectly.

Running the last kernel from git-bisect, /proc/acpi/bat and the appropriate menu 
item in make menuconfig are missing. The item for the AC Adapter is missing, too.
Comment 12 Zhang Rui 2008-09-09 19:26:20 UTC
hi, andreas,
sorry for the delay, please attach the output of "lspci -vxxx" both before and after S3.

(In reply to comment #10)
> With CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC in 2.6.26-rc9 enabled I can
> see 
> that FnF4 fires gpe0B, after resuming the events are dead and pressing FnF4
> does 
> not fire a gpe0B anymore. 
> FnF3 does not seem to fire a GPE.
which GPE does FnF3 fire before suspending?
Comment 13 ykzhao 2008-09-09 20:29:30 UTC
Hi, Andreas
    Will you please try the latest kernel(2.6.27-rc5) and confirm whether the problem still exists? Of course please confirm whether the ACPI event can be reported correctly before S3.
    Do you mean that there is no ACPI event when unplugging AC adapter or pressing LID state after the system is resumed from S3? Right?
    thanks.
Comment 14 Andreas Waidler 2008-09-18 12:15:39 UTC
Created attachment 17866 [details]
Output of lspci -vxxx

I can not boot the system with neither 2.6.27-rc5 nor 2.6.27-rc6. I get 
a message that the root filesystem could not be mounted read/write.

FnF3 fires a GPE09 before suspending, so does the LID and unplugging the AC 
adapter.

I just recognized, that, using 2.6.26-rc9, ACPI dies also when the AC adapter 
is removed, even without going to S3.

The output of lspci -vxxx is the same before (ACPI working) and after S3 (ACPI 
dead).

>     Do you mean that there is no ACPI event when unplugging AC adapter or
> pressing LID state after the system is resumed from S3? Right?
>     thanks.

Yes, there are no events at all after resuming from S3. Depending on the kernel 
version, ACPI sometimes survives a sleep-resume cycle or detaching the AC 
adapter.
Even when ACPI is dead, /sys/firmware/acpi/interrupts/gpe09 increases every few 
seconds by a value between about ten and some hundreds.
Comment 15 ykzhao 2008-09-23 02:19:19 UTC
Do you have opportunity to confirm whether the ACPI event can be reported correctly after resuming on windows? (Please check whether the FnF3/LID can work well).
   Thanks.
Comment 16 Zhang Rui 2008-10-31 00:13:32 UTC
ping andreas, any updates?
Comment 17 ykzhao 2008-10-31 01:09:15 UTC
Any updates?
   Will you please try the patch in http://marc.info/?l=linux-acpi&m=122307130419749&w=2 on the 2.6.27 stable kernel or the latest kernel and see whether the problem still exists?
   thanks.
Comment 18 Andreas Waidler 2008-11-05 12:43:43 UTC
Sorry, I'm quite busy these days.

I have no installation of windows and no unused harddisks, but probably I could try it out with a windows live-cd next week. Never done that before.

The patch applied to 2.6.27 does not solve the issue. After resuming, the output of dmesg contained the following message: "evxfevnt-0079 [00] enable                : System is already in ACPI mode".
Comment 19 ykzhao 2008-11-19 01:41:46 UTC
Hi, Andreas
    Will you please try the debug patch in http://bugzilla.kernel.org/show_bug.cgi?id=11255#C49and see whether the problem still exists?
    
    Thanks. 
Comment 20 Zhang Rui 2008-11-24 22:48:21 UTC
ping andreas
Comment 21 Andreas Waidler 2008-11-25 12:16:23 UTC
Using the patch above on linux-2.6.28-rc5, the machine is not able to resume at all. The screen turns on but remains blank, the LEDs for battery (even though there is no AC adapter attached), sleep and power are turned on. Even SysRq keys stop working when trying to resume.
Comment 22 ykzhao 2008-12-02 18:23:48 UTC
Hi, Andreas
    Thanks for the test. The patch makes it more serious. Please ignore it.
    From the test in comment #7, the system can work well on the kernel of 2.6.23.17. And from the bisect log  it seems that the regression is realted with the following commit:
    >commit bbc615b16d64643a3d22ab4890fde1a685e86d83
    >Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
    >Date:   Sat Oct 20 00:32:36 2007 +0200
       >ide: use __ide_end_request() in ide_end_dequeued_request()
    Why is no ACPI event related with the IDE patch? It is diffictul to understand.
    Will you please try the latest kernel(2.6.28-rc6) and see whether the problem still exists?
    Thanks.

    
  
Comment 23 ykzhao 2008-12-02 18:35:04 UTC
Created attachment 19112 [details]
use the attached tool to read I/O port

From the test it seems that the AC event is related with the GPE09 and some Fn hotkey is related with GPE0B. But after the system is resumed from S3, there is no ACPI event. 
    Will you please confirm whether ACPI event can be reported if pressing the power button when the problem happens?
    Will you please use the attached tool to read the following I/O port when no ACPI event is reported? (How to use the attached tool can be found in the Readme)
    a. ./ior --addr 0x1028 --width 16
    b. ./ior --addr 0x100e --width 16
    c. ./ior --addr 0x1004 --width 16
    
    Thanks.
Comment 24 Andreas Waidler 2008-12-04 11:31:29 UTC
> From the test in comment #7, the system can work well on the kernel of
> 2.6.23.17

It does not always work with 2.6.23.17, the problem does only occurr less often
with 2.6.23.17 than with other kernel verions I tested. (see comment #10)

> Why is no ACPI event related with the IDE patch? It is diffictul to
> understand.

Sorry, I don't know why the kernel or the machine behaves this way. This was
just what I found when I tried to find the bad change. In comment #11 I wrote
that I could not always clearly determine whether a set of changes was good or
bad, since the problem occurrs with all versions I tried but with different
frequency.

> Will you please try the latest kernel(2.6.28-rc6) and see whether the
> problem still exists?

Events for Fn keys are dead every time the machine resumes from S3.

> Will you please confirm whether ACPI event can be reported if pressing the
> power button when the problem happens?

After resuming with 2.6.28-rc6, there is still an event when pressing the power
button, even when the Fn keys are dead.

> Will you please use the attached tool to read the following I/O port when
> no ACPI event is reported?

With 2.6.28-rc6 I get the following:

err io_op # ./ior --addr 0x1028 --width 16

 the value of IO port 0x1028 is 1f01
err io_op # ./ior --addr 0x100e --width 16

 the value of IO port 0x100e is a00
err io_op # ./ior --addr 0x1004 --width 16

 the value of IO port 0x1004 is 1403


Regards,
Andreas
Comment 25 ykzhao 2008-12-17 00:27:03 UTC
Hi,Andreas
   Thanks for the test. From the info in comment #24 it seems that the GPE09&GPE0B  is still enabled after resuming. As the power button event can be generated, it indicates that the ACPI SCI interrupt is still enabled. But it is very strange that there is no ACPI event(For example: LID, Fn + hotkey).
   From the acpidump it seems that the ACPI event of LID/HKEY device is triggered by GPE09(EC device).
   Now there are a bunch of EC fix patches.
   Will you please try the following debug patch on the latest kernel(2.6.28-rc7) and see whether the ACPI event can be reported after resuming.
   http://bugzilla.kernel.org/show_bug.cgi?id=11884#C62
   In the above debug patch the EC is initialized after _INI object.
   Thanks.
Comment 26 Andreas Waidler 2008-12-21 08:02:13 UTC
Hi Yakui,

I applied the patch to the kernel versions 2.6.28-rc7, 2.6.28-rc8 and
2.6.28-rc9. Each version has been tested three times, but after resuming, Fn
keys and LID were always dead. The power button did always trigger an event,
though.

Regards, Andreas
Comment 27 ykzhao 2009-02-01 21:25:05 UTC
Hi, Andreas
     Sorry for the late response.
     Will you please not load the thinkpad_acpi driver and see whether the ACPI event can be reported after resume?(clear CONFIG_THINKPAD_ACPI in kernel configuration)
     Thanks.
Comment 28 Andreas Waidler 2009-02-10 09:15:09 UTC
Hi Yakui,

Sorry for keeping you waiting.

Without thinkpad_acpi the system does not trigger events for Fn+F3 and Fn+F12 at
all. Closing the lid or pressing Fn+F4, Fn+F7 or the power button triggers
events which have other names than the events generated by pressing the same
buttons when thinkpad_acpi is loaded. The events did neither die when the AC
adapter was removed nor when resumed from S3 without AC adapter -- I tested it
about 20 times and it worked always.

The tests above have been performed with linux-2.6.23.17 after removing
thinkpad-acpi from /etc/modules.autoload.d/ and rebooting. Simply removing
thinkpad_acpi from the running system did not help.

I have not yet tested whether the above is true for newer kernel versions since
I could not build any of them (make oldconfig fails since something somehow
pukes out 64bit object files).


Regards,
Andreas
Comment 29 ykzhao 2009-02-22 23:56:13 UTC
Hi, Andreas
    Sorry for the late response.
    Do you mean that the ACPI event(LID, Fn+F4, Fn+F7, power button) can be reported correctly after suspend/resume if the thinkpad_acpi driver is not loaded? If the thinkpad_acpi driver is loaded, the ACPI event can't be reported correctly while running on battery. But it can work well while running on the AC. Right?
    If so, maybe this issue will be related with the thinkpad_acpi driver.
    cc: hmh@hmh.eng.br
    
Hi, Henrique
    Any idea about this issue?
    thanks.
    
Comment 30 Henrique de Moraes Holschuh 2009-02-23 05:48:06 UTC
Yes, I do...

It looks like EC interrupt or polling mode is losing events.  ThinkPad ECs seem to have an internal queue (because part of the GPE process is done in SMI, ARGH!), if you don't drain it, it will eventually stop issuing events.

So, all it would take would be Linux not always emptying the EC queue when we poll or service an EC interrupt, and eventually, all stops working.
Comment 31 Karol Lewandowski 2009-03-12 02:31:11 UTC
*** Bug 12859 has been marked as a duplicate of this bug. ***
Comment 32 Henrique de Moraes Holschuh 2009-03-18 08:12:45 UTC
Guys, there is nothing I can do about this bug.

I will try to bring in the people working with the EC and GPE stuff into this, because AFAIK, our EC control HAS been fixed to attempt to drain the EC at every interrupt or poll cycle, in case it has internal queues like the ThinkPad ECs do.

Switching to EC component, reassigning, and:
Cc: Alexey Starikovskiy <aystarik@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Comment 33 Henrique de Moraes Holschuh 2009-03-18 08:16:13 UTC
Hmm, Lin Ming was in error, not added to Cc.
Comment 34 Henrique de Moraes Holschuh 2009-03-18 19:01:33 UTC
Just thought of something. The T22 is old (ACPI 1.0 kind of old).  Is it possible that using the kernel parameters for the old ordering of calls might fix it?

i.e. "acpi_sleep=old_ordering"

acpi_sleep=s4_nonvs could also be worth a try.

Andreas, can you test that?
Comment 35 Andreas Waidler 2009-03-22 13:04:45 UTC
When booted with "acpi_sleep=old_ordering", the screen does only contain some
garbage after resuming from S3. When pressing some keys, the garbage disappears
and the screen remains blank. The machine can be rebooted using sysrq.

"acpi_sleep=s4_nonvs" seems to not affect the machine at all.

Tests were performed on 2.6.28-rc9.


Regards,
Andreas
Comment 36 Len Brown 2009-08-13 03:11:30 UTC
still a problem with 2.6.30.stable?
Comment 37 Andreas Waidler 2009-08-15 10:55:39 UTC
Yes, using v2.6.30, the problem seems to still occurr every wakeup if running on battery.
Comment 38 Henrique de Moraes Holschuh 2010-01-24 13:14:10 UTC
Please check the fix for bug #14858 : ACPI events on T20 thinkpad stop being reported.  Maybe it also fixes your problem?

I think the easiest way to test that is to temporarily boot the latest 2.6.32 stable kernel with the patch in #14858 applied...
Comment 39 Andreas Waidler 2010-01-24 22:06:22 UTC
Sorry, I can't check that. My machine has died some weeks ago.
Comment 40 Karol Lewandowski 2010-01-24 23:37:27 UTC
I can't comment on Andreas' machine, but having exactly same problem (on Thinkpad T21) I have to say that patch from bug #14858 (ACPI: EC: Accelerate query execution) applied on top of 2.6.32.5 solves this issue for me.
Comment 41 Henrique de Moraes Holschuh 2010-01-25 01:58:40 UTC
Alexey,

Given the answer from Karol Lewandowski on comment #40, should we mark this bug as a duplicate of bug #14858 ?
Comment 42 Alexey Starikovskiy 2010-01-25 09:20:49 UTC

*** This bug has been marked as a duplicate of bug 14858 ***