Bug 45221 - Wrong ACPI event sent when unplug the AC on thinkpad e530
Summary: Wrong ACPI event sent when unplug the AC on thinkpad e530
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Battery (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Lan Tianyu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-27 16:09 UTC by Enrico Tagliavini
Modified: 2013-05-27 13:36 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.5.0 3.6.8 3.7.* 3.8.6 3.9-rc6
Tree: Mainline
Regression: No


Attachments
acpi dump, dmesg and syslogs (34.49 KB, application/x-gzip)
2012-07-27 16:09 UTC, Enrico Tagliavini
Details
dmesg after a cold boot with thinkpad_acpi blacklisted (24.60 KB, text/plain)
2012-07-31 17:30 UTC, Enrico Tagliavini
Details
syslog generated without thinkpad-acpi loaded (4.06 KB, text/plain)
2012-07-31 17:45 UTC, Enrico Tagliavini
Details
debug messages from acpi (55.64 KB, text/plain)
2012-08-02 17:23 UTC, Enrico Tagliavini
Details
debug messages from acpi dispatcher (77.43 KB, application/x-xz)
2012-08-02 17:28 UTC, Enrico Tagliavini
Details
debug messages from acpi ALL_DRIVERS (9.67 KB, text/plain)
2012-08-02 17:35 UTC, Enrico Tagliavini
Details
acpidump hex-dump (268.03 KB, application/octet-stream)
2012-11-28 14:07 UTC, Enrico Tagliavini
Details
dmesg output with a pr_info of ac->state (65.18 KB, text/plain)
2013-04-13 11:35 UTC, Enrico Tagliavini
Details
dmidecode output (16.19 KB, text/plain)
2013-04-13 17:01 UTC, Enrico Tagliavini
Details
patch (2.79 KB, patch)
2013-04-15 08:39 UTC, Lan Tianyu
Details | Diff

Description Enrico Tagliavini 2012-07-27 16:09:27 UTC
Created attachment 76261 [details]
acpi dump, dmesg and syslogs

When I boot the computer with the AC plugged and then unplug it the system doesn't always catch the change. I run KDE 4.8 and its power management still says the AC is still connected. KDE relies on upower so on ACPI IIRC. I done some edit to the default acpid scripts to be more verbose (it is attached with the other stuff). What I see when I unplug the AC is:
   ac_adapter ACPI0003:00 00000080 00000001
which is indeed the wrong event! This event is the one when you plugin the AC. I checked /sys/class/power_supply/AC/online and it says 0, so the change is detected correctly.

When I boot the computer without the AC attached the correct state is recognized by ACPI and KDE. If I insert the AC and then unplug it again the issue sometimes happens again.

This is a thinkpad e530 (edge series) computer running gentoo linux (mostly the stable branch) with kernel gentoo-source-3.5.0. I can attach my kernel config if needed.

I think this issue is caused by the EC to become confused by something the kernel does. For example in this moment when the AC detector works the F1-12 keys and the multimedia keys are swapped! I have the legacy setting in the BIOS (meaning if I hit the F1 key I get the F1 key, not the Mute speaker even generated), but still sometimes the settings swaps at runtime!

I have attached a tar archive (sorry but they are a lot of files) with acpidump binary output, dmesg after a fresh boot from battery and AC and the syslog when switching from the AC to the battery (from an AC boot) and from battery to AC (from a battery boot).
Comment 1 Len Brown 2012-07-31 02:26:16 UTC
# HERE I DISCONNECT THE AC. Note the wrong signal is detected
Jul 27 17:25:19 ivythink nobody: AC plugged in ac_adapter ACPI0003:00 00000080 00000001

Yes, here the "1" at the end means "online", which seems to be always set,
even for offline AC events:-(

I don't see any error messages from the EC driver in dmesg.

Please try again with the thinkpad_acpi module excluded from
the system to make sure that native driver isn't having
a bad side-effect.
Comment 2 Enrico Tagliavini 2012-07-31 08:45:44 UTC
Thank you for your responde. I forgotten to say I already tried without thinkpad-acpi loaded and the problem is exactly the same. Do you want the logs with thinkpad-acpi blacklisted? Something else I can try?
Comment 3 Enrico Tagliavini 2012-07-31 09:44:35 UTC
ehm... indeed I meant UNloaded, or better to say blacklisted, so it doesn't get loaded at boot. I also removed the AC and battery for one minute before rebooting (just to be sure the EC was clean).
Comment 4 Enrico Tagliavini 2012-07-31 17:30:38 UTC
Created attachment 76581 [details]
dmesg after a cold boot with thinkpad_acpi blacklisted

This is just dmesg without thinkpad-acpi loaded. Note the lines (present even when thinkpad-acpi is loaded booting with AC connected or not):

[    4.230757] ACPI: AC Adapter [AC] (on-line)
[    4.230792] ACPI Warning: 0x000000000000efa0-0x000000000000efbf SystemIO conflicts with Region \_SB_.PCI0.SBUS.SMBI 1 (20120320/utaddress-251)
[    4.230796] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver

Doesn't sound nice, but I ignore what this means, so it might just be unrelated to my issue. I will attach the syslog in a minute.
Comment 5 Enrico Tagliavini 2012-07-31 17:45:09 UTC
Created attachment 76591 [details]
syslog generated without thinkpad-acpi loaded

Nothing new here, same wrong event and no difference at all from the previous logs. Please feel free to ask for any test you need. 

If you I will not hear something from you I think I will enable debug layer ACPI_AC_COMPONENT on all level i guess (if this is gonna be a mess I might stop to ACPI_LV_VERBOSE_INFO including all lower levels), and I will report back if something interesting is found. But please if you think this is a good idea and the good way to go, say it, i will be more motivated :).

Also given I compile my own kernel if you want me to test some git branch or whatever custom patch this should not be a problem, feel free to ask.

Needless to say I want to go from NEEDINFO to WEHAVETHEINFO (and hopefully a fix) :).

Cheers
Comment 6 Enrico Tagliavini 2012-08-02 17:23:25 UTC
Created attachment 76681 [details]
debug messages from acpi

Ok this is quite hard, the output is not readable if not used to acpi kernel development, so I'm going quite blind here. Please ask for some debug_layer and _level if you need additional info. This run is with level=0x000FFFFF and layer set to ACPI_EVENTS (0x00000004) only. I tried with only ACPI_AC_COMPONENT and ACPI_POWER_COMPONENT and ACPI_SYSTEM_COMPONENT but they output nothing at all while connecting/disconnecting the AC.

I will soon attach the output of the run with ACPI_DISPATCHER enabled
Comment 7 Enrico Tagliavini 2012-08-02 17:28:59 UTC
Created attachment 76691 [details]
debug messages from acpi dispatcher

This is with debug_level set to 0x000FFFFF and _layer to 0x00000040. It is 1.6 MBs uncompressed, so I'm forced to compress it to be able to attach it.

This is indeed a mess. I cannot even understand the exact moment when the AC is disconnected. Also remeber that some output is not related to the AC, I have periodic output from the dispatcher, but I have no idea how to filter it.

Please let me know if additional debug is needed. Thank you
Comment 8 Enrico Tagliavini 2012-08-02 17:35:38 UTC
Created attachment 76701 [details]
debug messages from acpi ALL_DRIVERS

Ok while trying some random thing I got this output with layer 0xFFFF0000 (ALL_DRIVERS) and level 0x000FFFFF. I attach it just in case it is important. The periodic ACPI poll is thermal indeed (which is not working btw, it always says 30 °C, but coretemp driver is working correctly and I can see the CPU temperature).
Comment 9 Enrico Tagliavini 2012-08-03 08:35:20 UTC
Ok I discovered something new and I think it is important. The bug happens more likely (almost always) when I unplug the ac from the AC socket on the wall (the one with 220 V alternate current). When I unplug the 20V DC socket from the laptop the ACPI even delivered is correct. So this sounds somewhat like a race condition. When the AC socket is disconnected the capacitors in the PSU are able to deliver some power for about a second. In fact the green led on the laptop just at the side of the DC socket stays on for said time. When I unplug the DC socket from the laptop this LED turns off instantly.

I'm trying some debug option to see different logs from the two events now that I'm able to reproduce more easly the two conditions, but I failed for now. Any advice on what layer/level needs to be debugged is very very welcome.
Comment 10 Enrico Tagliavini 2012-09-04 08:07:18 UTC
One month without activity. I have no idea what to try next, if some interest will be rised I will still be here to help debugging/testing. Fear not I will not disappear
Comment 11 Zhang Rui 2012-11-28 13:52:28 UTC
please run acpidump > acpidump.out, without parameter "-b"
Comment 12 Enrico Tagliavini 2012-11-28 14:07:15 UTC
Created attachment 87541 [details]
acpidump hex-dump

Here it is :)
Comment 13 Zhang Rui 2012-11-29 05:08:48 UTC
Please try to reproduce the problem and see attach the output of "grep . /sys/class/power_supply/*/*".
I guess the AC status is totally messed up already., let check it.
Comment 14 Enrico Tagliavini 2012-11-29 10:56:57 UTC
grep: /sys/class/power_supply/AC/device: Is a directory
/sys/class/power_supply/AC/online:0
grep: /sys/class/power_supply/AC/power: Is a directory
grep: /sys/class/power_supply/AC/subsystem: Is a directory
/sys/class/power_supply/AC/type:Mains
/sys/class/power_supply/AC/uevent:POWER_SUPPLY_NAME=AC
/sys/class/power_supply/AC/uevent:POWER_SUPPLY_ONLINE=0
/sys/class/power_supply/BAT0/alarm:2580000
/sys/class/power_supply/BAT0/capacity:96
/sys/class/power_supply/BAT0/cycle_count:0
grep: /sys/class/power_supply/BAT0/device: Is a directory
/sys/class/power_supply/BAT0/energy_full:51600000
/sys/class/power_supply/BAT0/energy_full_design:57720000
/sys/class/power_supply/BAT0/energy_now:49670000
/sys/class/power_supply/BAT0/manufacturer:SANYO
/sys/class/power_supply/BAT0/model_name:45N1051
grep: /sys/class/power_supply/BAT0/power: Is a directory
/sys/class/power_supply/BAT0/power_now:13478000
/sys/class/power_supply/BAT0/present:1
/sys/class/power_supply/BAT0/serial_number:  550
/sys/class/power_supply/BAT0/status:Discharging
grep: /sys/class/power_supply/BAT0/subsystem: Is a directory
/sys/class/power_supply/BAT0/technology:Li-ion
/sys/class/power_supply/BAT0/type:Battery
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_NAME=BAT0
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_STATUS=Discharging
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_PRESENT=1
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_TECHNOLOGY=Li-ion
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_CYCLE_COUNT=0
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_VOLTAGE_MIN_DESIGN=11100000
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_VOLTAGE_NOW=12143000
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_POWER_NOW=13478000
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_ENERGY_FULL_DESIGN=57720000
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_ENERGY_FULL=51600000
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_ENERGY_NOW=49670000
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_CAPACITY=96
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_MODEL_NAME=45N1051
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_MANUFACTURER=SANYO
/sys/class/power_supply/BAT0/uevent:POWER_SUPPLY_SERIAL_NUMBER=  550
/sys/class/power_supply/BAT0/voltage_min_design:11100000
/sys/class/power_supply/BAT0/voltage_now:12143000

It is all good to my eyes and this is weird couse the syslog just said

Nov 29 10:53:38 ivythink logger: AC plugged in ac_adapter ACPI0003:00 00000080 00000001
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:00 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:01 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:02 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:03 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:04 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:05 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:06 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:07 00000081 00000000
Nov 29 10:53:41 ivythink logger: ACPI event unhandled: battery PNP0C0A:00 00000081 00000001

which is wrong!
Comment 15 Lan Tianyu 2013-04-13 05:58:27 UTC
Hi: Please try the following patch and attach the output of dmesg.

diff --git a/drivers/acpi/ac.c b/drivers/acpi/ac.c
index 6d5bf64..91b9ff4 100644
--- a/drivers/acpi/ac.c
+++ b/drivers/acpi/ac.c
@@ -28,6 +28,7 @@
 #include <linux/slab.h>
 #include <linux/init.h>
 #include <linux/types.h>
+#include <linux/delay.h>
 #ifdef CONFIG_ACPI_PROCFS_POWER
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
@@ -253,6 +254,10 @@ static void acpi_ac_notify(struct acpi_device *device, u32 event)
        case ACPI_NOTIFY_BUS_CHECK:
        case ACPI_NOTIFY_DEVICE_CHECK:
                acpi_ac_get_state(ac);
+               pr_info("%s: ac->state %u\n", __func__, ac->state);
+               msleep(2000);
+               acpi_ac_get_state(ac);
+               pr_info("%s: ac->state %u\n", __func__, ac->state);
                acpi_bus_generate_proc_event(device, event, (u32) ac->state);
                acpi_bus_generate_netlink_event(device->pnp.device_class,
                                                  dev_name(&device->dev), event,
Comment 16 Enrico Tagliavini 2013-04-13 11:35:16 UTC
Created attachment 98481 [details]
dmesg output with a pr_info of ac->state

Hi Lan. You got a point. I attached the whole dmesg, but I guess you are interested in this:

# AC detached
[   82.328764] acpi_ac_notify: ac->state 1
[   84.331240] acpi_ac_notify: ac->state 0
# AC attached
[  102.515453] acpi_ac_notify: ac->state 1
[  104.517753] acpi_ac_notify: ac->state 1
# AC detached
[  116.618425] acpi_ac_notify: ac->state 1
[  118.620669] acpi_ac_notify: ac->state 0
# AC attached
[  134.142452] acpi_ac_notify: ac->state 1
[  136.142782] acpi_ac_notify: ac->state 1

Since you added a sleep of 2 seconds in the kernel code now the userspace also get the correct state [with 2 seconds lag indeed] and the correct power setting is selected. I guess you are expecting this.

The question is: why this happens? Guess this is a nice bug in the Lenovo BIOS / microcontroller, but it would be nice to hear your idea.
Comment 17 Enrico Tagliavini 2013-04-13 11:36:00 UTC
Btw this is on 3.9.0 rc6
Comment 18 Lan Tianyu 2013-04-13 14:41:09 UTC
I check the ACPI table. Found the AC event handler that notify AC driver first and then sleep almost specific time before doing some operations. So this may affect the status reported to user space. So I add 2s sleep to ensure actual operation to be done. So could you try to find a suitable delay for this issue? 2s maybe a little longer. I will add a quirk for this machine. Thanks.

        Method (_Q26, 0, NotSerialized)
                     {
                        Notify (AC, 0x80)
                        Sleep (0x0c3E8)  
                        Store (One, PWRS)
                        PNOT ()
                    }

        Method (_Q27, 0, NotSerialized)
                    {
                        Notify (AC, 0x80)
                        Sleep (0x03E8)
                        Store (Zero, PWRS)
                        PNOT ()
                    }
The sleep in the bios is also weird...
Comment 19 Enrico Tagliavini 2013-04-13 14:56:54 UTC
To be honest, this machine has too much BIOS problems with linux. For example 1 times out of 4 shutdown the machine reboots instead of powering off. It is an extremely annoying machine to work with. I moved to a Dell XPS luckly. Works a way better.

Spending time finding the right time for quirks for a buggy Lenovo is not worth when I have a TODO list with useful projects bugs which is quite long. I'm sorry about that, but I think 2 seconds is acceptable. You can survive 2 seconds with the wrong powersave state set.

Also note that when I unplug the 20V DC socket from the laptop this problem doesn't happen. So it might even be an hardware latency.

I still hope you will add the quirk since I suggested some friend to buy lenovo laptops and they have this issue even if the model is different [if you need more IDs for this let me know, I will ask my friends the needed informations]. I'll also test new patches if you will be so kind to make some more.

Other then that, may I suggest to add a "Lenovo laptops taint kernel"? [This is irony :)]

Thank you for your work on this.
Comment 20 Lan Tianyu 2013-04-13 15:47:05 UTC
Ok. How about test 1s? and please provide the output of dmidecode.
Comment 21 Enrico Tagliavini 2013-04-13 17:01:35 UTC
Created attachment 98511 [details]
dmidecode output

I tested 1 second, it still works, so it is fine to have just 1 second delay here. I attached my dmidecode output.
Comment 22 Lan Tianyu 2013-04-15 08:39:49 UTC
Created attachment 98741 [details]
patch

Please test this new patch, Thanks.
Comment 23 Enrico Tagliavini 2013-04-15 18:50:22 UTC
Removed the previous patch [excluding the pr_info just to be sure]. Applied the one from comment #22 and tested. Indeed it works!

If possible think about backporting it to stable supported kernels. With a bit of luck distributions will do the same.

Thank you very much.

It is sad this is just a workaround and the real bug will never be fixed. Well at least some linux user will have one less bug.
Comment 24 Lan Tianyu 2013-05-06 13:01:30 UTC
Fix patch sent to ACPI maillist
https://patchwork.kernel.org/patch/2523951/
Comment 25 Lan Tianyu 2013-05-27 13:36:45 UTC
The patch has been merged into linux tree.

commit 0ab5bb64937d76c660c29813d8de0f4b47bf7550
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Wed May 8 07:28:46 2013 +0000

    ACPI / AC: Add sleep quirk for Thinkpad e530

    The Thinkpad e530's BIOS notifies the AC device first and then
    sleeps for certain amount of time before doing real work in the
    EC event handler (_Qxx):

Note You need to log in before you can comment on or make changes to this bug.