Created attachment 76261 [details]
acpi dump, dmesg and syslogs
When I boot the computer with the AC plugged and then unplug it the system doesn't always catch the change. I run KDE 4.8 and its power management still says the AC is still connected. KDE relies on upower so on ACPI IIRC. I done some edit to the default acpid scripts to be more verbose (it is attached with the other stuff). What I see when I unplug the AC is:
ac_adapter ACPI0003:00 00000080 00000001
which is indeed the wrong event! This event is the one when you plugin the AC. I checked /sys/class/power_supply/AC/online and it says 0, so the change is detected correctly.
When I boot the computer without the AC attached the correct state is recognized by ACPI and KDE. If I insert the AC and then unplug it again the issue sometimes happens again.
This is a thinkpad e530 (edge series) computer running gentoo linux (mostly the stable branch) with kernel gentoo-source-3.5.0. I can attach my kernel config if needed.
I think this issue is caused by the EC to become confused by something the kernel does. For example in this moment when the AC detector works the F1-12 keys and the multimedia keys are swapped! I have the legacy setting in the BIOS (meaning if I hit the F1 key I get the F1 key, not the Mute speaker even generated), but still sometimes the settings swaps at runtime!
I have attached a tar archive (sorry but they are a lot of files) with acpidump binary output, dmesg after a fresh boot from battery and AC and the syslog when switching from the AC to the battery (from an AC boot) and from battery to AC (from a battery boot).
# HERE I DISCONNECT THE AC. Note the wrong signal is detected
Jul 27 17:25:19 ivythink nobody: AC plugged in ac_adapter ACPI0003:00 00000080 00000001
Yes, here the "1" at the end means "online", which seems to be always set,
even for offline AC events:-(
I don't see any error messages from the EC driver in dmesg.
Please try again with the thinkpad_acpi module excluded from
the system to make sure that native driver isn't having
a bad side-effect.
Thank you for your responde. I forgotten to say I already tried without thinkpad-acpi loaded and the problem is exactly the same. Do you want the logs with thinkpad-acpi blacklisted? Something else I can try?
ehm... indeed I meant UNloaded, or better to say blacklisted, so it doesn't get loaded at boot. I also removed the AC and battery for one minute before rebooting (just to be sure the EC was clean).
Created attachment 76581 [details]
dmesg after a cold boot with thinkpad_acpi blacklisted
This is just dmesg without thinkpad-acpi loaded. Note the lines (present even when thinkpad-acpi is loaded booting with AC connected or not):
[ 4.230757] ACPI: AC Adapter [AC] (on-line)
[ 4.230792] ACPI Warning: 0x000000000000efa0-0x000000000000efbf SystemIO conflicts with Region \_SB_.PCI0.SBUS.SMBI 1 (20120320/utaddress-251)
[ 4.230796] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
Doesn't sound nice, but I ignore what this means, so it might just be unrelated to my issue. I will attach the syslog in a minute.
Created attachment 76591 [details]
syslog generated without thinkpad-acpi loaded
Nothing new here, same wrong event and no difference at all from the previous logs. Please feel free to ask for any test you need.
If you I will not hear something from you I think I will enable debug layer ACPI_AC_COMPONENT on all level i guess (if this is gonna be a mess I might stop to ACPI_LV_VERBOSE_INFO including all lower levels), and I will report back if something interesting is found. But please if you think this is a good idea and the good way to go, say it, i will be more motivated :).
Also given I compile my own kernel if you want me to test some git branch or whatever custom patch this should not be a problem, feel free to ask.
Needless to say I want to go from NEEDINFO to WEHAVETHEINFO (and hopefully a fix) :).
Created attachment 76681 [details]
debug messages from acpi
Ok this is quite hard, the output is not readable if not used to acpi kernel development, so I'm going quite blind here. Please ask for some debug_layer and _level if you need additional info. This run is with level=0x000FFFFF and layer set to ACPI_EVENTS (0x00000004) only. I tried with only ACPI_AC_COMPONENT and ACPI_POWER_COMPONENT and ACPI_SYSTEM_COMPONENT but they output nothing at all while connecting/disconnecting the AC.
I will soon attach the output of the run with ACPI_DISPATCHER enabled
Created attachment 76691 [details]
debug messages from acpi dispatcher
This is with debug_level set to 0x000FFFFF and _layer to 0x00000040. It is 1.6 MBs uncompressed, so I'm forced to compress it to be able to attach it.
This is indeed a mess. I cannot even understand the exact moment when the AC is disconnected. Also remeber that some output is not related to the AC, I have periodic output from the dispatcher, but I have no idea how to filter it.
Please let me know if additional debug is needed. Thank you
Created attachment 76701 [details]
debug messages from acpi ALL_DRIVERS
Ok while trying some random thing I got this output with layer 0xFFFF0000 (ALL_DRIVERS) and level 0x000FFFFF. I attach it just in case it is important. The periodic ACPI poll is thermal indeed (which is not working btw, it always says 30 °C, but coretemp driver is working correctly and I can see the CPU temperature).
Ok I discovered something new and I think it is important. The bug happens more likely (almost always) when I unplug the ac from the AC socket on the wall (the one with 220 V alternate current). When I unplug the 20V DC socket from the laptop the ACPI even delivered is correct. So this sounds somewhat like a race condition. When the AC socket is disconnected the capacitors in the PSU are able to deliver some power for about a second. In fact the green led on the laptop just at the side of the DC socket stays on for said time. When I unplug the DC socket from the laptop this LED turns off instantly.
I'm trying some debug option to see different logs from the two events now that I'm able to reproduce more easly the two conditions, but I failed for now. Any advice on what layer/level needs to be debugged is very very welcome.
One month without activity. I have no idea what to try next, if some interest will be rised I will still be here to help debugging/testing. Fear not I will not disappear
please run acpidump > acpidump.out, without parameter "-b"
Created attachment 87541 [details]
Here it is :)
Please try to reproduce the problem and see attach the output of "grep . /sys/class/power_supply/*/*".
I guess the AC status is totally messed up already., let check it.
grep: /sys/class/power_supply/AC/device: Is a directory
grep: /sys/class/power_supply/AC/power: Is a directory
grep: /sys/class/power_supply/AC/subsystem: Is a directory
grep: /sys/class/power_supply/BAT0/device: Is a directory
grep: /sys/class/power_supply/BAT0/power: Is a directory
grep: /sys/class/power_supply/BAT0/subsystem: Is a directory
It is all good to my eyes and this is weird couse the syslog just said
Nov 29 10:53:38 ivythink logger: AC plugged in ac_adapter ACPI0003:00 00000080 00000001
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:00 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:01 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:02 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:03 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:04 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:05 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:06 00000081 00000000
Nov 29 10:53:40 ivythink logger: ACPI event unhandled: processor LNXCPU:07 00000081 00000000
Nov 29 10:53:41 ivythink logger: ACPI event unhandled: battery PNP0C0A:00 00000081 00000001
which is wrong!
Hi: Please try the following patch and attach the output of dmesg.
diff --git a/drivers/acpi/ac.c b/drivers/acpi/ac.c
index 6d5bf64..91b9ff4 100644
@@ -28,6 +28,7 @@
@@ -253,6 +254,10 @@ static void acpi_ac_notify(struct acpi_device *device, u32 event)
+ pr_info("%s: ac->state %u\n", __func__, ac->state);
+ pr_info("%s: ac->state %u\n", __func__, ac->state);
acpi_bus_generate_proc_event(device, event, (u32) ac->state);
Created attachment 98481 [details]
dmesg output with a pr_info of ac->state
Hi Lan. You got a point. I attached the whole dmesg, but I guess you are interested in this:
# AC detached
[ 82.328764] acpi_ac_notify: ac->state 1
[ 84.331240] acpi_ac_notify: ac->state 0
# AC attached
[ 102.515453] acpi_ac_notify: ac->state 1
[ 104.517753] acpi_ac_notify: ac->state 1
# AC detached
[ 116.618425] acpi_ac_notify: ac->state 1
[ 118.620669] acpi_ac_notify: ac->state 0
# AC attached
[ 134.142452] acpi_ac_notify: ac->state 1
[ 136.142782] acpi_ac_notify: ac->state 1
Since you added a sleep of 2 seconds in the kernel code now the userspace also get the correct state [with 2 seconds lag indeed] and the correct power setting is selected. I guess you are expecting this.
The question is: why this happens? Guess this is a nice bug in the Lenovo BIOS / microcontroller, but it would be nice to hear your idea.
Btw this is on 3.9.0 rc6
I check the ACPI table. Found the AC event handler that notify AC driver first and then sleep almost specific time before doing some operations. So this may affect the status reported to user space. So I add 2s sleep to ensure actual operation to be done. So could you try to find a suitable delay for this issue? 2s maybe a little longer. I will add a quirk for this machine. Thanks.
Method (_Q26, 0, NotSerialized)
Notify (AC, 0x80)
Store (One, PWRS)
Method (_Q27, 0, NotSerialized)
Notify (AC, 0x80)
Store (Zero, PWRS)
The sleep in the bios is also weird...
To be honest, this machine has too much BIOS problems with linux. For example 1 times out of 4 shutdown the machine reboots instead of powering off. It is an extremely annoying machine to work with. I moved to a Dell XPS luckly. Works a way better.
Spending time finding the right time for quirks for a buggy Lenovo is not worth when I have a TODO list with useful projects bugs which is quite long. I'm sorry about that, but I think 2 seconds is acceptable. You can survive 2 seconds with the wrong powersave state set.
Also note that when I unplug the 20V DC socket from the laptop this problem doesn't happen. So it might even be an hardware latency.
I still hope you will add the quirk since I suggested some friend to buy lenovo laptops and they have this issue even if the model is different [if you need more IDs for this let me know, I will ask my friends the needed informations]. I'll also test new patches if you will be so kind to make some more.
Other then that, may I suggest to add a "Lenovo laptops taint kernel"? [This is irony :)]
Thank you for your work on this.
Ok. How about test 1s? and please provide the output of dmidecode.
Created attachment 98511 [details]
I tested 1 second, it still works, so it is fine to have just 1 second delay here. I attached my dmidecode output.
Created attachment 98741 [details]
Please test this new patch, Thanks.
Removed the previous patch [excluding the pr_info just to be sure]. Applied the one from comment #22 and tested. Indeed it works!
If possible think about backporting it to stable supported kernels. With a bit of luck distributions will do the same.
Thank you very much.
It is sad this is just a workaround and the real bug will never be fixed. Well at least some linux user will have one less bug.
Fix patch sent to ACPI maillist
The patch has been merged into linux tree.
Author: Lan Tianyu <firstname.lastname@example.org>
Date: Wed May 8 07:28:46 2013 +0000
ACPI / AC: Add sleep quirk for Thinkpad e530
The Thinkpad e530's BIOS notifies the AC device first and then
sleeps for certain amount of time before doing real work in the
EC event handler (_Qxx):