Bug 16512

Summary: Regression - Immediate wakeup after suspend due to LID event since 2.6.34 - MSI Wind
Product: Power Management Reporter: Dennis Jansen (dennis.jansen)
Component: Hibernation/SuspendAssignee: Zhang Rui (rui.zhang)
Status: CLOSED DOCUMENTED    
Severity: normal CC: alan, harviecz, lenb, rjw, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.3.0 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: dmesg of 2.6.33 after suspend
dmesg of 2.6.34 after trying to suspend
dmesg of 2.6.35 after trying to suspend
dmesg diff -y between 2.6.33 and 2.6.34
lspci
custom DSDT: use \LIDS for LID0 status
dmesg 2.6.36 with patched DSDT from #26
debug patch: enable all the lid device
custom DSDT: show correct lid status (the same as LID0) for LID device.
debug patch to disable LID0

Description Dennis Jansen 2010-08-04 10:09:25 UTC
Some time after kernel 2.6.31 my system does no longer enter the suspend to ram mode. It goes all the way to UP mode, but simply wakes back up again then.

(2.6.35 from ubuntu kernel ppa, single user mode)
(...)
[36203.085221] HDA Intel 0000:00:1b.0: PCI INT A disabled
[36203.100162] psb 0000:00:02.0: PCI INT A disabled
[36203.116177] PM: suspend devices took 1.644 seconds
[36203.116368] ehci_hcd 0000:00:1d.7: PME# disabled
[36203.132260] ACPI: Preparing to enter system sleep state S3
[36203.139955] Disabling non-boot CPUs ...
[36203.244033] CPU 1 is now offline
[36203.244040] SMP alternatives: switching to UP code
[36203.249858] CPU0 attaching NULL sched-domain.
[36203.249867] CPU1 attaching NULL sched-domain.
[36203.249880] CPU0 attaching NULL sched-domain.
[36203.250563] CPU1 is down
[36203.250624] Extended CMOS year: 2000
[36203.250624] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[36203.250624] Back to C!
[36203.250624] CPU0: Thermal monitoring enabled (TM1)
[36203.250624] Extended CMOS year: 2000
[36203.250624] Enabling non-boot CPUs ...
[36203.250624] SMP alternatives: switching to SMP code
[36203.256386] Booting processor 1 APIC 0x1 ip 0x6000
[36203.249521] Initializing CPU#1
[36203.249521] Calibrating delay using timer specific routine.. 3191.95 BogoMIPS (lpj=6383919)
(...)

Where do I start debugging? Unfortunately you can see it's netbook and it neither has lots of disk space, nor ram nor processor performance. I will add results for the kernels between 2.6.31 and 2.6.35
Comment 1 Dennis Jansen 2010-08-04 10:10:37 UTC
System: MSI Wind U110 (GMA 500)
OS: Ubuntu 9.10 and Ubuntu 10.04
Comment 2 Dennis Jansen 2010-08-04 10:57:19 UTC
Result: The bug first appears in 2.6.34. I'll attach the dmesgs of 2.6.33 to 35. Please let me know if anything else could help. No, I won't bisect for now.
Comment 3 Dennis Jansen 2010-08-04 10:58:31 UTC
Created attachment 27343 [details]
dmesg of 2.6.33 after suspend

Here suspend is still working as expected.
Comment 4 Dennis Jansen 2010-08-04 11:00:44 UTC
Created attachment 27344 [details]
dmesg of 2.6.34 after trying to suspend

Here nothing seems to happen, except that video is as always broken after suspend until going back into X.
Comment 5 Dennis Jansen 2010-08-04 11:02:05 UTC
Created attachment 27345 [details]
dmesg of 2.6.35 after trying to suspend

Here the system also does not enter suspend.
Comment 6 Dennis Jansen 2010-08-04 11:42:11 UTC
Created attachment 27346 [details]
dmesg diff -y between 2.6.33 and 2.6.34

diff -y -w -B --suppress-common-lines ../dmesg-2.6.33 dmesg 2.6.34
Comment 7 Rafael J. Wysocki 2010-08-04 12:01:09 UTC
The log in comment #5 shows that the system actually has suspended and woken up immediately afterwards.  I guess some of the wakeup devices on your system causes the immediate wakeup to happen and I'd bet on USB controllers.

Please attach:
(1) the contents of /proc/acpi/wakeup
(2) the output of lspci
Comment 8 Dennis Jansen 2010-08-04 12:04:18 UTC
Created attachment 27347 [details]
lspci
Comment 9 Dennis Jansen 2010-08-04 12:23:10 UTC
Genious Mr. Wysocki!
Though the content of /proc/acpi/wakup is the same:
EUSB      S4    *disabled  pci:0000:00:1d.7
LID       S4    *enabled

and USB is innocent, for some reason LID is the culprit. echo LID > /proc/acpi/wakeup fixes the issue!

Now how to get this into the kernel? Where is the bug?
Comment 10 Rafael J. Wysocki 2010-08-04 22:57:20 UTC
The LID is marked as a wakeup device, because it is handled by the button driver, which is intentional, but your BIOS generates a LID event every time immediately after suspend (it shouldn't do that).

Please use the "echo LID > /proc/acpi/wakeup" workaround for now.  Perhaps there is a better workaround for this issue, but generally it is a BIOS problem.
Comment 11 Zhang Rui 2010-09-27 01:32:58 UTC
ping Dennis...
Comment 12 Zhang Rui 2010-09-27 01:34:32 UTC
BTW, Dennis, please make sure that your machine can suspend correctly by using the shutdown mune -> suspend button.
Comment 13 Dennis Jansen 2010-09-27 18:15:35 UTC
It works fine with the echo LID workaround.
Any chance for an out of the box fix?

What's shutdown mune?
Comment 14 Zhang Rui 2010-09-28 01:35:13 UTC
(In reply to comment #13)
> It works fine with the echo LID workaround.
> Any chance for an out of the box fix?
> 
> What's shutdown mune?

oh, I mean the shutdown menu.
you don't need to do the test as we have verified the root cause.
And I'm afraid we can not fix it in Linux/kernel.

please attach the content of /proc/acpi/wakeup
Comment 15 Dennis Jansen 2010-09-28 05:27:12 UTC
Again, please see above, #9.
Comment 16 Zhang Rui 2010-09-28 05:33:59 UTC
do you have any chance to boot into a "working" kernel, say 2.6.31, where the problem doesn't exist, and attach the /proc/acpi/wakeup file of that kernel?
Comment 17 Dennis Jansen 2010-09-28 05:46:26 UTC
It's the same. But if you like I can paste it.
Comment 18 Zhang Rui 2010-09-29 02:48:35 UTC
so I'm wondering what make this a regression.
can you run git-bisect to see which commit introduce this problem please?
Comment 19 Dennis Jansen 2010-09-29 05:53:31 UTC
Sorry, not for about a year.
Comment 20 Rafael J. Wysocki 2010-09-29 10:14:58 UTC
@Rui: Well, I think we simply started to actually use the lid as a wakeup event source on this machine at one point and that revealed the problem.

FWIW: I have a Wind U100 and it is not affected.

I'm going to move the ACPI wakeup devices to the general wakeup infrastructure that is being developed at the moment and that should make it easier to work around this issue, or even provide a fix for it.
Comment 21 Dennis Jansen 2010-10-22 20:09:56 UTC
I just found out by accident that my acpi lid state file always reports the lid to be open. I guess that might be related to this bug.
Comment 22 Rafael J. Wysocki 2010-10-22 21:12:18 UTC
It certainly is.
Comment 23 Dennis Jansen 2010-10-22 21:59:10 UTC
Ok. The details:

there are two entries: LID and LID0.
inotifywatch finds access to LID, but it stays on "open".
LID0 does not generate an inotify event I think, but it changes to closed.
Comment 24 Zhang Rui 2010-10-25 00:18:46 UTC
(In reply to comment #23)
> there are two entries: LID and LID0.
> inotifywatch finds access to LID, but it stays on "open".
> LID0 does not generate an inotify event I think, but it changes to closed.

please attach the acpidump output of your laptop.
Comment 25 Dennis Jansen 2010-10-25 06:00:49 UTC
acpidump: https://bugzilla.kernel.org/attachment.cgi?id=32592

various other information: https://bugzilla.kernel.org/show_bug.cgi?id=19762
Comment 26 Zhang Rui 2010-12-27 07:09:56 UTC
Created attachment 41672 [details]
custom DSDT: use \LIDS for LID0 status

please apply the custom DSDT and see if the problem still exists.
Comment 27 Dennis Jansen 2010-12-27 20:37:04 UTC
Created attachment 41742 [details]
dmesg 2.6.36 with patched DSDT from #26

Sorry, the custom DSDT does no work. It wakes again immediately. I've attached the dmesg.
Comment 28 Rafael J. Wysocki 2010-12-27 23:49:27 UTC
@Rui: According to Dennis, LID0 is not a problem.  LID is.
Comment 29 Zhang Rui 2012-01-18 02:15:24 UTC
It's great that kernel bugzilla is back.

what's the current status of this bug?
Dennis,
can you please verify if the problem still exists in the latest upstream
kernel?
Comment 30 Tomas Mudrunka 2012-01-21 04:19:54 UTC
Can't this be related to this?
https://bugzilla.kernel.org/show_bug.cgi?id=11717

My laptop had suddenly woken up without a reason several times when i've been travelling which is not very comfortable. I'd prefer reliable suspend over having one more button to wake up the system (when i can use any other key to wake it up).
Comment 31 Dennis Jansen 2012-01-25 09:22:02 UTC
Just confirmed in 3.2.1. Still exactly the same. Thanks for checking! I won't have too much time for debugging until August.
Comment 32 Dennis Jansen 2012-01-25 09:24:41 UTC
btw. There is only LID in wakeup with the new kernel, no LID0.

@Thomas: try disabling everything in /proc/acpi/wakeup and see if that solves your problems.
Comment 33 Dennis Jansen 2012-03-20 07:10:48 UTC
(Still in 3.3.0-rc7.)
Comment 34 D. Jansen 2012-06-14 20:31:15 UTC
Does not seem related. I can't test as there are other issues in the
current kernel. I'll try again with the next major release.
On Jun 14, 2012 10:27 AM, <bugzilla-daemon@bugzilla.kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=16512
>
>
> Alan <alan@lxorguk.ukuu.org.uk> changed:
>
>           What    |Removed                     |Added
>
> ----------------------------------------------------------------------------
>             Status|NEEDINFO                    |ASSIGNED
>
>
>
>
> --
> Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.
>
Comment 35 Dennis Jansen 2012-09-16 17:59:32 UTC
confirmed in 3.6.

Also this might be interesting, he says he finds more than one Lid:
[    0.909629] ACPI: AC Adapter [ADP1] (off-line)
[    0.912260] input: Lid Switch as /devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0C09:00/PNP0C0D:00/input/input0
[    0.919659] ACPI: Lid Switch [LID0]
[    0.919899] input: Sleep Button as /devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0C09:00/PNP0C0E:00/input/input1
[    0.920033] ACPI: Sleep Button [SLPB]
[    0.920247] input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input2
[    0.920355] ACPI: Power Button [PWRB]
XXX [    0.920450] ACPI: More than one Lid device found! XXX
[    0.920535] button: probe of PNP0C0D:01 failed with error -17
Comment 36 Zhang Rui 2012-11-28 02:56:00 UTC
Created attachment 87461 [details]
debug patch: enable all the lid device

please try this debug patch.
please attach the dmesg output after system boot as well.
Comment 37 Zhang Rui 2012-12-11 01:28:02 UTC
ping...
Comment 38 Dennis Jansen 2012-12-11 11:26:42 UTC
I can't apply this to 2.6.36, right? Because I haven't upgraded my kernel
since then, due to graphics drivers problems (poulsbo). And with 3.x I'll
have to find a version again where standby works... Could you please port
this patch to 2.6.36? I could respond very quickly then.


On Tue, Dec 11, 2012 at 2:28 AM, <bugzilla-daemon@bugzilla.kernel.org>wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=16512
>
>
>
>
>
> --- Comment #37 from Zhang Rui <rui.zhang@intel.com>  2012-12-11 01:28:02
> ---
> ping...
>
> --
> Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.
>
Comment 39 Alan 2012-12-11 13:25:00 UTC
Poulsbo should work out of the box with the kernel GMA500 driver and the standard Xorg mode setting driver in 3.6. No hardware 3D acceleration however.
Comment 40 Dennis Jansen 2012-12-11 20:14:23 UTC
Ok, tested the patch (backported in 2.6.36). I didn't notice anything
different. I didn't see anything new in dmesg. Anything particular I should
look for?


On Tue, Dec 11, 2012 at 2:25 PM, <bugzilla-daemon@bugzilla.kernel.org>wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=16512
>
>
>
>
>
> --- Comment #39 from Alan <alan@lxorguk.ukuu.org.uk>  2012-12-11 13:25:00
> ---
> Poulsbo should work out of the box with the kernel GMA500 driver and the
> standard Xorg mode setting driver in 3.6. No hardware 3D acceleration
> however.
>
> --
> Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.
>
Comment 41 Zhang Rui 2013-04-13 16:53:48 UTC
Created attachment 98491 [details]
custom DSDT: show correct lid status (the same as LID0) for LID device.

first, please try the custom dsdt attached.
it should show the correct lid status as it returns the same value as LID0.
Comment 42 Zhang Rui 2013-04-13 16:57:53 UTC
Created attachment 98501 [details]
debug patch to disable LID0

if the custom dsdt does not help, please apply this debug patch on top and see if it helps.
Comment 43 Zhang Rui 2013-04-21 14:32:40 UTC
ping...
Comment 44 Dennis Jansen 2013-04-21 18:13:01 UTC
pong. Hi. thank you very much. This will take some time. Sorry about that.


2013/4/21 <bugzilla-daemon@bugzilla.kernel.org>

> https://bugzilla.kernel.org/show_bug.cgi?id=16512
>
>
>
>
>
> --- Comment #43 from Zhang Rui <rui.zhang@intel.com>  2013-04-21 14:32:40
> ---
> ping...
>
> --
> Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.
>
Comment 45 Dennis Jansen 2013-04-22 18:15:10 UTC
The new DSDT does not change anything in behavior. Anything special I should look for? It still suspends and resumes immediately (unless disabled via /proc/acpi/wakeup). Checking the patch next.
Comment 46 Dennis Jansen 2013-04-22 18:30:52 UTC
Oh, I tested the patch by itself, which does not change behavior either. Now I'll try to combination of changed DSDT and patch.
(all tested in 2.6.36)
Comment 47 Dennis Jansen 2013-04-22 18:37:09 UTC
No change either. As a summary, I've tested:
- only dsdt
- only patch
- both dsdt and patch.

Effects: always necessary to manually disable lid in proc, otherwise immediate resume.
Comment 48 Zhang Rui 2013-04-23 02:18:04 UTC
(In reply to comment #23)
> Ok. The details:
> 
> there are two entries: LID and LID0.
> inotifywatch finds access to LID, but it stays on "open".
> LID0 does not generate an inotify event I think, but it changes to closed.

the custom DSDT is to fix the inotifywatch issue.
please check if it works or not.
Comment 49 Zhang Rui 2013-04-23 02:41:53 UTC
(In reply to comment #45)
> The new DSDT does not change anything in behavior. Anything special I should
> look for? It still suspends and resumes immediately (unless disabled via
> /proc/acpi/wakeup).

echo LID > /proc/acpi/wakeup the different is that we disables GPE 0x0D
so I'm wondering why this is a regression because 0x0D is always enabled during suspend, even in old kernels.

could you please double check if suspend always work in 2.6.30?
Comment 50 Dennis Jansen 2013-04-23 07:35:24 UTC
Ok, so regarding inotifywatch:
I still get no notification when closing the lid. Now there's only LID, not LID0. when I use cat, I can see that the lid state correctly changes to closed, though. So that's probably a problem somewhere else.
Comment 51 Dennis Jansen 2013-04-23 07:35:59 UTC
Would kernel 2.6.31 also work? I don't have a 2.6.30 right now.
Comment 52 Dennis Jansen 2013-04-23 07:37:19 UTC
As I wrote above, the issue first appears in 2.6.34 for this system.
Comment 53 Zhang Rui 2013-04-23 07:40:17 UTC
(In reply to comment #52)
> As I wrote above, the issue first appears in 2.6.34 for this system.

then please check any kernel earlier than 2.6.34
Comment 54 Zhang Rui 2013-04-23 08:13:44 UTC
(In reply to comment #52)
> As I wrote above, the issue first appears in 2.6.34 for this system.

oh, better with 2.6.33
Comment 55 Zhang Rui 2013-04-23 08:37:59 UTC
another question, can opening lid wake up the system from suspend in 2.6.33 kernel?
Comment 56 Zhang Rui 2013-04-23 08:45:41 UTC
this is the code in 2.6.33:
void acpi_enable_wakeup_device(u8 sleep_state)
{
        struct list_head *node, *next;

        /* 
         * Caution: this routine must be invoked when interrupt is disabled 
         * Refer ACPI2.0: P212
         */
        list_for_each_safe(node, next, &acpi_wakeup_device_list) {
                struct acpi_device *dev =
                        container_of(node, struct acpi_device, wakeup_list);

                if (!dev->wakeup.flags.valid)
                        continue;

                /* If users want to disable run-wake GPE,
                 * we only disable it for wake and leave it for runtime
                 */
                if ((!dev->wakeup.state.enabled && !dev->wakeup.prepare_count)
                    || sleep_state > (u32) dev->wakeup.sleep_state) {
                        if (dev->wakeup.flags.run_wake) {
                                /* set_gpe_type will disable GPE, leave it like that */
                                acpi_set_gpe_type(dev->wakeup.gpe_device,
                                                  dev->wakeup.gpe_number,
                                                  ACPI_GPE_TYPE_RUNTIME);
                        }
                        continue;
                }
                if (!dev->wakeup.flags.run_wake)
                        acpi_enable_gpe(dev->wakeup.gpe_device,
                                        dev->wakeup.gpe_number);
        }
}

we can see that acpi_enable_gpe is invoked for all the non-run-wake gpes.
this is strange. because it seems that all the runwake GPE is never enabled during suspend previously, or do I miss something?
Comment 57 Zhang Rui 2013-04-23 09:14:53 UTC
here is the code path in 2.6.34:
suspend_ops->
.prepare_late -> acpi_pm_prepare |-> __acpi_pm_prepare
                                 |-> acpi_disable_all_gpes to disable all gpes
										 
.enter -> acpi_enable_wakeup_device -> acpi_set_gpe to enable GPE for all wake up gpes

and here is the code path in 2.6.33:
suspend_ops->
.prepare_late -> acpi_pm_prepare |-> __acpi_pm_prepare
                                 |-> acpi_disable_all_gpes to disable all gpes
										 
.enter -> acpi_enable_wakeup_device -> acpi_enable_gpe for all non-run-wake gpes
for non-run_wake gpe

This makes me wondering whether this is a software bug, or a software improvement that reveals a hardware problem.

Rafael,
can you comment on this one please? because this change is introduced in commit
commit 9630bdd9b15d2f489c646d8bc04b60e53eb5ec78
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Wed Feb 17 23:41:07 2010 +0100

    ACPI: Use GPE reference counting to support shared GPEs
    
    ACPI GPEs may map to multiple devices.  The current GPE interface
    only provides a mechanism for enabling and disabling GPEs, making
    it difficult to change the state of GPEs at runtime without extensive
    cooperation between devices.
    
    Add an API to allow devices to indicate whether or not they want
    their device's GPE to be enabled for both runtime and wakeup events.
    
    Remove the old GPE type handling entirely, which gets rid of various
    quirks, like the implicit disabling with GPE type setting. This
    requires a small amount of rework in order to ensure that non-wake
    GPEs are enabled by default to preserve existing behaviour.
    
    Based on patches from Matthew Garrett <mjg@redhat.com>.
    
    Signed-off-by: Matthew Garrett <mjg@redhat.com>
    Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
    Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Comment 58 Dennis Jansen 2013-04-23 09:18:31 UTC
He's already commented, see above, e.g. 
https://bugzilla.kernel.org/show_bug.cgi?id=16512#c10
https://bugzilla.kernel.org/show_bug.cgi?id=16512#c20
Comment 59 Dennis Jansen 2013-04-23 09:22:12 UTC
FWIW: I thank you all very much. The workaround has been doing very well for me. And I expect soon I will no longer use this system much. So if you're interested I might be able to ship it to you for testing. On the other hand, I think this bug is not very important as the issue is not too serious and the workaround is very simple.
Maybe your resources are better used somewhere else? :) I would understand that!

btw. AFAIK waking up from S3 by opening the lid doesn't even work in Windows.
Comment 60 Zhang Rui 2013-04-23 09:23:53 UTC
yeah, I remember these comments.
But I think that is just an assumption in comment #10.
And I just want to confirm his conclusion after some investigation.
Comment 61 Dennis Jansen 2013-04-23 10:52:24 UTC
1. Ok, I've tested 2.6.32. I could suspend the system without changing the wakeup file (which showed one "LID" as open).
2. But the system did not wake up if I opened the lid. I confirmed it does not do this in Windows, either. 
3. Also, in Windows, the detection of closing the lid is  less reliable than in Linux it seems.
Comment 62 Zhang Rui 2013-04-24 01:28:20 UTC
okay. 
I think the problem is clear now.
It is the hardware that generates a LID wakeup GPE during suspend, which causes Linux to resume immediately.
In this case, there is nothing we can do in the kernel, except the workaround to disable LID wakeup GPE, as stated in comment #9.

Bug closed.