Bug 61051 - GPE (0x18) storm after boot - Lenovo Ideapad y560p
Summary: GPE (0x18) storm after boot - Lenovo Ideapad y560p
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: BIOS (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Lan Tianyu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-09 05:12 UTC by James Tocknell
Modified: 2014-08-13 01:40 UTC (History)
8 users (show)

See Also:
Kernel Version: 3.10
Subsystem:
Regression: No
Bisected commit-id:


Attachments
acpidump output (277.25 KB, application/octet-stream)
2013-09-09 05:12 UTC, James Tocknell
Details
dmesg output from 2.6.37 (54.00 KB, application/octet-stream)
2013-09-09 05:13 UTC, James Tocknell
Details
dmesg output from 2.6.38 (57.17 KB, application/octet-stream)
2013-09-09 05:14 UTC, James Tocknell
Details
dmesg output from 3.10 (59.78 KB, application/octet-stream)
2013-09-09 05:14 UTC, James Tocknell
Details
contents of /proc/interrupts on 2.6.37 (2.93 KB, application/octet-stream)
2013-09-09 05:15 UTC, James Tocknell
Details
contents of /proc/interrupts on 2.6.38 (3.06 KB, application/octet-stream)
2013-09-09 05:15 UTC, James Tocknell
Details
contents of /proc/interrupts on 3.10 (3.42 KB, application/octet-stream)
2013-09-09 05:16 UTC, James Tocknell
Details
output of "grep . /sys/firmware/acpi/interrupts/*" (3.91 KB, text/plain)
2013-09-10 07:58 UTC, James Tocknell
Details
Samsung NP550P5C ACPI dump (48.18 KB, application/octet-stream)
2013-10-14 10:06 UTC, Agustin Barto
Details
gpe.patch (2.93 KB, patch)
2014-07-01 06:49 UTC, Lan Tianyu
Details | Diff
ACPI / scan: Simplify wakeup GPE initialization for buttons (4.72 KB, patch)
2014-07-09 11:44 UTC, Rafael J. Wysocki
Details | Diff

Description James Tocknell 2013-09-09 05:12:50 UTC
Created attachment 107721 [details]
acpidump output

From Debian bug 655944 (http://bugs.debian.org/655944)):

I found (after testing different kernels in debian snapshots between 2.6.32 and 3.* - I currently run 3.10) that for kernels 2.6.38 and above, kernel threads used significantly more CPU that normal on my laptop. Digging though logfiles lead me to find that /sys/firmware/acpi/interrupts/gpe18 was rising at about
100000 a second (on version 3.1, where I first found the issue).

The other noticeable difference (I don't know if this is important), is that the resolution cannot be set properly for versions 2.6.37 and lower.

I'm able to some testing on the machine (I'm familiar with the unix CLI and some of the dev tools, but I haven't built my own kernel and I'm not that great with git).

Thanks
James
Comment 1 James Tocknell 2013-09-09 05:13:39 UTC
Created attachment 107731 [details]
dmesg output from 2.6.37
Comment 2 James Tocknell 2013-09-09 05:14:07 UTC
Created attachment 107741 [details]
dmesg output from 2.6.38
Comment 3 James Tocknell 2013-09-09 05:14:39 UTC
Created attachment 107751 [details]
dmesg output from 3.10
Comment 4 James Tocknell 2013-09-09 05:15:14 UTC
Created attachment 107761 [details]
contents of /proc/interrupts on 2.6.37
Comment 5 James Tocknell 2013-09-09 05:15:34 UTC
Created attachment 107771 [details]
contents of /proc/interrupts on 2.6.38
Comment 6 James Tocknell 2013-09-09 05:16:11 UTC
Created attachment 107781 [details]
contents of /proc/interrupts on 3.10
Comment 7 Aaron Lu 2013-09-09 06:58:04 UTC
Hi James,

Thanks for the report. Do I understand correctly that the problem of high CPU usage begin to occur on v2.6.38 but the problem of many gpe18 occur only on v3.1 and later? Thanks.
Comment 8 James Tocknell 2013-09-09 07:47:36 UTC
No, gpe18 is still increasing rapidly on v2.6.38, just not as quickly (maybe half the rate). When I installed Debian (unstable) on the laptop, v3.1 was the kernel in unstable then, which was where I found the bug first (and why the Debian bug report mentions v3.1). Sorry for the confusion. Thanks.
Comment 9 Lan Tianyu 2013-09-10 07:08:47 UTC
Please provide the output of "grep . /sys/firmware/acpi/interrupts/*".

BTW, does "echo disabled > /sys/firmware/acpi/interrupts/gpe18" stop gpe?
Comment 10 James Tocknell 2013-09-10 07:58:58 UTC
Created attachment 107971 [details]
output of "grep . /sys/firmware/acpi/interrupts/*"
Comment 11 James Tocknell 2013-09-10 08:01:49 UTC
"echo disabled > /sys/firmware/acpi/interrupts/gpe18" appears to reset the count, but gpe18 is still shown as enabled.
Comment 12 Lan Tianyu 2013-09-11 05:55:51 UTC
O(In reply to James Tocknell from comment #11)
> "echo disabled > /sys/firmware/acpi/interrupts/gpe18" appears to reset the
> count, but gpe18 is still shown as enabled.

Ok. Please provide the dmesg after disabing gpe18.
Comment 13 Lan Tianyu 2013-09-11 06:13:43 UTC
Sorry. A mistake. Please try "echo disable > /sys/firmare/acpi/interrupts/gpe18" again.
Comment 14 James Tocknell 2013-09-13 04:09:18 UTC
On both 2.6.38 and 3.10 (the latest version in Debian unstable), running "echo disable > /sys/firmare/acpi/interrupts/gpe18" twice seems to fix the problem, but only calling disable once appears to do nothing (gpe18 is still enabled), unless it takes more than about 30 seconds to disable.
Comment 15 Lan Tianyu 2013-09-23 01:31:02 UTC
From the DSDT table, GP18 is LID wakeup gpe and so you can "echo LID > /proc/acpi/wakeup" to disable GPE18. But no idea why it rises so much and this seems a hardware or Bios issue.

Device (LID)
{
     Name (_HID, EisaId ("PNP0C0D"))  // _HID: Hardware ID
     Name (_PRW, Package (0x02)  // _PRW: Power Resources for Wake
     {
        0x18,
        0x03
     })
     ...
}
Comment 16 acpibug 2013-09-23 07:37:04 UTC
Could you try to do system suspend/resume and check whether the issue will be fixed?
Comment 17 Zhang Rui 2013-10-14 07:52:37 UTC
ping...
can you please do the test in comment #15 and #16?
Comment 18 Agustin Barto 2013-10-14 10:06:40 UTC
Created attachment 110921 [details]
Samsung NP550P5C ACPI dump

Binary ACPI dump of Samsung NP550P5C which has a similar problem described in http://marc.info/?l=linux-acpi&m=138170536231873&w=2
Comment 19 Agustin Barto 2013-10-14 11:28:53 UTC
I found a workaround for my problem. Enabling the "USB S3 Wake-Up" BIOS option seems to fix this problem. I had it turned off since I bought the machine, but I guess something was introduced in 3.9 that triggers the issue. Judging from the amount of ACPI related issues on Samsung laptops, a true solution for the problem would be to never install Linux on them.
Comment 20 Zhang Rui 2013-10-14 12:22:18 UTC
Agustin,
I'm not sure if they are the same problem or not.
will you please file a new bug report and attach the acpidump output when both "USB S3 Wake-up" BIOS option are set and cleared?

Jame,
comment #19 may be a clue, can you please check your BIOS options and see if there is some thing with wakeup related and check if changing the option helps?
Comment 21 Agustin Barto 2013-10-14 13:06:35 UTC
As per Zhang Rui's suggestion, a new bug was filed [0]. I apologize for the confusion.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=63021
Comment 22 James Tocknell 2013-10-15 07:50:34 UTC
Sorry for the slow response.

On both the 2.6.38 and 3.10 kernels I found that "echo LID > /proc/acpi/wakeup" didn't disable gpe18 (according to /sys/firmare/acpi/interrupts/gpe18), but did disable LID in /proc/acpi/wakeup (both were enabled before running "echo LID > /proc/acpi/wakeup"). Suspending via pm-suspend (which worked) with LID enabled or disabled had no effect.

In the BIOS, I have only 4 options (besides system date/time and boot order), Legacy USB suport (toggle), SATA mode (AHCI/IDE), Power Beep (toggle, description says "A special beep as alarm for external power supply changes.") and Intel Virtual Technology (toggle). I found that toggling Legacy USB suport had no effect. I haven't tested the other settings.
Comment 23 Muhammed YILDIRIM 2013-12-09 22:52:17 UTC
Hello.
I have a Lenovo y560p too. There are lags on system general. For example, while watching a video you see small freezes. Or while scrolling down a webpage you see theese freezes. Adding noapic or nolapic on grub fixing the freezing issue but high cpu usage persists.

@James Tocknell did you noticed these freezes?

I am going to add some info here: http://pastebin.com/CEBiTGeF
Comment 24 James Tocknell 2013-12-15 23:20:21 UTC
I'm not sure. I haven't noticed small freeze in videos, but that may be because I don't watch that many videos. I may have noticed scrolling can stutter, but I had put that down to setting sensitivity to the wrong level.
Comment 25 Muhammed YILDIRIM 2013-12-16 12:19:32 UTC
Can you share your lspci output? Do we have same hardware?
Comment 26 James Tocknell 2013-12-17 00:31:19 UTC
Output of lspci -v: http://dpaste.com/hold/1508817/
Comment 27 Muhammed YILDIRIM 2013-12-17 09:12:40 UTC
Only difference is network controller. You have: Intel Corporation Centrino Wireless-N 1000. I have: Qualcomm Atheros AR9285 Wireless Network Adapter.

The freezes are system general but you can see it easily while watching a video.

Using a kernel > v2.6: There are freezes. Adding noapic on grub fixing the freezing issue. CPU usage problem persists.

Using a kernel <= v2.6: No freezes and no strange cpu usage.

One other noticeable thing is this machine(y560p) doesn't have 2 graphics cards.
Comment 28 Muhammed YILDIRIM 2013-12-26 13:27:04 UTC
After some googling found other people with y560p had same issue. So I think this is not a hardware failure that only I have. Also factory installed Windows 7 don't have any issue. I am able to test anything to solve this issue with Linux Kernel.
Comment 29 Agustin Barto 2013-12-27 11:24:35 UTC
Judging by the ACPI dump, perhaps the patch that was proposed on another issue [1] might be worth trying.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=63021
Comment 30 Muhammed YILDIRIM 2013-12-28 14:19:07 UTC
Compiled with that patch. Does not fix our problem.
After running "echo disable > /sys/firmware/acpi/interrupts/gpe18" twice, CPU usage returns normal.
Comment 31 Muhammed YILDIRIM 2014-03-24 20:16:27 UTC
Can anyone contact Lenovo? I have opened a topic on Lenovo forum but no response: http://forums.lenovo.com/t5/Linux-Discussion/y560p-BIOS-Bug/td-p/1483406
Comment 32 Zhang Rui 2014-06-03 05:17:00 UTC
(In reply to Muhammed YILDIRIM from comment #30)
> Compiled with that patch. Does not fix our problem.
> After running "echo disable > /sys/firmware/acpi/interrupts/gpe18" twice,
> CPU usage returns normal.

first, I'm wondering why we need to run the command twice to make it work.
Second, please run "echo clear > /sys/firmware/acpi/interrupts/gpe18", and then "echo enable > /sys/firmware/acpi/interrupts/gpe18", does the storm come back again?
Comment 33 Muhammed YILDIRIM 2014-06-03 09:30:42 UTC
(In reply to Zhang Rui from comment #32)
> first, I'm wondering why we need to run the command twice to make it work.
> Second, please run "echo clear > /sys/firmware/acpi/interrupts/gpe18", and
> then "echo enable > /sys/firmware/acpi/interrupts/gpe18", does the storm
> come back again?

Yes it comes back.
Comment 34 Lan Tianyu 2014-07-01 02:18:03 UTC
(In reply to Zhang Rui from comment #32)
> (In reply to Muhammed YILDIRIM from comment #30)
> > Compiled with that patch. Does not fix our problem.
> > After running "echo disable > /sys/firmware/acpi/interrupts/gpe18" twice,
> > CPU usage returns normal.
> 
> first, I'm wondering why we need to run the command twice to make it work.
> Second, 

This have been resolved in the latest upstream kernel.
Comment 35 Lan Tianyu 2014-07-01 02:44:47 UTC
Hi Muhammed:
     Could you try the following patch? Test system suspend and check whether this patch will affect the wakeup via LID on your machine. Thanks.

diff --git a/drivers/acpi/wakeup.c b/drivers/acpi/wakeup.c
index 1638401..3d1c9d6 100644
--- a/drivers/acpi/wakeup.c
+++ b/drivers/acpi/wakeup.c
@@ -87,8 +87,8 @@ int __init acpi_wakeup_device_init(void)
                                                       wakeup_list);
                if (device_can_wakeup(&dev->dev)) {
                        /* Button GPEs are supposed to be always enabled. */
-                       acpi_enable_gpe(dev->wakeup.gpe_device,
-                                       dev->wakeup.gpe_number);
+//                     acpi_enable_gpe(dev->wakeup.gpe_device,
+//                                     dev->wakeup.gpe_number);
                        device_set_wakeup_enable(&dev->dev, true);
                }
        }
Comment 36 Lan Tianyu 2014-07-01 06:49:40 UTC
Created attachment 141701 [details]
gpe.patch

Please ignore the previous one and try this patch.
Comment 37 Lan Tianyu 2014-07-04 06:00:45 UTC
ping...
Comment 38 Lv Zheng 2014-07-04 08:45:57 UTC
Hi,

In acpi_ev_dispatch_gpe(), if a GPE doesn't have a handler/method to run, acpi_hw_low_set_gpe() is invoked and the acpi_ev_finish_gpe() isn't, thus it will be automatically disabled.
So this bug means this facility is not working on your platform?

Thanks
Comment 39 Rafael J. Wysocki 2014-07-08 12:02:05 UTC
I think it does, but acpi_ev_asynch_execute_gpe_method() is executed
due to ACPI_GPE_DISPATCH_NOTIFY.

I wonder if we need to install a notify handler for all wakeup GPEs, then.

[Of course, the problem is a missing GPE method on the Lenovo machine
in question, but we need a workaround.]

The Tianyu's patch from comment #36 is worth trying still, so someone
who can reproduce the problem please give it a go.
Comment 40 Lan Tianyu 2014-07-09 06:30:03 UTC
(In reply to Rafael J. Wysocki from comment #39)
> [Of course, the problem is a missing GPE method on the Lenovo machine
> in question, but we need a workaround.]

Hi Rafael:
         This is not a special case and I check some machines' dsdt. They also don't have GPE method for LID wakeup GPE.

E,G acpidump from bug 72641 and bug 77431
    https://bugzilla.kernel.org/attachment.cgi?id=139061
    https://bugzilla.kernel.org/attachment.cgi?id=132091&action=edit
Comment 41 Rafael J. Wysocki 2014-07-09 11:40:28 UTC
Yes, the problem is in our code.  We call acpi_setup_gpe_for_wake()
for buttons just because they are buttons even though there's no _PRW.
Comment 42 Rafael J. Wysocki 2014-07-09 11:44:34 UTC
Created attachment 142581 [details]
ACPI / scan: Simplify wakeup GPE initialization for buttons

This patch should help too if I'm not mistaken, so if someone with a reproducer can test it, please do so.

Note You need to log in before you can comment on or make changes to this bug.