Bug 117481

Summary: GPE flooding prevention - Problems with gpe06 interrupt storm on a 2016 Macbook Pro
Product: ACPI Reporter: Attila-Mihaly Balazs (dify.ltd)
Component: Config-InterruptsAssignee: Lv Zheng (lv.zheng)
Status: CLOSED CODE_FIX    
Severity: normal CC: ben.kero, charles.noneman, gokcen.eraslan, igb, juha, lenb, mattst88, rui.zhang
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 4.4.0-21 Subsystem:
Regression: No Bisected commit-id:
Attachments: acpidump output
acpidump -z output
acpidump -b output
copy of /sys/firmware/dmi/tables/DMI
Output of the "grep . -r /sys/firmware/acpi/interrupts" command
Output of "sudo perf record -g -a sleep 10"

Description Attila-Mihaly Balazs 2016-05-01 17:02:51 UTC
This seems similar to #53071 and #25412, however it is on a different GPE and still occurring.

Symptoms:

- A CPU core is 80-90% pegged with kworker:
12772 root      20   0       0      0      0 R  84,1  0,0  15:37.79 kworker/0:6

- When doing grep . -r /sys/firmware/acpi/interrupts/ I see a lot of interrupts on gpe06:
/sys/firmware/acpi/interrupts/gpe06:222309848   enabled

- Disabling gpe06 frees up the core without any noticeable side-effects:
# echo disable > /sys/firmware/acpi/interrupts/gpe06
Comment 1 Attila-Mihaly Balazs 2016-05-01 17:13:47 UTC
Created attachment 214831 [details]
acpidump output
Comment 2 Attila-Mihaly Balazs 2016-05-01 17:14:26 UTC
Created attachment 214841 [details]
acpidump -z output
Comment 3 Attila-Mihaly Balazs 2016-05-01 17:14:57 UTC
Created attachment 214851 [details]
acpidump -b output
Comment 4 Attila-Mihaly Balazs 2016-05-01 17:22:26 UTC
https://bugzilla.kernel.org/show_bug.cgi?id=85881 seems also be related
Comment 5 Attila-Mihaly Balazs 2016-05-01 17:26:55 UTC
Created attachment 214861 [details]
copy of /sys/firmware/dmi/tables/DMI

I attached a copy of /sys/firmware/dmi/tables/DMI since running "sudo dmidecode" gives the following error:

# dmidecode 3.0
Getting SMBIOS data from sysfs.
SMBIOS 2.4 present.
45 structures occupying 2597 bytes.
Table at 0x7AD14000.

mmap: Can't map beyond end of file /sys/firmware/dmi/tables/DMI
Table is unreachable, sorry.

(dmidecode version 3.0)
Comment 6 Len Brown 2016-05-03 00:04:39 UTC
        Method (_L06, 0, NotSerialized)  // _Lxx: Level-Triggered GPE
        {
            If ((\_SB.PCI0.IGPU.GSSE && !GSMI))
            {
                \_SB.PCI0.IGPU.GSCI ()
            }
            Else
            {
                \_SB.PCI0.IGPU.GEFC = 0x00
                SCIS = 0x01
                \_SB.PCI0.IGPU.GSSE = 0x00
                \_SB.PCI0.IGPU.SCIE = 0x00
            }
        }

GPE 06 seems to be display related:

           Device (IGPU)
            {
                Name (_ADR, 0x00020000)  // _ADR: Address
                OperationRegion (GFXH, PCI_Config, 0x00, 0x40)
                Field (GFXH, ByteAcc, NoLock, Preserve)
                {
                    VID0,   16,
                    DID0,   16
                }

                Method (_DSM, 4, NotSerialized)  // _DSM: Device-Specific Method
                {
                    If ((Arg0 == ToUUID ("a0b5b7c6-1318-441c-b0c9-fe695eaf949b")))
                    {
                        If (((VID0 & 0xFFFF) != 0xFFFF))
                        {
                            Local0 = Package (0x02)
                                {
                                    "hda-gfx",
                                    Buffer (0x0A)
                                    {
                                        "onboard-1"
                                    }
                                }
                            DTGP (Arg0, Arg1, Arg2, Arg3, RefOf (Local0))
                            Return (Local0)
                        }
                    }

                    Return (0x00)
                }
....

just for grins... what happens if you boot with "acpi_osi="
(and later, when the patch is upstream to allow it, "acpi_osi=!Darwin" )
Comment 7 Attila-Mihaly Balazs 2016-05-03 02:57:00 UTC
Created attachment 214991 [details]
Output of the "grep . -r /sys/firmware/acpi/interrupts" command
Comment 8 Attila-Mihaly Balazs 2016-05-03 02:57:48 UTC
Created attachment 215011 [details]
Output of "sudo perf record -g -a sleep 10"
Comment 9 Attila-Mihaly Balazs 2016-05-03 03:09:59 UTC
@Len: thank you for your quick reply.

- If I boot with "acpi_osi=" nothing seems to change

- I also attached the output of "grep . -r /sys/firmware/acpi/interrupts/" to make sure I didn't accidentally omit relevant details (for example both "interrupts/sci" and "interrupts/gpe_all" are also high in addition to "interrupts/gpe06" but I assumed they were some kind of "totals" perhaps)

- I also attached the output of "sudo perf record -g -a sleep 10".

- I worked around the issue by putting this in my root crontab:
@reboot echo disable > /sys/firmware/acpi/interrupts/gpe06
Comment 10 Lv Zheng 2016-05-10 00:46:17 UTC
Is that possible for you to download this git repo:

https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/log/?h=linux-next

# git clone git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
# git checkout linux-next

And try to boot the kernel with acpi_osi=!Darwin?

Thanks and best regards
-Lv
Comment 11 Lv Zheng 2016-05-16 22:01:10 UTC
We posted the GPE flooding preventsion code on the ACPI mailinglist.
https://patchwork.kernel.org/patch/9099221/
https://patchwork.kernel.org/patch/9099251/
https://patchwork.kernel.org/patch/9099271/
Please apply them and try acpi_block_gpe=0x06

Our strategy is not to introduce DMI based quirks into kernel, but facilitate users with the command line.
Because such kind of flooding normally indicates a gap in the kernel.
Such quirks will grow endlessly if we start to make quirks for unknown gaps.

Thanks
-Lv
Comment 12 Lv Zheng 2016-05-16 22:24:51 UTC
*** Bug 105781 has been marked as a duplicate of this bug. ***
Comment 13 Lv Zheng 2016-05-17 04:51:52 UTC
*** Bug 105491 has been marked as a duplicate of this bug. ***
Comment 14 Gökçen Eraslan 2016-09-28 05:24:02 UTC
Is it acpi.block_gpe=0x00 or acpi_block_gpe=0x00? The former is written in the patch and latter is written here in the bug report.

Another question, can you tell in which kernel version this will be included?
Comment 15 Lv Zheng 2016-09-28 05:57:39 UTC
Both of the patches (the boot parameter part) were not upstreamed.
For now, you can only do this during runtime IMO.
After booting, "echo mask > /sys/firmware/acpi/interrupts/gpeX".
Which may not be convenient for the distribution vendors to provide boot quirks.

Thanks
Lv
Comment 16 Lv Zheng 2016-09-28 06:12:22 UTC
(In reply to Gökçen Eraslan from comment #14)
> Is it acpi.block_gpe=0x00 or acpi_block_gpe=0x00? The former is written in
> the patch and latter is written here in the bug report.

Should be acpi_block_gpe=0x00.
The patch really contains problems in the comments...

Thanks
Lv
Comment 17 Matt Turner 2016-12-07 07:13:28 UTC
I don't think this is RESOLVED/CODE_FIX. I'm experiencing the problem with 4.9.0-rc6, and Lv Zheng's patches did not go upstream. Please reopen.
Comment 18 Lv Zheng 2016-12-08 02:32:28 UTC
It's not closed.
It's still on my radar as long as the status of the bug is not "closed".

I'll try to re-submit the last patch after doing necessary cleanup on it.
I failed to obtain the feedback about why Rafael disliked this patch, so I may still fail to achieve the final version he expects.
However I'll try again.

Thanks
Lv
Comment 19 Lv Zheng 2016-12-08 04:53:53 UTC
Submitted here:
https://patchwork.kernel.org/patch/9465793/
Let's wait for Rafael's review.

Thanks and best regards
Lv
Comment 20 Lv Zheng 2016-12-20 00:44:28 UTC
Refreshed patch according changed Documentation structure, enhanced it with enhanced declarators and patch description:
https://patchwork.kernel.org/patch/9477259/
Comment 21 Gökçen Eraslan 2017-04-04 06:08:42 UTC
Any news? Is this now merged to the kernel?
Comment 22 Gökçen Eraslan 2017-06-06 18:37:32 UTC
Doesn't it make more sense to mark it as IN_PROGRESS? It's RESOLVED when the fix is in upstream.
Comment 23 Lv Zheng 2017-07-04 00:42:23 UTC
Closing as:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9c4aa1eecb48cfac18ed5e3aca9d9ae58fbafc11

No platform quirks will be upstreamed, you can use kernel boot parameters for your platforms. Example of using this mechanism:
 acpi_mask_gpe=0x06

Thanks
Lv