Bug 217315 - Unable to wake from lid in 6.3-rc5 or later
Summary: Unable to wake from lid in 6.3-rc5 or later
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Input Devices (show other bugs)
Hardware: AMD Linux
: P1 normal
Assignee: Mario Limonciello (AMD)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-04-10 04:56 UTC by Mario Limonciello (AMD)
Modified: 2023-08-21 20:49 UTC (History)
5 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
acpidump (1.50 MB, text/plain)
2023-04-10 04:58 UTC, Mario Limonciello (AMD)
Details
/sys/kernel/debug/gpio output from various cases (7.14 KB, application/gzip)
2023-04-11 01:26 UTC, Mario Limonciello (AMD)
Details
dmesg with revert and AML tracing (1.03 MB, text/plain)
2023-04-11 17:10 UTC, Mario Limonciello (AMD)
Details
possible patch v1 (5.84 KB, patch)
2023-04-13 20:36 UTC, Mario Limonciello (AMD)
Details | Diff
possible patch v2 (2.85 KB, patch)
2023-04-19 05:26 UTC, Mario Limonciello (AMD)
Details | Diff

Description Mario Limonciello (AMD) 2023-04-10 04:56:45 UTC
I've found a regression in 6.3-rc5 and 6.3-rc6 where a Lenovo Z13 can no longer wake from lid.

Lid wakeup is enabled in /sys/bus/acpi/drivers/button/PNP0C0D:00/power/wakeup.

I bisected it down to:

b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on resume")

6.3-rc6 *without a revert* shows the following wake snippet when pm_debug_messages is enabled.

[   87.062077] PM: suspend-to-idle
[   87.062114] ACPI: EC: ACPI EC GPE status set
[   87.062131] ACPI: PM: Rearming ACPI SCI for wakeup
[   87.062145] amd_pmc AMDI0007:00: SMU idlemask s0i3: 0x8fff9eb5
[   87.066192] Timekeeping suspended for 2.915 seconds
[   87.066192] PM: Triggering wakeup from IRQ 9
[   87.066192] ACPI: EC: ACPI EC GPE status set
[   87.066192] ACPI: EC: ACPI EC GPE dispatched
[   87.108716] ACPI: EC: ACPI EC work flushed
[   87.108738] ACPI: PM: Rearming ACPI SCI for wakeup
[   87.108759] amd_pmc AMDI0007:00: SMU idlemask s0i3: 0x8fff9eb5
[   87.108803] PM: Triggering wakeup from IRQ 9
[   87.108906] ACPI: EC: ACPI EC GPE status set
[   87.108935] ACPI: EC: ACPI EC GPE dispatched
[   87.167874] ACPI: EC: ACPI EC work flushed
[   87.257968] ACPI: EC: ACPI EC work flushed
[   87.257974] ACPI: PM: Rearming ACPI SCI for wakeup
[   87.257987] PM: Triggering wakeup from IRQ 9
[   87.257996] amd_pmc AMDI0007:00: SMU idlemask s0i3: 0x8fff9eb5
[   87.258026] ACPI: EC: ACPI EC GPE status set
[   87.258036] ACPI: PM: Rearming ACPI SCI for wakeup
[   87.258046] amd_pmc AMDI0007:00: SMU idlemask s0i3: 0x8fff9eb5
[   87.262060] Timekeeping suspended for 3.804 seconds
[   87.262060] PM: Triggering wakeup from IRQ 9
[   87.262060] ACPI: EC: ACPI EC GPE status set
[   87.262060] ACPI: EC: ACPI EC GPE dispatched
[   87.304695] ACPI: EC: ACPI EC work flushed
[   87.304711] ACPI: PM: Rearming ACPI SCI for wakeup
[   87.304733] amd_pmc AMDI0007:00: SMU idlemask s0i3: 0x8fff9eb5

So the kernel keeps on processing the ACPI SCI but never actually breaks the s2idle loop.

Reverting this commit on top of 6.3-rc6 fixes the problem.

Here is the same snippet:

[  137.945857] PM: suspend-to-idle
[  137.945887] ACPI: EC: ACPI EC GPE status set
[  137.945904] ACPI: PM: Rearming ACPI SCI for wakeup
[  137.945916] amd_pmc AMDI0007:00: SMU idlemask s0i3: 0x8fff9eb5
[  137.949957] Timekeeping suspended for 5.025 seconds
[  137.949957] PM: Triggering wakeup from IRQ 9
[  137.950162] PM: Triggering wakeup from IRQ 7
[  137.949957] ACPI: EC: ACPI EC GPE status set
[  137.949957] ACPI: EC: ACPI EC GPE dispatched
[  137.992769] ACPI: EC: ACPI EC work flushed
[  137.992788] ACPI: PM: Wakeup after ACPI Notify sync
[  137.992792] PM: resume from suspend-to-idle

You can see that after the ACPI SCI is processed the IRQ controller is also active and the system wakes up.
Comment 1 Mario Limonciello (AMD) 2023-04-10 04:58:45 UTC
Created attachment 304104 [details]
acpidump
Comment 2 Mario Limonciello (AMD) 2023-04-11 01:26:39 UTC
Created attachment 304114 [details]
/sys/kernel/debug/gpio output from various cases

Requested in the thread was the output from /sys/kernel/debug/gpio before/after suspend.

Attaching 3 cases:
1) 6.3-rc6 (broken resume from lid; so used power button to resume)
2) 6.3-rc6 + patch suggested inline in thread on April 10 (broken resume from lid; so used power button to resume)
3) 6.3-rc6 + revert (resumed from lid successfully)
Comment 3 Kornel Dulęba 2023-04-11 10:36:38 UTC
Hello Mario,


I have a hypothesis as to what is going on:

1. The ACPI logic for lid wakeup doesn't request a wakeup irq from pin controller. (irq_set_irq_wake)
From what I can see it doesn't interact with it at all.
Yet we're relying on the pin controller to detect the wakeup event.

2. If I understand correctly the support for that was added in a patch that you wrote - 2d54067fcd23 ("pinctrl: amd: Fix wakeups when IRQ is shared with SCI") 

3. The interrupt itself has to be armed in FW, since the kernel doesn't know anything about it.

In summary the kernel is expected to detect an interrupt that was configured somewhere else.

Could you confirm whether I'm correct here?
If that's the case we need to revert my patch and think of a different approach.
Comment 4 Kornel Dulęba 2023-04-11 11:38:02 UTC
One more thing.
Could you check /proc/interrupts with this patch reverted, and after resume.
I wonder if that approach wouldn't result in an interrupt storm.
Comment 5 Mario Limonciello (AMD) 2023-04-11 13:36:15 UTC
> Could you confirm whether I'm correct here?

I don't think that's all correct. This particular Lenovo platform doesn't share ACPI SCI IRQ with the GPIO controller.
For this platform ACPI SCI is IRQ 9, GPIO controller is IRQ 7.

> Could you check /proc/interrupts with this patch reverted, and after resume.

Here's how I checked to avoid letting the power button trigger GPIO for the failure:

1) Run this script:
#!/bin/sh
cat /proc/interrupts > interrupts_before.txt
rtcwake --seconds 30 -m mem
cat /proc/interrupts > interrupts_before.txt

2) Open lid before the 30 seconds
3) Compare acpi and pinctrl_amd interrupts.

I noticed that with reverted patch ACPI goes up a handful of times and pinctrl_amd goes up by one.
With 6.3-rc6 alone it shows rtc0 increases (as that actually broke suspend loop) and ACPI has storm.
Comment 6 Kornel Dulęba 2023-04-11 13:43:28 UTC
```
I don't think that's all correct. This particular Lenovo platform doesn't share ACPI SCI IRQ with the GPIO controller.
For this platform ACPI SCI is IRQ 9, GPIO controller is IRQ 7.
```

Ok, I see.
The only thing that this patch changes is that we might no longer be able to detect a wakeup event through amd_gpio_check_wake.
Since we're at rc6 right now it might be best just to revert my patch for the time being.
Comment 7 Mario Limonciello (AMD) 2023-04-11 13:44:56 UTC
> Since we're at rc6 right now it might be best just to revert my patch for the
> time being.

Yeah; I agree.  I'm happy to iterate with you during 6.4 on a variation of this patch so we can get something that works for everyone.
Comment 8 Kornel Dulęba 2023-04-11 13:46:12 UTC
>Yeah; I agree.  I'm happy to iterate with you during 6.4 on a variation of
>this > patch so we can get something that works for everyone.

Thanks, I'll send a reverting change right away.
Comment 9 Mario Limonciello (AMD) 2023-04-11 17:10:24 UTC
Created attachment 304121 [details]
dmesg with revert and AML tracing

BTW I did confirm that `do_amd_gpio_irq_handler` ISR isn't executed in the working case with the revert until after the s2idle loop is broken.

Also with dynamic debugging on the ISR doesn't show any GPIO active message (GPIO %d is active).

So to me it seems the GPIO controller IRQ is active w/ the revert as a result of ASL that executed whilest in the s2idle loop.
Comment 10 Kornel Dulęba 2023-04-12 09:58:51 UTC
Interesting.
But then why didn't we exit the s2idle loop with the patch applied?
The only thing that my patch changed was the logic inside pinctrl resume function, which then could affect the return value from `do_amd_gpio_irq_handler`.
Comment 11 Mario Limonciello (AMD) 2023-04-12 14:02:04 UTC
IRQ7 wasn't active during resume with the commit applied.
I'm just as confused as you...

I am starting to wonder if we're actually seeing a timing problem only exposed by your change and the original commit you were basing around to workaround issues on Surface is problematic.
Comment 12 Mario Limonciello (AMD) 2023-04-12 22:24:15 UTC
With the commit reverted on 6.3-rc6 things work, but then put this in place and they stop working again:

--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -1164,7 +1164,7 @@ static int amd_gpio_probe(struct platform_device *pdev)
        }

        /* Disable and mask interrupts */
-       amd_gpio_irq_init(gpio_dev);
+       //amd_gpio_irq_init(gpio_dev);

        girq = &gpio_dev->gc.irq;
        gpio_irq_chip_set_chip(girq, &amd_gpio_irqchip);

I am starting to feel like the lid in the first place working might be a "house of cards" here.

Seeing this comment:
	/*
	 * If WAKE_INT_MASTER_REG.MaskStsEn is set, a software write to the
	 * debounce registers of any GPIO will block wake/interrupt status
	 * generation for *all* GPIOs for a length of time that depends on
	 * WAKE_INT_MASTER_REG.MaskStsLength[11:0].  During this period the
	 * INTERRUPT_ENABLE bit will read as 0.
	 *
	 * We temporarily enable irq for the GPIO whose configuration is
	 * changing, and then wait for it to read back as 1 to know when
	 * debounce has settled and then disable the irq again.
	 * We do this polling with the spinlock held to ensure other GPIO
	 * access routines do not read an incorrect value for the irq enable
	 * bit of other GPIOs.  We keep the GPIO masked while polling to avoid
	 * spurious irqs, and disable the irq again after polling.
	 */

I experimented with clearing MaskStsEn with this patch in case it factored in:

index b5cd1f98f62e..f76eea44e393 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -932,8 +932,16 @@ static int amd_gpio_suspend(struct device *dev)
        struct amd_gpio *gpio_dev = dev_get_drvdata(dev);
        struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
        unsigned long flags;
+       u32 regval;
        int i;

+       raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+       regval = readl(gpio_dev->base + WAKE_INT_MASTER_REG);
+       dev_info(dev, "WAKE_INT_MASTER_REG: 0x%x\n", regval);
+       regval &= ~EN_STS_MASK;
+       writel(regval, gpio_dev->base + WAKE_INT_MASTER_REG);
+       raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+
        for (i = 0; i < desc->npins; i++) {
                int pin = desc->pins[i].number;

But that still doesn't help.  I also confirmed that MaskStsEn had the same value during resume so the BIOS at least didn't muck with it.

@@ -973,6 +982,14 @@ static int amd_gpio_resume(struct device *dev)

        dev_info(&gpio_dev->pdev->dev, "Resume GPIO interrupt status: 0x%llx", status);

+
+       raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+       regval = readl(gpio_dev->base + WAKE_INT_MASTER_REG);
+       dev_info(dev, "WAKE_INT_MASTER_REG: 0x%x\n", regval);
+       regval |= EN_STS_MASK;
+       writel(regval, gpio_dev->base + WAKE_INT_MASTER_REG);
+       raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+
        for (i = 0; i < desc->npins; i++) {
                int pin = desc->pins[i].number;
Comment 13 Mario Limonciello (AMD) 2023-04-12 22:29:18 UTC
CC Mark Pearson @ Lenovo

Mark - can you please confirm how the lid event is "supposed" to be working for Z13?

The BIOS ASL sure makes it seem like it *should be* GPIO 0, but at least with the way the pinctrl-amd driver is configuring we're not seeing GPIO 0 activate, just the GPIO controller interrupt is going active with no GPIOs active and only if the init process happens "just right".
Comment 14 Mario Limonciello (AMD) 2023-04-13 03:18:28 UTC
Using the lid switch at runtime, I might have an explanation why this is behaving so funny compared to what the code is doing....

"""
Method Begin [0x00000000da304238:\SMI] execution.
"""

With 0x80 in trace_debug_layer and 0x10 in trace_debug_state and enable in trace_state and pinctrl-amd enabled in dynamic debug when I close the lid I don't see any GPIO whatsoever.

Here is what I see:

[  706.357266]   extrace-0138 ex_trace_point        : Method Begin [0x000000009813f389:\_SB.PCI0.LPC0.EC0._Q2B] execution.
[  706.357838]   extrace-0138 ex_trace_point        : Method Begin [0x00000000460877d4:\SCMS] execution.
[  706.357864]   extrace-0138 ex_trace_point        : Method Begin [0x00000000da304238:\SMI] execution.
[  706.358289]   extrace-0138 ex_trace_point        : Method End [0x00000000da304238:\SMI] execution.
[  706.358299]   extrace-0138 ex_trace_point        : Method End [0x00000000460877d4:\SCMS] execution.
[  706.358325]   extrace-0138 ex_trace_point        : Method Begin [0x000000006ccfcd76:\_SB.LID._LID] execution.
[  706.358545]   extrace-0138 ex_trace_point        : Method End [0x000000006ccfcd76:\_SB.LID._LID] execution.
[  706.358555]   extrace-0138 ex_trace_point        : Method Begin [0x0000000099f37191:\VCMS] execution.
[  706.358579]   extrace-0138 ex_trace_point        : Method Begin [0x00000000da304238:\SMI] execution.
[  706.358835]   extrace-0138 ex_trace_point        : Method End [0x00000000da304238:\SMI] execution.
[  706.358843]   extrace-0138 ex_trace_point        : Method End [0x0000000099f37191:\VCMS] execution.
[  706.358855]   extrace-0138 ex_trace_point        : Method Begin [0x000000008b8a4084:\_SB.PCI0.LPC0.EC0.AC._PSR] execution.
[  706.359111]   extrace-0138 ex_trace_point        : Method End [0x000000008b8a4084:\_SB.PCI0.LPC0.EC0.AC._PSR] execution.
[  706.359129]   extrace-0138 ex_trace_point        : Method Begin [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution.
[  706.359602]   extrace-0138 ex_trace_point        : Method Begin [0x00000000db501c9c:\_SB.PCI0.LPC0.EC0.HKEY.SCPF] execution.
[  706.370313]   extrace-0138 ex_trace_point        : Method Begin [0x00000000ad42bd4b:\FLPF] execution.
[  706.370331]   extrace-0138 ex_trace_point        : Method Begin [0x00000000da304238:\SMI] execution.
[  706.370814]   extrace-0138 ex_trace_point        : Method End [0x00000000da304238:\SMI] execution.
[  706.370824]   extrace-0138 ex_trace_point        : Method End [0x00000000ad42bd4b:\FLPF] execution.
[  706.370835]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b8384036:\_SB.PCI0.LPC0.EC0.HKEY.SSTT] execution.
[  706.370953]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.370998]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.371221]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.371460]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.371543]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.371549]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.371610]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.371652]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.371871]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.372453]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.372530]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.372537]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.372597]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.372639]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.372866]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.373770]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.373847]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.373855]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.373924]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.373967]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.374199]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.374805]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.374955]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.374969]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.375050]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.375096]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.375338]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.376035]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.376101]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.376108]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.376168]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.376209]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.376442]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.377496]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.377561]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.377568]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.377639]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.377682]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.377924]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.378947]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.378992]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.378997]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.379042]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.379074]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.379259]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.380091]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.380143]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.380149]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.380208]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.380249]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.380491]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.380934]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.380984]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.380990]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.381091]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.381140]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.381396]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.382019]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.382065]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.382072]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.382132]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.382174]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.382424]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.382644]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.382687]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.382693]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.382752]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.382793]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.383047]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.383305]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.383343]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.383349]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.383407]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.383448]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.383707]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.384487]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.384522]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.384528]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.384586]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.384628]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.384889]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.385620]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.385654]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.385661]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.385728]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.385770]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.386039]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.386230]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.386256]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.386263]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.386322]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.386363]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.386525]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.386820]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.386952]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.386958]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.387017]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.387058]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.387267]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.388071]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.388228]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.388239]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.388349]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.388424]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.388709]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.389258]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.389537]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.389550]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.389666]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.389744]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.390042]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.390755]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.390990]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.391002]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.391112]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.391186]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.391458]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.391764]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.392020]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.392032]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.392136]   extrace-0138 ex_trace_point        : Method Begin [0x0000000006e95060:\_SB.ALIB] execution.
[  706.392210]   extrace-0138 ex_trace_point        : Method Begin [0x00000000149e3037:\_SB.A025] execution.
[  706.392714]   extrace-0138 ex_trace_point        : Method Begin [0x00000000b187f742:\_SB.A011] execution.
[  706.393158]   extrace-0138 ex_trace_point        : Method End [0x00000000b187f742:\_SB.A011] execution.
[  706.393187]   extrace-0138 ex_trace_point        : Method End [0x00000000149e3037:\_SB.A025] execution.
[  706.393199]   extrace-0138 ex_trace_point        : Method End [0x0000000006e95060:\_SB.ALIB] execution.
[  706.393249]   extrace-0138 ex_trace_point        : Method Begin [0x00000000e81a8a92:\_SB.WMEM] execution.
[  706.393419]   extrace-0138 ex_trace_point        : Method End [0x00000000e81a8a92:\_SB.WMEM] execution.
[  706.393437]   extrace-0138 ex_trace_point        : Method End [0x00000000b8384036:\_SB.PCI0.LPC0.EC0.HKEY.SSTT] execution.
[  706.461239]   extrace-0138 ex_trace_point        : Method End [0x00000000db501c9c:\_SB.PCI0.LPC0.EC0.HKEY.SCPF] execution.
[  706.461522]   extrace-0138 ex_trace_point        : Method Begin [0x000000008bc2aee4:\_SB.PMF.ATST] execution.
[  706.461539]   extrace-0138 ex_trace_point        : Method Begin [0x000000003f6f4cf6:\_SB.PMF.ATS1] execution.
[  706.461582]   extrace-0138 ex_trace_point        : Method Begin [0x000000001cdd56ec:\M460] execution.
[  706.461603]   extrace-0138 ex_trace_point        : Method End [0x000000001cdd56ec:\M460] execution.
[  706.461658]   extrace-0138 ex_trace_point        : Method End [0x000000003f6f4cf6:\_SB.PMF.ATS1] execution.
[  706.461662]   extrace-0138 ex_trace_point        : Method End [0x000000008bc2aee4:\_SB.PMF.ATST] execution.
[  706.461684]   extrace-0138 ex_trace_point        : Method Begin [0x00000000494b551c:\_SB.PCI0.LPC0.EC0.HKEY.MHKQ] execution.
[  706.461716]   extrace-0138 ex_trace_point        : Method End [0x00000000494b551c:\_SB.PCI0.LPC0.EC0.HKEY.MHKQ] execution.
[  706.461724]   extrace-0138 ex_trace_point        : Method End [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution.
[  706.461753]   extrace-0138 ex_trace_point        : Method Begin [0x00000000494b551c:\_SB.PCI0.LPC0.EC0.HKEY.MHKQ] execution.
[  706.461777]   extrace-0138 ex_trace_point        : Method End [0x00000000494b551c:\_SB.PCI0.LPC0.EC0.HKEY.MHKQ] execution.
[  706.461805]   extrace-0138 ex_trace_point        : Method End [0x000000009813f389:\_SB.PCI0.LPC0.EC0._Q2B] execution.
[  706.461816]   extrace-0138 ex_trace_point        : Method Begin [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution.
[  706.461838]   extrace-0138 ex_trace_point        : Method End [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution.
[  706.461845]   extrace-0138 ex_trace_point        : Method Begin [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution.
[  706.462098]   extrace-0138 ex_trace_point        : Method End [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution.
[  706.462103]   extrace-0138 ex_trace_point        : Method Begin [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution.
[  706.462356]   extrace-0138 ex_trace_point        : Method End [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution.
[  706.462376]   extrace-0138 ex_trace_point        : Method Begin [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution.
[  706.462395]   extrace-0138 ex_trace_point        : Method End [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution.
[  706.462398]   extrace-0138 ex_trace_point        : Method Begin [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution.
[  706.462416]   extrace-0138 ex_trace_point        : Method End [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution.
[  706.462442]   extrace-0138 ex_trace_point        : Method Begin [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution.
[  706.462459]   extrace-0138 ex_trace_point        : Method End [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution.
[  706.462467]   extrace-0138 ex_trace_point        : Method Begin [0x000000006ccfcd76:\_SB.LID._LID] execution.

I think we need Lenovo's BIOS team to comment on this behavior on what their expectations are on handling the lid event and why this is working at all with the current pinctrl-amd code (it seems like it might be a fluke that it works IMO).

IOW - I'm wondering if the SMI is what causes the IRQ7, and it's got nothing to do with the GPIO controller.
Comment 15 Mark Pearson 2023-04-13 16:28:46 UTC
Ack - asking the FW team. Internal ticket LO-2406
Comment 16 Mario Limonciello (AMD) 2023-04-13 20:36:38 UTC
Created attachment 304132 [details]
possible patch v1

Thanks Mark!

I have a possible patch that fixes the issue for me even on 6.3-rc6 without the revert.  The description of why is described in depth in the commit message so I won't repeat it here.

But having spent a lot of time looking at this issue the last few days, I think that:

4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")

was probably wrong in the first place.  I found both on the Z13 and on reference hardware there are two GPIOs that the hardware starts up with the interrupts masked by the firmware.  They're used exclusively by microcontrollers in the SOC and could have been masking them all the time could lead to other difficult to debug issues.

So here's my plan of attack:
0) Get some confirmation this idea makes sense for what's going on from you guys on CC as well as Lenovo's firmware team.  Hopefully Lenovo's firmware team can validate my ideas and findings.

1) More testing of that patch.  It changes the flow quite a bit and could have unexpected implications because general purpose Linux doesn't have a concept of Dark Resume like ChromeOS does.  Without dark resume I could easily see this leading to more spurious interrupts and different behaviors.

For example; before this patch I couldn't wake from USB keyboard but now I can but ONLY when the lid is open.  This is at least an improvement.

2) Create a series that will include this patch and revert 4e5a04be88fe, looking for testing from that author.  It's quite possible the new modern standby _DSM handling behavior helps them too.

If you still think b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on resume") makes sense given the above transgressions I'd like to hear more about the issue it fixes.  I worry it's papering over something else that should be dug into.  Specifically there are extra dynamic debugging messages the kernel can emit to explain which GPIO is triggering and we can dig more into those.

3) I think we really need something like what ChromeOS does for Dark Resume in systemd.  While doing this we may want something userspace can set to let the kernel treat all SCI wakeups as intentional wakeups when it's doing this dark resume type of functionality.

Then all the policy decisions on what to wake up for can be left to userspace software and policy can be done more hollistically.

I have a bug opened in systemd to track this idea.
https://github.com/systemd/systemd/issues/27077
Comment 17 Kornel Dulęba 2023-04-18 12:46:09 UTC
I've found an issue on Intel pinctrl that looks very similar to what I've tried to fix: https://lore.kernel.org/linux-gpio/20221124222926.72326-1-andriy.shevchenko@linux.intel.com/.

If I understand correctly they detect whether a pin is used by firmware, and if that's the case they save/restore its value over resume/suspend.
I wonder if we could do something similar.

Mario,

Do you know whether documentation of this pinctrl is publicly available?
Comment 18 Mario Limonciello (AMD) 2023-04-18 17:57:26 UTC
> If I understand correctly they detect whether a pin is used by firmware, and
> if that's the case they save/restore its value over resume/suspend.
> I wonder if we could do something similar.

Similarly - all of the GPIO pads can be configured for I2C, GPIO, or I3C (on supported platforms) functionality.  This support was added to the pinctrl-amd driver by 72440158f70f2c61e6e5b22b7409d48de62cc914.

I think we probably want the output of the mux configuration in the debugfs output to better understand if that's contributing to this problem.

> Do you know whether documentation of this pinctrl is publicly available?

I am afraid GPIO pad documentation for "current" platforms is in the NDA version of the PPR not the public one.  The BKDG for "previous" platforms documents the way that the FCH GPIO registers and IOMUX registers work though which for the purpose of this bug hasn't changed in a significant way.  You'll notice we don't have different pinctrl-amd drivers for different SOCs.
So if you can't access it under NDA you can reference that.

Is there by chance Chromium bug associated with your patch?  Can you check if I can be included on it?  If so, please loop in mario.limonciello@amd.corp-partner.google.com.
Comment 19 Mario Limonciello (AMD) 2023-04-18 21:50:49 UTC
> I think we probably want the output of the mux configuration in the debugfs
> output to better understand if that's contributing to this problem.

TIL /sys/kernel/debug/pinctrl/<controller>, but no maps are being applied so I don't believe the functions are being changed at all.

> Ack - asking the FW team. Internal ticket LO-2406

I had a try today with toggling the lid for an AMD reference board in hopes I could better dig into what's really going on.  Unfortunately 6.3-rc6 and 6.3-rc7 both successfully woke up the system.

> 1) More testing of that patch.  It changes the flow quite a bit and could
> have unexpected implications because general purpose Linux doesn't have a
> concept of Dark Resume like ChromeOS does.  Without dark resume I could
> easily see this leading to more spurious interrupts and different behaviors.

The testing so far has lead positive results on the combination of that patch + the revert of 4e5a04be88fe on everything tested except for the Dell system that Richard Gong reported the regression on 6.3-rc6 that matched the Z13 issue I reported.

The Dell system actually *does* wake up from GPIO 0.  Using just my modern standby flow change patch but not reverting 4e5a04be88fe works for that system.  So my next goal is to figure out why 4e5a04be88fe is needed.  It shouldn't be.
Comment 20 Mario Limonciello (AMD) 2023-04-19 05:26:53 UTC
Created attachment 304162 [details]
possible patch v2

OK I've got a solution now.  I can solve it on 6.3-rc6 with the attached patch.  I'm going to re-introduce a variation of your patch back in a series that includes this after some more testing.
Comment 21 Mario Limonciello (AMD) 2023-04-19 14:51:49 UTC
The new patch tests well on everything that broke that I know of.  I've got the series ready to go, and will post it once Kornel gets a chance to test it on the bug that prompted his work.  I intend to CC the relevant bits back to stable as well.
Comment 22 Mario Limonciello (AMD) 2023-04-21 20:20:04 UTC
The patches to fix this have been submitted here: https://lore.kernel.org/linux-gpio/20230421120625.3366-1-mario.limonciello@amd.com/T/#m6d69abe8e00bbcf81550ea1d1d92bacbea6433bd

@Mark:

From the debugging it appears that GPIO 0 is the wake source for the lid.  The reason for the regression is described in the patches.  Although functional it appears that the power button and lid haven't been working "as designed".  There is a special debounce behavior for GPIO 0 that they're intended to be using.

The patches now fix the bug and align that debounce behavior as intended, which matches what it does on Windows too.

As this was all masked because of 
4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")

Part of the series reverts it.

Hopefully this ends up aligning with what your firmware team says.
Comment 23 Mario Limonciello (AMD) 2023-07-12 04:46:29 UTC
Series is merged in 6.5-rc1.
Comment 24 Björn Bidar 2023-08-20 22:11:54 UTC
Will the patches by applied to Linux-stable?
Comment 25 Mario Limonciello (AMD) 2023-08-21 13:12:05 UTC
Yeah they're in stable now too, but I do want to note that one ptach is going to be reverted in 6.6-rc1 because of a regression in the series causing increased power consumption in some systems.

https://lore.kernel.org/linux-gpio/20230818144850.1439-1-mario.limonciello@amd.com/

This will come to stable as well.
Comment 26 Björn Bidar 2023-08-21 17:39:22 UTC
Does that mean that the patches were on an impartial bugfix?
Comment 27 Mario Limonciello (AMD) 2023-08-21 17:42:38 UTC
> Does that mean that the patches were on an impartial bugfix?

Call it whatever you want to.

They had a side effect on some systems that the boot firmware configured the pins unexpectedly.

I tried on a variety of systems on my end, both OEM and reference systems and didn't see this behavior.  But as it is confirmed to fix the power issue reported I've submitted it.
Comment 28 Björn Bidar 2023-08-21 19:36:38 UTC
I'm more asking what part of the fix is gone after the revert.
Comment 29 Mario Limonciello (AMD) 2023-08-21 20:49:02 UTC
The part about trusting boot firmware for the GPIO wake bit configurations.

Note You need to log in before you can comment on or make changes to this bug.