Bug 217315
Summary: | Unable to wake from lid in 6.3-rc5 or later | ||
---|---|---|---|
Product: | Drivers | Reporter: | Mario Limonciello (AMD) (mario.limonciello) |
Component: | Input Devices | Assignee: | Mario Limonciello (AMD) (mario.limonciello) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | bjorn.bidar, korneld, mpearson-lenovo, shyam-sundar.s-k, triad |
Priority: | P1 | ||
Hardware: | AMD | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: | |
Attachments: |
acpidump
/sys/kernel/debug/gpio output from various cases dmesg with revert and AML tracing possible patch v1 possible patch v2 |
Description
Mario Limonciello (AMD)
2023-04-10 04:56:45 UTC
Created attachment 304104 [details]
acpidump
Created attachment 304114 [details]
/sys/kernel/debug/gpio output from various cases
Requested in the thread was the output from /sys/kernel/debug/gpio before/after suspend.
Attaching 3 cases:
1) 6.3-rc6 (broken resume from lid; so used power button to resume)
2) 6.3-rc6 + patch suggested inline in thread on April 10 (broken resume from lid; so used power button to resume)
3) 6.3-rc6 + revert (resumed from lid successfully)
Hello Mario, I have a hypothesis as to what is going on: 1. The ACPI logic for lid wakeup doesn't request a wakeup irq from pin controller. (irq_set_irq_wake) From what I can see it doesn't interact with it at all. Yet we're relying on the pin controller to detect the wakeup event. 2. If I understand correctly the support for that was added in a patch that you wrote - 2d54067fcd23 ("pinctrl: amd: Fix wakeups when IRQ is shared with SCI") 3. The interrupt itself has to be armed in FW, since the kernel doesn't know anything about it. In summary the kernel is expected to detect an interrupt that was configured somewhere else. Could you confirm whether I'm correct here? If that's the case we need to revert my patch and think of a different approach. One more thing. Could you check /proc/interrupts with this patch reverted, and after resume. I wonder if that approach wouldn't result in an interrupt storm. > Could you confirm whether I'm correct here? I don't think that's all correct. This particular Lenovo platform doesn't share ACPI SCI IRQ with the GPIO controller. For this platform ACPI SCI is IRQ 9, GPIO controller is IRQ 7. > Could you check /proc/interrupts with this patch reverted, and after resume. Here's how I checked to avoid letting the power button trigger GPIO for the failure: 1) Run this script: #!/bin/sh cat /proc/interrupts > interrupts_before.txt rtcwake --seconds 30 -m mem cat /proc/interrupts > interrupts_before.txt 2) Open lid before the 30 seconds 3) Compare acpi and pinctrl_amd interrupts. I noticed that with reverted patch ACPI goes up a handful of times and pinctrl_amd goes up by one. With 6.3-rc6 alone it shows rtc0 increases (as that actually broke suspend loop) and ACPI has storm. ``` I don't think that's all correct. This particular Lenovo platform doesn't share ACPI SCI IRQ with the GPIO controller. For this platform ACPI SCI is IRQ 9, GPIO controller is IRQ 7. ``` Ok, I see. The only thing that this patch changes is that we might no longer be able to detect a wakeup event through amd_gpio_check_wake. Since we're at rc6 right now it might be best just to revert my patch for the time being. > Since we're at rc6 right now it might be best just to revert my patch for the
> time being.
Yeah; I agree. I'm happy to iterate with you during 6.4 on a variation of this patch so we can get something that works for everyone.
>Yeah; I agree. I'm happy to iterate with you during 6.4 on a variation of
>this > patch so we can get something that works for everyone.
Thanks, I'll send a reverting change right away.
Created attachment 304121 [details]
dmesg with revert and AML tracing
BTW I did confirm that `do_amd_gpio_irq_handler` ISR isn't executed in the working case with the revert until after the s2idle loop is broken.
Also with dynamic debugging on the ISR doesn't show any GPIO active message (GPIO %d is active).
So to me it seems the GPIO controller IRQ is active w/ the revert as a result of ASL that executed whilest in the s2idle loop.
Interesting. But then why didn't we exit the s2idle loop with the patch applied? The only thing that my patch changed was the logic inside pinctrl resume function, which then could affect the return value from `do_amd_gpio_irq_handler`. IRQ7 wasn't active during resume with the commit applied. I'm just as confused as you... I am starting to wonder if we're actually seeing a timing problem only exposed by your change and the original commit you were basing around to workaround issues on Surface is problematic. With the commit reverted on 6.3-rc6 things work, but then put this in place and they stop working again: --- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -1164,7 +1164,7 @@ static int amd_gpio_probe(struct platform_device *pdev) } /* Disable and mask interrupts */ - amd_gpio_irq_init(gpio_dev); + //amd_gpio_irq_init(gpio_dev); girq = &gpio_dev->gc.irq; gpio_irq_chip_set_chip(girq, &amd_gpio_irqchip); I am starting to feel like the lid in the first place working might be a "house of cards" here. Seeing this comment: /* * If WAKE_INT_MASTER_REG.MaskStsEn is set, a software write to the * debounce registers of any GPIO will block wake/interrupt status * generation for *all* GPIOs for a length of time that depends on * WAKE_INT_MASTER_REG.MaskStsLength[11:0]. During this period the * INTERRUPT_ENABLE bit will read as 0. * * We temporarily enable irq for the GPIO whose configuration is * changing, and then wait for it to read back as 1 to know when * debounce has settled and then disable the irq again. * We do this polling with the spinlock held to ensure other GPIO * access routines do not read an incorrect value for the irq enable * bit of other GPIOs. We keep the GPIO masked while polling to avoid * spurious irqs, and disable the irq again after polling. */ I experimented with clearing MaskStsEn with this patch in case it factored in: index b5cd1f98f62e..f76eea44e393 100644 --- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -932,8 +932,16 @@ static int amd_gpio_suspend(struct device *dev) struct amd_gpio *gpio_dev = dev_get_drvdata(dev); struct pinctrl_desc *desc = gpio_dev->pctrl->desc; unsigned long flags; + u32 regval; int i; + raw_spin_lock_irqsave(&gpio_dev->lock, flags); + regval = readl(gpio_dev->base + WAKE_INT_MASTER_REG); + dev_info(dev, "WAKE_INT_MASTER_REG: 0x%x\n", regval); + regval &= ~EN_STS_MASK; + writel(regval, gpio_dev->base + WAKE_INT_MASTER_REG); + raw_spin_unlock_irqrestore(&gpio_dev->lock, flags); + for (i = 0; i < desc->npins; i++) { int pin = desc->pins[i].number; But that still doesn't help. I also confirmed that MaskStsEn had the same value during resume so the BIOS at least didn't muck with it. @@ -973,6 +982,14 @@ static int amd_gpio_resume(struct device *dev) dev_info(&gpio_dev->pdev->dev, "Resume GPIO interrupt status: 0x%llx", status); + + raw_spin_lock_irqsave(&gpio_dev->lock, flags); + regval = readl(gpio_dev->base + WAKE_INT_MASTER_REG); + dev_info(dev, "WAKE_INT_MASTER_REG: 0x%x\n", regval); + regval |= EN_STS_MASK; + writel(regval, gpio_dev->base + WAKE_INT_MASTER_REG); + raw_spin_unlock_irqrestore(&gpio_dev->lock, flags); + for (i = 0; i < desc->npins; i++) { int pin = desc->pins[i].number; CC Mark Pearson @ Lenovo Mark - can you please confirm how the lid event is "supposed" to be working for Z13? The BIOS ASL sure makes it seem like it *should be* GPIO 0, but at least with the way the pinctrl-amd driver is configuring we're not seeing GPIO 0 activate, just the GPIO controller interrupt is going active with no GPIOs active and only if the init process happens "just right". Using the lid switch at runtime, I might have an explanation why this is behaving so funny compared to what the code is doing.... """ Method Begin [0x00000000da304238:\SMI] execution. """ With 0x80 in trace_debug_layer and 0x10 in trace_debug_state and enable in trace_state and pinctrl-amd enabled in dynamic debug when I close the lid I don't see any GPIO whatsoever. Here is what I see: [ 706.357266] extrace-0138 ex_trace_point : Method Begin [0x000000009813f389:\_SB.PCI0.LPC0.EC0._Q2B] execution. [ 706.357838] extrace-0138 ex_trace_point : Method Begin [0x00000000460877d4:\SCMS] execution. [ 706.357864] extrace-0138 ex_trace_point : Method Begin [0x00000000da304238:\SMI] execution. [ 706.358289] extrace-0138 ex_trace_point : Method End [0x00000000da304238:\SMI] execution. [ 706.358299] extrace-0138 ex_trace_point : Method End [0x00000000460877d4:\SCMS] execution. [ 706.358325] extrace-0138 ex_trace_point : Method Begin [0x000000006ccfcd76:\_SB.LID._LID] execution. [ 706.358545] extrace-0138 ex_trace_point : Method End [0x000000006ccfcd76:\_SB.LID._LID] execution. [ 706.358555] extrace-0138 ex_trace_point : Method Begin [0x0000000099f37191:\VCMS] execution. [ 706.358579] extrace-0138 ex_trace_point : Method Begin [0x00000000da304238:\SMI] execution. [ 706.358835] extrace-0138 ex_trace_point : Method End [0x00000000da304238:\SMI] execution. [ 706.358843] extrace-0138 ex_trace_point : Method End [0x0000000099f37191:\VCMS] execution. [ 706.358855] extrace-0138 ex_trace_point : Method Begin [0x000000008b8a4084:\_SB.PCI0.LPC0.EC0.AC._PSR] execution. [ 706.359111] extrace-0138 ex_trace_point : Method End [0x000000008b8a4084:\_SB.PCI0.LPC0.EC0.AC._PSR] execution. [ 706.359129] extrace-0138 ex_trace_point : Method Begin [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution. [ 706.359602] extrace-0138 ex_trace_point : Method Begin [0x00000000db501c9c:\_SB.PCI0.LPC0.EC0.HKEY.SCPF] execution. [ 706.370313] extrace-0138 ex_trace_point : Method Begin [0x00000000ad42bd4b:\FLPF] execution. [ 706.370331] extrace-0138 ex_trace_point : Method Begin [0x00000000da304238:\SMI] execution. [ 706.370814] extrace-0138 ex_trace_point : Method End [0x00000000da304238:\SMI] execution. [ 706.370824] extrace-0138 ex_trace_point : Method End [0x00000000ad42bd4b:\FLPF] execution. [ 706.370835] extrace-0138 ex_trace_point : Method Begin [0x00000000b8384036:\_SB.PCI0.LPC0.EC0.HKEY.SSTT] execution. [ 706.370953] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.370998] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.371221] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.371460] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.371543] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.371549] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.371610] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.371652] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.371871] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.372453] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.372530] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.372537] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.372597] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.372639] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.372866] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.373770] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.373847] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.373855] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.373924] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.373967] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.374199] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.374805] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.374955] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.374969] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.375050] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.375096] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.375338] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.376035] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.376101] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.376108] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.376168] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.376209] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.376442] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.377496] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.377561] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.377568] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.377639] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.377682] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.377924] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.378947] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.378992] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.378997] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.379042] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.379074] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.379259] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.380091] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.380143] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.380149] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.380208] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.380249] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.380491] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.380934] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.380984] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.380990] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.381091] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.381140] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.381396] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.382019] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.382065] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.382072] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.382132] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.382174] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.382424] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.382644] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.382687] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.382693] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.382752] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.382793] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.383047] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.383305] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.383343] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.383349] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.383407] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.383448] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.383707] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.384487] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.384522] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.384528] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.384586] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.384628] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.384889] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.385620] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.385654] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.385661] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.385728] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.385770] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.386039] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.386230] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.386256] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.386263] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.386322] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.386363] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.386525] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.386820] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.386952] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.386958] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.387017] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.387058] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.387267] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.388071] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.388228] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.388239] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.388349] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.388424] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.388709] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.389258] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.389537] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.389550] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.389666] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.389744] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.390042] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.390755] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.390990] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.391002] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.391112] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.391186] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.391458] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.391764] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.392020] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.392032] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.392136] extrace-0138 ex_trace_point : Method Begin [0x0000000006e95060:\_SB.ALIB] execution. [ 706.392210] extrace-0138 ex_trace_point : Method Begin [0x00000000149e3037:\_SB.A025] execution. [ 706.392714] extrace-0138 ex_trace_point : Method Begin [0x00000000b187f742:\_SB.A011] execution. [ 706.393158] extrace-0138 ex_trace_point : Method End [0x00000000b187f742:\_SB.A011] execution. [ 706.393187] extrace-0138 ex_trace_point : Method End [0x00000000149e3037:\_SB.A025] execution. [ 706.393199] extrace-0138 ex_trace_point : Method End [0x0000000006e95060:\_SB.ALIB] execution. [ 706.393249] extrace-0138 ex_trace_point : Method Begin [0x00000000e81a8a92:\_SB.WMEM] execution. [ 706.393419] extrace-0138 ex_trace_point : Method End [0x00000000e81a8a92:\_SB.WMEM] execution. [ 706.393437] extrace-0138 ex_trace_point : Method End [0x00000000b8384036:\_SB.PCI0.LPC0.EC0.HKEY.SSTT] execution. [ 706.461239] extrace-0138 ex_trace_point : Method End [0x00000000db501c9c:\_SB.PCI0.LPC0.EC0.HKEY.SCPF] execution. [ 706.461522] extrace-0138 ex_trace_point : Method Begin [0x000000008bc2aee4:\_SB.PMF.ATST] execution. [ 706.461539] extrace-0138 ex_trace_point : Method Begin [0x000000003f6f4cf6:\_SB.PMF.ATS1] execution. [ 706.461582] extrace-0138 ex_trace_point : Method Begin [0x000000001cdd56ec:\M460] execution. [ 706.461603] extrace-0138 ex_trace_point : Method End [0x000000001cdd56ec:\M460] execution. [ 706.461658] extrace-0138 ex_trace_point : Method End [0x000000003f6f4cf6:\_SB.PMF.ATS1] execution. [ 706.461662] extrace-0138 ex_trace_point : Method End [0x000000008bc2aee4:\_SB.PMF.ATST] execution. [ 706.461684] extrace-0138 ex_trace_point : Method Begin [0x00000000494b551c:\_SB.PCI0.LPC0.EC0.HKEY.MHKQ] execution. [ 706.461716] extrace-0138 ex_trace_point : Method End [0x00000000494b551c:\_SB.PCI0.LPC0.EC0.HKEY.MHKQ] execution. [ 706.461724] extrace-0138 ex_trace_point : Method End [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution. [ 706.461753] extrace-0138 ex_trace_point : Method Begin [0x00000000494b551c:\_SB.PCI0.LPC0.EC0.HKEY.MHKQ] execution. [ 706.461777] extrace-0138 ex_trace_point : Method End [0x00000000494b551c:\_SB.PCI0.LPC0.EC0.HKEY.MHKQ] execution. [ 706.461805] extrace-0138 ex_trace_point : Method End [0x000000009813f389:\_SB.PCI0.LPC0.EC0._Q2B] execution. [ 706.461816] extrace-0138 ex_trace_point : Method Begin [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution. [ 706.461838] extrace-0138 ex_trace_point : Method End [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution. [ 706.461845] extrace-0138 ex_trace_point : Method Begin [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution. [ 706.462098] extrace-0138 ex_trace_point : Method End [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution. [ 706.462103] extrace-0138 ex_trace_point : Method Begin [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution. [ 706.462356] extrace-0138 ex_trace_point : Method End [0x0000000018fe83ac:\_SB.PCI0.LPC0.EC0.HKEY.DYTC] execution. [ 706.462376] extrace-0138 ex_trace_point : Method Begin [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution. [ 706.462395] extrace-0138 ex_trace_point : Method End [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution. [ 706.462398] extrace-0138 ex_trace_point : Method Begin [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution. [ 706.462416] extrace-0138 ex_trace_point : Method End [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution. [ 706.462442] extrace-0138 ex_trace_point : Method Begin [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution. [ 706.462459] extrace-0138 ex_trace_point : Method End [0x0000000055f8d7c1:\_SB.PCI0.LPC0.EC0.HKEY.MHKP] execution. [ 706.462467] extrace-0138 ex_trace_point : Method Begin [0x000000006ccfcd76:\_SB.LID._LID] execution. I think we need Lenovo's BIOS team to comment on this behavior on what their expectations are on handling the lid event and why this is working at all with the current pinctrl-amd code (it seems like it might be a fluke that it works IMO). IOW - I'm wondering if the SMI is what causes the IRQ7, and it's got nothing to do with the GPIO controller. Ack - asking the FW team. Internal ticket LO-2406 Created attachment 304132 [details] possible patch v1 Thanks Mark! I have a possible patch that fixes the issue for me even on 6.3-rc6 without the revert. The description of why is described in depth in the commit message so I won't repeat it here. But having spent a lot of time looking at this issue the last few days, I think that: 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe") was probably wrong in the first place. I found both on the Z13 and on reference hardware there are two GPIOs that the hardware starts up with the interrupts masked by the firmware. They're used exclusively by microcontrollers in the SOC and could have been masking them all the time could lead to other difficult to debug issues. So here's my plan of attack: 0) Get some confirmation this idea makes sense for what's going on from you guys on CC as well as Lenovo's firmware team. Hopefully Lenovo's firmware team can validate my ideas and findings. 1) More testing of that patch. It changes the flow quite a bit and could have unexpected implications because general purpose Linux doesn't have a concept of Dark Resume like ChromeOS does. Without dark resume I could easily see this leading to more spurious interrupts and different behaviors. For example; before this patch I couldn't wake from USB keyboard but now I can but ONLY when the lid is open. This is at least an improvement. 2) Create a series that will include this patch and revert 4e5a04be88fe, looking for testing from that author. It's quite possible the new modern standby _DSM handling behavior helps them too. If you still think b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on resume") makes sense given the above transgressions I'd like to hear more about the issue it fixes. I worry it's papering over something else that should be dug into. Specifically there are extra dynamic debugging messages the kernel can emit to explain which GPIO is triggering and we can dig more into those. 3) I think we really need something like what ChromeOS does for Dark Resume in systemd. While doing this we may want something userspace can set to let the kernel treat all SCI wakeups as intentional wakeups when it's doing this dark resume type of functionality. Then all the policy decisions on what to wake up for can be left to userspace software and policy can be done more hollistically. I have a bug opened in systemd to track this idea. https://github.com/systemd/systemd/issues/27077 I've found an issue on Intel pinctrl that looks very similar to what I've tried to fix: https://lore.kernel.org/linux-gpio/20221124222926.72326-1-andriy.shevchenko@linux.intel.com/. If I understand correctly they detect whether a pin is used by firmware, and if that's the case they save/restore its value over resume/suspend. I wonder if we could do something similar. Mario, Do you know whether documentation of this pinctrl is publicly available? > If I understand correctly they detect whether a pin is used by firmware, and > if that's the case they save/restore its value over resume/suspend. > I wonder if we could do something similar. Similarly - all of the GPIO pads can be configured for I2C, GPIO, or I3C (on supported platforms) functionality. This support was added to the pinctrl-amd driver by 72440158f70f2c61e6e5b22b7409d48de62cc914. I think we probably want the output of the mux configuration in the debugfs output to better understand if that's contributing to this problem. > Do you know whether documentation of this pinctrl is publicly available? I am afraid GPIO pad documentation for "current" platforms is in the NDA version of the PPR not the public one. The BKDG for "previous" platforms documents the way that the FCH GPIO registers and IOMUX registers work though which for the purpose of this bug hasn't changed in a significant way. You'll notice we don't have different pinctrl-amd drivers for different SOCs. So if you can't access it under NDA you can reference that. Is there by chance Chromium bug associated with your patch? Can you check if I can be included on it? If so, please loop in mario.limonciello@amd.corp-partner.google.com. > I think we probably want the output of the mux configuration in the debugfs > output to better understand if that's contributing to this problem. TIL /sys/kernel/debug/pinctrl/<controller>, but no maps are being applied so I don't believe the functions are being changed at all. > Ack - asking the FW team. Internal ticket LO-2406 I had a try today with toggling the lid for an AMD reference board in hopes I could better dig into what's really going on. Unfortunately 6.3-rc6 and 6.3-rc7 both successfully woke up the system. > 1) More testing of that patch. It changes the flow quite a bit and could > have unexpected implications because general purpose Linux doesn't have a > concept of Dark Resume like ChromeOS does. Without dark resume I could > easily see this leading to more spurious interrupts and different behaviors. The testing so far has lead positive results on the combination of that patch + the revert of 4e5a04be88fe on everything tested except for the Dell system that Richard Gong reported the regression on 6.3-rc6 that matched the Z13 issue I reported. The Dell system actually *does* wake up from GPIO 0. Using just my modern standby flow change patch but not reverting 4e5a04be88fe works for that system. So my next goal is to figure out why 4e5a04be88fe is needed. It shouldn't be. Created attachment 304162 [details]
possible patch v2
OK I've got a solution now. I can solve it on 6.3-rc6 with the attached patch. I'm going to re-introduce a variation of your patch back in a series that includes this after some more testing.
The new patch tests well on everything that broke that I know of. I've got the series ready to go, and will post it once Kornel gets a chance to test it on the bug that prompted his work. I intend to CC the relevant bits back to stable as well. The patches to fix this have been submitted here: https://lore.kernel.org/linux-gpio/20230421120625.3366-1-mario.limonciello@amd.com/T/#m6d69abe8e00bbcf81550ea1d1d92bacbea6433bd @Mark: From the debugging it appears that GPIO 0 is the wake source for the lid. The reason for the regression is described in the patches. Although functional it appears that the power button and lid haven't been working "as designed". There is a special debounce behavior for GPIO 0 that they're intended to be using. The patches now fix the bug and align that debounce behavior as intended, which matches what it does on Windows too. As this was all masked because of 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe") Part of the series reverts it. Hopefully this ends up aligning with what your firmware team says. Series is merged in 6.5-rc1. Will the patches by applied to Linux-stable? Yeah they're in stable now too, but I do want to note that one ptach is going to be reverted in 6.6-rc1 because of a regression in the series causing increased power consumption in some systems. https://lore.kernel.org/linux-gpio/20230818144850.1439-1-mario.limonciello@amd.com/ This will come to stable as well. Does that mean that the patches were on an impartial bugfix? > Does that mean that the patches were on an impartial bugfix?
Call it whatever you want to.
They had a side effect on some systems that the boot firmware configured the pins unexpectedly.
I tried on a variety of systems on my end, both OEM and reference systems and didn't see this behavior. But as it is confirmed to fix the power issue reported I've submitted it.
I'm more asking what part of the fix is gone after the revert. The part about trusting boot firmware for the GPIO wake bit configurations. |