Bug 215770
Summary: | Spurious wakeup from s2idle - AMD Ryzen 7 5825U with Radeon Graphics | ||
---|---|---|---|
Product: | ACPI | Reporter: | Kai-Heng Feng (kai.heng.feng) |
Component: | Power-Sleep-Wake | Assignee: | acpi_power-sleep-wake |
Status: | RESOLVED DOCUMENTED | ||
Severity: | normal | CC: | mario.limonciello, rjw, rui.zhang, superm1 |
Priority: | P1 | ||
Hardware: | AMD | ||
OS: | Linux | ||
Kernel Version: | mainline, linux-next | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg
patch to add more verbose logging dmesg with the patch applied |
Description
Kai-Heng Feng
2022-03-29 11:40:26 UTC
Not sure if this matters: [ 0.663660] ACPI Error: No handler for Region [ECRM] (00000000dee7d46d) [EmbeddedControl] (20211217/evregion-130) [ 0.663679] ACPI Error: Region EmbeddedControl (ID=3) has no handler (20211217/exfldio-261) [ 0.663692] No Local Variables are initialized for Method [_EVT] [ 0.663693] Initialized Arguments for Method [_EVT]: (1 arguments defined for method invocation) [ 0.663693] Arg0: 00000000fdd19f23 <Obj> Integer 0000000000000000 [ 0.663697] ACPI Error: Aborting method \_SB.GPIO._EVT due to previous error (AE_NOT_EXIST) (20211217/psparse-529) Created attachment 300637 [details]
dmesg
Can you please see if this still occurs with https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=linux-next&id=9946e39fe8d0a5da9eb947d8e40a7ef204ba016e applied? (In reply to Mario Limonciello (AMD) from comment #3) > Can you please see if this still occurs with > https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/ > ?h=linux-next&id=9946e39fe8d0a5da9eb947d8e40a7ef204ba016e applied? The issue persists with the patch applied. Created attachment 301485 [details]
patch to add more verbose logging
Exactly the same? IRQ 1 shows at /sys/power/pm_wakeup_irq?
Can you please add this patch as well to your kernel and share full log with pm_debug_messages set? I wonder if in your circumstance you have more than 2 sources.
(In reply to Mario Limonciello (AMD) from comment #5) > Created attachment 301485 [details] > patch to add more verbose logging > > Exactly the same? IRQ 1 shows at /sys/power/pm_wakeup_irq? Yes. Some laptops show IRQ 1, some show IRQ 9. But as you can see in the log the IRQ 1 is the one that triggers wakeup. As soon as i8042 wakeup is disabled the issue goes away. > > Can you please add this patch as well to your kernel and share full log with > pm_debug_messages set? I wonder if in your circumstance you have more than > 2 sources. Created attachment 301486 [details]
dmesg with the patch applied
Thanks. It was a bit of a long shot that this helped this design, but that commit does fix some IRQ1 related problems on other designs. I'll send up this patch for the extra debugging message separately for Rafael to take a look at, I think it's useful for issues like this. I still do think this issue at it's core is a platform firmware issue not a kernel issue. One of 3 things to me: 1) Either the EC asserting i8042 2) The polarity is wrong for IRQ1 (like I mentioned for that coreboot design in AMD gitlab issue). 3) Some other source in this design asserting IRQ1 that is not EC. (In reply to Mario Limonciello (AMD) from comment #8) > Thanks. It was a bit of a long shot that this helped this design, but that > commit does fix some IRQ1 related problems on other designs. > > I'll send up this patch for the extra debugging message separately for > Rafael to take a look at, I think it's useful for issues like this. Yes this will be quite useful. > > I still do think this issue at it's core is a platform firmware issue not a > kernel issue. One of 3 things to me: > 1) Either the EC asserting i8042 Or maybe it's from IO-APIC? The EC folks guaranteed that i8042 doesn't raise the IRQ. > 2) The polarity is wrong for IRQ1 (like I mentioned for that coreboot design > in AMD gitlab issue). If polarity is wrong, the keyboard won't work at all. So I don' think it's the case here. > 3) Some other source in this design asserting IRQ1 that is not EC. I think AMD Taipei is trying to find the root cause here. > If polarity is wrong, the keyboard won't work at all. So I don' think it's
> the case here.
Actually - It can be set in either direction as long as it's consistent with rest of platform firmware configuration. If another source that is part of the AND gate to IRQ1 has a different polarity coming out of s0i3 then that could lead to this mismatch. This direction makes sense why to investigate IO-APIC configuration.
(In reply to Mario Limonciello (AMD) from comment #10) > > If polarity is wrong, the keyboard won't work at all. So I don' think it's > > the case here. > > Actually - It can be set in either direction as long as it's consistent with > rest of platform firmware configuration. If another source that is part of > the AND gate to IRQ1 has a different polarity coming out of s0i3 then that > could lead to this mismatch. This direction makes sense why to investigate > IO-APIC configuration. Shouldn't IRQ1 be exclusive to i8042? Does APIC use the same hardware IRQ line for different IRQs? > Shouldn't IRQ1 be exclusive to i8042? Does APIC use the same hardware IRQ
> line for different IRQs?
It depends on OEM's design.
(In reply to Mario Limonciello (AMD) from comment #12) > > Shouldn't IRQ1 be exclusive to i8042? Does APIC use the same hardware IRQ > > line for different IRQs? > > It depends on OEM's design. I was told that the issue also happens on AMD's CRB, so what's the other IRQ that shares with IRQ 1? That's not as straightforward a question as you may think. A bunch of non-obvious devices can also generate IRQ 1 such as PCI SMBUS controller. CRB doesn't generate EC events during S0i3 very frequently due to using _BTP (which many HP designs don't use). I've not seen it first hand on CRB myself. If it can also reproduce on CRB, it should need to be dug into by BIOS guys who can look more closely. (In reply to Mario Limonciello (AMD) from comment #14) > That's not as straightforward a question as you may think. A bunch of > non-obvious devices can also generate IRQ 1 such as PCI SMBUS controller. I didn't know that. So when PCI SMBUS controller raises IRQ, the kernel calls i8042's IRQ handler as usual? > > CRB doesn't generate EC events during S0i3 very frequently due to using _BTP > (which many HP designs don't use). I've not seen it first hand on CRB > myself. > > If it can also reproduce on CRB, it should need to be dug into by BIOS guys > who can look more closely. I think they are getting close to the root cause, are you in the discussion loop? > So when PCI SMBUS controller raises IRQ, the kernel calls i8042's IRQ handler > as usual? In this case that's what I would expect happens. It's just like shared IRQ in the kernel between two drivers. > I think they are getting close to the root cause, are you in the discussion > loop? No, but I'll reach out to some of them to find out more. Having looked over that discussion; I'm confident this is a hardware/firmware bug, nothing for Linux to do here unless a W/A is bandaged in for it for this system. (In reply to Mario Limonciello (AMD) from comment #17) > Having looked over that discussion; I'm confident this is a > hardware/firmware bug, nothing for Linux to do here unless a W/A is bandaged > in for it for this system. Is it possible to have an erratum for this issue? I don't know. It depends on which place the bug ends up living. It's not a Linux bug is all I'm saying closing this issue. |