Bug 216230

Summary: "irq9: nobody cared" on Thinkpad T14 Gen1 (AMD) when s2idle is enabled
Product: Drivers Reporter: madcatx
Component: Platform_x86Assignee: drivers_platform_x86 (drivers_platform_x86)
Status: RESOLVED DOCUMENTED    
Severity: normal CC: jwrdegoede, mario.limonciello, mpearson-lenovo
Priority: P1    
Hardware: AMD   
OS: Linux   
Kernel Version: 5.18.9 Subsystem:
Regression: No Bisected commit-id:
Attachments: /sys/kernel/debug/gpio when the issue does manifest
/sys/kernel/debug/gpio when there is no issue
Full dmesg log with suspend cycle
dmesg withpinctrl debugging

Description madcatx 2022-07-09 18:24:25 UTC
Whenever I power up my T14 Gen1 TP with AMD chip, I get the following error about 4 seconds after boot

[    4.304639] kernel: irq 9: nobody cared (try booting with the "irqpoll" option)
[    4.304652] kernel: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.18.9-arch1-1 #1 137f0035b2ece06cb65382579db27e9de66af504
[    4.304654] kernel: Hardware name: LENOVO 20UDS02D00/20UDS02D00, BIOS R1BET71W(1.40 ) 04/05/2022
[    4.304655] kernel: Call Trace:
[    4.304656] kernel:  <IRQ>
[    4.304658] kernel:  dump_stack_lvl+0x48/0x5d
[    4.304662] kernel:  __report_bad_irq+0x35/0xaa
[    4.304664] kernel:  note_interrupt.cold+0xa/0x65
[    4.304666] kernel:  handle_irq_event+0x72/0x80
[    4.304669] kernel:  handle_fasteoi_irq+0x8e/0x1f0
[    4.304671] kernel:  __common_interrupt+0x46/0xa0
[    4.304674] kernel:  common_interrupt+0x81/0xa0
[    4.304677] kernel:  </IRQ>
[    4.304677] kernel:  <TASK>
[    4.304678] kernel:  asm_common_interrupt+0x22/0x40
[    4.304681] kernel: RIP: 0010:cpuidle_enter_state+0xdc/0x380
[    4.304684] kernel: Code: 00 00 31 ff e8 95 70 80 ff 45 84 ff 74 16 9c 58 0f 1f 40 00 f6 c4 02 0f 85 8f 02 00 00 31 ff e8 fa 44 86 ff fb 0f 1f 44 00 00 <45> 85 f6 0f 88 22 01 00 00 49 63 ce 48 8d 04 49 48 8d 04 81 49 8d
[    4.304685] kernel: RSP: 0018:ffffb77f401a7e90 EFLAGS: 00000246
[    4.304687] kernel: RAX: ffff8f606f872cc0 RBX: 0000000000000003 RCX: 0000000000000000
[    4.304688] kernel: RDX: 0000000100930767 RSI: fffffffc2980382f RDI: 0000000000000000
[    4.304688] kernel: RBP: ffff8f5980bcfc00 R08: 0000000000000002 R09: 000000004b6f7f5c
[    4.304689] kernel: R10: 0000000000000d4c R11: 0000000000000002 R12: ffffffff94d51300
[    4.304690] kernel: R13: 0000000100930767 R14: 0000000000000002 R15: 0000000000000000
[    4.304692] kernel:  cpuidle_enter+0x2d/0x40
[    4.304693] kernel:  do_idle+0x1ba/0x220
[    4.304695] kernel:  cpu_startup_entry+0x1d/0x20
[    4.304696] kernel:  start_secondary+0x11c/0x140
[    4.304699] kernel:  secondary_startup_64_no_verify+0xd5/0xdb
[    4.304702] kernel:  </TASK>

When the machine boots up, I can check /proc/interrupts and get this for IRQ 9

  9:          0     100076          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC    9-fasteoi   acpi, pinctrl_amd

Rather interestingly, when I reboot the machine, I don't get any IRQ-related errors during boot and everything works fine. /proc/interrupts now looks like this

 9:          0         87          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC    9-fasteoi   acpi, pinctrl_amd

The number of interrupts very slowly increases as the computer runs.

This odd IRQ9 error that happens only on the first cold boot happens only when I enable "Windows sleep", aka s2idle, mode in BIOS. Switching to S3 ("Linux sleep") takes care of the IRQ problem but I'd rather stick to s2idle because the latest Lenovo firmware has some issues that go away when s2idle is enabled.

Thanks!
Comment 1 Mario Limonciello (AMD) 2022-07-10 15:52:40 UTC
I've got a handful of questions that should help clarify what's going on.

1) Is this a regression from a kernel upgrade?  What kernel introduced it?
2) Is it a regression from a BIOS upgrade?  What BIOS didn't have it?  Can you downgrade to that and it goes away?
3) Does s2idle still work?  What indicates it did or didn't is whether this message appears on resume:
https://github.com/torvalds/linux/blob/master/drivers/platform/x86/amd-pmc.c#L361

In addition to those questions the logs I would like to see:
* A full dmesg log including a suspend cycle.
* /sys/kernel/debug/gpio from a cold boot (that the problem reproduces)
* /sys/kernel/debug/gpio from a warm boot (that the problem doesn't reproduce)
Comment 2 madcatx 2022-07-11 06:56:21 UTC
Created attachment 301385 [details]
/sys/kernel/debug/gpio when the issue does manifest
Comment 3 madcatx 2022-07-11 06:56:56 UTC
Created attachment 301386 [details]
/sys/kernel/debug/gpio when there is no issue
Comment 4 madcatx 2022-07-11 06:57:28 UTC
Created attachment 301387 [details]
Full dmesg log with suspend cycle
Comment 5 madcatx 2022-07-11 07:02:00 UTC
Thanks for looking into this.

It turns out that the issue is more difficult to reproduce than I'd originally thought. I saw a few cold boots that went fine and one warm boot when the irq9 fault happened. It looks like the cold/warm boot was just a red herring.

My machine is affected by the NVMe slow wakeup bug so I only started to use s2idle since kernel 5.18. As such I don't know if this is a regression. I'll see if I can downgrade the firmware if you think that'd help you track this down. FTR I'm on version 1.40 which seems to be the latest one as of now.

I attached the requested logs.
Comment 6 Mario Limonciello (AMD) 2022-07-11 16:09:53 UTC
The tough thing with this bug is that in this design the IRQ is shared both for the ACPI SCI as well as for the GPIO controller.  There are patches in the kernel's s2idle handling created specifically for that case.  But it means when you're dealing with a spurious interrupt you don't know which device it was REALLY destined for.

Looking at your GPIO output it seems that pin 32 switches from output pin in the good scenario to input pin in the bad scenario.  Can you try to correlate if this happens every single time you see a problem or it's also a red herring?

Good 
> pin32     interrupt is disabled| interrupt is masked| disable wakeup in S0i3
> state| disable wakeup in S3 state|
 disable wakeup in S4/S5 state|     pull-up is disabled| Pull-down is disabled| output is high| output is enabled| debouncing filter disabled|   0xc50000

Bad
> pin32     interrupt is disabled| interrupt is masked| disable wakeup in S0i3
> state| disable wakeup in S3 state|
 disable wakeup in S4/S5 state| input is high|   pull-up is disabled| Pull-down is disabled|   output is disabled| debouncing filter disabled|   0x50000

> My machine is affected by the NVMe slow wakeup bug so I only started to use
> s2idle since kernel 5.18.

Sorry what bug?  Are you talking about https://github.com/torvalds/linux/commit/455cd867b85b53fd3602345f9b8a8facc551adc9 ?

> I'll see if I can downgrade the firmware if you think that'd help you track
> this down. FTR I'm on version 1.40 which seems to be the latest one as of
> now.

> I attached the requested logs.

I didn't see the issue listed there, I was hoping to understand the timing of it.

Something I'm wondering is if maybe it's an IRQ destined for the GPIO controller but the GPIO controller driver isn't yet loaded?  This could be particularly relevant if it's not in your initramfs but rather loaded after you entered your encryption key (noticed root=/dev/mapper/cryptroot)

Could you please turn on dynamic debugging for pinctrl_amd (pinctrl.dyndbg=+p on your kernel command line )and share a log with it reproducing?

> pcie_aspm=force

FYI - this is generally not a safe thing to do.  If you remove this from your kernel command line can you reproduce the issue?

> As such I don't know if this is a regression. I'll see if I can downgrade the
> firmware if you think that'd help you track this down. FTR I'm on version
> 1.40 which seems to be the latest one as of now.

I think Mark will have to comment whether it's something they're aware of, tracking, etc regression wise.
Comment 7 madcatx 2022-07-11 16:30:58 UTC
(In reply to Mario Limonciello (AMD) from comment #6)
> The tough thing with this bug is that in this design the IRQ is shared both
> for the ACPI SCI as well as for the GPIO controller.  There are patches in
> the kernel's s2idle handling created specifically for that case.  But it
> means when you're dealing with a spurious interrupt you don't know which
> device it was REALLY destined for.
> 
> Looking at your GPIO output it seems that pin 32 switches from output pin in
> the good scenario to input pin in the bad scenario.  Can you try to
> correlate if this happens every single time you see a problem or it's also a
> red herring?
> 
> Good 
> > pin32     interrupt is disabled| interrupt is masked| disable wakeup in
> S0i3
> > state| disable wakeup in S3 state|
>  disable wakeup in S4/S5 state|     pull-up is disabled| Pull-down is
> disabled| output is high| output is enabled| debouncing filter disabled|  
> 0xc50000
> 
> Bad
> > pin32     interrupt is disabled| interrupt is masked| disable wakeup in
> S0i3
> > state| disable wakeup in S3 state|
>  disable wakeup in S4/S5 state| input is high|   pull-up is disabled|
> Pull-down is disabled|   output is disabled| debouncing filter disabled|  
> 0x50000

I noticed the difference there. I'll keep an eye on it whenever I run into this,

> > My machine is affected by the NVMe slow wakeup bug so I only started to use
> > s2idle since kernel 5.18.
> 
> Sorry what bug?  Are you talking about
> https://github.com/torvalds/linux/commit/
> 455cd867b85b53fd3602345f9b8a8facc551adc9 ?

Yes, that one, at least I think this is what made s2idle usable for me.

> > I'll see if I can downgrade the firmware if you think that'd help you track
> > this down. FTR I'm on version 1.40 which seems to be the latest one as of
> > now.
> 
> > I attached the requested logs.
> 
> I didn't see the issue listed there, I was hoping to understand the timing
> of it.

Apologies, I'll grab a fresh dmesg when it happens again. Although I'm afraid it won't be very interesting.

> 
> Something I'm wondering is if maybe it's an IRQ destined for the GPIO
> controller but the GPIO controller driver isn't yet loaded?  This could be
> particularly relevant if it's not in your initramfs but rather loaded after
> you entered your encryption key (noticed root=/dev/mapper/cryptroot)

Okay. Would adding pinctrl_amd to initramfs be enough to rule out this possibility?

> 
> Could you please turn on dynamic debugging for pinctrl_amd
> (pinctrl.dyndbg=+p on your kernel command line )and share a log with it
> reproducing?

Will do.

> 
> > pcie_aspm=force
> 
> FYI - this is generally not a safe thing to do.  If you remove this from
> your kernel command line can you reproduce the issue?

I'm aware but I've been using this machine with forced ASPM for about 2 years with no apparent issues related to that. I'll give it a go.

> > As such I don't know if this is a regression. I'll see if I can downgrade
> the
> > firmware if you think that'd help you track this down. FTR I'm on version
> > 1.40 which seems to be the latest one as of now.
> 
> I think Mark will have to comment whether it's something they're aware of,
> tracking, etc regression wise.

Thanks a lot! I'll see what I can do and let you know...
Comment 8 Mario Limonciello (AMD) 2022-07-11 16:36:08 UTC
> Although I'm afraid it won't be very interesting.

It might not be, but once we know the timing relative to pinctrl-amd loading we can know if it's that.  If it's not, then our next step will be to crank up ACPI debugging output to see if it correlates with an ACPI event.

> Okay. Would adding pinctrl_amd to initramfs be enough to rule out this
> possibility?

Yeah that or compiling it as built-in.

> I'm aware but I've been using this machine with forced ASPM for about 2 years
> with no apparent issues related to that. I'll give it a go.

I guess  the reason you do it is for improved power consumption.  I think you should contrast how much you save with a watt meter and what devices forcing ASPM actually changes.  If If you find that you have improved power consumption from something it's better to ask Lenovo to change the defaults in their BIOS.  They're more well equipped to do stress testing and analysis that PCIe is still performing properly using the proper equipment.
Comment 9 madcatx 2022-07-11 17:34:14 UTC
Luckily I managed to reproduce this very quickly this time.

The state of the GPIO pin32 is now reported as:
pin32       interrupt is disabled| interrupt is masked| disable wakeup in S0i3 state| disable wakeup in S3 state|
 disable wakeup in S4/S5 state|     pull-up is disabled| Pull-down is disabled| output is high| output is enabled| debouncing filter disabled|   0xc50000

I attached a new dmesg log with the extended pinctrl debugging enabled.
Comment 10 madcatx 2022-07-11 17:34:40 UTC
Created attachment 301392 [details]
dmesg withpinctrl debugging
Comment 11 Mario Limonciello (AMD) 2022-07-11 17:40:34 UTC
I think you turned on the wrong pinctrl debugging.  I'm looking for the timing of these statements:

drivers/pinctrl/pinctrl-amd.c:1061 [pinctrl_amd]amd_gpio_probe =_ "amd gpio driver loaded\012"
drivers/pinctrl/pinctrl-amd.c:655 [pinctrl_amd]_amd_gpio_irq_handler =_ "Disabling spurious GPIO IRQ %d\012"
drivers/pinctrl/pinctrl-amd.c:632 [pinctrl_amd]_amd_gpio_irq_handler =_ "Waking due to GPIO register 0x%p: 0x%x"
Comment 12 madcatx 2022-07-11 19:10:23 UTC
Is it pinctrl.dyndbg=+p or pinctrl_amd.dyndbg=+p? I used the former to produce the last dmesg dump, the latter doesn't seem to do anything.
Comment 13 Mario Limonciello (AMD) 2022-07-11 19:14:05 UTC
It should be the latter.  You can confirm the statements were turned on by grepping /sys/kernel/debug/dynamic_debug/control.

If it's not doing anything now is the right time to ask - you do have CONFIG_PINCTRL_AMD set in your kernel config right?
Comment 14 Mario Limonciello (AMD) 2022-07-11 19:37:09 UTC
Try using this instead:

dyndbg="module pinctrl_amd +p"
Comment 15 madcatx 2022-07-12 06:52:14 UTC
I made a small discovery today. After adding pinctrl_amd module to my initramfs I haven't been able to reproduce the problem so far. That may not be indicative of anything because the problem does not appear every time. A bit more interesting is my current output of /proc/interrupts. For IRQ9 it currently shows

 9:          0      11461          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC    9-fasteoi   acpi, pinctrl_amd

Notice that the number of recorded interrupts is around 11000. Before I added pinctrl_amd to initramfs, there were two distinct cases. Either over 100000 interrupts with the kernel complaining about IRQ9 not being serviced or about 90 interrupts and no apparent issues. All of these numbers are from /proc/interrupts dump done right after logging into DE after a reboot.

I was able to reproduce these figures multiple times. This suggests that some HW starts to blast IRQs and doesn't stop until it gets serviced. Knowing this, is there anything else you'd like me to try besides your suggestions above?

My CONFIG_PINCTRL_AMD is set to CONFIG_PINCTRL_AMD=m in my kernel config and the respective module is loaded.
Comment 16 Mario Limonciello (AMD) 2022-07-12 15:34:05 UTC
> Knowing this, is there anything else you'd like me to try besides your
> suggestions above?

Actually this does make something pop out in my mind.

In your failing log above you had shown this:
> [    4.295526] kernel: Disabling IRQ #9

By this happening before pinctrl_amd is loaded might mean that IRQs are not being serviced anymore for the ACPI SCI driver.  When pinctrl_amd gets loaded it tries to service IRQ 9 stuff, but it's not destined for that driver and then you see an IRQ storm from ACPI interrupts.

So IMO continue to monitor to see if putting it into the initramfs helped the problem afterall, but please leave the dynamic debugging for pinctrl-amd in place on your kernel command line in case it comes back and we need to look at the timing of the error and pinctrl-amd loading.
Comment 17 madcatx 2022-07-13 06:33:49 UTC
My output from /sys/kernel/debug/dynamic_debug/control lists pinctrl_amd like this

drivers/pinctrl/pinctrl-amd.c:1056 [pinctrl_amd]amd_gpio_probe =p "amd gpio driver loaded\012"
drivers/pinctrl/pinctrl-amd.c:655 [pinctrl_amd]do_amd_gpio_irq_handler =p "Disabling spurious GPIO IRQ %d\012"
drivers/pinctrl/pinctrl-amd.c:632 [pinctrl_amd]do_amd_gpio_irq_handler =p "Waking due to GPIO %d: 0x%x"

IIRC this means that those debugging statements are enabled. The only relevant line from dmesg is this though:

[    0.668575] amd_gpio AMDI0030:00: amd gpio driver loaded

I just noticed another case where the number of interrupts reached about 12000. With a bit of careful optimism I think that loading pinctrl_amd early during boot puts an early stop to the IRQ storm before the kernel intervenes and disables the interrupt channel.

As for reproducibility, I think that the machine needs to run for some time before the problem can be "re-reproduced". This is why it looked like it was happening only on cold boots. Assuming there an EC connected to these GPIOs I guess it's some kind of timer overflow...?
Comment 18 Mario Limonciello (AMD) 2022-07-13 17:18:24 UTC
> IIRC this means that those debugging statements are enabled. The only
> relevant line from dmesg is this though:

Yeah the other two messages you won't be seeing generally.  They're relevant to certain circumstances.

> [    0.668575] amd_gpio AMDI0030:00: amd gpio driver loaded

The really important thing here is the kernel timestamp.  If you compare it to the failing log you have above, you can see that the failure happens a lot later in the boot.

[    4.295441] kernel: irq 9: nobody cared (try booting with the "irqpoll" option)

> I just noticed another case where the number of interrupts reached about
> 12000. With a bit of careful optimism I think that loading pinctrl_amd early
> during boot puts an early stop to the IRQ storm before the kernel intervenes
> and disables the interrupt channel.

> As for reproducibility, I think that the machine needs to run for some time
> before the problem can be "re-reproduced". This is why it looked like it was
> happening only on cold boots. Assuming there an EC connected to these GPIOs I
> guess it's some kind of timer overflow...?

Careful optimism sounds appropriate, but fingers crossed!

As a general statement it seems that by using an encrypted partition you exacerbated this problem because you will by definition have a longer boot that you don't have access to pinctrl-amd's kernel object while getting your passphrase entered.

I'd say if you don't hit this in the next few days with this change we should close it as "DOCUMENTED".  Mayyybe we should add something to the Kconfig text to warn about this?  Not sure what else can really be done from the kernel.
Comment 19 Hans de Goede 2022-07-13 17:29:51 UTC
> I'd say if you don't hit this in the next few days with this change we should
> close it as "DOCUMENTED".  Mayyybe we should add something to the Kconfig
> text to warn about this?  Not sure what else can really be done from the
> kernel.

We could consider changing the amd pinctrl option into a bool instead of a tristate, forcing it to be builtin. In general ACPI tables may rely on the pinctrl through GPIO OpRegions in various ways. So making sure that these are available early usually is a good idea. I've also seen issues on some Intel models when the pinctrl driver is not builtin.

Note I just checked and the Fedora kernels already have:

CONFIG_PINCTRL_AMD=y
Comment 20 Mario Limonciello (AMD) 2022-07-13 18:04:14 UTC
Great suggestion, thanks!
I've sent up https://lore.kernel.org/linux-gpio/20220713175950.964-1-mario.limonciello@amd.com/T/#u for just that.
Comment 21 madcatx 2022-07-14 11:31:04 UTC
Just FTR, I just reproduced another case where the problem most likely would have occurred - almost 33000 interrupts before the driver took action - but having pinctrl_amd loaded early seems to be an effective solution.

I guess that Lenovo people may want to look into this, 33000 intrs is sort of cutting it close to the kernel disabling the interrupt regardless of the driver being available.

Thanks a lot for all the guidance how to track this down!
Comment 22 Mario Limonciello (AMD) 2022-07-14 11:34:56 UTC
Were you still using the initramfs method when 33000 came through?
If so; please switch to built-in.  That should allow the driver to load as early as possible.

> Thanks a lot for all the guidance how to track this down!

Sure!
Comment 23 madcatx 2022-07-14 13:15:13 UTC
(In reply to Mario Limonciello (AMD) from comment #22)
> Were you still using the initramfs method when 33000 came through?
> If so; please switch to built-in.  That should allow the driver to load as
> early as possible.

Just tried that and the number of unserviced interrupts dropped down to less than 8000; that's the lowest value I've seen so far. pinctrl_amd gets loaded about 0.48 secs into boot, it loaded at about 0.67 sec when it was a module.

Anyway, it looks like we pinned the problem down pretty well. Thanks!
Comment 24 Mario Limonciello (AMD) 2022-07-14 13:48:20 UTC
OK thanks.  I'll close it, and we'll work on getting that patch landed to prevent users from compiling as a module.