Bug 201817

Summary: irq 7: nobody cared for laptops with AMD processors
Product: Platform Specific/Hardware Reporter: Bram Coenen (bram.coenen96)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: NEEDINFO ---    
Severity: normal CC: albertogomezmarin, callofdutypsp, CaptainSifff, carnil, clemens, damkrat, drake, greg, hikaph+kernel, jan, joakim.rosenqvist, jvdelisle2, jwrdegoede, konoha02, mario.limonciello, mkj, mruize85, nospamming11+kernel, nuriilengir, openproggerfreak, paul.richards, philipp.list, postix, rventura.pt, samy, sh200105, shtetldik, superm1, t.neish, texstar, yinette.hodge
Priority: P1    
Hardware: AMD   
OS: Linux   
Kernel Version: 5.10 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on: 198715    
Bug Blocks: 208469    
Attachments: dmesg after boot, 4.19.4, no touchscreen
dmesg after boot, 4.19.3, working touchscreen
cat /proc/interrupts
modified function
modified - dmesg - working
[RFC] pinctrl/amd: Clear interrupt enable bits on probe
dmesg with leonard patch and a non-working touchscreen
dmesg with leonard patch and a working touchscreen
dmesg_leonard_patch_no_touchscreen
cat /sys/kernel/debug/gpio - linux-zen 5.5.8
cat /sys/kernel/gpio result
cat /sys/kernel/debug/gpio
cat /sys/kernel/debug/gpio. Cuz why not
dmesg
interrupts
dmesg output of HP M01-F0001 System
Part of the Mainboard "Erica2" from HP
dmesg with apic=verbose kernel 5.12.5
dmesg apic=verbose, ryzen 3500u, thinkpad e495
Attempt to work between rock and a hard place
dmesg after paultest1 build
dmesg paultest2 build
acpidump kernel paultest2
Lenovo E595 dmesg output using paultest3 kernel
Lenovo E595 acpidump output using paultest3 kernel
dmesg from paultest3 build
acpidump from paultest3
Lenovo E595 dmesg output using paultest4 kernel
Lenovo E595 acpidump output using paultest4 kernel
dmesg from paultest4
acpidump from paultest4
attachment-2695-0.html
multiples oops for 5.16.11-arch1-1,5.16.14-arch1-1,5.15.28-1-lts,6.0.1-arch2-1
1-dmesg-oops-T495-Ryzen7PRO3700U-6.0.1-arch2-1 with dyndbg="module pinctrl_amd +p"
2-dmesg-oops-T495-Ryzen7PRO3700U-6.0.1-arch2-1 with dyndbg="module pinctrl_amd +p"

Description Bram Coenen 2018-11-29 18:40:08 UTC
Created attachment 279741 [details]
dmesg after boot, 4.19.4, no touchscreen

The Elan touchscreen on HP laptops with an AMD processor just got fixed and worked properly for a while in 4.19.3. However after installing a new kernel version, 4.19.4, the touchscreen stopped working and new errors appeared.

I once got this in 4.19.3 as well, but after a few shutdowns and the last shutdown holding the power button, this was resolved. I did have issues login in as well (Don't)

I'm on a HP ENVY x360 Convertible 15-bq0xx/8311, BIOS F.08 with only Fedora 28.

The APCI-config was fixed in https://bugzilla.kernel.org/show_bug.cgi?id=198715 .
Comment 1 Bram Coenen 2018-11-29 18:51:30 UTC
Created attachment 279743 [details]
dmesg after boot, 4.19.3, working touchscreen
Comment 2 Bram Coenen 2018-11-29 18:58:55 UTC
Just checked. The kernel version doesn't matter, a decent about of reboots does the trick to get the touchscreen working even on 4.19.4.
Comment 3 Bram Coenen 2018-11-29 19:11:53 UTC
[   16.587361] irq 7: nobody cared (try booting with the "irqpoll" option)
[   16.587366] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C        4.19.4-200.fc28.x86_64 #1
[   16.587367] Hardware name: HP HP ENVY x360 Convertible 15-bq0xx/8311, BIOS F.08 03/30/2018
[   16.587368] Call Trace:
[   16.587372]  <IRQ>
[   16.587381]  dump_stack+0x5c/0x80
[   16.587385]  __report_bad_irq+0x37/0xae
[   16.587388]  note_interrupt.cold.9+0xa/0x69
[   16.587390]  handle_irq_event_percpu+0x6a/0x80
[   16.587392]  handle_irq_event+0x27/0x44
[   16.587394]  handle_fasteoi_irq+0x7f/0x120
[   16.587398]  handle_irq+0xbf/0x100
[   16.587400]  do_IRQ+0x49/0xd0
[   16.587403]  common_interrupt+0xf/0xf
[   16.587405]  </IRQ>
[   16.587409] RIP: 0010:native_safe_halt+0x2/0x10
[   16.587411] Code: ff ff 7f c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 75 c4 eb 8c 90 90 90 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90
[   16.587412] RSP: 0018:ffffffffbd203e18 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd8
[   16.587415] RAX: 0000000080000000 RBX: ffff887df5a01c00 RCX: 0000000000000034
[   16.587416] RDX: 4ec4ec4ec4ec4ec5 RSI: ffffffffbd2dd200 RDI: ffff887df5a01c64
[   16.587418] RBP: ffff887df5a01c64 R08: 0000000000000002 R09: 0000000000020800
[   16.587419] R10: 0000000f66cccf99 R11: ffff887df721fde8 R12: 0000000000000001
[   16.587420] R13: 0000000000000001 R14: 0000000000000001 R15: 00000000d0c2fae0
[   16.587424]  acpi_safe_halt+0x1b/0x30
[   16.587427]  acpi_idle_enter+0x104/0x2a0
[   16.587431]  cpuidle_enter_state+0x71/0x320
[   16.587435]  do_idle+0x226/0x260
[   16.587438]  cpu_startup_entry+0x6f/0x80
[   16.587442]  start_kernel+0x523/0x543
[   16.587446]  secondary_startup_64+0xa4/0xb0
[   16.587448] handlers:
[   16.587453] [<00000000e6019074>] amd_gpio_irq_handler [pinctrl_amd]
[   16.587455] Disabling IRQ #7
Comment 4 Hans de Goede 2018-11-30 09:45:15 UTC
I discussed this a bit on the mailinglist. Here are the relevant parts of the discussion:

Me:

The amd_gpio chip/driver appears to be the only driver
connected to IRQ 7, so I think there is an issue with the
amd_gpio driver where it does not properly clear the interrupt
source. E.g. it might be that the BIOS requested interrupts
on a GPIO which Linux does not monitor and that the driver
does not disable this GPIO-IRQ on probe and since it is not
handling that pin in IRQ mode also does not clear it.

Anyways that is just a theory. It would greatly help if
someone who knows the amd_gpio driver better could take
a look.

Reply by Daniel Drake:

Sorry that I can't be much help here - I don't have access to any
useful info beyond the source code already present in Linux.

Maybe you could explore your theory by dumping the GPIO/GPIO-INT
enable regs, see if any of them are marked as enabled by something
other than Linux.

###

I'm afraid I don't have time to look into this myself atm. Maybe someone can add some printk calls to drivers/pinctrl/pinctrl-amd.c to dump relevant register values as Daniel suggested and see if that yields any useful info?
Comment 5 mruize85 2018-12-16 15:20:55 UTC
Forcing `amd_gpio_irq_handler()` from `drivers/pinctrl/pinctrl-amd.c` to always return `IRQ_HANDLED` makes touchscreen work, but now system is sending interrupts at very high rate (around 100 k/s). I a completely newbie to kernel development, so I don't really know what I'm doing... Any idea?
Comment 6 nospamming11+kernel 2018-12-20 21:09:07 UTC
Created attachment 280107 [details]
cat /proc/interrupts

Seeing the exact same problem on 4.19.10 (HP Envy x360 13-ag0004ng)

Attached the output of /proc/interrupts
Comment 7 Lukas Kahnert 2018-12-24 20:01:51 UTC
The only case where I got the "nobody cared" panic on my HP Envy x360 bq-1xx was if I used Windows 10 on my last boot and rebooted into Linux.
My theory is that Windows set the IRQ 7 on a state that persists on reboot(and trigger the panic in linux) and only get cleared if you hold down power button.
My Laptop is now Linux only and since then I never had this issue again(using 4.19.5 now).
Comment 8 JerryD 2019-01-15 02:33:41 UTC
This issue occurs without regard to Windows 10 previous boot. From cold powerup I get failure to boot about 2 out of 3 attempts. Anyway to remove this touchscreen driver? I think it should be backed out until the regression is fixxed.
Comment 9 Bram Coenen 2019-01-15 07:22:10 UTC
(In reply to JerryD from comment #8)
> This issue occurs without regard to Windows 10 previous boot. From cold
> powerup I get failure to boot about 2 out of 3 attempts. Anyway to remove
> this touchscreen driver? I think it should be backed out until the
> regression is fixxed.

I don't even have Windows any more and get the bug sometimes. However, I do not agree that the driver should be removed. I still use the touchscreen daily because for me it is working most of the time. Besides, it does not hurt having it there, does it?
Comment 10 JerryD 2019-01-17 02:22:07 UTC
Are there any workarounds?
Comment 11 nospamming11+kernel 2019-01-17 13:34:38 UTC
Some suggest here: https://github.com/linuxwacom/wacom-hid-descriptors/issues/12
that switching to Legacy BIOS boot helped them.
Comment 12 nospamming11+kernel 2019-01-18 18:19:59 UTC
I can confirm that always returning IRQ_HANDLED fixes the error, but spams the system with a lot of these interrupts (it never stops)

Sometimes booting with the kernel option noirqdebug helped me to get the touchscreen up and running again.

I then noticed the following: See attached three files. One with a modified amd_gpio_irq_handler and two dmesg outputs. One of them with a working touchscreen and one where the touchscreen does not work. I can see that in the working case all interrupts are handled correctly while in the "not-working"-case there are A LOT of interrupts handled at all.
Comment 13 nospamming11+kernel 2019-01-18 18:20:41 UTC
Created attachment 280583 [details]
modified function
Comment 14 nospamming11+kernel 2019-01-18 18:22:35 UTC
Created attachment 280585 [details]
modified - dmesg - working
Comment 15 nospamming11+kernel 2019-01-18 18:27:24 UTC
Not working - dmesg (too large to attach directly): https://bit.ly/2FHChBb
Comment 16 JerryD 2019-01-20 00:21:56 UTC
(In reply to nospamming11+kernel from comment #12)
> I can confirm that always returning IRQ_HANDLED fixes the error, but spams
> the system with a lot of these interrupts (it never stops)
> 
> Sometimes booting with the kernel option noirqdebug helped me to get the
> touchscreen up and running again.
> 
> I then noticed the following: See attached three files. One with a modified
> amd_gpio_irq_handler and two dmesg outputs. One of them with a working
> touchscreen and one where the touchscreen does not work. I can see that in
> the working case all interrupts are handled correctly while in the
> "not-working"-case there are A LOT of interrupts handled at all.

What does this imply? Does the driver need to actually handle this interupt? What hardware is actually generating this interupt? Or is IRQ 7 and unused pin that if floating and therefore must be disabled?
Comment 17 JerryD 2019-02-09 19:35:50 UTC
Appears to be fixed on Fedora kernel 4.20.6-200.fc29.x86_64
Comment 18 nospamming11+kernel 2019-02-09 19:59:22 UTC
Sadly, I can't confirm that. Neither on 4.20.6.arch1-1-ARCH nor on 4.20.7-arch1-1-ARCH the problem is fixed for me. (Laptop Firmware: F.32)

Still noirqdebug is a workaround.
Comment 19 JerryD 2019-02-10 01:11:38 UTC
(In reply to nospamming11+kernel from comment #18)
> Sadly, I can't confirm that. Neither on 4.20.6.arch1-1-ARCH nor on
> 4.20.7-arch1-1-ARCH the problem is fixed for me. (Laptop Firmware: F.32)
> 
> Still noirqdebug is a workaround.

I am on HP Envy Laptop with Bios F19 which HP pulled evidently because it has some other big problem. But since everything is stable for me at the moment I am just leaving it alone. From what I am reading the only way to back that bios out is with a special USB stick which they will send you. My current kernel boot line is:

BOOT_IMAGE=/vmlinuz-4.20.6-200.fc29.x86_64 root=/dev/mapper/fedora_localhost--live-root ro resume=/dev/mapper/fedora_localhost--live-swap rd.lvm.lv=fedora_localhost-live/root rd.lvm.lv=fedora_localhost-live/swap rhgb quiet LANG=en_US.UTF-8 idle=nomwait processor.max_cstate=5
Comment 20 Bram Coenen 2019-02-14 04:48:29 UTC
(In reply to JerryD from comment #17)
> Appears to be fixed on Fedora kernel 4.20.6-200.fc29.x86_64

It isn't fixed for me on the 4.20.6-200.fc29.x86_64. But it doesn't always happen, sometimes it can go days without the bug appearing.
Comment 21 JerryD 2019-02-16 03:16:40 UTC
(In reply to Bram Coenen from comment #20)
> (In reply to JerryD from comment #17)
> > Appears to be fixed on Fedora kernel 4.20.6-200.fc29.x86_64
> 
> It isn't fixed for me on the 4.20.6-200.fc29.x86_64. But it doesn't always
> happen, sometimes it can go days without the bug appearing.

You are right, it is intermittent, sometimes I get it and sometimes I dont.

A side note, I upgraded to Bios F20.  Dont do this. 4.20.xx fails to boot. 4.18 seems to boot fine.
Comment 22 nospamming11+kernel 2019-02-16 14:59:58 UTC
Can you guys confirm that "noirqdebug" as a kernel boot param works for you too?
Comment 23 JerryD 2019-02-17 16:53:37 UTC
(In reply to nospamming11+kernel from comment #22)
> Can you guys confirm that "noirqdebug" as a kernel boot param works for you
> too?

With Linux version 4.18.16-300.fc29.x86_64 mockbuild@bkernel04.phx2.fedoraproject.org)

I see no irq7 issue.

With 4.20.7-200.fc29.x86_64 it locks up right away and unable to get any sort of backtrace with ot without "noirqdebug" ABRT reports insufficiant information to generate a report and to contact kernel mailing list.
Comment 24 JerryD 2019-02-17 16:59:33 UTC
For clarity on comment 23: HP HP ENVY x360 Convertible 15-bq1xx/83C6, BIOS F.20 12/25/2018
Comment 25 Hans de Goede 2019-02-19 13:24:40 UTC
Created attachment 281207 [details]
[RFC] pinctrl/amd: Clear interrupt enable bits on probe

Good news, Leonard Crestez has come up with a patch which likely fixes this.

I'm attaching the patch here, please give it a try.
Comment 26 nospamming11+kernel 2019-02-19 18:35:13 UTC
Unfortunately this still does not fix it for me.

I applied it to 4.20.10-arch1-1 (Archlinux kernel) and I still get the error:

"irq 7:nobody cared (try booting with the "irqpoll" option)"
with a Call Trace afterwards.

I can see that the new function is getting called and tells me that a bunch of PINs get disabled

"amd_gpio: AMD10030:00: Pin 67 interrupt enabled on boot: disable
.....
"

But right after that IRQ 7 starts to spam my logs again (having still added the above described log-outputs
Comment 27 Bram Coenen 2019-02-20 14:26:55 UTC
Created attachment 281231 [details]
dmesg with leonard patch and a non-working touchscreen

Dmesg output when trying Leonard Crestez's patch the first time and it didn't work. It worked when booting into the kernel the second time and I think the relevant error message was the one shown below. I'll come back and give an update if/when the touchscreen stops working again. 

The nobody cared error is gone at least! 

[    2.964272] i2c_hid i2c-ELAN0732:00: HID over i2c has not been provided an Int IRQ
[    2.964330] i2c_hid: probe of i2c-ELAN0732:00 failed with error -22

(This is my first time compiling a kernel in fedora. So I could also have done something wrong.)
Comment 28 nospamming11+kernel 2019-02-20 14:47:32 UTC
I have seen the i2c-error too - when my touchscreen worked, but the patch does not work for me (even after several linux-only-boots).

Do you have a windows system installed @Bram Coenen? Can you boot into it and test if the touchscreen still works, when you boot back into linux?
Comment 29 Bram Coenen 2019-02-20 16:39:29 UTC
(In reply to nospamming11+kernel from comment #28)
> I have seen the i2c-error too - when my touchscreen worked, but the patch
> does not work for me (even after several linux-only-boots).
> 
> Do you have a windows system installed @Bram Coenen? Can you boot into it
> and test if the touchscreen still works, when you boot back into linux?

No, unfortunately I don't have windows installed. However the bug happens now and again anyways
Comment 30 Bram Coenen 2019-02-20 16:41:30 UTC
Created attachment 281233 [details]
dmesg with leonard patch and a working touchscreen

I think the interesting parts here are these prints.

[    2.988099] i2c_hid i2c-ELAN0732:00: i2c-ELAN0732:00 supply vdd not found, using dummy regulator
[    2.988146] i2c_hid i2c-ELAN0732:00: Linked as a consumer to regulator.0
[    2.988148] i2c_hid i2c-ELAN0732:00: i2c-ELAN0732:00 supply vddl not found, using dummy regulator
[    2.993265] audit: type=1130 audit(1550670938.923:9): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=plymouth-start comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[    3.015329] acpi PNP0C14:01: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[    3.015354] wmi_bus wmi_bus-PNP0C14:01: WQBJ data block query control method not found
[    3.017236] input: ELAN0732:00 04F3:24BC Touchscreen as /devices/platform/AMD0010:00/i2c-0/i2c-ELAN0732:00/0018:04F3:24BC.0004/input/input13
[    3.017377] input: ELAN0732:00 04F3:24BC as /devices/platform/AMD0010:00/i2c-0/i2c-ELAN0732:00/0018:04F3:24BC.0004/input/input14
[    3.017434] input: ELAN0732:00 04F3:24BC as /devices/platform/AMD0010:00/i2c-0/i2c-ELAN0732:00/0018:04F3:24BC.0004/input/input15
[    3.017492] input: ELAN0732:00 04F3:24BC as /devices/platform/AMD0010:00/i2c-0/i2c-ELAN0732:00/0018:04F3:24BC.0004/input/input16
[    3.017589] hid-generic 0018:04F3:24BC.0004: input,hidraw3: I2C HID v1.00 Device [ELAN0732:00 04F3:24BC] on i2c-ELAN0732:00
[    3.064691] nvme nvme0: pci function 0000:03:00.0
[    3.107262] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[    3.213210] input: ELAN0732:00 04F3:24BC as /devices/platform/AMD0010:00/i2c-0/i2c-ELAN0732:00/0018:04F3:24BC.0004/input/input18
[    3.213350] input: ELAN0732:00 04F3:24BC as /devices/platform/AMD0010:00/i2c-0/i2c-ELAN0732:00/0018:04F3:24BC.0004/input/input21
[    3.213445] hid-multitouch 0018:04F3:24BC.0004: input,hidraw2: I2C HID v1.00 Device [ELAN0732:00 04F3:24BC] on i2c-ELAN0732:00
Comment 31 Bram Coenen 2019-02-20 19:52:49 UTC
I got the bug again, but no print from the patch. So I probably messed up the compilation of the kernel with the patch. I'll try to patch it correctly tomorrow!
Comment 32 JerryD 2019-02-21 03:22:58 UTC
(In reply to JerryD from comment #23)
> (In reply to nospamming11+kernel from comment #22)
> > Can you guys confirm that "noirqdebug" as a kernel boot param works for you
> > too?
> 
> With Linux version 4.18.16-300.fc29.x86_64
> mockbuild@bkernel04.phx2.fedoraproject.org)
> 
> I see no irq7 issue.
> 
> With 4.20.7-200.fc29.x86_64 it locks up right away and unable to get any
> sort of backtrace with ot without "noirqdebug" ABRT reports insufficiant
> information to generate a report and to contact kernel mailing list.

Turns out the problem I have with the 4.20 kernel is not bios related and bios F.20 is OK. I ran into this bug:

https://bugs.freedesktop.org/show_bug.cgi?id=109206

Kernel 4.19.15-300.fc29.x86_64 is working fine.
Comment 33 Bram Coenen 2019-02-21 04:55:25 UTC
(In reply to JerryD from comment #32)
> (In reply to JerryD from comment #23)
> > (In reply to nospamming11+kernel from comment #22)
> > > Can you guys confirm that "noirqdebug" as a kernel boot param works for
> you
> > > too?
> > 
> > With Linux version 4.18.16-300.fc29.x86_64
> > mockbuild@bkernel04.phx2.fedoraproject.org)
> > 
> > I see no irq7 issue.
> > 
> > With 4.20.7-200.fc29.x86_64 it locks up right away and unable to get any
> > sort of backtrace with ot without "noirqdebug" ABRT reports insufficiant
> > information to generate a report and to contact kernel mailing list.
> 
> Turns out the problem I have with the 4.20 kernel is not bios related and
> bios F.20 is OK. I ran into this bug:
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=109206
> 
> Kernel 4.19.15-300.fc29.x86_64 is working fine.

Try a higher version of 4.20. Mine is working on 4.20.10 for example.
Comment 34 Bram Coenen 2019-02-21 05:05:04 UTC
Created attachment 281253 [details]
dmesg_leonard_patch_no_touchscreen

I applied the patch made by Leonard correctly this time. The touchscreen does not work and prints the following for pin 67 to 148.

[    3.080467] amd_gpio AMD0030:00: Pin 67 interrupt enabled on boot: disable
Comment 35 Bram Coenen 2019-02-21 05:11:12 UTC
(In reply to Bram Coenen from comment #34)
> Created attachment 281253 [details]
> dmesg_leonard_patch_no_touchscreen
> 
> I applied the patch made by Leonard correctly this time. The touchscreen
> does not work and prints the following for pin 67 to 148.
> 
> [    3.080467] amd_gpio AMD0030:00: Pin 67 interrupt enabled on boot: disable

"irq 7: nobody cared" is also still present.
Comment 36 Bram Coenen 2019-02-21 10:24:27 UTC
(In reply to Bram Coenen from comment #35)
> (In reply to Bram Coenen from comment #34)
> > Created attachment 281253 [details]
> > dmesg_leonard_patch_no_touchscreen
> > 
> > I applied the patch made by Leonard correctly this time. The touchscreen
> > does not work and prints the following for pin 67 to 148.
> > 
> > [    3.080467] amd_gpio AMD0030:00: Pin 67 interrupt enabled on boot:
> disable
> 
> "irq 7: nobody cared" is also still present.

Now my touchscreen is working with the patch. It still prints that the same pins are disabled and the nobody cared is gone. So I don't think the patch made any difference.
Comment 37 nospamming11+kernel 2019-07-01 18:54:43 UTC
Giving a quick update:

On Linux 5.1.15 (archlinux) with Firmware F.32 the bug still persists reliable (Linux boot after Windows boot)
Comment 38 JerryD 2019-07-28 15:45:57 UTC
Still present on 5.1.19-300 (Fedora 30)
Comment 39 Daniel Drake 2019-08-14 09:17:38 UTC
I just faced a similar problem on a new platform: the BIOS boots with a GPIO IRQ enabled, the boot-time GPIO state causes the IRQ to fire. pinctrl-amd tries to handle the IRQ, but doesn't call any handler, and as a result we get a boot-time interrupt storm.

I considered the patch here, but I checked Windows vs Linux. After boot, both OSes have the same 41 GPIO IRQs enabled. This is many more than the ones listed in the DSDT. I believe this means that Windows does not disable all GPIO IRQs at boot time, and hence the patch posted here (based on an earlier suggestion that I made) is somewhat risky.

I guess from the comments above that approach didn't help either.

I took an alternative approach, to just disable the spurious IRQ, which solves the issue I was facing:
   [PATCH] pinctrl/amd: disable spurious-firing GPIO IRQs
but if the disable-all approach didn't work then I don't think this patch will help your case either.

If anyone wants to investigate further, I think the next step here is to dump the contents of the "status" variable in amd_gpio_irq_handler(). The fact that you are getting "nobody cared" indicates that the pinctrl-amd driver itself doesn't know even which GPIO is causing the interrupt to be fired. From that angle it's understandable that disabling all the GPIO interrupts didn't make a difference. Checking the exact value of "status" will give us more clarity.
Comment 40 nospamming11+kernel 2019-08-14 09:34:51 UTC
You can already find logs with the status variable dumped in this thread. 

I attached them: one version with a boot where the touchscreen was working and one version with a boot where the touchscreen was not working. (and the source code of the modified handler that creates the dump)

The problem still persists and a workaround is booting with the kernel-option noirqdebug or let the device powered off for a few days. Then the touchscreen works again (until you boot to windows).
Comment 41 Daniel Drake 2019-08-15 02:58:49 UTC
Ah, I see it in Comment #15.

So when your issue bites, the pinctrl-amd status register is zero.
The GPIO controller is indicating that it did not generate any interrupt at all.

So as you probably already grasped, the cause of your issue almost certainly resides outside of pinctrl-amd.

To look for clues here, two more ideas:

1. Check closely when the interrupt spam starts appearing. Is it right after pinctrl-amd loads, or is it triggered somewhat later in boot? If it's triggered later then you can try to figure out precisely what causes it.

2. Look for differences elsewhere between the working & non-working setups. Perhaps differences in the APIC registers (or something like that) could explain the weird interrupts on irq 7. I'm a bit out of my knowledge area there though.
Comment 42 Shmerl 2019-11-05 20:23:16 UTC
I have a similar problem with Lenovo Thinkpad E495 (see below for dmesg snippet). This laptop doesn't even have a touchscreen display.

The error doesn't really cause any hangs or slowdown (I'm running 5.4-rc6), only that message in the log (shown during boot). Is there a way though to get rid of it without using Windows? I don't have access to Windows installation on this computer anymore.

Thanks!

Here is what I see in dmesg:

[    2.380193] irq 7: nobody cared (try booting with the "irqpoll" option)
[    2.380216] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E     5.4.0-rc5+ #25
[    2.380217] Hardware name: LENOVO 20NECTO1WW/20NECTO1WW, BIOS R11ET30W (1.10 ) 10/11/2019
[    2.380218] Call Trace:
[    2.380220]  <IRQ>
[    2.380225]  dump_stack+0x5c/0x80
[    2.380228]  __report_bad_irq+0x38/0xad
[    2.380230]  note_interrupt.cold+0xb/0x6e
[    2.380232]  handle_irq_event_percpu+0x72/0x80
[    2.380233]  handle_irq_event+0x3c/0x5c
[    2.380234]  handle_fasteoi_irq+0xa3/0x160
[    2.380236]  do_IRQ+0x53/0xe0
[    2.380238]  common_interrupt+0xf/0xf
[    2.380238]  </IRQ>
[    2.380241] RIP: 0010:cpuidle_enter_state+0xc4/0x450
[    2.380243] Code: e8 c1 94 ad ff 80 7c 24 0f 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 61 03 00 00 31 ff e8 e3 b1 b3 ff fb 66 0f 1f 44 00 00 <45> 85 e4 0f 88 8c 02 00 00 49 63 cc 4c 2b 6c 24 10 48 8d 04 49 48
[    2.380243] RSP: 0018:ffffaea10015fe68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffde
[    2.380245] RAX: ffff8f99708ea6c0 RBX: ffffffff9babcb00 RCX: 000000008dcbf47f
[    2.380246] RDX: 000000008e08fd75 RSI: 000000008dcbf47f RDI: 0000000000000000
[    2.380246] RBP: ffff8f996ddcf800 R08: 000000008dcbf8a5 R09: 00000049b4d1ac6b
[    2.380246] R10: ffff8f99708e95a0 R11: ffff8f99708e9580 R12: 0000000000000002
[    2.380247] R13: 000000008dcbf8a5 R14: 0000000000000002 R15: ffff8f996efe8000
[    2.380249]  ? cpuidle_enter_state+0x9f/0x450
[    2.380251]  cpuidle_enter+0x29/0x40
[    2.380253]  do_idle+0x1dc/0x270
[    2.380254]  cpu_startup_entry+0x19/0x20
[    2.380257]  start_secondary+0x15f/0x1b0
[    2.380259]  secondary_startup_64+0xa4/0xb0
[    2.380260] handlers:
[    2.380270] [<0000000027c08871>] amd_gpio_irq_handler
[    2.380283] Disabling IRQ #7
Comment 43 nospamming11+kernel 2019-11-06 16:08:59 UTC
(In reply to Shmerl from comment #42)
> I have a similar problem with Lenovo Thinkpad E495 (see below for dmesg
> snippet). This laptop doesn't even have a touchscreen display.
> 
> The error doesn't really cause any hangs or slowdown (I'm running 5.4-rc6),
> only that message in the log (shown during boot). Is there a way though to
> get rid of it without using Windows? I don't have access to Windows
> installation on this computer anymore.
> 
> Thanks!

Did you try to start your kernel with noirqdebug (as kernel option) which is a workaround for the issue with touchscreens
Comment 44 Shmerl 2019-11-11 16:30:12 UTC
(In reply to nospamming11+kernel from comment #43)
> 
> Did you try to start your kernel with noirqdebug (as kernel option) which is
> a workaround for the issue with touchscreens

Just tried it. With noirqdebug, the message doesn't pop up anymore during boot, but on next boot returns if I boot without noirqdebug, so it just masks the issue, rather than changing something permanently like in some examples above.
Comment 45 nospamming11+kernel 2019-11-11 17:08:29 UTC
(In reply to Shmerl from comment #44)
> (In reply to nospamming11+kernel from comment #43)
> > 
> > Did you try to start your kernel with noirqdebug (as kernel option) which
> is
> > a workaround for the issue with touchscreens
> 
> Just tried it. With noirqdebug, the message doesn't pop up anymore during
> boot, but on next boot returns if I boot without noirqdebug, so it just
> masks the issue, rather than changing something permanently like in some
> examples above.

For the touchscreen: with noirqdebug the touchscreen works without (and thus the appearing error) the touchscreen does not work. So it has to do more than just suppressing the error message in dmesg.
Comment 46 Fruhwirth Clemens 2019-12-17 09:10:33 UTC
Another data point when waking up from suspend for Thinkpad T495 on Ryzen 3700 which has an amdgpu:

[  373.071812] irq 7: nobody cared (try booting with the "irqpoll" option)
[  373.071822] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.5.0-rc2 #1-NixOS
[  373.071823] Hardware name: LENOVO 20NJCTO1WW/20NJCTO1WW, BIOS R12ET46W(1.16 ) 10/28/2019
[  373.071824] Call Trace:
[  373.071828]  <IRQ>
[  373.071837]  dump_stack+0x66/0x90
[  373.071842]  __report_bad_irq+0x37/0xb1
[  373.071846]  note_interrupt.cold.10+0xa/0x6d
[  373.071849]  handle_irq_event_percpu+0x6a/0x80
[  373.071852]  handle_irq_event+0x3c/0x5c
[  373.071855]  handle_fasteoi_irq+0xa3/0x150
[  373.071859]  do_IRQ+0x51/0xe0
[  373.071862]  common_interrupt+0xf/0xf
[  373.071863]  </IRQ>
[  373.071868] RIP: 0010:cpuidle_enter_state+0xbe/0x3f0
[  373.071872] Code: e8 27 c6 b3 ff 80 7c 24 13 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d5 02 00 00 31 ff e8 f9 d3 b9 ff fb 66 0f 1f 44 00 00 <85> ed 0f 88 42 02 00 00 48 63 c5 4c 8b 3c 24 4c 2b 7c 24 08 48 8d
[  373.071873] RSP: 0018:ffffffff94203e48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffc8
[  373.071877] RAX: ffff98d838a2c300 RBX: ffff98d83677c000 RCX: 000000000000001f
[  373.071878] RDX: 00000056dccfecc1 RSI: 0000000037b5a6f8 RDI: 0000000000000000
[  373.071879] RBP: 0000000000000001 R08: 0000000000000002 R09: 000000000002bb80
[  373.071880] R10: 0000000285c499ea R11: ffff98d838a2b3e4 R12: ffffffff942b9da0
[  373.071881] R13: ffffffff942b9e20 R14: 0000000000000001 R15: 0000000000000001
[  373.071886]  ? cpuidle_enter_state+0x99/0x3f0
[  373.071889]  cpuidle_enter+0x29/0x40
[  373.071894]  do_idle+0x22b/0x260
[  373.071898]  cpu_startup_entry+0x19/0x20
[  373.071901]  start_kernel+0x4e2/0x504
[  373.071906]  secondary_startup_64+0xb6/0xc0
[  373.071908] handlers:
[  373.071916] [<0000000085173049>] amd_gpio_irq_handler [pinctrl_amd]
[  373.071918] Disabling IRQ #7
Comment 47 Alois Nespor 2020-01-07 15:22:26 UTC
i have same problem, HP desktop M01-F0xxx, Ryzen 5 3400G, no touchscreen:

[    5.799887] irq 7: nobody cared (try booting with the "irqpoll" option)
[    5.799890] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G           OE     5.4.8-zen1-1-zen #1
[    5.799891] Hardware name: HP HP Desktop M01-F0xxx/8643, BIOS F.11 11/26/2019
[    5.799892] Call Trace:
[    5.799894]  <IRQ>
[    5.799898]  dump_stack+0x66/0x90
[    5.799901]  __report_bad_irq+0x35/0xaa
[    5.799902]  note_interrupt.cold+0xb/0x69
[    5.799903]  handle_irq_event+0xa9/0xb0
[    5.799904]  handle_fasteoi_irq+0xcc/0x1e0
[    5.799906]  do_IRQ+0x84/0x140
[    5.799907]  common_interrupt+0xf/0xf
[    5.799908]  </IRQ>
[    5.799910] RIP: 0010:cpuidle_enter_state+0xc4/0xa20
[    5.799911] Code: e8 41 cb 87 ff 80 7c 24 0f 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 06 09 00 00 31 ff e8 13 10 8f ff fb 66 0f 1f 44 00 00 <45> 85 e4 0f 88 86 02 00 00 49 63 cc 4c 2b 6c 24 10 48 8d 04 49 48
[    5.799912] RSP: 0018:ffffafbf40177e50 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd9
[    5.799913] RAX: ffff936a509c0000 RBX: ffffffffbb4c1ba0 RCX: 000000000000001f
[    5.799913] RDX: 0000000000000000 RSI: 0000000022a8e515 RDI: 0000000000000000
[    5.799914] RBP: ffff936a481a9800 R08: 0000000159b326b3 R09: 00000001687d2800
[    5.799914] R10: 0000000000000008 R11: 0000000000000008 R12: 0000000000000002
[    5.799915] R13: 0000000159b326b3 R14: 0000000000000002 R15: ffff936a4efe9e40
[    5.799917]  cpuidle_enter+0x29/0x40
[    5.799919]  do_idle+0x202/0x2b0
[    5.799920]  cpu_startup_entry+0x19/0x20
[    5.799922]  start_secondary+0x1c6/0x220
[    5.799923]  secondary_startup_64+0xb6/0xc0
[    5.799924] handlers:
[    5.799927] [<00000000abfe7a71>] amd_gpio_irq_handler [pinctrl_amd]
[    5.799928] Disabling IRQ #7

If i try  add irqpoll option for boot, my hpet will be broken: 
hpet: Lost 9601 RTC interrupts

Archlinux, kernel linux-zen 5.4.8-zen1,
Comment 48 Rui Ventura 2020-01-24 10:24:10 UTC
Ditto, Archlinux kernel 5.4.14-arch1-1, ThinkPad E595 Ryzen 3500U

[   11.749810] irq 7: nobody cared (try booting with the "irqpoll" option)
[   11.749814] CPU: 5 PID: 444 Comm: systemd-journal Not tainted 5.4.14-arch1-1 #1
[   11.749816] Hardware name: LENOVO 20NFCTO1WW/20NFCTO1WW, BIOS R11ET31W (1.11 ) 11/20/2019
[   11.749817] Call Trace:
[   11.749820]  <IRQ>
[   11.749827]  dump_stack+0x66/0x90
[   11.749831]  __report_bad_irq+0x35/0xaa
[   11.749834]  note_interrupt.cold+0xb/0x69
[   11.749836]  handle_irq_event_percpu+0x6f/0x80
[   11.749839]  handle_irq_event+0x37/0x54
[   11.749842]  handle_fasteoi_irq+0xb5/0x160
[   11.749845]  do_IRQ+0x84/0x140
[   11.749848]  common_interrupt+0xf/0xf
[   11.749849]  </IRQ>
[   11.749851] RIP: 0033:0x7f3e23058524
[   11.749854] Code: 48 89 ca 4c 89 c1 45 85 c9 41 0f 9f c0 48 85 f6 74 0e 45 0f b6 c0 47 8d 44 00 04 e9 16 a6 f6 ff 50 e8 a0 04 00 00 f3 0f 1e fa <48> 81 ec d8 00 00 00 41 89 d2 4c 89 c2 4c 89 4c 24 48 84 c0 74 >
[   11.749855] RSP: 002b:00007ffd0eedf6d8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda
[   11.749858] RAX: 0000000000000000 RBX: 000000000000001a RCX: 0000000000000030
[   11.749859] RDX: 0000000000000001 RSI: 000000000000002c RDI: 00005637261d73f0
[   11.749860] RBP: 0000000000000000 R08: 00005637254a7008 R09: 000000000000004d
[   11.749861] R10: 0000000000000000 R11: 00007f3e2310ba40 R12: 0000000000000000
[   11.749861] R13: 00007ffd0eedfa40 R14: 00005637261d7360 R15: 00007ffd0eedfd28
[   11.749864] handlers:
[   11.749869] [<000000000dea8798>] amd_gpio_irq_handler [pinctrl_amd]
[   11.749870] Disabling IRQ #7
Comment 49 JerryD 2020-03-06 17:18:35 UTC
ditto 5.6.0-0.rc3.git0.1.fc32.x86_64

This went away for a while and then the 5.5 kernels started showing it again for me.

The irq 7: nobody cared statement is quite true. I have no touchscreen function now either.
Comment 50 YinH 2020-03-11 00:25:01 UTC
Hi There!

Just to advise those that were having this issue with Lenovo ThinkPads, my
T495 () in particular had this IRQ#7 issue with 5.5.7-200.fc31.x86_64 and lower.

I updated the BIOS version to 1.19 (for T495) and the issue appears to have now
gone away. I'll continue to observe and report back if the problem returns.

This update also fixed Linux not being able to see the Battery's status (unsure if related to IRQ#7)

Thankfully, Lenovo are releasing stand-alone .ISOs for BIOS updating, just make
sure to disable Secure Boot before booting into it off a USB.

Just quickly checking some of the E series Ryzen ThinkPads they've also had
a BIOS update in the last week or so.

I hope other hardware vendors have released, or will release a fix for this! I'm wondering if AMD have pushed some newer byte-code for these chips?

Cheers,
Comment 51 philipp_023 2020-03-11 16:28:54 UTC
Hi,
For me this issue is still present, even there is not much impact as far as I know.

Machine:
Lenovo ThinkPad E495, model 20NECTO1WW, ThinkPad BIOS R11ET35W (1.15 ), EC R11HT35W

OS:
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=19.10
DISTRIB_CODENAME=eoan

Kernel:
Linux e495 5.5.8-050508-generic #202003051633 SMP Thu Mar 5 16:37:27 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Exception:
[Wed Mar 11 08:39:44 2020] Hardware name: LENOVO 20NECTO1WW/20NECTO1WW, BIOS R11ET35W (1.15 ) 02/19/2020
[Wed Mar 11 08:39:44 2020] Call Trace:
[Wed Mar 11 08:39:44 2020]  <IRQ>
[Wed Mar 11 08:39:44 2020]  dump_stack+0x6d/0x9a
[Wed Mar 11 08:39:44 2020]  __report_bad_irq+0x3a/0xaf
[Wed Mar 11 08:39:44 2020]  note_interrupt.cold+0xb/0x61
[Wed Mar 11 08:39:44 2020]  handle_irq_event_percpu+0x73/0x80
[Wed Mar 11 08:39:44 2020]  handle_irq_event+0x3b/0x5a
[Wed Mar 11 08:39:44 2020]  handle_fasteoi_irq+0x9c/0x150
[Wed Mar 11 08:39:44 2020]  do_IRQ+0x55/0xf0
[Wed Mar 11 08:39:44 2020]  common_interrupt+0xf/0xf
[Wed Mar 11 08:39:44 2020]  </IRQ>
[Wed Mar 11 08:39:44 2020] RIP: 0010:__call_rcu+0xd3/0x1d0
[Wed Mar 11 08:39:44 2020] Code: 0f a3 05 e0 5b 72 01 73 1c 49 8b 94 24 98 00 00 00 48 8b 05 1f ab 54 01 49 03 84 24 b0 00 00 00 48 39 c2 7f 77 4c 89 ef 57 9d <0f> 1f 44 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 d4
[Wed Mar 11 08:39:44 2020] RSP: 0018:ffffa8f2405f7e08 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffde
[Wed Mar 11 08:39:44 2020] RAX: 0000000000002710 RBX: 000000000002df00 RCX: ffffffffa3ae5b90
[Wed Mar 11 08:39:44 2020] RDX: 0000000000000014 RSI: ffff91bb81d92e00 RDI: 0000000000000246
[Wed Mar 11 08:39:44 2020] RBP: ffffa8f2405f7e40 R08: ffff91bb8d5d3a40 R09: 0000000000000064
[Wed Mar 11 08:39:44 2020] R10: 0000000040000010 R11: ffff91bb8d009068 R12: ffff91bb8f2edf00
[Wed Mar 11 08:39:44 2020] R13: 0000000000000246 R14: ffff91bb8f2edf50 R15: ffff91bb81d92e00
[Wed Mar 11 08:39:44 2020]  ? get_max_files+0x20/0x20
[Wed Mar 11 08:39:44 2020]  ? get_max_files+0x20/0x20
[Wed Mar 11 08:39:44 2020]  call_rcu+0x10/0x20
[Wed Mar 11 08:39:44 2020]  __fput+0x150/0x260
[Wed Mar 11 08:39:44 2020]  ____fput+0xe/0x10
[Wed Mar 11 08:39:44 2020]  task_work_run+0x8f/0xb0
[Wed Mar 11 08:39:44 2020]  exit_to_usermode_loop+0x131/0x160
[Wed Mar 11 08:39:44 2020]  do_syscall_64+0x170/0x1b0
[Wed Mar 11 08:39:44 2020]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Wed Mar 11 08:39:44 2020] RIP: 0033:0x7f2302ff1ab7
[Wed Mar 11 08:39:44 2020] Code: ff ff e8 3c 13 02 00 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 33 5e f8 ff
[Wed Mar 11 08:39:44 2020] RSP: 002b:00007ffdf4a630e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[Wed Mar 11 08:39:44 2020] RAX: 0000000000000000 RBX: 00007f2302a887a0 RCX: 00007f2302ff1ab7
[Wed Mar 11 08:39:44 2020] RDX: 00007ffdf4a63150 RSI: 00007ffdf4a63150 RDI: 0000000000000000
[Wed Mar 11 08:39:44 2020] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000562965a2f720
[Wed Mar 11 08:39:44 2020] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[Wed Mar 11 08:39:44 2020] R13: 0000000000000005 R14: 00007ffdf4a63238 R15: 0000000000000005
[Wed Mar 11 08:39:44 2020] handlers:
[Wed Mar 11 08:39:44 2020] [<000000003328b550>] amd_gpio_irq_handler
[Wed Mar 11 08:39:44 2020] Disabling IRQ #7

Philipp
Comment 52 Thomas Gleixner 2020-03-11 21:22:04 UTC
Phillip,

bugzilla-daemon@bugzilla.kernel.org writes:
> Kernel:
> Linux e495 5.5.8-050508-generic #202003051633 SMP Thu Mar 5 16:37:27 UTC 2020
> x86_64 x86_64 x86_64 GNU/Linux
>
> Exception:
> [Wed Mar 11 08:39:44 2020] Hardware name: LENOVO 20NECTO1WW/20NECTO1WW, BIOS
> R11ET35W (1.15 ) 02/19/2020
> [Wed Mar 11 08:39:44 2020] handlers:
> [Wed Mar 11 08:39:44 2020] [<000000003328b550>] amd_gpio_irq_handler
> [Wed Mar 11 08:39:44 2020] Disabling IRQ #7

Can you please enable CONFIG_DEBUG_FS and provide the output of

    cat /sys/kernel/debug/gpio

Thanks,

        tglx
Comment 53 Alois Nespor 2020-03-11 21:34:19 UTC
(In reply to Alois Nespor from comment #47)
> i have same problem, HP desktop M01-F0xxx, Ryzen 5 3400G, no touchscreen:
> 
> [    5.799887] irq 7: nobody cared (try booting with the "irqpoll" option)
> [    5.799890] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G           OE    
> 5.4.8-zen1-1-zen #1
> [    5.799891] Hardware name: HP HP Desktop M01-F0xxx/8643, BIOS F.11
> 11/26/2019
> [    5.799892] Call Trace:
> [    5.799894]  <IRQ>
> [    5.799898]  dump_stack+0x66/0x90
> [    5.799901]  __report_bad_irq+0x35/0xaa
> [    5.799902]  note_interrupt.cold+0xb/0x69
> [    5.799903]  handle_irq_event+0xa9/0xb0
> [    5.799904]  handle_fasteoi_irq+0xcc/0x1e0
> [    5.799906]  do_IRQ+0x84/0x140
> [    5.799907]  common_interrupt+0xf/0xf
> [    5.799908]  </IRQ>
> [    5.799910] RIP: 0010:cpuidle_enter_state+0xc4/0xa20
> [    5.799911] Code: e8 41 cb 87 ff 80 7c 24 0f 00 74 17 9c 58 0f 1f 44 00
> 00 f6 c4 02 0f 85 06 09 00 00 31 ff e8 13 10 8f ff fb 66 0f 1f 44 00 00 <45>
> 85 e4 0f 88 86 02 00 00 49 63 cc 4c 2b 6c 24 10 48 8d 04 49 48
> [    5.799912] RSP: 0018:ffffafbf40177e50 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffffd9
> [    5.799913] RAX: ffff936a509c0000 RBX: ffffffffbb4c1ba0 RCX:
> 000000000000001f
> [    5.799913] RDX: 0000000000000000 RSI: 0000000022a8e515 RDI:
> 0000000000000000
> [    5.799914] RBP: ffff936a481a9800 R08: 0000000159b326b3 R09:
> 00000001687d2800
> [    5.799914] R10: 0000000000000008 R11: 0000000000000008 R12:
> 0000000000000002
> [    5.799915] R13: 0000000159b326b3 R14: 0000000000000002 R15:
> ffff936a4efe9e40
> [    5.799917]  cpuidle_enter+0x29/0x40
> [    5.799919]  do_idle+0x202/0x2b0
> [    5.799920]  cpu_startup_entry+0x19/0x20
> [    5.799922]  start_secondary+0x1c6/0x220
> [    5.799923]  secondary_startup_64+0xb6/0xc0
> [    5.799924] handlers:
> [    5.799927] [<00000000abfe7a71>] amd_gpio_irq_handler [pinctrl_amd]
> [    5.799928] Disabling IRQ #7
> 
> If i try  add irqpoll option for boot, my hpet will be broken: 
> hpet: Lost 9601 RTC interrupts
> 
> Archlinux, kernel linux-zen 5.4.8-zen1,


@Thomas Gleixner,

if you help my output of 'cat /sys/kernel/debug/gpio', linux-zen 5.5.8, please see gpio.txt
Comment 54 Alois Nespor 2020-03-11 21:35:16 UTC
Created attachment 287877 [details]
cat /sys/kernel/debug/gpio - linux-zen 5.5.8
Comment 55 Jose Silva 2020-03-12 00:25:13 UTC
Issue still persists in my Lenovo ThinkPad E595, BIOS (1.15).


Attached my /sys/kernel/debug/gpio, will run again with CONFIG_DEBUG_FS on tomorrow.

Kernel 5.5.8-arch1-1.
Comment 56 Jose Silva 2020-03-12 00:27:32 UTC
Created attachment 287879 [details]
cat /sys/kernel/gpio result
Comment 57 philipp_023 2020-03-12 08:09:26 UTC
Created attachment 287883 [details]
cat /sys/kernel/debug/gpio

Output of cat /sys/kernel/debug/gpio

Machine:
Hardware name: LENOVO 20NECTO1WW/20NECTO1WW, BIOS R11ET35W (1.15 ) 02/19/2020

OS:
Ubuntu 19.10

Kernel: 
Linux e495 5.5.8-050508-generic #202003051633 SMP Thu Mar 5 16:37:27 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Comment 58 Thomas Gleixner 2020-03-12 15:32:46 UTC
bugzilla-daemon@bugzilla.kernel.org writes:

Thanks for the files!

This starts to be really puzzling.

The files from Alois and Phillip have dozens of interrupts enabled, but
ALL of them are masked which means they cannot raise an interupt unless
the masking is broken.

Joses file has not a single interrupt line enabled which means that the
interrupt fires for completely different reasons.

I'm trying to get some more detailed information about the inner working
of these chips. Will take a while.

Thanks,

        tglx
Comment 59 Nelson G 2020-03-27 20:54:46 UTC
Similar issue.  Fedora 31, Linux version 5.5.11.  Thinkpad E495.

[    5.809023] irq 7: nobody cared (try booting with the "irqpoll" option)
[    5.809027] CPU: 4 PID: 31 Comm: ksoftirqd/4 Not tainted 5.5.11-200.fc31.x86_64 #1
[    5.809028] Hardware name: LENOVO 20NECTO1WW/20NECTO1WW, BIOS R11ET35W (1.15 ) 02/19/2020
[    5.809029] Call Trace:
[    5.809033]  <IRQ>
[    5.809044]  dump_stack+0x66/0x90
[    5.809051]  __report_bad_irq+0x35/0xa7
[    5.809053]  note_interrupt.cold+0xb/0x63
[    5.809056]  handle_irq_event_percpu+0x6f/0x80
[    5.809058]  handle_irq_event+0x36/0x53
[    5.809060]  handle_fasteoi_irq+0x8b/0x130
[    5.809063]  do_IRQ+0x50/0xe0
[    5.809067]  common_interrupt+0xf/0xf
[    5.809069]  </IRQ>
[    5.809072] RIP: 0010:finish_task_switch+0x80/0x2a0
[    5.809074] Code: 8b 1c 25 c0 8b 01 00 0f 1f 44 00 00 0f 1f 44 00 00 41 c7 45 38 00 00 00 00 4c 89 e7 c6 07 00 0f 1f 40 00 fb 66 0f 1f 44 00 00 <65> 48 8b 04 25 c0 8b 01 00 0f 1f 44 00 00 4d 85 f6 74 21 65 48 8b
[    5.809076] RSP: 0018:ffffbb45c025be40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdb
[    5.809078] RAX: ffff9efd68482680 RBX: ffff9efd7260a680 RCX: 0000000000000000
[    5.809079] RDX: 0000000002068000 RSI: 0000000000000000 RDI: ffff9efd74b2ae00
[    5.809080] RBP: ffffbb45c025be68 R08: ffff9efd68482730 R09: ffff9efd68482730
[    5.809080] R10: 00000000000000bb R11: ffff9efd74b2aeb8 R12: ffff9efd74b2ae00
[    5.809081] R13: ffff9efd68482680 R14: 0000000000000000 R15: 0000000000000000
[    5.809087]  __schedule+0x2cf/0x740
[    5.809089]  schedule+0x4a/0xb0
[    5.809093]  smpboot_thread_fn+0x10b/0x160
[    5.809097]  kthread+0xf9/0x130
[    5.809099]  ? sort_range+0x20/0x20
[    5.809100]  ? kthread_park+0x90/0x90
[    5.809102]  ret_from_fork+0x22/0x40
[    5.809104] handlers:
[    5.809109] [<00000000463020c1>] amd_gpio_irq_handler [pinctrl_amd]
[    5.809110] Disabling IRQ #7
Comment 60 Nelson G 2020-03-27 21:10:14 UTC
Created attachment 288097 [details]
cat /sys/kernel/debug/gpio.  Cuz why not
Comment 61 damkrat 2020-04-21 20:03:16 UTC
Here the same problem with IRQ 7 on Lenovo E495 with BIOS 1.15; running CentOS 8 Stream, Linux 5.6.3-1.el8.elrepo.x86_64. 

Enabling noapic kernel option will let the "IRQ 7" warning disappear.

I haven't noticed any issues with system, seems to run same as without this tweak.
Comment 62 Jan Sordid 2020-05-03 12:28:29 UTC
To all here who also have an E495, BIOS 1.15 was removed from Lenovo's pages some time ago, but there is now a newer 1.16 available.
https://pcsupport.lenovo.com/us/de/products/laptops-and-netbooks/thinkpad-edge-laptops/thinkpad-e495-type-20ne/downloads/DS539418
If someone feels like trying this one out, please ping me if you had success.

I'm still on BIOS 1.10 for now, since I have no real problems and the laptop really needs to continue working until I finished my thesis ;)

Just for reference
- E495 with BIOS 1.10
- IRQ 7 nobody cared
- Never booted a Windows, only Linux (Xubuntu 19.10)
- Kernel 5.3.0-51-generic #44-Ubuntu SMP
Comment 63 Nelson G 2020-05-03 18:05:34 UTC
(In reply to Jan Sordid from comment #62)
> If someone feels like trying this one out, please ping me if you had success.

[    4.985889] irq 7: nobody cared (try booting with the "irqpoll" option)
[    4.985894] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.6.8-200.fc31.x86_64 #1
[    4.985895] Hardware name: LENOVO 20NECTO1WW/20NECTO1WW, BIOS R11ET36W (1.16 ) 03/30/2020
[    4.985896] Call Trace:
[    4.985900]  <IRQ>
[    4.985911]  dump_stack+0x66/0x90
[    4.985917]  __report_bad_irq+0x35/0xa7
[    4.985920]  note_interrupt.cold+0xb/0x63
[    4.985923]  handle_irq_event_percpu+0x4f/0x60
[    4.985925]  handle_irq_event+0x36/0x53
[    4.985927]  handle_fasteoi_irq+0x8b/0x130
[    4.985931]  do_IRQ+0x50/0xe0
[    4.985934]  common_interrupt+0xf/0xf
[    4.985935]  </IRQ>
[    4.985941] RIP: 0010:cpuidle_enter_state+0xc9/0x3e0
[    4.985943] Code: e8 3c 2d 91 ff 80 7c 24 0f 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 ea 02 00 00 31 ff e8 3e 5f 97 ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 40 02 00 00 49 63 d5 4c 2b 64 24 10 48 8d 04 52 48
[    4.985944] RSP: 0018:ffffad9c4015fe78 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdb
[    4.985947] RAX: ffff9258f4b2ae80 RBX: ffff9258ea173400 RCX: 00000001292e3489
[    4.985948] RDX: 00000001293d76b5 RSI: 00000001292e3489 RDI: 0000000000000000
[    4.985949] RBP: ffffffff90769f40 R08: 00000001292e349d R09: 000000007fffffff
[    4.985950] R10: 0000000000000001 R11: ffff9258f4b29ca4 R12: 00000001292e349d
[    4.985950] R13: 0000000000000002 R14: 0000000000000002 R15: ffff9258f2ff0000
[    4.985955]  ? cpuidle_enter_state+0xa4/0x3e0
[    4.985957]  cpuidle_enter+0x29/0x40
[    4.985961]  do_idle+0x1c0/0x260
[    4.985964]  cpu_startup_entry+0x19/0x20
[    4.985967]  start_secondary+0x152/0x190
[    4.985972]  secondary_startup_64+0xb6/0xc0
[    4.985974] handlers:
[    4.985979] [<00000000cf504361>] amd_gpio_irq_handler [pinctrl_amd]
[    4.985980] Disabling IRQ #7



Nothing new.
Comment 64 Alberto 2020-05-21 23:55:44 UTC
I have this problem too in a lenovo thinkpad E595 with Linux 5.6 from manjaro and arch kernels. It appear that not generate any problem at all.. but I dont know well
Comment 65 Alberto 2020-05-21 23:57:01 UTC
Created attachment 289205 [details]
dmseg with linux 5.6ck
Comment 66 derectus 2020-06-06 01:58:45 UTC
Created attachment 289537 [details]
dmesg
Comment 67 derectus 2020-06-06 02:11:45 UTC
Created attachment 289539 [details]
interrupts
Comment 68 derectus 2020-06-06 02:12:35 UTC
Hi,
For me this issue is still present. I use Thinkpad e595. AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx.


Machine:
Lenovo ThinkPad E595, model 20NF001QTX, ThinkPad BIOS  R11ET36W (1.16 )

Kernel:
5.6.15-arch1-1 #1 SMP PREEMPT Wed, 27 May 2020 23:42:26 +0000 x86_64 GNU/Linux


dmesg error like this;

[    1.048960] pci 0000:00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.
[    3.693667] snd_pci_acp3x 0000:05:00.5: Invalid ACP audio mode : 2
[    3.814530] tpm tpm0: tpm_try_transmit: send(): error -5
[    3.814532] tpm tpm0: [Firmware Bug]: TPM interrupt not working, polling instead
[    5.117604] irq 7: nobody cared (try booting with the "irqpoll" option)
[    5.117669] handlers:
[    5.117673] [<000000009876dd1a>] amd_gpio_irq_handler [pinctrl_amd]

and also emrg ;

[    5.117674] Disabling IRQ #7

I have attached full dmesg and interrupts output above.
Comment 69 Flo 2020-06-06 17:57:22 UTC
Hi all,
Just to add my 2 cents...
I basically have the same system as Alois from comment 47 just with a Ryzen-3 instead of Ryzen-5 (I'll attach a dmesg).
This is an HP Desktop system from HP not a notebook and hence does not feature a touchscreen. The "description" of the custom mainboard Erica2 of HP is here:
https://support.hp.com/us-en/product/hp-desktop-pc-m01-f0000i/29014486/model/31450166/document/c06418906
To describe the system a little:
Most of the hardware is onboard.
These are AMD Systems with a AMD Promontory B550A (which seems to b a rebranded 450 series chipset). The system has 4 back and 4 front USB Ports and an SD card reader. Wi-Fi, Bluetooth and Ethernet are onboard. The only other connectors are a VGA Port, an HDMI port as well as an audio-out and a mic-in. Except the Power Button there are no other external LEDs. I get the Kernel crash that is described here (see dmesg) during boot, but the system afterwards seems to work normally. If I use the irqpoll option as advised, then the desktop system becomes unusable. Youtube Videos hang, libreoffice misses key presses scrolling lags all of this due to a very high (100%) Xorg cpu usage.

Given the price point of this system I would not count on HP of releasing any BIOS update helping in that regard...

Feel free to ask for any help I can do with the stock OpenSUSE kernel, but booting different ones is not quite an option since I want to put the system into productive use.
Comment 70 Flo 2020-06-06 18:02:19 UTC
Created attachment 289547 [details]
dmesg output of HP M01-F0001 System

The dmesg output of Kernel 5.3.18 from OpenSuse 15.2 on an HP M01-F0001 System.
Comment 71 Flo 2020-06-08 00:07:18 UTC
Created attachment 289565 [details]
Part of the Mainboard  "Erica2" from HP

Part of the Mainboard  "Erica2" from HP. Note that it has a large SPI field of Pins.
Comment 72 Flo 2020-06-08 00:09:21 UTC
I added a shot of the mainboard. Maybe it's interesting that it has a large Pin field called SPI - Debug.
Comment 73 Bram Coenen 2020-07-20 11:51:02 UTC
Hello everyone,

It's been a while since I used Fedora and newer kernel versions, but it seems like the "irq 7: nobody cared" is gone from my dmesg output! Has anyone else noticed the same thing?

[bram@loki ~]$ dmesg | grep irq
[    0.067335] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.067337] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
[    0.146820] NR_IRQS: 524544, nr_irqs: 1000, preallocated irqs: 16
[    1.312580] AMD0020:00: ttyS4 at MMIO 0xfedc6000 (irq = 10, base_baud = 3000000) is a 16550A
[    1.314985] ata1: SATA max UDMA/133 abar m1024@0xf0d6c000 port 0xf0d6c100 irq 19
[    1.315627] ehci-pci 0000:00:12.0: irq 18, io mem 0xf0d6d000
[    1.330943] i8042: PNP: PS/2 Controller [PNP0303:KBD0,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
[    1.336886] serio: i8042 KBD port at 0x60,0x64 irq 1
[    1.336890] serio: i8042 AUX port at 0x60,0x64 irq 12
[    1.337381] rtc_cmos 00:01: alarms up to one month, 114 bytes nvram, hpet irqs


Kernel version:
5.7.8-200.fc32.x86_64 

I'll see if it shows up again after a few shutdowns ;)
Comment 74 Alberto 2020-08-02 01:31:09 UTC
I have to test it in my lenovo e595 to see if it is fixed or not.
Comment 75 Rui Ventura 2020-08-02 01:39:01 UTC
I can further comment on the issue still being present for the Lenovo e595. A recent boot using 5.7.10 still prompted me with the same 'disabling IRQ#7' message.
Comment 76 Martin Jørgensen 2020-10-24 12:02:01 UTC
*** Bug 208469 has been marked as a duplicate of this bug. ***
Comment 77 Martin Jørgensen 2020-10-24 12:04:18 UTC
i'm still experiencing this on my thinkpad x395 running Debian Testing. maybe a slightly different dmesg error:

[ 3382.895489] PM: suspend exit
[ 3382.921981] irq 7: nobody cared (try booting with the "irqpoll" option)
[ 3382.921985] CPU: 0 PID: 9977 Comm: i3bar Tainted: G           OE     5.7.0-1-amd64 #1 Debian 5.7.6-1
[ 3382.921986] Hardware name: LENOVO 20NL001RMX/20NL001RMX, BIOS R13ET44W(1.18 ) 05/08/2020
[ 3382.921987] Call Trace:
[ 3382.921990]  <IRQ>
[ 3382.921997]  dump_stack+0x66/0x90
[ 3382.922001]  __report_bad_irq+0x38/0xad
[ 3382.922003]  note_interrupt.cold+0xb/0x6e
[ 3382.922005]  handle_irq_event_percpu+0x72/0x80
[ 3382.922006]  handle_irq_event+0x3c/0x5c
[ 3382.922008]  handle_fasteoi_irq+0xa3/0x160
[ 3382.922013]  do_IRQ+0x53/0xe0
[ 3382.922015]  common_interrupt+0xf/0xf
[ 3382.922016]  </IRQ>
[ 3382.922018] RIP: 0033:0x7f663e593401
[ 3382.922020] Code: 00 b8 ff ff ff ff c3 48 83 ec 08 bf 10 00 00 00 e8 e4 09 ff ff b8 ff ff ff ff 48 83 c4 08 c3 66 2e 0f 1f 84 00 00 00 00 00 55 <53> 48 83 ec 18 83 fe 05 77 3d 41 f6 c0 03 75 4f 44 89 44 24 04 81
[ 3382.922021] RSP: 002b:00007ffe3c655da0 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffd9
[ 3382.922023] RAX: 00007f663e616608 RBX: 00007ffe3c655eb8 RCX: 000000000000000c
[ 3382.922024] RDX: 0000000000000008 RSI: 0000000000000002 RDI: 000055e7513fde30
[ 3382.922024] RBP: 000055e7513fde30 R08: 0000000000000008 R09: 000055e7513fde90
[ 3382.922025] R10: 000000000000000c R11: 0000000000000008 R12: 00007ffe3c655f60
[ 3382.922025] R13: 0000000000000002 R14: 000000000000000c R15: 0000000000000008
[ 3382.922028] handlers:
[ 3382.922030] [<00000000af5d2529>] amd_gpio_irq_handler
[ 3382.922031] Disabling IRQ #7
....
Comment 78 Joakim R 2021-01-15 13:40:30 UTC
I have been tackling this same problem on my Thinkpad E595 for at least a year, trying a plethora of boot options. The noapic setting worked well for some kernel versions, but lately it has messed with the ability to resume after suspension. The irqpoll option gives a constant 100% load on one of the CPU threads.

However, when using fwupd I noticed that "capsule updates" were disabled, which in the BIOS settings corresponds to allowing "Windows UEFI update". After enabling that setting, my IRQ 7 problem has disappeared.
 

Hopefully this is of use to someone else.
Comment 79 Hans de Goede 2021-01-15 14:02:27 UTC
(In reply to Joakim R from comment #78)
> I have been tackling this same problem on my Thinkpad E595 for at least a
> year, trying a plethora of boot options. The noapic setting worked well for
> some kernel versions, but lately it has messed with the ability to resume
> after suspension. The irqpoll option gives a constant 100% load on one of
> the CPU threads.
> 
> However, when using fwupd I noticed that "capsule updates" were disabled,
> which in the BIOS settings corresponds to allowing "Windows UEFI update".
> After enabling that setting, my IRQ 7 problem has disappeared.
>  
> 
> Hopefully this is of use to someone else.

Interesting, did you perhaps also boot a newer kernel at the same time? Either a newer "z" release (as in kernel version expresses as x.y.z) or maybe a 5.11-rc kernel? There have been some recent AMD GPIO interrupt handling changes which might be related.

If you did also boot a new kernel and have an older kernel still installed it would be interesting to know if the problem is also gone with the older kernel.
Comment 80 Hans de Goede 2021-01-15 14:03:24 UTC
p.s. You want "capsule updates" to be enabled anyways since these are used by lvfs / fwupd; and in general it is best to change as little BIOS options as possible, typically only the default settings are properly tested.
Comment 81 Joakim R 2021-01-15 14:10:00 UTC
(In reply to Hans de Goede from comment #79)
> (In reply to Joakim R from comment #78)
> > I have been tackling this same problem on my Thinkpad E595 for at least a
> > year, trying a plethora of boot options. The noapic setting worked well for
> > some kernel versions, but lately it has messed with the ability to resume
> > after suspension. The irqpoll option gives a constant 100% load on one of
> > the CPU threads.
> > 
> > However, when using fwupd I noticed that "capsule updates" were disabled,
> > which in the BIOS settings corresponds to allowing "Windows UEFI update".
> > After enabling that setting, my IRQ 7 problem has disappeared.
> >  
> > 
> > Hopefully this is of use to someone else.
> 
> Interesting, did you perhaps also boot a newer kernel at the same time?
> Either a newer "z" release (as in kernel version expresses as x.y.z) or
> maybe a 5.11-rc kernel? There have been some recent AMD GPIO interrupt
> handling changes which might be related.
> 
> If you did also boot a new kernel and have an older kernel still installed
> it would be interesting to know if the problem is also gone with the older
> kernel.

I have been testing this with kernel 5.9.16 and 5.10.6. Initially I started to fool around with all this again because suspend wasn't working properly after going to 5.10. I have currently booted from kernel 5.10.6 and can verify that suspend is now working as it should, as well as no "irq 7 nobody cared" error.
Comment 82 Joakim R 2021-01-15 14:26:44 UTC
(In reply to Joakim R from comment #81)
> (In reply to Hans de Goede from comment #79)
> > (In reply to Joakim R from comment #78)
> > > I have been tackling this same problem on my Thinkpad E595 for at least a
> > > year, trying a plethora of boot options. The noapic setting worked well
> for
> > > some kernel versions, but lately it has messed with the ability to resume
> > > after suspension. The irqpoll option gives a constant 100% load on one of
> > > the CPU threads.
> > > 
> > > However, when using fwupd I noticed that "capsule updates" were disabled,
> > > which in the BIOS settings corresponds to allowing "Windows UEFI update".
> > > After enabling that setting, my IRQ 7 problem has disappeared.
> > >  
> > > 
> > > Hopefully this is of use to someone else.
> > 
> > Interesting, did you perhaps also boot a newer kernel at the same time?
> > Either a newer "z" release (as in kernel version expresses as x.y.z) or
> > maybe a 5.11-rc kernel? There have been some recent AMD GPIO interrupt
> > handling changes which might be related.
> > 
> > If you did also boot a new kernel and have an older kernel still installed
> > it would be interesting to know if the problem is also gone with the older
> > kernel.
> 
> I have been testing this with kernel 5.9.16 and 5.10.6. Initially I started
> to fool around with all this again because suspend wasn't working properly
> after going to 5.10. I have currently booted from kernel 5.10.6 and can
> verify that suspend is now working as it should, as well as no "irq 7 nobody
> cared" error.

It seems like I may have spoken too soon. The suspend error is still there on kernel 5.10, so it seems to be intermittent. Also, dmesg still shows the irq 7 error, but abrt no longer reports on it, leading me to believe it was gone. So I  guess this wasn't very helpful after all.
Comment 83 Martin Jørgensen 2021-01-15 14:41:28 UTC
the same IRQ 7 error still shows up on my ThinkPad x395. Even with latest 5.10 kernel and BIOS updates.
Comment 84 Bill Reyolds 2021-05-19 21:15:53 UTC
My brand new AMD 4700G GPU with included Radeon GPU.

[   35.466776] CPU: 3 PID: 0 Comm: swapper/3 Tainted: P           O      5.12.5-pclos1 #1
[   35.466778] Hardware name: HP HP Desktop M01-F1xxx/87D6, BIOS F.03 09/23/2020
[   35.466779] Call Trace:
[   35.466781]  <IRQ>
[   35.466782]  dump_stack+0x64/0x7c
[   35.466787]  __report_bad_irq+0x35/0xaa
[   35.466789]  note_interrupt.cold+0xb/0x64
[   35.466791]  handle_irq_event+0xa0/0xb0
[   35.466793]  handle_fasteoi_irq+0x7f/0x1d0
[   35.466795]  __common_interrupt+0x3e/0xa0
[   35.466797]  common_interrupt+0x7e/0xa0
[   35.466800]  </IRQ>
[   35.466800]  asm_common_interrupt+0x1e/0x40
[   35.466803] RIP: 0010:cpuidle_reflect+0x10/0x20
[   35.466805] Code: fc ff ff 48 c7 43 08 00 00 00 00 5b 5d 41 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 05 ac a7 40 01 48 8b 40 40 <48> 85 c0 74 09 85 f6 78 05 e9 e2 4b 46 00 c3 90 0f 1f 44 00 00 48
[   35.466807] RSP: 0018:ffffa6bcc017ff00 EFLAGS: 00000292
[   35.466808] RAX: ffffffffae79ef40 RBX: 0000000000000003 RCX: 0000000000000002
[   35.466809] RDX: ffff8ebc831a00c0 RSI: 0000000000000003 RDI: ffff8ebc831a0000
[   35.466810] RBP: ffff8ebc80893c80 R08: 0000000841f946f4 R09: 0000000000000008
[   35.466811] R10: 0000000000000003 R11: 0000000000000002 R12: ffffffffaf53c460
[   35.466811] R13: ffff8ebc831a0000 R14: 0000000000000003 R15: 0000000000000000
[   35.466812]  ? ladder_select_state+0x1a0/0x1a0
[   35.466814]  do_idle+0x1ed/0x290
[   35.466817]  cpu_startup_entry+0x19/0x20
[   35.466818]  secondary_startup_64_no_verify+0xb0/0xbb
[   35.466820] handlers:
[   35.466821] [<00000000466e3e82>] amd_gpio_irq_handler
[   35.466824] Disabling IRQ #7
Comment 85 Thomas Gleixner 2021-05-19 21:42:12 UTC
On Wed, May 19 2021 at 21:15, bugzilla-daemon wrote:
> --- Comment #84 from Bill Reyolds (texstar@gmail.com) ---
> My brand new AMD 4700G GPU with included Radeon GPU.
>
> [   35.466820] handlers:
> [   35.466821] [<00000000466e3e82>] amd_gpio_irq_handler
> [   35.466824] Disabling IRQ #7

Can you please add 'apic=verbose' to the kernel command line and provide
the full output of dmesg?

Thanks,

        tglx
Comment 86 Bill Reyolds 2021-05-19 22:30:51 UTC
Created attachment 296887 [details]
dmesg with apic=verbose kernel 5.12.5
Comment 87 Nelson G 2021-05-23 18:43:51 UTC
Created attachment 296959 [details]
dmesg apic=verbose,  ryzen 3500u, thinkpad e495

I'd like to share mine too.  Hope it's useful.
Comment 88 Bill Reyolds 2021-05-28 20:36:49 UTC
Kernel 5.12.8. I doubt I will have much success in getting HP to update their locked down bios to fix.

[   35.884729] irq 7: nobody cared (try booting with the "irqpoll" option)
[   35.884734] CPU: 3 PID: 0 Comm: swapper/3 Tainted: P S O 5.12.8-pclos1 #1
[   35.884736] Hardware name: HP HP Desktop M01-F1xxx/87D6, BIOS F.03 09/23/2020
[   35.884737] Call Trace:
[   35.884739]  <IRQ>
[   35.884741]  dump_stack+0x64/0x7c
[   35.884745]  __report_bad_irq+0x35/0xaa
[   35.884747]  note_interrupt.cold+0xb/0x64
[   35.884749]  handle_irq_event+0xa0/0xb0
[   35.884752]  handle_fasteoi_irq+0x7f/0x1d0
[   35.884754]  __common_interrupt+0x3e/0xa0
[   35.884756]  common_interrupt+0x7e/0xa0
[   35.884758]  </IRQ>
[   35.884759]  asm_common_interrupt+0x1e/0x40
[   35.884761] RIP: 0010:do_idle+0x65/0x290
[   35.884764] Code: 48 8b 45 00 89 db a8 08 74 28 e9 e9 00 00 00 e8 c1 24 06 00 e8 9c c6 91 00 e8 67 ff ff ff 65 48 8b 04 25 00 6d 01 00 48 8b 00 <a8> 08 0f 85 c6 00 00 00 0f ae e8 fa 48 0f a3 1d 67 23 51 01 0f 83
[   35.884765] RSP: 0018:ffffab1c8017ff08 EFLAGS: 00000202
[   35.884767] RAX: 0000000000204000 RBX: 0000000000000003 RCX: 0000000000000002
[   35.884768] RDX: 0000000000000007 RSI: 0000000000000003 RDI: ffff9c1a438bdc00
[   35.884768] RBP: ffff9c1a40895ac0 R08: 0000000000000000 R09: 0000000000000008
[   35.884769] R10: 0000000000000003 R11: 0000000000000003 R12: ffffffffa553c480
[   35.884770] R13: ffff9c1a438bdc00 R14: 0000000000000003 R15: 0000000000000000
[   35.884771]  ? do_idle+0x59/0x290
[   35.884773]  cpu_startup_entry+0x19/0x20
[   35.884775]  secondary_startup_64_no_verify+0xb0/0xbb
[   35.884777] handlers:
[   35.884777] [<00000000bbf5ab51>] amd_gpio_irq_handler
[   35.884781] Disabling IRQ #7
Comment 89 Mario Limonciello 2021-07-23 01:50:00 UTC
I don't know this is actually helpful to the situation, but I noticed recently that the DSDT for SMB00001 includes IRQ7.

IOW commit 2bbb5fa37475d7aa5fa62f34db1623f3da2dfdfa may be part of the reason that IRQ7 isn't serviced by anything.  It might be useful however as a data point for someone involved here if reverting helps.  As there is a lot of history behind that though, I don't think it can simply be reverted without causing problems for a number of older machines.
Comment 90 Paul Richards 2021-08-24 19:55:21 UTC
Another +1 from me.  Linux 5.13.12-200.fc34.x86_64 on a Lenovo E595 (Ryzen 3500U).


[    1.830326] irq 7: nobody cared (try booting with the "irqpoll" option)
[    1.830328] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.13.12-200.fc34.x86_64 #1
[    1.830330] Hardware name: LENOVO 20NFCTO1WW/20NFCTO1WW, BIOS R11ET40W (1.20 ) 11/17/2020
[    1.830331] Call Trace:
[    1.830334]  <IRQ>
[    1.830336]  dump_stack+0x76/0x94
[    1.830341]  __report_bad_irq+0x35/0xa7
[    1.830344]  note_interrupt.cold+0xb/0x61
[    1.830346]  handle_irq_event+0x88/0x90
[    1.830350]  handle_fasteoi_irq+0x78/0x1c0
[    1.830352]  __common_interrupt+0x3e/0xa0
[    1.830355]  common_interrupt+0x7e/0xa0
[    1.830359]  </IRQ>
[    1.830359]  asm_common_interrupt+0x1e/0x40
[    1.830363] RIP: 0010:note_page+0x66/0x660
[    1.830366] Code: 85 c9 74 08 48 63 c2 4c 8b 64 c7 30 8b 43 18 48 8b 53 20 48 8b 4b 28 83 f8 ff 0f 84 a4 02 00 00 4
8 39 d5 0f 95 c2 3b 44 24 04 <0f> 95 c0 08 c2 75 09 49 39 cc 0f 84 65 03 00 00 80 7b 71 00 74 2a
[    1.830367] RSP: 0018:ffffb21d00067cb8 EFLAGS: 00000246
[    1.830369] RAX: 0000000000000004 RBX: ffffb21d00067ea8 RCX: 0000000000000000
[    1.830371] RDX: 0000000000000001 RSI: ffffff64261a8000 RDI: ffffb21d00067ea8
[    1.830372] RBP: 8000000000000161 R08: ffffffff8407dda0 R09: 0000000000000000
[    1.830372] R10: ffff880000000000 R11: ffffffff85d468c8 R12: 8000000000000000
[    1.830373] R13: ffffff64261a8000 R14: ffffff64261a8000 R15: 0000000000000000
[    1.830374]  ? ptdump_walk_pgd_level_debugfs+0x40/0x40
[    1.830377]  ? hugetlb_get_unmapped_area+0x300/0x300
[    1.830378]  ptdump_pte_entry+0x57/0x60
[    1.830382]  __walk_page_range+0xb85/0xc70
[    1.830386]  walk_page_range_novma+0x57/0x70
[    1.830388]  ptdump_walk_pgd+0x48/0xb0
[    1.830390]  ptdump_walk_pgd_level_core+0xb3/0xd0
[    1.830391]  ? ptdump_walk_pgd_level_debugfs+0x40/0x40
[    1.830392]  ? hugetlb_get_unmapped_area+0x300/0x300
[    1.830394]  ? rest_init+0xb4/0xb4
[    1.830395]  ? rest_init+0xb4/0xb4
[    1.830397]  kernel_init+0x36/0x11c
[    1.830398]  ret_from_fork+0x22/0x30
[    1.830402] handlers:
[    1.830402] [<00000000e88fc910>] amd_gpio_irq_handler
[    1.830406] Disabling IRQ #7
Comment 91 Paul Richards 2021-09-10 09:41:52 UTC
I've done some testing with respect to commit 2bbb5fa37475d7aa5fa62f34db1623f3da2dfdfa, and confirm that reverting it fixes the "irq 7: nobody cared" message for me.  I compared the current Fedora 34 kernel (v5.13.14), against a build with the commit reverted.

If others would like to test my kernel build is available here (use at your own risk): http://static.pauldoo.com/kernel/kernel-5.13.14-0.paultest1.fc34.x86_64.tar.gz

After superficial testing the laptop (Lenovo E595) works normally with this commit reverted and there are no new errors appearing in dmesg output.
Comment 92 Mario Limonciello (AMD) 2021-09-10 15:18:28 UTC
Created attachment 298741 [details]
Attempt to work between rock and a hard place

Paul - thanks for trying that.

Looking at this some today, the workaround from commit 2bbb5fa37475d7aa5fa62f34db1623f3da2dfdfa to fix the touchscreen on the BIOS using both legacy IRQ and extended IRQ notation has non-obvious side effects.  In addition to the IRQ7 nobody cared, I would suspect this is the reason that touchpads that support SMBUS aren't binding on a number of laptops.

Reverting it will cause the breakage from https://bugzilla.kernel.org/show_bug.cgi?id=198715 to return, but leaving it in place causes this bug.  As it's stuck in a very subtle workaround for very old BIOS from over 13 years ago, I have a thought on how we can avoid it.  Can some folks affected by this please try the attached patch?  If this helps I'll send it out to the mailing lists for further feedback.
Comment 93 Paul Richards 2021-09-10 19:40:44 UTC
I build a kernel using attachment 298741 [details] from comment 92, and unfortunately the "irq 7: nobody cared" message is present again.

Here is my build should anyone want to test: http://static.pauldoo.com/kernel/kernel-5.13.14-0.paultest2.fc34.x86_64.tar.gz
Comment 94 Mario Limonciello (AMD) 2021-09-10 19:52:34 UTC
That's... surprising considering the patch has the revert of 2bbb5fa37475d7aa5fa62f34db1623f3da2dfdfa which worked for you.  Are you sure it helped and you're sure you ran the right test kernel?

I did see the patch included in your source rpm (within patch-5.13-redhat.patch).

Assuming it's all built up right and you tested the right thing, can you please share your dmesg output with that patch in place and an acpidump?
Comment 95 Paul Richards 2021-09-10 20:28:04 UTC
I am new to the workflows of building kernels, so if the result is suspicious it’s very likely I messed up.

I’ll do a rebuild and update again in a day or two.
Comment 96 Nelson G 2021-09-10 21:48:47 UTC
Created attachment 298745 [details]
dmesg after paultest1 build

Hello,  I tried the 5.13.14-0.paultest1.fc34.x86_64 rpm packages.  And no more irq 7.

Thinkpad E495 ryzen 3500u

I also installed the paultest2 rpm packages but they have the irq7 nobody cared message. 

NOW.  Just to be clear  I don't have a touchscreen on my thnkpad ;)



Unrelated: hopefully Lenovo fixes our HPET stuck cpu's and patch the AMD memory clock stuck at 100% soon.
Comment 97 Mario Limonciello (AMD) 2021-09-10 21:50:58 UTC
@Neil,

I see in your dmesg this:
>[    0.406508] ACPI: IRQ 7 override to edge, high
which is what I expect from the first test (revert 2bbb5fa37475d7aa5fa62f34db1623f3da2dfdfa)

Can you please share your dmesg from the second test kernel and an acpidump please?
Comment 98 Nelson G 2021-09-10 23:15:51 UTC
Created attachment 298747 [details]
dmesg paultest2 build
Comment 99 Nelson G 2021-09-10 23:18:24 UTC
Created attachment 298749 [details]
acpidump kernel paultest2

There you have @Mario.
Later if needed I can share the acpidump from the first build.
Comment 100 Mario Limonciello 2021-09-11 00:03:18 UTC
https://bugzilla.kernel.org/show_bug.cgi?id=213031 is actually pretty much the same problem with the legacy IRQ getting setup wrong.  However the commit from that bug was reverted because it caused regressions.

So I suspect that reverting 2bbb5fa37475d7aa5fa62f34db1623f3da2dfdfa and applying 0ec4e55e9f571f08970ed115ec0addc691eda613 on top of master again would have helped this bug as well.
Comment 101 Bram Coenen 2021-09-11 06:35:37 UTC
(In reply to Neil from comment #96)
> 
> NOW.  Just to be clear  I don't have a touchscreen on my thnkpad ;)
> 

Fixed the title  ;)
Comment 102 Paul Richards 2021-09-12 19:07:33 UTC
@Mario,

Since the build for "paultest2" was suspicious, I created another build "paultest3" using the same diff from attachment 298741 [details] in comment 92.  I used a clean checkout of the code to rule out the possibility of me not rebuilding correctly last time.

The result is that the message "irq 7: nobody cared" is present - which is the same as "paultest2".

I'll attach the dmesg and acpidump outputs (note to Fedora folks: this tool is found in the "acpica-tools" package).

Here is a link to the build should anyone wish to test: http://static.pauldoo.com/kernel/kernel-5.13.14-0.paultest3.fc34.x86_64.tar.gz (it should give identical results to paultest2)
Comment 103 Paul Richards 2021-09-12 19:09:38 UTC
Created attachment 298755 [details]
Lenovo E595 dmesg output using paultest3 kernel
Comment 104 Paul Richards 2021-09-12 19:10:17 UTC
Created attachment 298757 [details]
Lenovo E595 acpidump output using paultest3 kernel
Comment 105 Nelson G 2021-09-12 21:51:46 UTC
Created attachment 298761 [details]
dmesg from paultest3 build

Tried paultest3 build.
Comment 106 Nelson G 2021-09-12 21:52:18 UTC
Created attachment 298763 [details]
acpidump from paultest3
Comment 107 Paul Richards 2021-09-19 19:40:46 UTC
I'm just wondering, is there another patch I can test?
Comment 108 Mario Limonciello (AMD) 2021-09-21 21:13:05 UTC
The failure is every time for you or just some boots?  Can you please double check your 'paultest1' kernel again?  I'm still perplexed why reverting 2bbb5fa37475d7aa5fa62f34db1623f3da2dfdfa helped but a commit with 2bbb5fa37475d7aa5fa62f34db1623f3da2dfdfa plus ignoring the "bad" codepath for legacy devices didn't.
Comment 109 Paul Richards 2021-09-24 08:39:52 UTC
With an unmodified Fedora 34 kernel, the warning occurs on every boot.

I'll to another build with only 2bbb5fa37475d7aa5fa62f34db1623f3da2dfdfa reverted and get back to you.
Comment 110 Paul Richards 2021-09-24 14:49:45 UTC
Another build to report.  F34 kernel (v5.13.16) with only 2bbb5fa37475d7aa5fa62f34db1623f3da2dfdfa reverted.

"paultest4": http://static.pauldoo.com/kernel/kernel-5.13.16-0.paultest4.fc34.x86_64.tar.gz

I confirm that this eliminates the "irq 7: nobody cared" message.  I'll attach fresh dmesg and acpidump output under this build.

This confirms what we saw with "paultest1".
Comment 111 Paul Richards 2021-09-24 14:51:26 UTC
Created attachment 298953 [details]
Lenovo E595 dmesg output using paultest4 kernel
Comment 112 Paul Richards 2021-09-24 14:52:03 UTC
Created attachment 298955 [details]
Lenovo E595 acpidump output using paultest4 kernel
Comment 113 Nelson G 2021-09-28 22:30:07 UTC
Created attachment 299015 [details]
dmesg from paultest4

Howdy
Comment 114 Nelson G 2021-09-28 22:31:58 UTC
Created attachment 299017 [details]
acpidump from paultest4
Comment 115 philipp_023 2022-03-11 14:26:25 UTC
Hi,
For me the issue is gone.

Machine:
Lenovo ThinkPad E495, model 20NECTO1WW, ThinkPad BIOS R11ET44W (1.24 ), EC R11HT44W

OS:
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=21.10
DISTRIB_CODENAME=impish

Kernel:
Linux e495 5.13.0-35-generic #40-Ubuntu SMP Mon Mar 7 08:03:10 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Comment 116 Bill Reyolds 2022-03-11 16:28:56 UTC
Hi,
For me the issue is gone too.

HP HP Desktop M01-F1xxx/87D6, BIOS F.13 03/29/2021
EFI v2.70 by American Megatrends
Linux version 5.16.13-pclos1 
OS: PCLinuxOS 2022
Comment 117 Alexander Kernozhitsky 2022-03-15 21:07:32 UTC
For me, the issue is still present.

[    1.573670] irq 7: nobody cared (try booting with the "irqpoll" option)
[    1.573694] CPU: 3 PID: 115 Comm: modprobe Not tainted 5.16.0-4-amd64 #1  Debian 5.16.12-1
[    1.573697] Hardware name: LENOVO 20NE001QRT/20NE001QRT, BIOS R11ET36W (1.16 ) 03/30/2020
[    1.573698] Call Trace:
...
[    1.573745] handlers:
[    1.573755] [<00000000ba607cbb>] amd_gpio_irq_handler
[    1.573772] Disabling IRQ #7

I am using Debian GNU/Linux bookworm/sid, with kernel 5.16.12-1.
Comment 118 Alexander Kernozhitsky 2022-03-15 21:08:56 UTC
Forgot to mention that I am using Lenovo ThinkPad E495, model 20NE001QRT.
Comment 119 Alexander Kernozhitsky 2022-03-15 21:33:21 UTC
The bug is gone for me after updating BIOS from 1.16 to 1.24.
Comment 120 Paul Richards 2022-03-17 20:02:20 UTC
Fixed for me also on my Lenovo E595, after updating the BIOS from 1.21 to 1.24.


BIOS v1.21:
```
> uname -a ;and dmesg | grep -P -i 'nobody cared|Disabling IRQ|DMI:'
Linux len 5.16.14-200.fc35.x86_64 #1 SMP PREEMPT Fri Mar 11 20:31:18 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
[    0.000000] DMI: LENOVO 20NFCTO1WW/20NFCTO1WW, BIOS R11ET41W (1.21 ) 06/07/2021
[    1.463917] irq 7: nobody cared (try booting with the "irqpoll" option)
[    1.464069] Disabling IRQ #7
```

BIOS v1.24:
```
> uname -a ;and dmesg | grep -P -i 'nobody cared|Disabling IRQ|DMI:'
Linux len 5.16.14-200.fc35.x86_64 #1 SMP PREEMPT Fri Mar 11 20:31:18 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
[    0.000000] DMI: LENOVO 20NFCTO1WW/20NFCTO1WW, BIOS R11ET44W (1.24 ) 01/26/2022
```
Comment 121 Lahfa Samy 2022-04-08 13:49:54 UTC
I'm still facing this issue on both kernel 5.17.1-arch1-1 and 5.15.32-1-lts on a T495 with an AMD Ryzen 7 PRO 3700U.
irq 7: nobody cared (try booting with the "irqpoll" option)
CPU: 0 PID: 0 Comm: swapper/0 Tainted: P           OE     5.15.32-1-lts #1 bb8765a1c0d>
Hardware name: LENOVO 20NKS28F00/20NKS28F00, BIOS R12ET55W(1.25 ) 07/06/2020
Call Trace:
 <IRQ>
 dump_stack_lvl+0x46/0x5a
 __report_bad_irq+0x35/0xaa
 note_interrupt.cold+0xb/0x64
 handle_irq_event+0xab/0xc0
 handle_fasteoi_irq+0x8a/0x1f0
 __common_interrupt+0x41/0xa0
 common_interrupt+0x7b/0xa0
 </IRQ>
 <TASK>
 asm_common_interrupt+0x1e/0x40
RIP: 0010:native_sched_clock+0x34/0x70
Code: c2 65 44 8b 05 0d c4 3f 46 44 89 c0 83 e0 01 48 c1 e0 04 48 8d 88 40 0a 03 00 65>
RSP: 0018:ffffffffbb803df8 EFLAGS: 00000256
RAX: 0000000000000000 RBX: ffffffffbb81a940 RCX: 000000000000001f
RDX: 000000020fb62397 RSI: 0000000037c1aab9 RDI: 0000000fe790ff1c
RBP: ffff955a435bb000 R08: 0000000000000002 R09: 0000000000000007
R10: 0000000000000001 R11: 0000000000000000 R12: 00000010cd4f834a
R13: 0000000000000000 R14: 0000000000098968 R15: 0000000000000000
 sched_clock_cpu+0x9/0xa0
 poll_idle+0xa5/0xb3
 cpuidle_enter_state+0x89/0x350
 cpuidle_enter+0x29/0x40
 do_idle+0x1e1/0x270
 cpu_startup_entry+0x19/0x20
 start_kernel+0x9bb/0x9e2
 secondary_startup_64_no_verify+0xc2/0xcb
 </TASK>
handlers:
[<000000009e238fe9>] amd_gpio_irq_handler [pinctrl_amd]
Disabling IRQ #7

Anyone has an idea how could this issue be more investigated ? I've read many comments here, and it seemed that at the end this issue was fixed, but I don't know how or why I'm still facing it.
Comment 122 Alberto 2022-04-11 16:22:22 UTC
Created attachment 300742 [details]
attachment-2695-0.html

The problem was fixed with a bios update.. have you seen if you have any?

El vie, 8 abr 2022 3:49 p. m., <bugzilla-daemon@kernel.org> escribió:

> https://bugzilla.kernel.org/show_bug.cgi?id=201817
>
> --- Comment #121 from Lahfa Samy (samy@lahfa.xyz) ---
> I'm still facing this issue on both kernel 5.17.1-arch1-1 and
> 5.15.32-1-lts on
> a T495 with an AMD Ryzen 7 PRO 3700U.
> irq 7: nobody cared (try booting with the "irqpoll" option)
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: P           OE     5.15.32-1-lts #1
> bb8765a1c0d>
> Hardware name: LENOVO 20NKS28F00/20NKS28F00, BIOS R12ET55W(1.25 )
> 07/06/2020
> Call Trace:
>  <IRQ>
>  dump_stack_lvl+0x46/0x5a
>  __report_bad_irq+0x35/0xaa
>  note_interrupt.cold+0xb/0x64
>  handle_irq_event+0xab/0xc0
>  handle_fasteoi_irq+0x8a/0x1f0
>  __common_interrupt+0x41/0xa0
>  common_interrupt+0x7b/0xa0
>  </IRQ>
>  <TASK>
>  asm_common_interrupt+0x1e/0x40
> RIP: 0010:native_sched_clock+0x34/0x70
> Code: c2 65 44 8b 05 0d c4 3f 46 44 89 c0 83 e0 01 48 c1 e0 04 48 8d 88 40
> 0a
> 03 00 65>
> RSP: 0018:ffffffffbb803df8 EFLAGS: 00000256
> RAX: 0000000000000000 RBX: ffffffffbb81a940 RCX: 000000000000001f
> RDX: 000000020fb62397 RSI: 0000000037c1aab9 RDI: 0000000fe790ff1c
> RBP: ffff955a435bb000 R08: 0000000000000002 R09: 0000000000000007
> R10: 0000000000000001 R11: 0000000000000000 R12: 00000010cd4f834a
> R13: 0000000000000000 R14: 0000000000098968 R15: 0000000000000000
>  sched_clock_cpu+0x9/0xa0
>  poll_idle+0xa5/0xb3
>  cpuidle_enter_state+0x89/0x350
>  cpuidle_enter+0x29/0x40
>  do_idle+0x1e1/0x270
>  cpu_startup_entry+0x19/0x20
>  start_kernel+0x9bb/0x9e2
>  secondary_startup_64_no_verify+0xc2/0xcb
>  </TASK>
> handlers:
> [<000000009e238fe9>] amd_gpio_irq_handler [pinctrl_amd]
> Disabling IRQ #7
>
> Anyone has an idea how could this issue be more investigated ? I've read
> many
> comments here, and it seemed that at the end this issue was fixed, but I
> don't
> know how or why I'm still facing it.
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 123 Lahfa Samy 2022-04-11 20:23:58 UTC
I'm on the third-latest BIOS from Lenovo. I wasn't very keen on upgrading because Lenovo restricts the amount of memory I could choose for the iGPU on the second-newest BIOS update.
This item : Remove UMA buffer size item  128MB/256MB/512MB on BIOS setup.
I became thus a bit skeptical from actually upgrading, even if I'm aware it's not good at all to ditch any BIOS upgrade, especially considering how rare they are.

Here are the changelogs of the two last updates :
<1.28>
- [Important] Update Phoenix Security Issue.
- [Important] Sync system base board version to SMBIOS type2.
- [Important] Update CopyRight to 2021.
- (Fix) Fixed an issue that BTS SPI protection fail.

<1.27>
- [Important] Remove UMA buffer size item  128MB/256MB/512MB on BIOS setup.
- [Important] Remove no BIOS setup item into WMI list. 
- [Important] Modify Strange diagnostics error.
- [New] Added Support SMBios release to support country code.
- [New] Added back flash prevention for Type11 Country code.
- [New] Added support for Nuvoton TPM firmware update function.	 
- [New] Added (LEN-27581) Security:changed SmmOEMInt15 to return an error after ExitBootServices.Removed USB-API function.
- (Fix) Fixed an issue that Fn+Tab have no function.
- (Fix) Fixed an issue that it can't be waked by WOL in S4/S5 from TBT TR dock(TBT work station).
- (Fix) Fixed an issue that battery icon in System task tray shows yellow mark when plug out 65W/90W AC adapter then plug in 135W AC adapter.

Is it the Remove no BIOS setup item into WMI list that fixes this bug ? Or the (Fix) in 1.28 ? Or neither of these ? 

Finally, do you have a T495 or anyone having a ThinkPad T495 has tested that these BIOS updates do fix the issue? Or it is only supposed they do ? I could always update and rollback, but I'm just not a fan of flashing any low level stuff, if any operation fails I'm in for a rollercoaster.
Comment 124 Lahfa Samy 2022-04-11 20:35:36 UTC
There are is one comment (#50) that says that updating their BIOS to 1.19 on a T495 got rid of this issue, so I don't know if a regression was introduced by another BIOS upgrade or a kernel upgrade or if the issue lies elsewhere : https://bugzilla.kernel.org/show_bug.cgi?id=201817#c50
Comment 125 Mario Limonciello (AMD) 2022-07-12 15:38:24 UTC
A very similar issue to this has popped up (https://bugzilla.kernel.org/show_bug.cgi?id=216230) which has an interesting finding.  By pinctrl-amd loading late some IRQs are not getting serviced.  Moving it into the initramfs appears to help in that case.

Anyone who is still experiencing this issue can you please do the following:
1) First reproduce on latest 5.18.y (or 5.19-rc)
2) Modify your kernel config for CONFIG_PINCTRL_AMD to be built-in.
3) See if you can still reproduce it.

If you can still reproduce it, please turn on dynamic debugging for pinctrl-amd (dyndbg="module pinctrl_amd +p" on kernel command line) and then share another updated dmesg.
Comment 126 Lahfa Samy 2022-10-15 11:48:51 UTC
I've seen that a somewhat similar issue in the ArchLinux linux kernel 6.0.1-arch2-1 has popped up again, it wasn't showing up before (but I don't know when it stopped showing up on 5.19.x), looking in journalctl, the issue was happening on 5.16.2-arch1-1, 5.16.8-arch1-1, 5.16.10-arch1-1, 5.16.11-arch1-1, 5.16.15-arch1-1, 5.15.32-lts-1, 5.15.35-lts-1, 5.15.55-lts-1, 5.15.69-lts, 5.17.1-arch1-1, 5.17.5-arch1-1, 5.17.9-arch1-1, 5.18.7-arch1-1, 5.19.2-arch1-1, 5.19.4-arch1-1, 5.19.7-arch1-1, 5.19.10-arch1-1 and finally just now on 6.0.1-arch2-1.

On the latest kernel 6.0.1-arch2-1 :

kernel: CPU: 0 PID: 0 Comm: swapper/0 Tainted: P           OE      6.0.1-arch2-1 #1 ca4c4b1e174d24f1d562eb4d3f9de5bf01e98574
kernel: Hardware name: LENOVO 20NKS28F00/20NKS28F00, BIOS R12ET55W(1.25 ) 07/06/2020
kernel: Call Trace:
kernel:  <IRQ>
kernel:  dump_stack_lvl+0x48/0x60
kernel:  __report_bad_irq+0x35/0xaa
kernel:  note_interrupt.cold+0xa/0x65
kernel:  handle_irq_event+0x75/0x80
kernel:  handle_fasteoi_irq+0x8e/0x1f0
kernel:  __common_interrupt+0x46/0xa0
kernel:  common_interrupt+0x43/0xa0
kernel:  asm_common_interrupt+0x26/0x40
kernel: RIP: 0010:__do_softirq+0x7c/0x2ca
kernel: Code: 14 81 67 2c ff f7 ff ff be 00 01 00 00 e8 4c c3 2f ff c7 44 24 10 0a 00 00 00 65 66 c7 05 ca 24 e3 4c 00 00 fb 0f 1f 44 00 00 >
kernel: RSP: 0018:ffffb60bc0003f90 EFLAGS: 00000246
kernel: RAX: 0000000000000000 RBX: ffffffffb4003de8 RCX: 00000001000f634f
kernel: RDX: 0000000000000001 RSI: 0000000000000100 RDI: ffffffffb401a9c0
kernel: RBP: 0000000000000043 R08: 000000023ece886f R09: 8996bb55954eaf17
kernel: R10: 0000000000000040 R11: ffff933375ca7300 R12: 0000000000000001
kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000040
kernel:  ? handle_edge_irq+0x9a/0x260
kernel:  __irq_exit_rcu+0xb7/0xe0
kernel:  common_interrupt+0x86/0xa0
kernel:  </IRQ>
kernel:  <TASK>
kernel:  asm_common_interrupt+0x26/0x40
kernel: RIP: 0010:tick_nohz_idle_enter+0x45/0x50
kernel: Code: 81 35 ab 4d 48 83 bb b0 00 00 00 00 75 22 80 4b 4c 01 e8 5e f0 fe ff 80 4b 4c 04 48 89 43 78 e8 71 cf f9 ff fb 0f 1f 44 00 00 >
kernel: RSP: 0018:ffffffffb4003e90 EFLAGS: 00000282
kernel: RAX: 000003548cd93c16 RBX: ffff9335f0a24800 RCX: 000000000000260a
kernel: RDX: 000003548cd93c16 RSI: 000003548cd93c16 RDI: 000003548cd93c16
kernel: RBP: 0000000000000000 R08: ffffffffffcdab20 R09: 0000000037c1c8f8
kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff9335feff2001
kernel: R13: 0000000000000000 R14: ffffffffb401a120 R15: 00000000bdf51000
kernel:  do_idle+0x42/0x270
kernel:  cpu_startup_entry+0x1d/0x20
kernel:  rest_init+0xc8/0xd0
kernel:  arch_call_rest_init+0xe/0x1c
kernel:  start_kernel+0x97a/0x9a3
kernel:  secondary_startup_64_no_verify+0xe5/0xeb
kernel:  </TASK>
kernel: handlers:
kernel: [<00000000d191bbef>] amd_gpio_irq_handler
kernel: Disabling IRQ #7

I will add dynamic debugging for pinctrl-amd, hopefully trigger the exact same issue reliably again (as there are 2 differents oops that seems to be triggered randomly, see the attached file next of this comment), then share another dmesg, in hopes a fix or a patch can come.
Comment 127 Lahfa Samy 2022-10-15 11:53:19 UTC
Created attachment 303006 [details]
multiples oops for 5.16.11-arch1-1,5.16.14-arch1-1,5.15.28-1-lts,6.0.1-arch2-1

Oops traces all made on a Thinkpad T495 Ryzen 7 3700U with Vega RX 10, BIOS R12ET55W(1.25 )
Comment 128 Lahfa Samy 2022-10-15 12:21:02 UTC
Created attachment 303007 [details]
1-dmesg-oops-T495-Ryzen7PRO3700U-6.0.1-arch2-1 with dyndbg="module pinctrl_amd +p"

T495-Ryzen 7 PRO 3700U with Vega RX10, 6.0.1-arch2-1

cat /proc/cmdline
> BOOT_IMAGE=/vmlinuz-linux zfs=zroot-ext/djqdje_arch/ROOT/default rw
> radeon.si_support=0 amdgpu.si_support=1 radeon.cik_support=0
> amdgpu.cik_support=1 loglevel=3 quiet "dyndbg=module pinctrl_amd +p"

cat /proc/config.gz | rg PINCTRL_AMD 
> CONFIG_PINCTRL_AMD=y
Comment 129 Lahfa Samy 2022-10-15 12:24:31 UTC
Created attachment 303008 [details]
2-dmesg-oops-T495-Ryzen7PRO3700U-6.0.1-arch2-1 with dyndbg="module pinctrl_amd +p"

Same as settings as the 1-dmesg but the oops trace is longer, I've compared/diffed them using meld.
Comment 130 Mario Limonciello (AMD) 2022-12-05 22:03:16 UTC
Is there a chance that this is tied to only happening on warm boot vs happening on cold boot too?

If it's happening on warm boot, maybe we are missing some cleanup on shutdown for the GPIO controller.