Bug 201817 - irq 7: nobody cared for HP laptop with with touchscreen and AMD processor
Summary: irq 7: nobody cared for HP laptop with with touchscreen and AMD processor
Status: NEW
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: platform_x86_64@kernel-bugs.osdl.org
URL:
Keywords:
Depends on: 198715
Blocks:
  Show dependency tree
 
Reported: 2018-11-29 18:40 UTC by Bram Coenen
Modified: 2019-01-18 18:27 UTC (History)
6 users (show)

See Also:
Kernel Version: 4.19.4
Tree: Mainline
Regression: No


Attachments
dmesg after boot, 4.19.4, no touchscreen (75.64 KB, text/plain)
2018-11-29 18:40 UTC, Bram Coenen
Details
dmesg after boot, 4.19.3, working touchscreen (73.19 KB, text/plain)
2018-11-29 18:51 UTC, Bram Coenen
Details
cat /proc/interrupts (1.22 KB, text/plain)
2018-12-20 21:09 UTC, nospamming11+kernel
Details
modified function (1.81 KB, text/plain)
2019-01-18 18:20 UTC, nospamming11+kernel
Details
modified - dmesg - working (322.06 KB, text/plain)
2019-01-18 18:22 UTC, nospamming11+kernel
Details

Description Bram Coenen 2018-11-29 18:40:08 UTC
Created attachment 279741 [details]
dmesg after boot, 4.19.4, no touchscreen

The Elan touchscreen on HP laptops with an AMD processor just got fixed and worked properly for a while in 4.19.3. However after installing a new kernel version, 4.19.4, the touchscreen stopped working and new errors appeared.

I once got this in 4.19.3 as well, but after a few shutdowns and the last shutdown holding the power button, this was resolved. I did have issues login in as well (Don't)

I'm on a HP ENVY x360 Convertible 15-bq0xx/8311, BIOS F.08 with only Fedora 28.

The APCI-config was fixed in https://bugzilla.kernel.org/show_bug.cgi?id=198715 .
Comment 1 Bram Coenen 2018-11-29 18:51:30 UTC
Created attachment 279743 [details]
dmesg after boot, 4.19.3, working touchscreen
Comment 2 Bram Coenen 2018-11-29 18:58:55 UTC
Just checked. The kernel version doesn't matter, a decent about of reboots does the trick to get the touchscreen working even on 4.19.4.
Comment 3 Bram Coenen 2018-11-29 19:11:53 UTC
[   16.587361] irq 7: nobody cared (try booting with the "irqpoll" option)
[   16.587366] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C        4.19.4-200.fc28.x86_64 #1
[   16.587367] Hardware name: HP HP ENVY x360 Convertible 15-bq0xx/8311, BIOS F.08 03/30/2018
[   16.587368] Call Trace:
[   16.587372]  <IRQ>
[   16.587381]  dump_stack+0x5c/0x80
[   16.587385]  __report_bad_irq+0x37/0xae
[   16.587388]  note_interrupt.cold.9+0xa/0x69
[   16.587390]  handle_irq_event_percpu+0x6a/0x80
[   16.587392]  handle_irq_event+0x27/0x44
[   16.587394]  handle_fasteoi_irq+0x7f/0x120
[   16.587398]  handle_irq+0xbf/0x100
[   16.587400]  do_IRQ+0x49/0xd0
[   16.587403]  common_interrupt+0xf/0xf
[   16.587405]  </IRQ>
[   16.587409] RIP: 0010:native_safe_halt+0x2/0x10
[   16.587411] Code: ff ff 7f c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 75 c4 eb 8c 90 90 90 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90
[   16.587412] RSP: 0018:ffffffffbd203e18 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd8
[   16.587415] RAX: 0000000080000000 RBX: ffff887df5a01c00 RCX: 0000000000000034
[   16.587416] RDX: 4ec4ec4ec4ec4ec5 RSI: ffffffffbd2dd200 RDI: ffff887df5a01c64
[   16.587418] RBP: ffff887df5a01c64 R08: 0000000000000002 R09: 0000000000020800
[   16.587419] R10: 0000000f66cccf99 R11: ffff887df721fde8 R12: 0000000000000001
[   16.587420] R13: 0000000000000001 R14: 0000000000000001 R15: 00000000d0c2fae0
[   16.587424]  acpi_safe_halt+0x1b/0x30
[   16.587427]  acpi_idle_enter+0x104/0x2a0
[   16.587431]  cpuidle_enter_state+0x71/0x320
[   16.587435]  do_idle+0x226/0x260
[   16.587438]  cpu_startup_entry+0x6f/0x80
[   16.587442]  start_kernel+0x523/0x543
[   16.587446]  secondary_startup_64+0xa4/0xb0
[   16.587448] handlers:
[   16.587453] [<00000000e6019074>] amd_gpio_irq_handler [pinctrl_amd]
[   16.587455] Disabling IRQ #7
Comment 4 Hans de Goede 2018-11-30 09:45:15 UTC
I discussed this a bit on the mailinglist. Here are the relevant parts of the discussion:

Me:

The amd_gpio chip/driver appears to be the only driver
connected to IRQ 7, so I think there is an issue with the
amd_gpio driver where it does not properly clear the interrupt
source. E.g. it might be that the BIOS requested interrupts
on a GPIO which Linux does not monitor and that the driver
does not disable this GPIO-IRQ on probe and since it is not
handling that pin in IRQ mode also does not clear it.

Anyways that is just a theory. It would greatly help if
someone who knows the amd_gpio driver better could take
a look.

Reply by Daniel Drake:

Sorry that I can't be much help here - I don't have access to any
useful info beyond the source code already present in Linux.

Maybe you could explore your theory by dumping the GPIO/GPIO-INT
enable regs, see if any of them are marked as enabled by something
other than Linux.

###

I'm afraid I don't have time to look into this myself atm. Maybe someone can add some printk calls to drivers/pinctrl/pinctrl-amd.c to dump relevant register values as Daniel suggested and see if that yields any useful info?
Comment 5 mruize85 2018-12-16 15:20:55 UTC
Forcing `amd_gpio_irq_handler()` from `drivers/pinctrl/pinctrl-amd.c` to always return `IRQ_HANDLED` makes touchscreen work, but now system is sending interrupts at very high rate (around 100 k/s). I a completely newbie to kernel development, so I don't really know what I'm doing... Any idea?
Comment 6 nospamming11+kernel 2018-12-20 21:09:07 UTC
Created attachment 280107 [details]
cat /proc/interrupts

Seeing the exact same problem on 4.19.10 (HP Envy x360 13-ag0004ng)

Attached the output of /proc/interrupts
Comment 7 Lukas Kahnert 2018-12-24 20:01:51 UTC
The only case where I got the "nobody cared" panic on my HP Envy x360 bq-1xx was if I used Windows 10 on my last boot and rebooted into Linux.
My theory is that Windows set the IRQ 7 on a state that persists on reboot(and trigger the panic in linux) and only get cleared if you hold down power button.
My Laptop is now Linux only and since then I never had this issue again(using 4.19.5 now).
Comment 8 JerryD 2019-01-15 02:33:41 UTC
This issue occurs without regard to Windows 10 previous boot. From cold powerup I get failure to boot about 2 out of 3 attempts. Anyway to remove this touchscreen driver? I think it should be backed out until the regression is fixxed.
Comment 9 Bram Coenen 2019-01-15 07:22:10 UTC
(In reply to JerryD from comment #8)
> This issue occurs without regard to Windows 10 previous boot. From cold
> powerup I get failure to boot about 2 out of 3 attempts. Anyway to remove
> this touchscreen driver? I think it should be backed out until the
> regression is fixxed.

I don't even have Windows any more and get the bug sometimes. However, I do not agree that the driver should be removed. I still use the touchscreen daily because for me it is working most of the time. Besides, it does not hurt having it there, does it?
Comment 10 JerryD 2019-01-17 02:22:07 UTC
Are there any workarounds?
Comment 11 nospamming11+kernel 2019-01-17 13:34:38 UTC
Some suggest here: https://github.com/linuxwacom/wacom-hid-descriptors/issues/12
that switching to Legacy BIOS boot helped them.
Comment 12 nospamming11+kernel 2019-01-18 18:19:59 UTC
I can confirm that always returning IRQ_HANDLED fixes the error, but spams the system with a lot of these interrupts (it never stops)

Sometimes booting with the kernel option noirqdebug helped me to get the touchscreen up and running again.

I then noticed the following: See attached three files. One with a modified amd_gpio_irq_handler and two dmesg outputs. One of them with a working touchscreen and one where the touchscreen does not work. I can see that in the working case all interrupts are handled correctly while in the "not-working"-case there are A LOT of interrupts handled at all.
Comment 13 nospamming11+kernel 2019-01-18 18:20:41 UTC
Created attachment 280583 [details]
modified function
Comment 14 nospamming11+kernel 2019-01-18 18:22:35 UTC
Created attachment 280585 [details]
modified - dmesg - working
Comment 15 nospamming11+kernel 2019-01-18 18:27:24 UTC
Not working - dmesg (too large to attach directly): https://bit.ly/2FHChBb

Note You need to log in before you can comment on or make changes to this bug.