Created attachment 277381 [details] Devices available
My i2c Elantech touchpad stops working after a while on my Acer F5-573. Dmesg diff shows "irq 16: nobody cared (try booting with the "irqpoll" option)". I reported the bug to libinput developers and we came to the conclusion it's probably a kernel bug. I can reproduce it on a lot of different kernels and Linux distros, the kernel version i'm using right now is "4.17.4-1-default". I must say the touchpad works fine under Windows, and I only get this bug on Linux. Here's the link from libinput GitHub with more information: https://gitlab.freedesktop.org/libinput/libinput/issues/77 I'll attach some logs, new ones made with kernel 4.17.4-1 instead of the ones on the link above. Thanks!
Created attachment 277383 [details] Dmesg after boot
Created attachment 277385 [details] Dmesg diff with the bug
Created attachment 277387 [details] Xorg log
Created attachment 277389 [details] Full dmesg log with bug
I also need to say that sometimes when it stops, if I run... modprobe -r i2c_hid modprobe i2c_hid ...it starts working again, but that workaround doesn't always work. Usually when the cursor stops moving, when I move my finger, it starts clicking like crazy, as noted in my bug report to libinput. It frequently happens when i'm playing on Dolphin emulator (everytime actually), but it already happened when I was taking a look at some screensavers. It (still) didn't happen when using the laptop for other activities. I found some other people with the exact same problem as me on the internet: https://askubuntu.com/questions/1004769/touchpad-works-slow-disabling-irq-16 https://bbs.archlinux.org/viewtopic.php?id=221463 Thanks.
It seems to be the same kind of issue we have in https://bugzilla.kernel.org/show_bug.cgi?id=198473 In the two links in the previous comment, we see: cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 16: 126908 27190 205425 53703 IR-IO-APIC 16-fasteoi idma64.0, i801_smbus, i2c_designware.0 82: 4 2 10 1 IR-IO-APIC 82-edge SYNA7DB5:00 And I believe this is far too many interrupts for a I2C chip. Mika, Hans, any ideas if the pinctrl is at fault here too? I would have expected c41eb2c7f93531b8 which is in v4.17 to fix that, but it seems it's not the case.
(In reply to Benjamin Tissoires from comment #7) > It seems to be the same kind of issue we have in > https://bugzilla.kernel.org/show_bug.cgi?id=198473 Well, in my case the touchpad works normally until it stops moving and starts clicking everytime I move my finger. > In the two links in the previous comment, we see: > cat /proc/interrupts > CPU0 CPU1 CPU2 CPU3 > 16: 126908 27190 205425 53703 IR-IO-APIC 16-fasteoi > idma64.0, i801_smbus, i2c_designware.0 > 82: 4 2 10 1 IR-IO-APIC 82-edge > SYNA7DB5:00 I can't post my "cat /proc/interrupts" now but if I remember well on my computer it looks exactly the same.
Looking at the /proc/interrupts the touchpad seems to use APIC interrupt (not GPIO) so I don't think pinctrl driver is involved in this at all.
Hmm, the IOAPIC being used for the interrupt is weird, can you attach an acpidump for the machine please?
Created attachment 277611 [details] My computer's interrupts
Created attachment 277613 [details] ACPI I don't know if i'm doing this correctly, but here's my acpidump (learned here: http://smackerelofopinion.blogspot.com/2009/10/dumping-acpi-tables-using-acpidump-and.html)
Created attachment 277615 [details] Extracted Guess all files are here
Created attachment 277663 [details] All hardware log Here's Suse hardware log. Interesting enough there's some parts saying it's a PS/2 Mouse, but if I remember well on Mint it doesn't show that.
Created attachment 277665 [details] DMESG Again Bug happened again, here's a new dmesg log. I don't know if this one is any different, but this time I was benchmarking with Unigine Valley GPU benchmark to confirm a suspicion I had, and I confirmed it: Before that I used the computer about 3 times, about 40 minutes each without any problems. I'm certain the bug only happens when i'm stressing the GPU (possibly the CPU too). It already happened when using both OpenGL and Vulkan on Dolphin-emulator, when using the Valley benchmark and when I was taking a look at some 3D screensavers, when I use the computer to browse the web, or any other simple task the bug doesn't get triggered.
Created attachment 277667 [details] Dmesg with Irqpool boot option is different. Now I tried with the IRQPOOL option and dmesg seems to be different after the bug happens, before it would show: "CPU: 1 PID: 0 Comm: swapper/1 Not tainted", now it shows "CPU: 1 PID: 3280 Comm: Audio thread - Not tainted". This part is also different: Before irqpool: [ 1022.761949] RIP: 0010:cpuidle_enter_state+0xbc/0x2e0 [ 1022.761950] RSP: 0018:ffff97d000d23eb0 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffdb [ 1022.761952] RAX: 0000000000000001 RBX: 000000ee215c3334 RCX: 000000000000001f [ 1022.761953] RDX: 000000ee215c3334 RSI: 0000000000022740 RDI: 0000000000000000 [ 1022.761953] RBP: 0000000000000001 R08: 0000024532af73c8 R09: 0000000000000005 [ 1022.761954] R10: 00000000ffffffff R11: ffff89a8d14a1ea8 R12: ffffb7cfffc90870 [ 1022.761955] R13: ffffffff920d96d8 R14: 000000ee215c1668 R15: 0000000000000000 [ 1022.761960] do_idle+0x21d/0x270 [ 1022.761962] cpu_startup_entry+0x5f/0x70 [ 1022.761964] start_secondary+0x1a0/0x1e0 [ 1022.761966] secondary_startup_64+0xa5/0xb0 After irqpool: [ 1011.718836] RIP: 0033:0x7f1a9dcc7720 [ 1011.718837] RSP: 002b:00007f1a40de63c8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffdb [ 1011.718839] RAX: 0000000000000000 RBX: 00007f1a3c003640 RCX: 00007f1a3c000d58 [ 1011.718840] RDX: 0000000000000001 RSI: 00007f1a40de63d7 RDI: 000000000000002a [ 1011.718841] RBP: 0000000000000400 R08: 00000000000000fb R09: 0000000000000000 [ 1011.718841] R10: 0000000000000fb0 R11: 0000000000000202 R12: 0000000000000000 [ 1011.718842] R13: 0000000000000000 R14: 00007f1a3c012cd0 R15: 0000000000000000 With or without irqpool the bug still happens.
If I execute "modprobe -r i2c_hid" and "modprobe -f i2c_hid" the following shows up in dmesg (tried several times, doesn't work): [ 7869.023842] i2c_designware i2c_designware.0: controller timed out [ 7869.053169] i2c_designware i2c_designware.0: timeout in disabling adapter [ 7869.053181] hid (null): reading report descriptor failed [ 7869.053192] i2c_hid i2c-ELAN0501:00: can't add hid device: -5 [ 7869.053322] i2c_hid: probe of i2c-ELAN0501:00 failed with error -5 [ 7870.845461] i2c_designware i2c_designware.0: timeout in disabling adapter [ 7934.239998] i2c_designware i2c_designware.0: controller timed out [ 7934.269769] i2c_designware i2c_designware.0: timeout in disabling adapter [ 7934.269781] hid (null): reading report descriptor failed [ 7934.269792] i2c_hid i2c-ELAN0501:00: can't add hid device: -5 [ 7934.269992] i2c_hid: probe of i2c-ELAN0501:00 failed with error -5 [ 7935.837163] i2c_designware i2c_designware.0: timeout in disabling adapter [ 7956.607886] i2c_designware i2c_designware.0: controller timed out [ 7956.637430] i2c_designware i2c_designware.0: timeout in disabling adapter [ 7956.637442] hid (null): reading report descriptor failed [ 7956.637453] i2c_hid i2c-ELAN0501:00: can't add hid device: -5 [ 7956.637662] i2c_hid: probe of i2c-ELAN0501:00 failed with error -5 [ 7957.853402] i2c_designware i2c_designware.0: timeout in disabling adapter [ 7968.314580] i2c_hid: module_layout: kernel tainted. [ 7968.314582] Disabling lock debugging due to kernel taint [ 7968.314639] i2c_hid: module verification failed: signature and/or required key missing - tainting kernel [ 7970.431926] i2c_designware i2c_designware.0: controller timed out [ 7970.463044] i2c_designware i2c_designware.0: timeout in disabling adapter [ 7970.463059] hid (null): reading report descriptor failed [ 7970.463071] i2c_hid i2c-ELAN0501:00: can't add hid device: -5 [ 7970.463289] i2c_hid: probe of i2c-ELAN0501:00 failed with error -5 [ 7971.870741] i2c_designware i2c_designware.0: timeout in disabling adapter
Sorry to revive such an old bug but in the past few days I've been using the same laptop again and the same thing happened. I used the command "journalctl --since "2021-04-24 16:20" | grep i2c" and this line from the time the bug happened could indicate something: i2c_hid i2c-ELAN0501:00: i2c_hid_get_input: IRQ triggered but there's no data I'm on kernel 5.4.0-72-generic using KDE Neon with Ubuntu 20.04LTS as a base. Please note I'm only reporting this now due to the possibility of this bug affecting more people, so it's mostly for informative purposes, as this laptop is in it's last legs anyway.
I found a workaround for this bug. I'll post it here if someone else suffers from the same problem in the future: On the UEFI setup, change the touchpad from Advanced to Basic mode. Acer laptops, like the one where I experienced this bug (Aspire F5-573 series), have this option. On my particular laptop, this created another problem: sometimes the CPU would be flooded with IRQs and would run at 100% load, mostly while playing games or doing anything with 3D. To fix that, blacklist the intel_lpss_pci module: Add the file /etc/modprobe.d/lpss.conf with the following line: blacklist intel_lpss_pci This will fix the high CPU usage problem and the touchpad will work fine this way. Please note that if you keep the touchpad in Advanced mode it will stop working if you blacklist that module. Only do this if your BIOS allows changing it to Basic. Also note that if you dual boot with Windows, the touchpad will lose some functionalities there.
I also have this Acer laptop, I'm on kernel 6.6.47 on Gentoo and I'm having the exact same problem, and have been for quite a while or other distros. If you guys need more info for debugging, please contact me.