Bug 200529 - Elantech touchpad stops working after a while, shows "irq 16: nobody cared"
Summary: Elantech touchpad stops working after a while, shows "irq 16: nobody cared"
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: I2C (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Drivers/I2C virtual user
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-07-17 19:41 UTC by guimarcalsilva
Modified: 2018-08-02 15:51 UTC (History)
4 users (show)

See Also:
Kernel Version: 4.17.4-1-default
Tree: Mainline
Regression: No


Attachments
Devices available (4.09 KB, text/plain)
2018-07-17 19:41 UTC, guimarcalsilva
Details
Dmesg after boot (67.66 KB, text/plain)
2018-07-17 19:49 UTC, guimarcalsilva
Details
Dmesg diff with the bug (3.43 KB, text/plain)
2018-07-17 19:50 UTC, guimarcalsilva
Details
Xorg log (46.57 KB, text/plain)
2018-07-17 19:51 UTC, guimarcalsilva
Details
Full dmesg log with bug (71.08 KB, text/plain)
2018-07-17 19:53 UTC, guimarcalsilva
Details
My computer's interrupts (3.26 KB, text/plain)
2018-07-30 17:22 UTC, guimarcalsilva
Details
ACPI (651.41 KB, video/mpeg)
2018-07-30 17:28 UTC, guimarcalsilva
Details
Extracted (53.02 KB, application/zip)
2018-07-30 17:30 UTC, guimarcalsilva
Details
All hardware log (1.03 MB, text/plain)
2018-08-02 11:24 UTC, guimarcalsilva
Details
DMESG Again (69.19 KB, text/plain)
2018-08-02 13:05 UTC, guimarcalsilva
Details
Dmesg with Irqpool boot option is different. (72.41 KB, text/plain)
2018-08-02 13:43 UTC, guimarcalsilva
Details

Description guimarcalsilva 2018-07-17 19:41:36 UTC
Created attachment 277381 [details]
Devices available
Comment 1 guimarcalsilva 2018-07-17 19:48:43 UTC
My i2c Elantech touchpad stops working after a while on my Acer F5-573. Dmesg diff shows "irq 16: nobody cared (try booting with the "irqpoll" option)". I reported the bug to libinput developers and we came to the conclusion it's probably a kernel bug. I can reproduce it on a lot of different kernels and Linux distros, the kernel version i'm using right now is "4.17.4-1-default". I must say the touchpad works fine under Windows, and I only get this bug on Linux.

Here's the link from libinput GitHub with more information: https://gitlab.freedesktop.org/libinput/libinput/issues/77

I'll attach some logs, new ones made with kernel 4.17.4-1 instead of the ones on the link above.

Thanks!
Comment 2 guimarcalsilva 2018-07-17 19:49:26 UTC
Created attachment 277383 [details]
Dmesg after boot
Comment 3 guimarcalsilva 2018-07-17 19:50:03 UTC
Created attachment 277385 [details]
Dmesg diff with the bug
Comment 4 guimarcalsilva 2018-07-17 19:51:35 UTC
Created attachment 277387 [details]
Xorg log
Comment 5 guimarcalsilva 2018-07-17 19:53:05 UTC
Created attachment 277389 [details]
Full dmesg log with bug
Comment 6 guimarcalsilva 2018-07-17 20:03:24 UTC
I also need to say that sometimes when it stops, if I run...
modprobe -r i2c_hid
modprobe i2c_hid

...it starts working again, but that workaround doesn't always work.

Usually when the cursor stops moving, when I move my finger, it starts clicking like crazy, as noted in my bug report to libinput.

It frequently happens when i'm playing on Dolphin emulator (everytime actually), but it already happened when I was taking a look at some screensavers. It (still) didn't happen when using the laptop for other activities.

I found some other people with the exact same problem as me on the internet:

https://askubuntu.com/questions/1004769/touchpad-works-slow-disabling-irq-16
https://bbs.archlinux.org/viewtopic.php?id=221463

Thanks.
Comment 7 Benjamin Tissoires 2018-07-19 08:13:51 UTC
It seems to be the same kind of issue we have in https://bugzilla.kernel.org/show_bug.cgi?id=198473

In the two links in the previous comment, we see:
cat /proc/interrupts
            CPU0       CPU1       CPU2       CPU3       
  16:     126908      27190     205425      53703  IR-IO-APIC   16-fasteoi   idma64.0, i801_smbus, i2c_designware.0
  82:          4          2         10          1  IR-IO-APIC   82-edge      SYNA7DB5:00

And I believe this is far too many interrupts for a I2C chip.
Mika, Hans, any ideas if the pinctrl is at fault here too? I would have expected c41eb2c7f93531b8 which is in v4.17 to fix that, but it seems it's not the case.
Comment 8 guimarcalsilva 2018-07-19 10:20:27 UTC
(In reply to Benjamin Tissoires from comment #7)
> It seems to be the same kind of issue we have in
> https://bugzilla.kernel.org/show_bug.cgi?id=198473

Well, in my case the touchpad works normally until it stops moving and starts clicking everytime I move my finger.


> In the two links in the previous comment, we see:
> cat /proc/interrupts
>             CPU0       CPU1       CPU2       CPU3       
>   16:     126908      27190     205425      53703  IR-IO-APIC   16-fasteoi  
> idma64.0, i801_smbus, i2c_designware.0
>   82:          4          2         10          1  IR-IO-APIC   82-edge     
> SYNA7DB5:00

I can't post my "cat /proc/interrupts" now but if I remember well on my computer it looks exactly the same.
Comment 9 Mika Westerberg 2018-07-23 16:32:41 UTC
Looking at the /proc/interrupts the touchpad seems to use APIC interrupt (not GPIO) so I don't think pinctrl driver is involved in this at all.
Comment 10 Hans de Goede 2018-07-27 12:46:28 UTC
Hmm, the IOAPIC being used for the interrupt is weird, can you attach an acpidump for the machine please?
Comment 11 guimarcalsilva 2018-07-30 17:22:26 UTC
Created attachment 277611 [details]
My computer's interrupts
Comment 12 guimarcalsilva 2018-07-30 17:28:40 UTC
Created attachment 277613 [details]
ACPI

I don't know if i'm doing this correctly, but here's my acpidump (learned here: http://smackerelofopinion.blogspot.com/2009/10/dumping-acpi-tables-using-acpidump-and.html)
Comment 13 guimarcalsilva 2018-07-30 17:30:04 UTC
Created attachment 277615 [details]
Extracted

Guess all files are here
Comment 14 guimarcalsilva 2018-08-02 11:24:46 UTC
Created attachment 277663 [details]
All hardware log

Here's Suse hardware log. Interesting enough there's some parts saying it's a PS/2 Mouse, but if I remember well on Mint it doesn't show that.
Comment 15 guimarcalsilva 2018-08-02 13:05:20 UTC
Created attachment 277665 [details]
DMESG Again

Bug happened again, here's a new dmesg log. I don't know if this one is any different, but this time I was benchmarking with Unigine Valley GPU benchmark to confirm a suspicion I had, and I confirmed it:

Before that I used the computer about 3 times, about 40 minutes each without any problems. I'm certain the bug only happens when i'm stressing the GPU (possibly the CPU too). It already happened when using both OpenGL and Vulkan on Dolphin-emulator, when using the Valley benchmark and when I was taking a look at some 3D screensavers, when I use the computer to browse the web, or any other simple task the bug doesn't get triggered.
Comment 16 guimarcalsilva 2018-08-02 13:43:29 UTC
Created attachment 277667 [details]
Dmesg with Irqpool boot option is different.

Now I tried with the IRQPOOL option and dmesg seems to be different after the bug happens, before it would show: "CPU: 1 PID: 0 Comm: swapper/1 Not tainted", now it shows "CPU: 1 PID: 3280 Comm: Audio thread -  Not tainted".

This part is also different:

Before irqpool:

[ 1022.761949] RIP: 0010:cpuidle_enter_state+0xbc/0x2e0
[ 1022.761950] RSP: 0018:ffff97d000d23eb0 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffdb
[ 1022.761952] RAX: 0000000000000001 RBX: 000000ee215c3334 RCX: 000000000000001f
[ 1022.761953] RDX: 000000ee215c3334 RSI: 0000000000022740 RDI: 0000000000000000
[ 1022.761953] RBP: 0000000000000001 R08: 0000024532af73c8 R09: 0000000000000005
[ 1022.761954] R10: 00000000ffffffff R11: ffff89a8d14a1ea8 R12: ffffb7cfffc90870
[ 1022.761955] R13: ffffffff920d96d8 R14: 000000ee215c1668 R15: 0000000000000000
[ 1022.761960]  do_idle+0x21d/0x270
[ 1022.761962]  cpu_startup_entry+0x5f/0x70
[ 1022.761964]  start_secondary+0x1a0/0x1e0
[ 1022.761966]  secondary_startup_64+0xa5/0xb0

After irqpool:

[ 1011.718836] RIP: 0033:0x7f1a9dcc7720
[ 1011.718837] RSP: 002b:00007f1a40de63c8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffdb
[ 1011.718839] RAX: 0000000000000000 RBX: 00007f1a3c003640 RCX: 00007f1a3c000d58
[ 1011.718840] RDX: 0000000000000001 RSI: 00007f1a40de63d7 RDI: 000000000000002a
[ 1011.718841] RBP: 0000000000000400 R08: 00000000000000fb R09: 0000000000000000
[ 1011.718841] R10: 0000000000000fb0 R11: 0000000000000202 R12: 0000000000000000
[ 1011.718842] R13: 0000000000000000 R14: 00007f1a3c012cd0 R15: 0000000000000000


With or without irqpool the bug still happens.
Comment 17 guimarcalsilva 2018-08-02 15:51:00 UTC
If I execute "modprobe -r i2c_hid" and "modprobe -f i2c_hid" the following shows up in dmesg (tried several times, doesn't work):

[ 7869.023842] i2c_designware i2c_designware.0: controller timed out
[ 7869.053169] i2c_designware i2c_designware.0: timeout in disabling adapter
[ 7869.053181] hid (null): reading report descriptor failed
[ 7869.053192] i2c_hid i2c-ELAN0501:00: can't add hid device: -5
[ 7869.053322] i2c_hid: probe of i2c-ELAN0501:00 failed with error -5
[ 7870.845461] i2c_designware i2c_designware.0: timeout in disabling adapter
[ 7934.239998] i2c_designware i2c_designware.0: controller timed out
[ 7934.269769] i2c_designware i2c_designware.0: timeout in disabling adapter
[ 7934.269781] hid (null): reading report descriptor failed
[ 7934.269792] i2c_hid i2c-ELAN0501:00: can't add hid device: -5
[ 7934.269992] i2c_hid: probe of i2c-ELAN0501:00 failed with error -5
[ 7935.837163] i2c_designware i2c_designware.0: timeout in disabling adapter
[ 7956.607886] i2c_designware i2c_designware.0: controller timed out
[ 7956.637430] i2c_designware i2c_designware.0: timeout in disabling adapter
[ 7956.637442] hid (null): reading report descriptor failed
[ 7956.637453] i2c_hid i2c-ELAN0501:00: can't add hid device: -5
[ 7956.637662] i2c_hid: probe of i2c-ELAN0501:00 failed with error -5
[ 7957.853402] i2c_designware i2c_designware.0: timeout in disabling adapter
[ 7968.314580] i2c_hid: module_layout: kernel tainted.
[ 7968.314582] Disabling lock debugging due to kernel taint
[ 7968.314639] i2c_hid: module verification failed: signature and/or required key missing - tainting kernel
[ 7970.431926] i2c_designware i2c_designware.0: controller timed out
[ 7970.463044] i2c_designware i2c_designware.0: timeout in disabling adapter
[ 7970.463059] hid (null): reading report descriptor failed
[ 7970.463071] i2c_hid i2c-ELAN0501:00: can't add hid device: -5
[ 7970.463289] i2c_hid: probe of i2c-ELAN0501:00 failed with error -5
[ 7971.870741] i2c_designware i2c_designware.0: timeout in disabling adapter

Note You need to log in before you can comment on or make changes to this bug.