Bug 210895 - Touchpad touches are not read randomly for several seconds (Elantech)
Summary: Touchpad touches are not read randomly for several seconds (Elantech)
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Input Devices (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: drivers_input-devices
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-25 04:37 UTC by John
Modified: 2021-03-28 07:16 UTC (History)
4 users (show)

See Also:
Kernel Version: 5.11.5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg capture right after incident, within a few seconds of fresh boot (67.85 KB, text/plain)
2020-12-29 06:42 UTC, John
Details
# hexdump /dev/input/event9 | tee /tmp/hexdumpEvent9.txt (188.00 KB, text/plain)
2020-12-31 05:14 UTC, John
Details

Description John 2020-12-25 04:37:39 UTC
OVERVIEW:
I'm having an issue where my touchpad will be working perfectly, and then it will just stop moving the cursor for seconds at a time. clicking works, but anything tap-related is inoperable.  When the problem arises, it happens several times, and is instigated when my finger falls on the pad.  Only a single finger on the pad is needed to prompt the issue, but it is not predictable when it will happen or what causes it to start working normally again for some time.  I have noticed, however, that the touchpad seems to resume normal operation if I stop using it for 5 seconds or so.

I recorded this phenomenon.  It can be seen here (Debug output on the screen is from # libinput debug-events):
https://youtu.be/dyyGGoo7c6M

I used libinput's debug tools to check what was happening there.  The debug output stops when the cursor stops responding, which indicates to me that the stream from the /dev/input/event* kernel device stops.  This is the bug that I am bringing forward with this bug report.

There are no error messages related to this in dmesg output. It seems like the kernel is completely unaware that input from the touchpad is being apparently ignored.

STEPS TO REPRODUCE:
It's happens repeatedly, but it not predicatably.
1. Bring single finger to touchpad.
2. Drag finger.  No cursor movement will result for approx. 3 seconds.
3. Lift finger and repeat from step 1.  Result will repeat.

ACTUAL RESULT:
Cursor will not respond to touchpad inputs for several seconds.
Output stream of /dev/input/event* handle is still, even though finger is touching the pad.

EXPECTED RESULTS:
cursor will follow finger movements for as long as finger is on the touchpad.  No noticeable hesitation will occur when finger lands on the pad.
kernel event info through the /dev/input/event* handle will constantly report even the finger on the pad even if the finger does not move.

BUILD DATE AND HARDWARE:
computer identifying info from dmesg:
kernel: DMI: ASUSTeK COMPUTER INC. VivoBook_ASUSLaptop X513IA_M513IA/X513IA, BIOS X513IA.303 07/08/2020

$ uname -a
Linux hostname 5.9.14-arch1-1 #1 SMP PREEMPT Sat, 12 Dec 2020 14:37:12 +0000 x86_64 GNU/Linux

# libinput list-devices
Device:           ELAN1300:00 04F3:3104 Touchpad
Kernel:           /dev/input/event7
Group:            7
Seat:             seat0, default
Size:             103x69mm
Capabilities:     pointer gesture
Tap-to-click:     disabled
Tap-and-drag:     enabled
Tap drag lock:    disabled
Left-handed:      disabled
Nat.scrolling:    disabled
Middle emulation: disabled
Calibration:      n/a
Scroll methods:   *two-finger edge 
Click methods:    *button-areas clickfinger 
Disable-w-typing: enabled
Accel profiles:   flat *adaptive
Rotation:         n/a

OTHER PLATFORMS:
No such issues with Windows 10 on the same hardware. I have therefore ruled-out hardware failure.
Comment 1 sganes 2020-12-29 06:25:49 UTC
HI,

Can you share the dmesg log?.

Thanks,
Sganes
Comment 2 John 2020-12-29 06:42:53 UTC
Created attachment 294387 [details]
dmesg capture right after incident, within a few seconds of fresh boot
Comment 3 John 2020-12-29 07:21:28 UTC
I've been experimenting with loaded modules to see which ones are integral on my hardware.  

apologies if the following details are already well known, but I'm including it to be thorough.
i2c_hid module is required on my hardware. No touchpad input works without it (makes sense, since the touchpad is a i2c device).
hid_multitouch loads along with i2c_hid.  The behavior described in the main bug report happens with these two hid modules loaded at a minimum.
I can unload hid_multitouch and load hid_generic, but the touchpad taps are interpreted as absolute position with hid_generic and every tap is considered a click.  It is practically unusable, but works nevertheless.
Comment 4 sganes 2020-12-29 08:06:06 UTC
Hi John,

Can you try the same experiment with the below command instead of using libinput's? 

 $hexdump /dev/input/event7 


If you are getting the same issue, Can you check the following sysfs entries are created in the below path?

/sys/bus/serio/drivers/psmouse/serio/*
debug
paritycheck
crc_enabled

if these created can you disable all features by using echo?

I hope the issue may due to debug logs overhead.

echo 0 > /sys/bus/serio/drivers/psmouse/serio/debug
echo OFF > /sys/bus/serio/drivers/psmouse/serio/paritycheck
echo 0 > /sys/bus/serio/drivers/psmouse/serio/crc_enabled

then do the same experiment.

Thanks,
Sganes
Comment 5 John 2020-12-30 05:00:38 UTC
(In reply to sganes from comment #4)
> Hi John,
> 
> Can you try the same experiment with the below command instead of using
> libinput's? 
> 
>  $hexdump /dev/input/event7 
> 
> 
> If you are getting the same issue, Can you check the following sysfs entries
> are created in the below path?
> 
> /sys/bus/serio/drivers/psmouse/serio/*
> debug
> paritycheck
> crc_enabled
> 
> if these created can you disable all features by using echo?
> 
> I hope the issue may due to debug logs overhead.
> 
> echo 0 > /sys/bus/serio/drivers/psmouse/serio/debug
> echo OFF > /sys/bus/serio/drivers/psmouse/serio/paritycheck
> echo 0 > /sys/bus/serio/drivers/psmouse/serio/crc_enabled
> 
> then do the same experiment.
> 
> Thanks,
> Sganes

What is the experiment to which you are referring?  The one where I am watching the output of "libinput debug-events"?
I will record the output when the error happens and hopefully trim the file to a relevant window of time.

As for the sysfs entries, I do not have /sys/bus/serio/drivers/psmouse/serio/*
The path on my computer ends at /sys/bus/serio/drivers/   there's no psmouse folder.  Instead, I have serio_raw and atkdb, and I don't see the debug, paritycheck, and crc_enabled files in either of those folders.
Comment 6 John 2020-12-31 05:14:13 UTC
Created attachment 294439 [details]
# hexdump /dev/input/event9 | tee /tmp/hexdumpEvent9.txt

the hexdump output.

I piped the output to tee so I could see the output of hexdump in the terminal while writing to the file

I started the recording as soon as I felt the hesitation. I captured 3 or 4 moments where the hexdump output was silent while my finger was on the pad.  As soon as I determined that the track was likely going to behave for another minute or two, I canceled the recording.
Comment 7 André Menrath 2021-01-03 08:48:00 UTC
I can reproduce this bug on a slightly different Hardware. Note that we both have an ELAN-Touchpad and a new AMD Ryzen 4000 Series.

# inxi -Fz
Machine:   Type: Laptop System: LENOVO product: 82A2 v: Yoga Slim 7 14ARE05 serial: <filter> 
           Mobo: LENOVO model: LNVNB161216 v: SDK0J40709
           UEFI: LENOVO v: DMCN34WW date: 08/17/2020 
CPU:       Info: 8-Core model: AMD Ryzen 7 4800U with Radeon Graphics bits: 64 

$ uname -a
Linux yoga 5.10.2-2-MANJARO #1 SMP PREEMPT Tue Dec 22 08:14:42 UTC 2020 x86_64 GNU/Linux

# libinput list-devices
Device:           ELAN0634:00 04F3:3124 Touchpad
Kernel:           /dev/input/event8
Group:            7
Seat:             seat0, default
Size:             100x61mm
Capabilities:     pointer gesture
Tap-to-click:     disabled
Tap-and-drag:     enabled
Tap drag lock:    disabled
Left-handed:      disabled
Nat.scrolling:    disabled
Middle emulation: disabled
Calibration:      n/a
Scroll methods:   *two-finger edge 
Click methods:    *button-areas clickfinger 
Disable-w-typing: enabled
Accel profiles:   flat *adaptive
Rotation:         n/a

# One additional description:
When this issue happens, the freeze, it will last at least as long as I keep at least one finger in contact with the touchpad. It might only resolve, when I lift all fingers. And most of the time lifting all fingers is as well the end of the freeze.

My dmesg dumps etc look similar to those provided by John.
Comment 8 André Menrath 2021-01-04 19:39:12 UTC
I just came across https://www.spinics.net/lists/platform-driver-x86/msg24007.html

Jiaxun Yang writes:
"[...] ELAN0634 touchpad do not use EC to switch touchpad.
Reading VPCCMD_R_TOUCHPAD will return zero thus touchpad may be blocked
unexpectedly.
Writing VPCCMD_W_TOUCHPAD may cause a spurious key press."

Maybe this could be connected with our issue of a "suddenly randomly blocked" touchpad.
Comment 9 John 2021-01-07 07:47:58 UTC
It sure sounds relevant. Doing a grep on the kernel source, VPCCMD_R_TOUCHPAD is only defined in ideapad-laptop.c, so it's not strictly relevant.  Investigating how it is used there could provide good insight as to what might be happening here, or what needs to change.

I threw some debug messages in the i2c-hid module to learn how it's put together and this is what I got:

[15945.561802] i2c_hid: entered function i2c_hid_irq
[15945.561803] i2c_hid: calling i2c_hid_get_input from i2c_hid_irq.
[15945.561803] i2c_hid: entered i2c_hid_get_input
[15945.562405] i2c_hid: return value after calling i2c_master_recv: 16
[15945.562420] ret_size: 16
[15945.562421] i2c_hid i2c-ELAN1300:00: input: 10 00 04 03 0e 06 64 04 38 48 01 80 10 34 00 00
[15945.568310] i2c_hid: entered function i2c_hid_irq
[15945.568311] i2c_hid: calling i2c_hid_get_input from i2c_hid_irq.
[15945.568312] i2c_hid: entered i2c_hid_get_input
[15945.568913] i2c_hid: return value after calling i2c_master_recv: 16
[15945.568914] ret_size: 16
[15945.568917] i2c_hid i2c-ELAN1300:00: input: 10 00 04 03 0e 06 63 04 7e 48 01 80 0e 34 00 00

when the module is first loaded into the kernel, i2c_hid_probe is called.  Among other things, it calls i2c_hid_init_irq, which calls request_threaded_irq(), which registers i2c_hid_irq() as the interrupt thread function.

it's evident in the debug that this is happening because touch events trigger direct calls to i2c_hid_irq, which spits-out the 16-byte data that is piped to /dev/input/event*

This is where my digging has stopped. I haven't yet discovered what triggers the interrupt, and why touch events are randomly not triggering interrupts for seconds at a time
Comment 10 André Menrath 2021-01-07 21:46:45 UTC
Maybe we are looking at the wrong end. Do you also have a i8042 PS/2 controller?

I stumbled upon this in the Gentoo-Wiki:
https://wiki.gentoo.org/wiki/Synaptics#Troubleshooting

I also have this line in dmesg though my touchpad is getting recognized.
i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp

I will have to test this further until I can say for sure if this is related to the bug.
Comment 11 John 2021-01-08 03:32:24 UTC
I do have that line as well:
i8042: PNP: PS/2 Controller [PNP030b:PS2K] at 0x60,0x64 irq 1
i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
serio: i8042 KBD port at 0x60,0x64 irq 1
input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input3

looks like the PS/2 device is the keyboard--not the touchpad. don't seem related to me.
Comment 12 John 2021-03-18 05:45:12 UTC
After some digging in the i2c-hid module code (i2c-hid-core.c), I found that  
* i2c_hid_irq()  is the function that is registered through request_threaded_irq() to handle the interrupts.  On my computer, client->irq = 76, which corresponds to the entry in /sys/kernel/input/76.  The hwirq mapped to the kernel-space irq (76 in this case) is consistently 16.'

Here's the progression of data as  I've been able to determine:
Hw interrupt (hwirq) 16---> maps to kernel IRQ 76---> which is linked to client->name ($ cat /sys/kernel/irq/76/action)---> which will trigger a call to function i2c_hid_irq() in i2c-hid-core.c --> which will call i2c_hid_get_input() in i2c-hid-core.c, which will call i2c_master_recv(), which will return the 16-byte buffer of data that libinput can use.  

using i2c-hid with debug enabled, I can see when i2c_hid_irq is getting called.  When the cursor no longer responds to touchpad inputs, this function is no longer being called.  the output of "hexdump /dev/input/event*" (also, hid-recorder) pauses, which is an indicator of the same.

And since libinput is receiving no buffer data from the kernel, it's output stops because there's no data incoming data stream to process.  So it appears that the issue is with the hardware interrupt getting to the registered function in i2c-hid-core.c, and therefore no input events get processed.

So it looks like this issue is could be in AMD's interrupt handler?
Comment 13 Michael 2021-03-28 07:16:43 UTC
I can confirm this issue:
ASUS TUF Gaming FX505DT-BQ455 (AMD Ryzen 5 3550H)

Reloding the module solve the issue until next reboot: 
$ modprobe -r i2c_hid
$ modprobe i2c_hid

Note You need to log in before you can comment on or make changes to this bug.