With the update to Linux kernel 4.11.x on Archlinux on my Dell XPS 9333, left and right click events are no longer working. The touchpad itself works fine and tap-to-click works, but the physical left and right clicks don't do anything. This makes it really impossible to use the system for any length of time. Reverting to kernel 4.10.x brought back the left/right clicking, so it seems to be kernel specific. There are quite a few people reporting this issue on the Archlinux forums, and it seems to be people with XPS 13 9333 laptops, with at least one person with a XPS 13 9343 NOT reporting the issue. I am guessing there is some device specific regression which is driving this bug.
It appears that on 4.11.x my laptop is trying to make use of hid_rmi to control the touchpad, whereas on kernels <4.11.x hid_multitouch was used (at least from what I can tell from lsmod).
I can also confirm on Dell 3542 On kernel 4.10 my touchpad was identified as: 'DLL0651:00 06CB:2985' With Synaptics Capabilities (306): 1, 0, 0, 1, 1, 1, 0 On kernel 4.11 my touchpad was identified as: Device 'Synaptics s3203': With Synaptics Capabilities (307): 0, 0, 0, 1, 1, 1, 0 I think the regression 1st appeared in v4.11-rc1 HID: rmi: Make hid-rmi a transport driver for synaptics-rmi4 HID: rmi: Handle all Synaptics touchpads using hid-rmi HID: rmi: fallback to generic/multitouch if hid-rmi is not built
That's weird. "HID: rmi: Handle all Synaptics touchpads using hid-rmi" has been reverted in 84379d83d8e536aef2c and it has been shipped since v4.11-rc7. Looking at the code in v4.11.3 shows that the faulty commit has been reverted and you still should be using hid-multitouch. Are you using a vanilla v4.11.x or one shipped by your distribution?
4.11.3-1-ARCH #1 SMP PREEMPT Sun May 28 10:40:17 CEST 2017 x86_64 GNU/Linux I didn't know it had been reverted, I was on Opensuse tumbleweed until they reached the 4.11 (when the regression started) Hopped on to Ubuntu 17.10 thinking it was a distro bug and the same thing happened on 4.11, (I tried the v4.11-rc1 to see when the regression began and it was that kernel) I started using Antergos.
Created attachment 256879 [details] dmesg after cold boot on kernel 4.11.3
Created attachment 256885 [details] Xorg log after cold boot 4.11.3
I have the same issue on my Dell 9Q33 (similar innards to 9333). After bisecting, the issue does indeed appear after the introduction of hid_rmi. It seems that the f54 probe has a race condition that corrupts a future read, and when the probing tries to read the next function's data, it gets a garbled mess. I'm attaching a hacky patch as a workaround while I look into the root cause, can you try it on a 9333?
Created attachment 256895 [details] Workaround via delay
I confirm that the patch helps on 9333 too. I applied it to v4.11 and now my touchpad has clickfinger listed as click method in libinput-list-devices output and I can click by pressing the touchpad. This is on Dell XPS 13 model 9333.
Yes, it does sound like the F54 probe maybe causing an issue here. There have been previous problems with the 9333 i2c bus. We even have a function which was added to hid-rmi called rmi_check_sanity() which works around instances where we get incorrect data. The F54 probe does a a larger then normal read (27 bytes) which results in the response being reported in two consecutive HID reports. It's possible that these two reports are causing an issue. It looks like F54 is only looking at the first 6 bytes so we could reduce the size of the read to 6 bytes and get the response in a single report. That might be worth trying just to confirm that this is what is causing the issue. But, if this system has issues with reading lots of data we should probably just disable F54 for this device. F54 is used for diagnostics and tries to read hundreds of bytes when it is being used. We can add a flag to prevent it from loading.
Can confirm that both changing F54_QUERY_LEN to 6 and disabling the f54 outright result in proper behavior on my 9q33. Either would work as a fix.
I also can confirm that setting F54_QUERY_LEN to 6 'fixed' the issue on my XPS 9333. The touch pad works sometimes without any modifications (In like 1 out of 10 boots), so this seems to be a timing problem.
OK, thanks for testing. I will prepare a patch that limits the reads in F54 to 6 bytes.
I am having the same issues on a Dell Latitude E7270 running Fedora 25. Two finger scrolling also seems to drop off intermittently and have a delay as well. Could anyone tell me how to install the patch? I am not very familiar with Bugzilla.
The easiest way to fix this is to simply disable the RMI4_F54 option in your kernel config, no patch needed (though you must still rebuild it). As it stands, F54 won't work right on our machines without some more extensive quirks anyways. I haven't been experiencing the two-finger scrolling issue though, so YMMV.
Created attachment 257071 [details] 4.11 s3203 Workaround F54_QUERY_LEN
@allen832008@gmail.com: You can apply this: https://bugzilla.kernel.org/attachment.cgi?id=257071&action=diff patch with: patch -p1 -i "${srcdir}/4.11_s3203_workaround.patch" and rebuild your kernel afterwards.
I posted my version of the patch to the kernel mailing list: https://lkml.org/lkml/2017/6/20/1020
Any chance that this fix will make it into a 4.11.x release? I'd hate to wait till 4.12 to have a working touchpad
Problem fixed in 4.12.x but now the touchpad is disabled after suspend. (touchscreen working fine) dmesg: [ 226.363005] hid-rmi 0018:06CB:2734.0001: rmi_hid_read_block: timeout elapsed [ 227.376382] hid-rmi 0018:06CB:2734.0001: rmi_hid_read_block: timeout elapsed [ 228.389699] hid-rmi 0018:06CB:2734.0001: rmi_hid_read_block: timeout elapsed [ 229.402955] hid-rmi 0018:06CB:2734.0001: rmi_hid_read_block: timeout elapsed [ 230.416357] hid-rmi 0018:06CB:2734.0001: rmi_hid_read_block: timeout elapsed [ 230.416363] rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed to read current IRQ mask. [ 230.416377] dpm_run_callback(): i2c_hid_resume+0x0/0xf0 [i2c_hid] returns -11 [ 230.416385] PM: Device i2c-DLL060A:00 failed to resume async: error -11 [ 230.416478] PM: resume of devices complete after 5193.346 msecs Don't know if it's a coincidence thought. Other people experienced it with 4.11.x as well: https://bugzilla.redhat.com/show_bug.cgi?id=1442699