Bug 208171

Summary: lenovo ideapad 5 14ARE05 touchpad randomly failing
Product: Drivers Reporter: Ole Petersen (peteole2707)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: REOPENED ---    
Severity: normal CC: andri, bjorn.forsman, bjorn, dev, fgrieco, gouge.tristan, jml86khakons, kxra, louismichel, marius.andreiana, matteo.mazzarelli, norbertrom01, peteole2707, radupantiru, tim, tinozzo123, voqelfrei
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.7 Subsystem:
Regression: No Bisected commit-id:
Attachments: acpidump.txt
dmesg-patched
attachment-26413-0.html

Description Ole Petersen 2020-06-14 22:21:02 UTC
On my lenovo ideapad 5 14ARE05 random decides whether the touchpad works or not, but it stays the same until a second boot. Unlike the 15 inch version it has the following touchpad:
Device 'MSFT0004:00 06CB:CD98 Touchpad':
When it does not work, I get the following kernel error every few seconds:
i2c_designware AMDI0010:00: controller timed out
It is recognized by xinput in inoperational condition as well.
However, the behavior is not totally random. Chances are much better if I shut the computer down and then power it on after a few seconds compared to hitting the reboot button.
I am using manjaro linux, but also tried ubuntu and fedora, it's the same on all distros as well as on the 5.6 kernel.
I have been wondering if it is possible to completely restart the touchpad until it works, but I could not find out which component need to be restarted for that and how to do it. This would be a great functionality to fix all kinds of hardware bugs.
Another idea would be to find out what makes the touchpad go into the working or not working state. I would be pleased for any ideas.
Comment 1 Louis-Michel Raynauld 2020-06-27 07:47:57 UTC
I have the same touchpad issues with the same laptop IdeaPad 5 14ARE05

I added the kernel boot parameters:
"dynbg=file drivers/input/* +pt" i8042.debug=1 i8042.nopnp=1 log_buf_len=32M

Then I could get the following logs:
On kernel 5.7.6:
(no touchpad at all)
i2c_hid i2c-MSFT0004:00: HID over i2c has not been provided an Int IRQ
i2c_hid: probe of i2c-MSFT0004:00 failed with error -22

On kernel 5.8.2-rc2 mainline:
(touchpad acting limited)
i2c_hid i2c-MSFT0004:00: supply vdd not found, using dummy regulator
i2c_hid i2c-MSFT0004:00: supply vddl not found, using dummy regulator
i2c_hid i2c-MSFT0004:00: failed to retrieve report from device.
input: MSFT0004:00 06CB:CD98 Mouse as /devices/platform/AMDI0010:00/i2c-0/i2c-MSFT0004:00/0018:06CB:CD98.0001/input/input13
input: MSFT0004:00 06CB:CD98 Touchpad as /devices/platform/AMDI0010:00/i2c-0/i2c-MSFT0004:00/0018:06CB:CD98.0001/input/input14
hid-generic 0018:06CB:CD98.0001: input,hidraw0: I2C HID v1.00 Mouse [MSFT0004:00 06CB:CD98] on i2c-MSFT0004:00
i2c_hid i2c-MSFT0004:00: failed to retrieve report from device.
input: MSFT0004:00 06CB:CD98 Mouse as /devices/platform/AMDI0010:00/i2c-0/i2c-MSFT0004:00/0018:06CB:CD98.0001/input/input16
input: MSFT0004:00 06CB:CD98 Touchpad as /devices/platform/AMDI0010:00/i2c-0/i2c-MSFT0004:00/0018:06CB:CD98.0001/input/input17
hid-multitouch 0018:06CB:CD98.0001: input,hidraw0: I2C HID v1.00 Mouse [MSFT0004:00 06CB:CD98] on i2c-MSFT0004:00

Hope that can helps us all find a fix for that issue.
Comment 2 Ole Petersen 2020-06-27 12:08:57 UTC
Thanks! I also realized that chances are better for the touchpad to work if it worked on the last boot. I believe the kernel does strange things with it when it spams me with "controller timed out" messages on shutdown. I wrote a script which prevents the system from having these shutdown issues. It unloads some modules and then reloads them. Unfortunately it only helps for the next boot, the touchpad is never even recognized after running it.


sudo modprobe -r i2c_hid 
sudo modprobe -r i2c_piix4
sudo modprobe -r hid_multitouch
sudo modprobe i2c_hid 
sudo modprobe i2c_piix4
sudo modprobe hid_multitouch

It helps to break "bad boot streaks"
Comment 3 Louis-Michel Raynauld 2020-06-28 06:56:38 UTC
Interesting find indeed when doing
sudo modprobe -r i2c_hid 
sudo modprobe i2c_hid 
The "controller timed out" messages stops. Although that does not fix the touchpad it does suggest that this module is at cause.

After further digging, I got another set of log after booting on mainline (appended below). Basically, I see that the following shows up first before the "time out spamming" starts:

i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
i2c_designware AMDI0010:00: controller timed out
i2c_hid i2c-MSFT0004:00: failed to retrieve report from device.

Now, I found these 3 exact log messages in this commit to fix another DELL Win8 touchpad in 2015:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.8-rc2&id=6d4f5440a3a2bb2e9d0d582bbf98234e9e9bb095

Maybe some learning from this commit can lead us to fix this for the MSFT touchpad.

Here my latest log on mainline:
---------------
i2c_hid i2c-MSFT0004:00: supply vdd not found, using dummy regulator
i2c_hid i2c-MSFT0004:00: supply vddl not found, using dummy regulator
i2c_designware AMDI0010:00: i2c_dw_handle_tx_abort: lost arbitration
i2c_designware AMDI0010:00: controller timed out
i2c_hid i2c-MSFT0004:00: failed to retrieve report from device.
input: MSFT0004:00 06CB:CD98 Mouse as /devices/platform/AMDI0010:00/i2c-0/i2c-MSFT0004:00/0018:06CB:CD98.0001/input/input13
input: MSFT0004:00 06CB:CD98 Touchpad as /devices/platform/AMDI0010:00/i2c-0/i2c-MSFT0004:00/0018:06CB:CD98.0001/input/input14
hid-generic 0018:06CB:CD98.0001: input,hidraw0: I2C HID v1.00 Mouse [MSFT0004:00 06CB:CD98] on i2c-MSFT0004:00
i2c_designware AMDI0010:00: controller timed out
i2c_hid i2c-MSFT0004:00: failed to retrieve report from device.
i2c_designware AMDI0010:00: controller timed out
i2c_hid i2c-MSFT0004:00: failed to retrieve report from device.
i2c_designware AMDI0010:00: controller timed out
i2c_hid i2c-MSFT0004:00: failed to retrieve report from device.
i2c_designware AMDI0010:00: controller timed out
i2c_designware AMDI0010:00: controller timed out
i2c_hid i2c-MSFT0004:00: failed to retrieve report from device.
input: MSFT0004:00 06CB:CD98 Mouse as /devices/platform/AMDI0010:00/i2c-0/i2c-MSFT0004:00/0018:06CB:CD98.0001/input/input17
input: MSFT0004:00 06CB:CD98 Touchpad as /devices/platform/AMDI0010:00/i2c-0/i2c-MSFT0004:00/0018:06CB:CD98.0001/input/input18
hid-multitouch 0018:06CB:CD98.0001: input,hidraw0: I2C HID v1.00 Mouse [MSFT0004:00 06CB:CD98] on i2c-MSFT0004:00
i2c_designware AMDI0010:00: controller timed out
i2c_designware AMDI0010:00: controller timed out
i2c_hid i2c-MSFT0004:00: failed to set a report to device.
Comment 4 Ole Petersen 2020-07-11 11:19:32 UTC
Interestingly there are phases which last a few days where it works perfectly. I am wondering which events trigger these phases. Maybe some windows update which makes it go into another shutdown state? Or some updates deep in the kernel? It once changed after trying to install a dkms module to get virtualbox running...
Comment 5 Tim Richardson 2020-07-11 12:02:12 UTC
Created attachment 290225 [details]
attachment-6042-0.html

In my case, the laptop is my son's. Tonight, I gave up on manjaro, and
installed ubuntu. Touchpad didn't work. Booted into windows ... and it also
didn't work (which was new, my son who has been using it for a week and
half has not seen this before). Was not visible in Device Manager, and
scanning for new hardware did not reveal anything.  I could not find any
Lenovo drivers to download.. I rebooted a few times. No touchpad in
windows. But there was a BIOS update, (1.06 I think, from July 8, change
note says something about SI03 fixes). I applied that. Secure Boot was
reenabled, and the trackpad worked again in windows. But still no good in
Ubuntu. This is with mainline 5.7.8 and 5.8 RC4. The BIOS update readme
said nothing about touchpad, but it looks like flashing the BIOS fixed the
problem (in Windows).

On Sat, 11 Jul 2020 at 21:19, <bugzilla-daemon@bugzilla.kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=208171
>
> --- Comment #4 from Ole Petersen (peteole2707@gmail.com) ---
> Interestingly there are phases which last a few days where it works
> perfectly.
> I am wondering which events trigger these phases. Maybe some windows update
> which makes it go into another shutdown state? Or some updates deep in the
> kernel? It once changed after trying to install a dkms module to get
> virtualbox
> running...
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 6 Tim Richardson 2020-07-13 11:25:17 UTC
I think that the touchpad can be reliably reset to a working state in either Windows or Linux by shutdown without AC power attached. In other circumstances, a restart can still leave the laptop with a non-working touchpad, including in Windows.
Comment 7 Alex Maras 2020-08-01 07:44:50 UTC
I have the same issue - am happy to help with logs or debug if needed. With either 5.7 or 5.8rc7, and with no dual-boot at all (linux only), the trackpad will sometimes fail to come up on boot.

The state will sometimes change when coming out of hibernation as well, without a full shutdown/reboot. Today, the trackpad wasn't working. After hibernating, waiting a while, then starting up again, it returned from hibernation successfully with a working trackpad.
Comment 8 kxra 2020-08-04 18:50:12 UTC
Looping in some other discussion:
* https://bugs.launchpad.net/linux/+bug/1884981
Comment 9 Ole Petersen 2020-09-08 06:40:18 UTC
I just updated to 5.9rc1 and it seems that it fixed the issue. I have not hat a single failure since the update. However I am not sure due to the random behavior
Comment 10 Tim Richardson 2020-09-08 08:12:17 UTC
We've been using 5.9rc since release, now on rc4. The problem is fixed.
Comment 11 matteo.mazzarelli 2020-10-14 13:08:11 UTC
Note that the problem hasn't been fully fixed. It still persists when going out of S3 deep sleep (suspend) mode - i.e. it's random whether it will work or not when resuming. I can however confirm that running 5.9-rc6 the touchpad always works as expected on a new start or on a reboot.
Comment 12 Marius 2021-03-12 20:56:40 UTC
Same issue on lenovo ideapad 5 14ARE05
Linux fedora 5.11.3-300.fc34.x86_64 #1 SMP 

# dmesg | grep i2c
[    1.569895] i2c_hid i2c-MSFT0004:00: HID over i2c has not been provided an Int IRQ
[    1.569930] i2c_hid: probe of i2c-MSFT0004:00 failed with error -22
Comment 13 tinozzo123 2022-05-17 17:28:20 UTC
I have the same laptop, and while on Linux 5.17 I never witnessed the touchpad fail "randomly", I found a consistent way to make it fail: suspend the computer, close the lid, and wake it up while the lid is closed (a way to do so is by plugging/unplugging the charger).
Et voilà, the touchpad won't work, but it will work again by suspending the computer another time and waking it up with the lid open.
Comment 14 Bjorn Helgaas 2022-12-02 21:10:10 UTC
I think this is caused by the PCI core not assigning space for a device leading to the touchpad, the same as bug 216565.

You probably see a line like this in the dmesg log:

  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]

On Ideapads, it should be fixed by https://git.kernel.org/linus/d341838d776a ("x86/PCI: Disable E820 reserved region clipping via quirks"), which appeared in v5.19.

The patch at https://bugzilla.kernel.org/attachment.cgi?id=303237 should fix other machines.  If you test this patch, I'd appreciate a note about whether it works and what machine it was.
Comment 15 tinozzo123 2023-03-22 19:11:01 UTC
I wrote before that on my Lenovo IdeaPad 5 14ARE05 I never got my touchpad failing randomly.
I did write however that by plugging/unplugging the charger the laptop would wake up from suspend, and if done while the lid was closed, then the touchpad would stop working until you suspend and wake up the laptop again.
On kernel 6.2, though, plugging/unplugging the charger doesn't wake up the laptop anymore (or, to be more precise, it does wake it up, but then it puts it back to sleep immediately after), meaning I don't experience this bug in any form anymore.
Comment 16 Bjorn Helgaas 2023-03-22 20:35:52 UTC
(In reply to tinozzo123 from comment #15)
> ... meaning I don't experience this bug in
> any form anymore.

Thanks very much for testing this out and reporting the results!  Please don't hesitate to report any other issues you trip over!

I'm going to close this issue as "resolved" by these commits:

  https://git.kernel.org/linus/d341838d776a ("x86/PCI: Disable E820 reserved region clipping via quirks") (appeared in v5.19)
  https://git.kernel.org/linus/07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map") (appeared in v6.2)

If anybody still sees this issue with v6.2 or later, please re-open this and attach the complete dmesg log when booted with "efi=debug".
Comment 17 radupantiru 2023-11-17 18:23:22 UTC
I am having the same issue. Will enable "efi=boot" hopefully to capture more logs.

root@r-laptop ~]# uname -a
Linux r-laptop 6.5.9-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 26 Oct 2023 00:52:20 +0000 x86_64 GNU/Linux

Nov 17 18:11:24 r-laptop kernel: i2c_designware AMDI0010:00: controller timed out
Nov 17 18:11:24 r-laptop systemd[1]: Using hardware watchdog 'SP5100 TCO timer', version 0, device /dev/watchdog0
Nov 17 18:11:24 r-laptop systemd[1]: Watchdog running with a timeout of 10min.
Nov 17 18:11:24 r-laptop kernel: watchdog: watchdog0: watchdog did not stop!
Comment 18 Bjorn Helgaas 2023-11-17 19:02:01 UTC
radupantiru@: could you also capture the "acpidump" output, please?

We have a couple other reports like https://bugzilla.kernel.org/show_bug.cgi?id=218050 that seem related to 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map").
Comment 19 radupantiru 2023-11-17 19:12:36 UTC
Created attachment 305416 [details]
acpidump.txt

There you are and thanks for looking into this!

On Fri, Nov 17, 2023 at 7:02 PM <bugzilla-daemon@kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=208171
>
> Bjorn Helgaas (bjorn@helgaas.com) changed:
>
>            What    |Removed                     |Added
>
> ----------------------------------------------------------------------------
>              Status|RESOLVED                    |REOPENED
>          Resolution|CODE_FIX                    |---
>
> --- Comment #18 from Bjorn Helgaas (bjorn@helgaas.com) ---
> radupantiru@: could you also capture the "acpidump" output, please?
>
> We have a couple other reports like
> https://bugzilla.kernel.org/show_bug.cgi?id=218050 that seem related to
> 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map").
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 20 Bjorn Helgaas 2023-11-20 23:21:11 UTC
radupantiru@: Your MCFG says there's ECAM space at [mem 0xf8000000-0xfbffffff] for [bus 00-3f].  That space *should* be reserved via a PNP0C01 or PNP0C02 device in the ACPI namespace, but it isn't, so this is likely a BIOS defect.

Can you try the patch I attached here?  https://bugzilla.kernel.org/show_bug.cgi?id=218050#c6 .  If you can, please attach the dmesg log here.
Comment 21 radupantiru 2023-11-23 06:38:00 UTC
Created attachment 305462 [details]
dmesg-patched

Hello,

This is the dmesg for the patched kernel.
6.5.9-arch2-1-custom #1 SMP PREEMPT_DYNAMIC Tue, 21 Nov 2023 17:01:41 +0000
x86_64 GNU/Linux

Thank you!

On Mon, Nov 20, 2023 at 11:21 PM <bugzilla-daemon@kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=208171
>
> --- Comment #20 from Bjorn Helgaas (bjorn@helgaas.com) ---
> radupantiru@: Your MCFG says there's ECAM space at [mem
> 0xf8000000-0xfbffffff]
> for [bus 00-3f].  That space *should* be reserved via a PNP0C01 or PNP0C02
> device in the ACPI namespace, but it isn't, so this is likely a BIOS
> defect.
>
> Can you try the patch I attached here?
> https://bugzilla.kernel.org/show_bug.cgi?id=218050#c6 .  If you can,
> please
> attach the dmesg log here.
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 22 Bjorn Helgaas 2023-11-24 03:38:20 UTC
(In reply to radupantiru from comment #21)
Thanks very much for testing this!  The dmesg shows what I expected:

  PCI: [Firmware Info]: ECAM at [mem 0xf8000000-0xfbffffff] not reserved in ACPI motherboard resources
  PCI: ECAM at [mem 0xf8000000-0xfbffffff] reserved as EfiMemoryMappedIO
  PCI: ECAM [mem 0xf8000000-0xfbffffff] reserved to work around lack of ACPI motherboard _CRS

Just to double-check, I think you were seeing a touchpad issue, right?  And I assume that problem is fixed by this patch?
Comment 23 radupantiru 2023-11-24 08:49:21 UTC
Created attachment 305467 [details]
attachment-26413-0.html

It was indeed the touchpad which failed to work at random intervals. It
will be quite obvious if it fails again and my question is if the patch
will make its way into the future kernels as I will have not to upgrade my
OS in order to monitor the fix.

Thanks,
Radu