Bug 200737 - i801: driver SMBus access protection breaks touchpad on ThinkPad T560
Summary: i801: driver SMBus access protection breaks touchpad on ThinkPad T560
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: I2C (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Drivers/I2C virtual user
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-08-05 21:51 UTC by Yussuf Khalil
Modified: 2018-11-12 13:53 UTC (History)
4 users (show)

See Also:
Kernel Version: 4.18-rc7
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
acpidump output (556.08 KB, text/plain)
2018-08-05 21:51 UTC, Yussuf Khalil
Details
Uninstall OpRegion handler on suspend (2.06 KB, patch)
2018-08-17 08:36 UTC, Mika Westerberg
Details | Diff
dmesg output with patch applied (642.59 KB, text/plain)
2018-08-19 19:37 UTC, Yussuf Khalil
Details
Allow AML OpRegion access during suspend (1.48 KB, patch)
2018-08-20 11:21 UTC, Mika Westerberg
Details | Diff
dmesg output with "Allow AML OpRegion access during suspend" applied (83.85 KB, text/plain)
2018-08-20 19:34 UTC, Yussuf Khalil
Details
dmidecode output (15.95 KB, text/plain)
2018-08-21 10:06 UTC, Yussuf Khalil
Details
Allow IO access for non-SMBus ports (1.34 KB, patch)
2018-08-21 14:32 UTC, Mika Westerberg
Details | Diff
dmesg output with "Allow IO access for non-SMBus ports" applied (83.57 KB, text/plain)
2018-08-21 18:12 UTC, Yussuf Khalil
Details

Description Yussuf Khalil 2018-08-05 21:51:09 UTC
Created attachment 277691 [details]
acpidump output

The ThinkPad T560 uses a Synaptics touchpad that supports both PS/2 and the modern RMI4 protocol using SMBus, however, enabling RMI4 causes the touchpad to completely stop working after resuming from suspend. I've found that the root cause seems to lie in i801, which prohibits the RMI4 driver from accessing the touchpad as the ACPI code touches it shortly before entering suspend.

The following snippet from dmesg demonstrates the issue (I've added a call to dump_stack() to i801_acpi_io_handler):
[   71.757981] PM: suspend entry (deep)
[   71.757983] PM: Syncing filesystems ... done.
[   71.871368] Freezing user space processes ... (elapsed 0.003 seconds) done.
[   71.874975] OOM killer disabled.
[   71.874975] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[   71.876147] Suspending console(s) (use no_console_suspend to debug)
[   71.917145] sd 1:0:0:0: [sda] Synchronizing SCSI cache
[   71.918676] sd 1:0:0:0: [sda] Stopping disk
[   71.979574] e1000e: EEE TX LPI TIMER: 00000011
[   71.982200] CPU: 0 PID: 179 Comm: kworker/0:3 Tainted: G     U            4.18.0-rc7+ #3
[   71.982203] Hardware name: LENOVO 20FH002RGE/20FH002RGE, BIOS N1KET39W (1.26 ) 05/28/2018
[   71.982214] Workqueue: kacpid acpi_os_execute_deferred
[   71.982218] Call Trace:
[   71.982232]  dump_stack+0x5c/0x80
[   71.982243]  i801_acpi_io_handler.cold.15+0x5/0x40 [i2c_i801]
[   71.982251]  acpi_ev_address_space_dispatch+0x2de/0x380
[   71.982257]  ? i801_suspend+0x30/0x30 [i2c_i801]
[   71.982264]  acpi_ex_access_region+0x449/0x4e8
[   71.982270]  ? acpi_ex_prep_field_value+0x4ce/0x4ce
[   71.982277]  acpi_ex_field_datum_io+0x181/0x41a
[   71.982283]  acpi_ex_write_with_update_rule+0xab/0x1fd
[   71.982288]  ? acpi_ns_search_and_enter+0x39a/0x4e3
[   71.982294]  acpi_ex_insert_into_field+0x241/0x29e
[   71.982300]  acpi_ex_write_data_to_field+0x464/0x49f
[   71.982308]  acpi_ex_store_object_to_node+0x2cf/0x324
[   71.982317]  acpi_ex_store+0xb0/0x441
[   71.982323]  ? acpi_ut_trace_str+0x26/0x68
[   71.982329]  acpi_ex_opcode_1A_1T_1R+0x429/0x5b1
[   71.982337]  acpi_ds_exec_end_op+0x110/0x6bd
[   71.982342]  acpi_ps_parse_loop+0xa3b/0xaf6
[   71.982349]  ? acpi_ds_result_push+0x82/0x1d2
[   71.982354]  acpi_ps_parse_aml+0x1a2/0x4af
[   71.982360]  acpi_ps_execute_method+0x1ef/0x2ab
[   71.982365]  acpi_ns_evaluate+0x2e4/0x42c
[   71.982370]  acpi_ev_asynch_execute_gpe_method+0xb5/0x14e
[   71.982375]  acpi_os_execute_deferred+0x16/0x20
[   71.982382]  process_one_work+0x1a1/0x350
[   71.982389]  worker_thread+0x30/0x380
[   71.982396]  ? wq_update_unbound_numa+0x1a0/0x1a0
[   71.982401]  kthread+0x112/0x130
[   71.982406]  ? kthread_create_worker_on_cpu+0x70/0x70
[   71.982412]  ret_from_fork+0x35/0x40
[   71.982420] i801_smbus 0000:00:1f.4: BIOS is accessing SMBus registers
[   71.982423] i801_smbus 0000:00:1f.4: Driver SMBus register access inhibited

After resume, the RMI4 driver spews all kinds of SMBus-related errors which can be traced to i801_access() returning EBUSY due to acpi_reserved being set.

This feature was introduced by commit a7ae81952cdab56a1277bd2f9ed7284c0f575120 ("i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR") to fix https://bugzilla.kernel.org/show_bug.cgi?id=110041

Everything works fine if I comment the priv->acpi_reserved = true line in i801_acpi_io_handler().

Not sure whether it's helpful, but I've attached the output of acpidump.
Comment 1 Mika Westerberg 2018-08-17 08:36:18 UTC
Created attachment 277907 [details]
Uninstall OpRegion handler on suspend

Can you try the attached patch? Regardless whether it works or not, please attach full dmesg of the test run to the bug as well.
Comment 2 Yussuf Khalil 2018-08-19 19:37:16 UTC
Created attachment 277979 [details]
dmesg output with patch applied

Thank you very much for the patch. Indeed, the touchpad keeps working fine after resuming from suspend with the patch applied, however, I am getting a bunch of ACPI errors in dmesg.
Comment 3 Mika Westerberg 2018-08-20 11:21:00 UTC
Created attachment 277983 [details]
Allow AML OpRegion access during suspend

Indeed I forgot that acpi_remove_address_space_handler() does not reset the default handler. Here's a new patch that allows OpRegion access during suspend. Can you give it a try?
Comment 4 Yussuf Khalil 2018-08-20 19:34:27 UTC
Created attachment 277989 [details]
dmesg output with "Allow AML OpRegion access during suspend" applied

This one doesn't work for me. I've attached the dmesg output again. Notably, I am now getting the "Driver SMBus register access inhibited" message right after resume.
Comment 5 Mika Westerberg 2018-08-21 09:04:37 UTC
I looked more closely ACPI tables of this system and it seems that the BIOS AML code actually does not touch the SMBbus I/O registers at all. It just reads iTCO BAR and then tries to write there. I think here the correct fix would be to not to install this special OpRegion handler at all.

Can you attach output of dmidecode to the bug?
Comment 6 Yussuf Khalil 2018-08-21 10:06:49 UTC
Created attachment 277999 [details]
dmidecode output

Sure, here you go.
Comment 7 Mika Westerberg 2018-08-21 14:32:53 UTC
Created attachment 278011 [details]
Allow IO access for non-SMBus ports

Here is another patch to try out. This one should allow IO access from AML if it does not touch SMBus ports.
Comment 8 Yussuf Khalil 2018-08-21 18:12:31 UTC
Created attachment 278015 [details]
dmesg output with "Allow IO access for non-SMBus ports" applied

Thank you, this one indeed fixes the issue without breaking anything as far as I can see. I've attached the dmesg output again.
Comment 9 Mika Westerberg 2018-08-22 13:28:37 UTC
Great, thanks for testing. I'll send a formal patch out soon.
Comment 10 mirh 2018-09-01 13:53:33 UTC
Ehrm.. Just for the records, are you sure bug 110041 would still be a thing in the first place?

I ask because while investigating bug 200713, I found out my almost identical warning message to get solved since ACPICA improved module level code handling.
Comment 11 Mika Westerberg 2018-09-03 05:45:44 UTC
Yes it is because this is not about wrong ASL code but the fact that the BIOS uses the same I/O space with the SMBus device and we need to be sure that they don't mess with each other.
Comment 12 Jean Delvare 2018-11-12 13:53:00 UTC
Fix has been committed meanwhile, so closing:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=7fd6d98b89f382d414e1db528e29a67bbd749457

Note You need to log in before you can comment on or make changes to this bug.