Bug 215604 - rmi4: Error in irq_dispose_mapping
Summary: rmi4: Error in irq_dispose_mapping
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Input Devices (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_input-devices
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-02-14 06:47 UTC by torsten.hilbrich
Modified: 2023-06-20 22:09 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.15.22
Subsystem:
Regression: Yes
Bisected commit-id: 24d28e4f1271


Attachments
Logs from the crash on 5.15.22 kernel (107.10 KB, text/plain)
2022-02-14 06:47 UTC, torsten.hilbrich
Details
Patch fixing the problem (604 bytes, patch)
2023-06-14 06:26 UTC, torsten.hilbrich
Details | Diff

Description torsten.hilbrich 2022-02-14 06:47:02 UTC
Created attachment 300449 [details]
Logs from the crash on 5.15.22 kernel

The error was first seen in kernel 5.4.42 with both Lenovo T470s and Lenovo T14 G1. It is still present with kernel 5.15.22.

The kernel has the grsecurity patches active which causes released memory to be poisoned with the 0xfe value (seen in RAX in some traces). This seems to indicate some use after free here.

These logs were created with the touchpad disabled in the UEFI. But the bug is happening also when having them enabled.

The kernel crash does not happen on every boot. I would estimate a crash every 20 boots of the system.

The kernel logs for 5.15.22 on a Lenovo T14G1 are attached:

<3>[   12.943063] rmi4_physical rmi4-00: Failed to read irqs, code=-6
<4>[   12.946167] ------------[ cut here ]------------
<4>[   12.946168] remove_proc_entry: removing non-empty directory 'irq/1', leaking at least 'i8042'
<4>[   12.946176] WARNING: CPU: 2 PID: 128 at fs/proc/generic.c:841 remove_proc_entry+0x16b/0x180
<4>[   12.946183] Modules linked in: pinctrl_cannonlake pinctrl_intel dm_crypt_sina(O) dm_mod efivarfs
...

<4>[   12.946333] WARNING: CPU: 2 PID: 128 at kernel/irq/irqdomain.c:875 irq_dispose_mapping+0x113/0x140
<4>[   12.946337] Modules linked in: pinctrl_cannonlake pinctrl_intel dm_crypt_sina(O) dm_mod efivarfs
<4>[   12.946340] CPU: 2 PID: 128 Comm: kworker/2:3 Tainted: G        W  O    T  5.15.22-grsec+ #1
<4>[   12.946341] Hardware name: LENOVO 20S1S19N00/20S1S19N00, BIOS N2XET00S (3.99 ) 10/08/2020
<4>[   12.946342] Workqueue: events ffffffff81e0b330
<4>[   12.946343] RIP: 0010:[<ffffffff814db363>] irq_dispose_mapping+0x113/0x140
...

<4>[   12.946880] general protection fault, probably for non-canonical address 0xfefefefefefeff46: 0000 [#1] SMP NOPTI
<4>[   12.946926] CPU: 2 PID: 128 Comm: kworker/2:3 Tainted: G        W  O    T  5.15.22-grsec+ #1
<4>[   12.946960] Hardware name: LENOVO 20S1S19N00/20S1S19N00, BIOS N2XET00S (3.99 ) 10/08/2020
<4>[   12.946993] Workqueue: events ffffffff81e0b330
<4>[   12.947012] RIP: 0010:[<ffffffff814db2ec>] irq_dispose_mapping+0x9c/0x140
Comment 1 torsten.hilbrich 2023-06-14 06:26:12 UTC
Created attachment 304419 [details]
Patch fixing the problem

I received a patch from the grsecurity team for this problem. It seems the problem is caused because the device is potentially used in the irq_dispose_mapping call after being put.
Comment 2 torsten.hilbrich 2023-06-14 06:30:03 UTC
Based on the fix the problem was likely introduced with the commit:

24d28e4f1271 Input: synaptics-rmi4 - convert irq distribution to irq_domain

in v4.18.

Note You need to log in before you can comment on or make changes to this bug.