Bug 215269 - change in cros_ec_typec from linux 5.14 to linux 5.15 causes kernel oops on every boot
Summary: change in cros_ec_typec from linux 5.14 to linux 5.15 causes kernel oops on e...
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Platform (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_platform@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-12-08 20:58 UTC by jannf
Modified: 2022-04-20 22:38 UTC (History)
3 users (show)

See Also:
Kernel Version: 5.15
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel messages of the oops (5.40 KB, text/plain)
2021-12-08 20:58 UTC, jannf
Details
dmesg from 5.17.0 RC3 with oops (134.56 KB, text/plain)
2022-02-14 00:34 UTC, Jason M.
Details

Description jannf 2021-12-08 20:58:30 UTC
Created attachment 299945 [details]
kernel messages of the oops

since the update from linux 5.14 to linux 5.15, I get a kernel oops on every boot and the system behaves strangely on shutdown and can only be turned off using long button press.
SysRQ sync does work, but SysRQ unmount and reboot does not work anymore.

I seem to have found the commit that causes this also: https://github.com/torvalds/linux/commit/a8db7a3f8ac69e558c7bfbd04802201c39a104ad

Kernel messages during boot are attached as textfile oops.txt and full dmesg can be provided on request.

I am using arch linux with linux package version 5.15.6.arch2-1 on a CTL chromebox (WUKONG) with Intel i7 and NVMe drive.
Comment 1 bleung 2021-12-09 16:38:30 UTC
Thank you for the report.

It looks like you're using a custom bios as well. Do you have any information about that?
Comment 2 jannf 2021-12-09 17:26:06 UTC
Yes of course, it is the latest public coreboot/EDK2 build of Mr.Chromebox aka Matt DeVillier.

https://github.com/MrChromebox/coreboot/commits/2021.07.25
https://github.com/MrChromebox/edk2/commits/uefipayload_202107
https://github.com/MrChromebox/chrome-ec/branches/all

The GSC firmware should be the latest as of CrOS 94 when I flashed the custom firmware.

If you need the output of the EC console, I can also provide that.
Comment 3 Jason M. 2022-02-14 00:33:22 UTC
I see this on a Google Pixelbook running Fedora as well. If I don't build or blacklist the cros_ec_typec module things work normally. Otherwise slow boots with an oops and can't shut down properly.

I started seeing this with 5.15 as well.
Comment 4 Jason M. 2022-02-14 00:34:28 UTC
Created attachment 300448 [details]
dmesg from 5.17.0 RC3 with oops
Comment 6 Jason M. 2022-03-02 20:58:01 UTC
It looks like the fix for the kernel is in linux-tree. Is there any chance of this making it into 5.17?
Comment 7 Jason M. 2022-03-02 20:58:49 UTC
It looks like the fix for the kernel is in linux-next, rather.
Comment 8 Jason M. 2022-04-08 16:30:36 UTC
I am still seeing an oops with 5.17.2 and 5.18-rc1 which both appear to have this.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/platform/chrome?h=v5.17.2&id=4015c654a5f33a85e592447374e264a9bc842590

[    4.092229] BUG: kernel NULL pointer dereference, address: 0000000000000304
[    4.092235] #PF: supervisor read access in kernel mode
[    4.092237] #PF: error_code(0x0000) - not-present page
[    4.092240] PGD 0 P4D 0 
[    4.092243] Oops: 0000 [#1] PREEMPT SMP PTI
[    4.092247] CPU: 3 PID: 624 Comm: systemd-udevd Not tainted 5.17.2-300.fc36.x86_64 #1
[    4.092250] Hardware name: Google Eve/Eve, BIOS MrChromebox-4.14 08/06/2021
[    4.092252] RIP: 0010:cros_ec_check_features+0xa/0xa0
[    4.092258] Code: c4 10 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc bd f4 ff ff ff eb e7 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 fd 53 <83> bf 04 03 00 00 ff 89 f3 74 23 85 db 8d 43 1f 89 d9 0f 49 c3 83
[    4.092262] RSP: 0018:ffffac190094bc60 EFLAGS: 00010246
[    4.092265] RAX: ffff96c08314a400 RBX: 0000000000000000 RCX: 0000000000000001
[    4.092267] RDX: 000000000bc6e003 RSI: 0000000000000029 RDI: 0000000000000000
[    4.092269] RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000004
[    4.092271] R10: ffff96c085974660 R11: 0000000000000003 R12: ffff96c08142fc10
[    4.092273] R13: 0000000000000000 R14: 00007fc91518a43c R15: ffffac190094be80
[    4.092276] FS:  00007fc914477580(0000) GS:ffff96c3eed80000(0000) knlGS:0000000000000000
[    4.092279] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.092281] CR2: 0000000000000304 CR3: 0000000108682003 CR4: 00000000003706e0
[    4.092284] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    4.092286] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    4.092288] Call Trace:
[    4.092290]  <TASK>
[    4.092294]  cros_typec_probe+0xca/0x53f [cros_ec_typec]
[    4.092302]  platform_probe+0x44/0x90
[    4.092307]  really_probe+0x1f3/0x3c0
[    4.092311]  __driver_probe_device+0xfc/0x170
[    4.092314]  driver_probe_device+0x1f/0x90
[    4.092317]  __driver_attach+0xbb/0x190
[    4.092321]  ? __device_attach_driver+0xe0/0xe0
[    4.092324]  bus_for_each_dev+0x62/0x90
[    4.092327]  bus_add_driver+0x14e/0x1f0
[    4.092330]  driver_register+0x89/0xd0
[    4.092334]  ? 0xffffffffc0a68000
[    4.092340]  do_one_initcall+0x44/0x200
[    4.092345]  ? do_init_module+0x22/0x250
[    4.092348]  ? kmem_cache_alloc_trace+0x161/0x2c0
[    4.092353]  do_init_module+0x4a/0x250
[    4.092356]  __do_sys_init_module+0x127/0x180
[    4.092360]  do_syscall_64+0x3a/0x80
[    4.092365]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    4.092370] RIP: 0033:0x7fc91502fa5e
[    4.092373] Code: 48 8b 0d bd 03 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8a 03 0e 00 f7 d8 64 89 01 48
[    4.092376] RSP: 002b:00007ffe3a4121d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[    4.092379] RAX: ffffffffffffffda RBX: 000055c0e3e27d00 RCX: 00007fc91502fa5e
[    4.092381] RDX: 00007fc91518a43c RSI: 000000000000b1b6 RDI: 000055c0e464e4b0
[    4.092383] RBP: 00007fc91518a43c R08: 000055c0e3d6e6f0 R09: 00007ffe3a40f236
[    4.092385] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000020000
[    4.092387] R13: 000055c0e3e20b20 R14: 0000000000000000 R15: 000055c0e3e25ca0
[    4.092392]  </TASK>
[    4.092393] Modules linked in: cros_ec_typec(+) ecdh_generic(+) joydev mc rfkill cros_usbpd_notify processor_thermal_device_pci_legacy snd_timer idma64(+) processor_thermal_device typec snd intel_xhci_usb_role_switch processor_thermal_rfim processor_thermal_mbox soundcore cros_ec_i2c cros_ec_lpcs intel_vbtn processor_thermal_rapl sparse_keymap cros_ec soc_button_array intel_rapl_common int3403_thermal int340x_thermal_zone intel_pch_thermal intel_soc_dts_iosf int3400_thermal acpi_thermal_rel cros_kbd_led_backlight zram xfs i915 sdhci_pci cqhci crct10dif_pclmul sdhci crc32_pclmul nvme crc32c_intel hid_multitouch nvme_core mmc_core ghash_clmulni_intel serio_raw ttm i2c_hid_acpi i2c_hid video pinctrl_sunrisepoint ip6_tables ip_tables fuse
[    4.092438] CR2: 0000000000000304
[    4.092441] ---[ end trace 0000000000000000 ]---
[    4.092443] RIP: 0010:cros_ec_check_features+0xa/0xa0
[    4.092448] Code: c4 10 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc bd f4 ff ff ff eb e7 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 fd 53 <83> bf 04 03 00 00 ff 89 f3 74 23 85 db 8d 43 1f 89 d9 0f 49 c3 83
[    4.092451] RSP: 0018:ffffac190094bc60 EFLAGS: 00010246
[    4.092453] RAX: ffff96c08314a400 RBX: 0000000000000000 RCX: 0000000000000001
[    4.092455] RDX: 000000000bc6e003 RSI: 0000000000000029 RDI: 0000000000000000
[    4.092457] RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000004
[    4.092459] R10: ffff96c085974660 R11: 0000000000000003 R12: ffff96c08142fc10
[    4.092461] R13: 0000000000000000 R14: 00007fc91518a43c R15: ffffac190094be80
[    4.092463] FS:  00007fc914477580(0000) GS:ffff96c3eed80000(0000) knlGS:0000000000000000
[    4.092466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.092468] CR2: 0000000000000304 CR3: 0000000108682003 CR4: 00000000003706e0
[    4.092470] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    4.092472] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Comment 9 Jason M. 2022-04-20 22:38:04 UTC
MrChromebox-4.16 has been released and fixes the issue.

Note You need to log in before you can comment on or make changes to this bug.