Bug 218493
Summary: | MMC hang / deadlock on reboot | ||
---|---|---|---|
Product: | Drivers | Reporter: | Gabriel C (gabrielcorneanu) |
Component: | MMC/SD | Assignee: | drivers_mmc-sd |
Status: | NEW --- | ||
Severity: | normal | ||
Priority: | P3 | ||
Hardware: | ARM | ||
OS: | Linux | ||
Kernel Version: | 6.5, 6.1, 5.1 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Gabriel C
2024-02-14 11:59:22 UTC
Pulling the plug... everything is updated. Waiting... I get further traces (not from me): [ 612.280732] INFO: task swupdate-progre:181 blocked for more than 245 seconds. [ 612.287905] Tainted: G O 6.5.13 #4 [ 612.293262] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 612.301125] task:swupdate-progre state:D stack:0 pid:181 ppid:175 flags:0x00000004 [ 612.309498] __schedule from schedule+0x58/0xa8 [ 612.314108] schedule from schedule_timeout+0xe4/0x160 [ 612.319279] schedule_timeout from __wait_for_common+0x90/0x1b0 [ 612.325249] __wait_for_common from __flush_work+0x194/0x324 [ 612.330982] __flush_work from __cancel_work_timer+0x11c/0x1fc [ 612.336851] __cancel_work_timer from mmc_host_classdev_shutdown+0x14/0x1c [ 612.343774] mmc_host_classdev_shutdown from device_shutdown+0x14c/0x230 [ 612.350510] device_shutdown from kernel_restart+0x40/0x98 [ 612.356042] kernel_restart from __do_sys_reboot+0x10c/0x200 [ 612.361742] __do_sys_reboot from ret_fast_syscall+0x0/0x54 [ 612.367334] Exception stack(0xf0985fa8 to 0xf0985ff0) [ 612.372414] 5fa0: 00013131 00000004 fee1dead 28121969 01234567 00000010 [ 612.380594] 5fc0: 00013131 00000004 00000000 00000058 00000000 00013124 00000004 00000064 [ 612.388787] 5fe0: b6ecbe3c bec921bc 0001147c b6ecbe5c Tried to login on console!: [ 1103.890654] INFO: task login:205 blocked for more than 491 seconds. [ 1103.896930] Tainted: G O 6.5.13 #4 [ 1103.902285] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1103.910131] task:login state:D stack:0 pid:205 ppid:1 flags:0x00000004 [ 1103.918519] __schedule from schedule+0x58/0xa8 [ 1103.923098] schedule from io_schedule+0x3c/0x4c [ 1103.927770] io_schedule from bit_wait_io+0x14/0x5c [ 1103.932683] bit_wait_io from __wait_on_bit+0x7c/0x13c [ 1103.937893] __wait_on_bit from out_of_line_wait_on_bit+0x78/0x9c [ 1103.944021] out_of_line_wait_on_bit from __ext4_find_entry+0x2d4/0x61c [ 1103.950699] __ext4_find_entry from ext4_lookup+0xa8/0x24c [ 1103.956217] ext4_lookup from __lookup_slow+0x8c/0x134 [ 1103.961428] __lookup_slow from lookup_one_unlocked+0xa4/0xb8 [ 1103.967209] lookup_one_unlocked from ovl_lookup_single+0x54/0x334 [ 1103.973443] ovl_lookup_single from ovl_lookup_layer+0x144/0x188 [ 1103.979487] ovl_lookup_layer from ovl_lookup+0x114/0xd18 [ 1103.984907] ovl_lookup from __lookup_slow+0x8c/0x134 [ 1103.990006] __lookup_slow from walk_component+0xf4/0x164 [ 1103.995429] walk_component from path_lookupat+0x7c/0x1a4 [ 1104.000865] path_lookupat from filename_lookup+0xbc/0x154 [ 1104.006367] filename_lookup from user_path_at_empty+0x4c/0x64 [ 1104.012233] user_path_at_empty from do_faccessat+0x120/0x2b4 [ 1104.018029] do_faccessat from ret_fast_syscall+0x0/0x54 [ 1104.023360] Exception stack(0xf097dfa8 to 0xf097dff0) [ 1104.028442] dfa0: b6faab8c 000cf1a8 000b9d1e 00000000 00000000 00000001 [ 1104.036619] dfc0: b6faab8c 000cf1a8 00000000 00000021 00000000 00000003 000ce6f0 000b9cbc [ 1104.044816] dfe0: 000ce3e8 be96ac5c 0001d060 b6f27b30 [ 1104.049897] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings Sorry, I made a mistake: "exfat" should read "ext4" everywhere (p2, p3)... I think isolated the problem... nothing to do with the write/update. mmc_rescan (polling) during shutdown hangs: Alive fails (timeout?), triggers a card remove, and mmc_remove_card hangs in device_del() call. __mmc_stop_host (reboot context) also hangs in cancel_delayed_work_sync() call The "rescan_disable" flag in mmc/core.c usually helps, but it's too late. I could easily force the bug (mmc/core.c) by delaying the rescan_disable flag. -at the end of mmc_rescan, requeue for faster polling: out: if (host->caps & MMC_CAP_NEEDS_POLL) mmc_schedule_delayed_work(&host->detect, HZ / 10); //gc: poll often -in __mmc_stop_host(), add a delay to allow mm_rescan to enter void __mmc_stop_host(struct mmc_host *host) { pr_info("__mmc_stop_host\n"); msleep(100); //gc: force bug! .... Then just reboot from console. |