Bug 207159
Summary: | Kernel trace at crypto_wait_for_test with follow-up lockups for kernel task cryptomgr_probe and systemd-udev | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Kai Lüke (kai) |
Component: | x86-64 | Assignee: | platform_x86_64 (platform_x86_64) |
Status: | NEW --- | ||
Severity: | high | CC: | bp, carnil, ebiggers3, gerow, herbert, kai |
Priority: | P1 | ||
Hardware: | Other | ||
OS: | Linux | ||
Kernel Version: | 5.6.2 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Kai Lüke
2020-04-08 09:05:33 UTC
I've seen what looks like the same issue come up on 5.6.7 and 5.6.14. This kernel includes some debian patches, for what it's worth. I rebuilt it with lock debugging and it looks like some notifier-related semaphore lock is being held by a bunch of different tasks. [ 183.821049] WARNING: CPU: 0 PID: 138 at crypto/algapi.c:404 crypto_wait_for_test+0x59/0x70 [ 183.829429] Modules linked in: crc32c_intel(+) efivarfs crc_t10dif(+) crct10dif_generic crct10dif_pclmul crct10dif_common virtio_scsi scsi_mod virtio_net net_failover failover aesni_intel(+) libaes crypto_simd cryptd glue_helper virtio_pci i2c_piix4 psmouse virtio_ring virtio button [ 183.854635] CPU: 0 PID: 138 Comm: systemd-udevd Not tainted 5.6.14 #1 [ 183.861296] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 183.870633] RIP: 0010:crypto_wait_for_test+0x59/0x70 [ 183.875710] Code: 8d fd ff ff 48 8d bd 88 01 00 00 e8 21 fc 48 00 85 c0 75 16 48 89 ee bf 02 00 00 00 e8 40 db ff ff 48 89 ef 5d e9 67 e0 ff ff <0f> 0b eb f3 0f 0b eb ef 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 [ 183.894612] RSP: 0018:ffff9fb44022fc08 EFLAGS: 00010286 [ 183.899948] RAX: 00000000fffffe00 RBX: ffffffffc02342b8 RCX: 0000000000000000 [ 183.907200] RDX: 00000000fffffe00 RSI: 0000000000000006 RDI: ffff8d805a5f0040 [ 183.914447] RBP: ffff8d7ff4822c00 R08: 0000000000000000 R09: 0000000000000000 [ 183.921706] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 183.928957] R13: ffffffffc023423a R14: ffff8d8047dec000 R15: ffffffffc0234200 [ 183.936205] FS: 00007fe19d5f1880(0000) GS:ffff8d805a400000(0000) knlGS:0000000000000000 [ 183.944421] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 183.950289] CR2: 000056440e23f070 CR3: 0000000134888006 CR4: 00000000003606f0 [ 183.957550] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 183.964799] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 183.972046] Call Trace: [ 183.974610] crypto_register_alg+0x56/0x70 [ 183.978833] simd_aead_create_compat+0x15d/0x180 [crypto_simd] [ 183.984784] simd_register_aeads_compat+0x64/0x630 [crypto_simd] [ 183.990905] ? 0xffffffffc0242000 [ 183.994349] aesni_init+0xff/0x1000 [aesni_intel] [ 183.999171] ? do_one_initcall+0x5d/0x2c0 [ 184.003304] ? do_init_module+0x23/0x230 [ 184.007341] ? rcu_read_lock_sched_held+0x52/0x60 [ 184.012155] ? do_init_module+0x23/0x230 [ 184.016200] ? kmem_cache_alloc_trace+0x260/0x2a0 [ 184.021017] ? do_init_module+0x5c/0x230 [ 184.025062] ? load_module+0x2381/0x2570 [ 184.029109] ? __do_sys_finit_module+0xaa/0x110 [ 184.033761] ? __do_sys_finit_module+0xaa/0x110 [ 184.038421] ? do_syscall_64+0x5a/0x230 [ 184.042379] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 184.047724] irq event stamp: 9906 [ 184.051148] hardirqs last enabled at (9905): [<ffffffff8ecdbb54>] _raw_spin_unlock_irq+0x24/0x30 [ 184.060137] hardirqs last disabled at (9906): [<ffffffff8e40391c>] trace_hardirqs_off_thunk+0x1a/0x1c [ 184.069471] softirqs last enabled at (9782): [<ffffffff8f00035d>] __do_softirq+0x35d/0x3f1 [ 184.077957] softirqs last disabled at (9767): [<ffffffff8e494949>] irq_exit+0xb9/0xc0 [ 184.085958] ---[ end trace ec497009adfc1f9a ]--- [ 242.923659] INFO: task systemd-udevd:137 blocked for more than 120 seconds. [ 242.931295] Tainted: G W 5.6.14 #1 [ 242.936766] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.944731] systemd-udevd D 0 137 124 0x80004004 [ 242.950390] Call Trace: [ 242.953007] ? __schedule+0x370/0x960 [ 242.956805] ? schedule+0x4a/0xb0 [ 242.960287] ? rwsem_down_write_slowpath+0x32c/0x640 [ 242.965431] ? blocking_notifier_chain_register+0x26/0x60 [ 242.970997] ? blocking_notifier_chain_register+0x26/0x60 [ 242.976562] ? 0xffffffffc0188000 [ 242.980043] ? crc_t10dif_mod_init+0xc/0x1000 [crc_t10dif] [ 242.985665] ? do_one_initcall+0x5d/0x2c0 [ 242.989806] ? do_init_module+0x23/0x230 [ 242.993911] ? rcu_read_lock_sched_held+0x52/0x60 [ 242.998819] ? do_init_module+0x23/0x230 [ 243.002877] ? kmem_cache_alloc_trace+0x260/0x2a0 [ 243.007754] ? do_init_module+0x5c/0x230 [ 243.011815] ? load_module+0x2381/0x2570 [ 243.015925] ? __do_sys_finit_module+0xaa/0x110 [ 243.020675] ? __do_sys_finit_module+0xaa/0x110 [ 243.025438] ? do_syscall_64+0x5a/0x230 [ 243.029439] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 243.034808] INFO: task cryptomgr_probe:201 blocked for more than 120 seconds. [ 243.042116] Tainted: G W 5.6.14 #1 [ 243.047472] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.055559] cryptomgr_probe D 0 201 2 0x80004000 [ 243.061211] Call Trace: [ 243.063824] ? __schedule+0x370/0x960 [ 243.067622] ? schedule+0x4a/0xb0 [ 243.071060] ? rwsem_down_read_slowpath+0x27e/0x600 [ 243.076112] ? down_read+0x99/0x120 [ 243.079788] ? down_read+0x99/0x120 [ 243.083430] ? blocking_notifier_call_chain+0x34/0x70 [ 243.088769] ? crypto_probing_notify+0x20/0x50 [ 243.093456] ? crypto_wait_for_test+0x50/0x70 [ 243.097949] ? crypto_register_instance+0xe9/0x110 [ 243.102882] ? cryptd_create+0x450/0x490 [cryptd] [ 243.107724] ? cryptomgr_probe+0x38/0xc0 [ 243.111783] ? kthread+0xf9/0x130 [ 243.115229] ? cryptomgr_notify+0x580/0x580 [ 243.119582] ? kthread_park+0x90/0x90 [ 243.123504] ? ret_from_fork+0x3a/0x50 [ 243.127475] [ 243.127475] Showing all locks held in the system: [ 243.133797] 1 lock held by khungtaskd/18: [ 243.138058] #0: ffffffff8f85fd00 (rcu_read_lock){....}, at: debug_show_all_locks+0x15/0x161 [ 243.146786] 1 lock held by systemd-udevd/137: [ 243.151905] #0: ffffffff8f8d4568 ((crypto_chain).rwsem){++++}, at: blocking_notifier_chain_register+0x26/0x60 [ 243.162332] 1 lock held by cryptomgr_test/195: [ 243.167107] #0: ffffffff8f8d4568 ((crypto_chain).rwsem){++++}, at: blocking_notifier_call_chain+0x34/0x70 [ 243.176922] 1 lock held by cryptomgr_probe/201: [ 243.181591] #0: ffffffff8f8d4568 ((crypto_chain).rwsem){++++}, at: blocking_notifier_call_chain+0x34/0x70 [ 243.191427] 1 lock held by modprobe/255: [ 243.195480] #0: ffffffff8f8d4568 ((crypto_chain).rwsem){++++}, at: blocking_notifier_call_chain+0x34/0x70 [ 243.205274] [ 243.206884] ============================================= Worryingly blocking_notifier_chain_register gets a write lock while blocking_notifier_call_chain gets a read lock, which if I'm reading the "held locks" lines correctly that means the semaphore is simultaneously held by read holders and write holders (it's entirely possible I'm reading this wrong and all these tasks are just simultaneously waiting for something else). Can you blacklist crc-t10dif and see if this still happens? There is a race condition in crc-t10dif in that it registers the notifier too early and it may race against its own crypto alloc call. > Can you blacklist crc-t10dif and see if this still happens? There is a race
> condition in crc-t10dif in that it registers the notifier too early and it
> may race against its own crypto alloc call.
I don't think that can be it, because the stack traces show that crc_t10dif_mod_init() is blocked on crypto_register_notifier(). So it never got to crypto_alloc_shash().
I'm more inclined to blame the fact that cryptomgr_schedule_probe() waits for the requested module to be loaded, while it holds crypto_chain.rwsem for read via blocking_notifier_call_chain().
That will deadlock if loading the requested module needs to wait for crc-t10dif.ko to also be loaded, for any reason. I don't immediately see why that would happen, as crc-t10dif.ko is a library module, not a crypto API module. Maybe userspace is serializing the module loading? It would be helpful to have a way to reproduce this.
Also, I'm guessing that crc-t10dif.ko is being inserted as a dependency of the SCSI disk module sd.ko, so blacklisting it wouldn't work.
Can you try this patch?
diff --git a/crypto/algboss.c b/crypto/algboss.c
index 535f1f87e6c1..5ebccbd6b74e 100644
--- a/crypto/algboss.c
+++ b/crypto/algboss.c
@@ -178,8 +178,6 @@ static int cryptomgr_schedule_probe(struct crypto_larval *larval)
if (IS_ERR(thread))
goto err_put_larval;
- wait_for_completion_interruptible(&larval->completion);
-
return NOTIFY_STOP;
Applying that patch seems to fix the lockup for me. I can look to try and find a minimal repro. I have other configurations that run in GCE that don't hit this lockup, so something unique about the order modules are loaded or userspace comes up sounds like a plausible trigger. OK, after some more experimentation I think the trigger here was Debian moving some of the CRYPTO_* libs into modules. Moving them back into the kernel image seems to make the problem go away. Which I guess still means there might be a bug here, but unfortunately I can't bound it within v5.5..v5.6 because the trigger wasn't a commit but a config change. I'll add another comment once I figure out what the actual config changes are. Thanks Eric. Your patch looks correct, I have no idea why I put that wait in there at all :) The actual larval users would be waiting after the notify returns anyway. Please send it to the list when you're ready. Ah, I assumed the patch was just for debugging purposes, if that's the actual fix that's fantastic. Thanks a lot! |