We have been getting somewhat frequently (around 1 per day for about 500 VMs) blocked tasks with tracebacks always similar to this: First we get: [656898.010322] INFO: task kworker/u2:1:458736 blocked for more than 122 seconds. [656898.011988] Not tainted 5.10.70 #1-NixOS [656898.012981] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [656898.014679] task:kworker/u2:1 state:D stack: 0 pid:458736 ppid: 2 flags:0x00004000 [656898.016530] Workqueue: writeback wb_workfn (flush-253:0) [656898.017698] Call Trace: [656898.018310] __schedule+0x271/0x860 [656898.019208] schedule+0x46/0xb0 [656898.019944] rwsem_down_read_slowpath+0x169/0x490 [656898.021008] ? mempool_alloc+0x62/0x170 [656898.022032] xfs_map_blocks+0xb9/0x400 [xfs] [656898.022998] iomap_do_writepage+0x163/0x850 [656898.023587] ? __mod_memcg_lruvec_state+0x21/0xe0 [656898.024044] write_cache_pages+0x186/0x3d0 [656898.024463] ? iomap_migrate_page+0xc0/0xc0 [656898.024882] ? submit_bio_noacct+0x3a9/0x420 [656898.025303] iomap_writepages+0x1c/0x40 [656898.025712] xfs_vm_writepages+0x64/0x90 [xfs] [656898.026146] do_writepages+0x34/0xc0 [656898.026513] __writeback_single_inode+0x39/0x2a0 [656898.026969] writeback_sb_inodes+0x200/0x470 [656898.027388] __writeback_inodes_wb+0x4c/0xe0 [656898.027805] wb_writeback+0x1d8/0x290 [656898.028161] wb_workfn+0x29b/0x4d0 [656898.028526] ? __switch_to_asm+0x42/0x70 [656898.028915] ? __switch_to+0x7b/0x3e0 [656898.029285] process_one_work+0x1df/0x370 [656898.029696] worker_thread+0x50/0x400 [656898.030054] ? process_one_work+0x370/0x370 [656898.030475] kthread+0x11b/0x140 [656898.030798] ? __kthread_bind_mask+0x60/0x60 [656898.031214] ret_from_fork+0x22/0x30 [656898.031592] INFO: task nix-daemon:459204 blocked for more than 122 seconds. [656898.032259] Not tainted 5.10.70 #1-NixOS [656898.032695] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [656898.033451] task:nix-daemon state:D stack: 0 pid:459204 ppid: 1067 flags:0x00000000 [656898.034261] Call Trace: [656898.034538] __schedule+0x271/0x860 [656898.034881] schedule+0x46/0xb0 [656898.035226] xfs_log_commit_cil+0x6a4/0x800 [xfs] [656898.035709] ? wake_up_q+0xa0/0xa0 [656898.036073] __xfs_trans_commit+0x9d/0x310 [xfs] [656898.036561] xfs_setattr_nonsize+0x342/0x520 [xfs] [656898.037027] notify_change+0x348/0x4c0 [656898.037407] ? chmod_common+0xa1/0x150 [656898.037779] chmod_common+0xa1/0x150 [656898.038139] do_fchmodat+0x5a/0xb0 [656898.038491] __x64_sys_chmod+0x17/0x20 [656898.038860] do_syscall_64+0x33/0x40 [656898.039221] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [656898.039722] RIP: 0033:0x7f3d9b3a3b77 [656898.040072] RSP: 002b:00007ffc388e2b18 EFLAGS: 00000206 ORIG_RAX: 000000000000005a [656898.040797] RAX: ffffffffffffffda RBX: 00007ffc388e2e10 RCX: 00007f3d9b3a3b77 [656898.041542] RDX: 0000000000008049 RSI: 0000000000008124 RDI: 00000000016a2400 [656898.042222] RBP: 00007ffc388e32a0 R08: 00007ffc388e4628 R09: 0000000000000000 [656898.042913] R10: 0000000000000000 R11: 0000000000000206 R12: 00007ffc388e4620 [656898.043604] R13: 0000000000000000 R14: 00007ffc388e2df0 R15: 00007ffc388e2e10 and then a while later we see this: [657020.886633] INFO: task qemu-ga:750 blocked for more than 122 seconds. [657020.888346] Not tainted 5.10.70 #1-NixOS [657020.889482] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [657020.891072] task:qemu-ga state:D stack: 0 pid: 750 ppid: 1 flags:0x00000080 [657020.891906] Call Trace: [657020.892188] __schedule+0x271/0x860 [657020.892547] ? kvm_sched_clock_read+0xd/0x20 [657020.892983] schedule+0x46/0xb0 [657020.893319] rwsem_down_write_slowpath+0x218/0x480 [657020.893802] thaw_super+0x12/0x20 [657020.894130] __x64_sys_ioctl+0x62/0xb0 [657020.894491] do_syscall_64+0x33/0x40 [657020.894860] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [657020.895396] RIP: 0033:0x7f69a55f2b17 [657020.895762] RSP: 002b:00007ffe5687b548 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [657020.896470] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f69a55f2b17 [657020.897144] RDX: 0000000000000000 RSI: 00000000c0045878 RDI: 0000000000000006 [657020.897815] RBP: 0000000000000000 R08: 00007f69a5477c08 R09: 0000000000000000 [657020.898479] R10: 0000000000000000 R11: 0000000000000246 R12: 000055ef74a15190 [657020.899155] R13: 00007ffe5687b5c0 R14: 00007ffe5687b560 R15: 0000000000000006 [657020.899850] INFO: task kworker/u2:1:458736 blocked for more than 245 seconds. [657020.900513] Not tainted 5.10.70 #1-NixOS [657020.900942] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [657020.901672] task:kworker/u2:1 state:D stack: 0 pid:458736 ppid: 2 flags:0x00004000 [657020.902469] Workqueue: writeback wb_workfn (flush-253:0) [657020.902976] Call Trace: [657020.903220] __schedule+0x271/0x860 [657020.903556] schedule+0x46/0xb0 [657020.903870] rwsem_down_read_slowpath+0x169/0x490 [657020.904321] ? mempool_alloc+0x62/0x170 [657020.904996] xfs_map_blocks+0xb9/0x400 [xfs] [657020.905412] iomap_do_writepage+0x163/0x850 [657020.905823] ? __mod_memcg_lruvec_state+0x21/0xe0 [657020.906269] write_cache_pages+0x186/0x3d0 [657020.906664] ? iomap_migrate_page+0xc0/0xc0 [657020.907079] ? submit_bio_noacct+0x3a9/0x420 [657020.907487] iomap_writepages+0x1c/0x40 [657020.907942] xfs_vm_writepages+0x64/0x90 [xfs] [657020.908376] do_writepages+0x34/0xc0 [657020.908760] __writeback_single_inode+0x39/0x2a0 [657020.909222] writeback_sb_inodes+0x200/0x470 [657020.909635] __writeback_inodes_wb+0x4c/0xe0 [657020.910058] wb_writeback+0x1d8/0x290 [657020.910410] wb_workfn+0x29b/0x4d0 [657020.910756] ? __switch_to_asm+0x42/0x70 [657020.911135] ? __switch_to+0x7b/0x3e0 [657020.911489] process_one_work+0x1df/0x370 [657020.911912] worker_thread+0x50/0x400 [657020.912264] ? process_one_work+0x370/0x370 [657020.912667] kthread+0x11b/0x140 [657020.913008] ? __kthread_bind_mask+0x60/0x60 [657020.913413] ret_from_fork+0x22/0x30 [657020.913791] INFO: task nix-daemon:459204 blocked for more than 245 seconds. [657020.914440] Not tainted 5.10.70 #1-NixOS [657020.914879] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [657020.915612] task:nix-daemon state:D stack: 0 pid:459204 ppid: 1067 flags:0x00000000 [657020.916399] Call Trace: [657020.916654] __schedule+0x271/0x860 [657020.917006] schedule+0x46/0xb0 [657020.917343] xfs_log_commit_cil+0x6a4/0x800 [xfs] [657020.917806] ? wake_up_q+0xa0/0xa0 [657020.918162] __xfs_trans_commit+0x9d/0x310 [xfs] [657020.918633] xfs_setattr_nonsize+0x342/0x520 [xfs] [657020.919101] notify_change+0x348/0x4c0 [657020.919461] ? chmod_common+0xa1/0x150 [657020.919831] chmod_common+0xa1/0x150 [657020.920180] do_fchmodat+0x5a/0xb0 [657020.920509] __x64_sys_chmod+0x17/0x20 [657020.920878] do_syscall_64+0x33/0x40 [657020.921222] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [657020.921703] RIP: 0033:0x7f3d9b3a3b77 [657020.922055] RSP: 002b:00007ffc388e2b18 EFLAGS: 00000206 ORIG_RAX: 000000000000005a [657020.922772] RAX: ffffffffffffffda RBX: 00007ffc388e2e10 RCX: 00007f3d9b3a3b77 [657020.923435] RDX: 0000000000008049 RSI: 0000000000008124 RDI: 00000000016a2400 [657020.924115] RBP: 00007ffc388e32a0 R08: 00007ffc388e4628 R09: 0000000000000000 [657020.924787] R10: 0000000000000000 R11: 0000000000000206 R12: 00007ffc388e4620 [657020.925450] R13: 0000000000000000 R14: 00007ffc388e2df0 R15: 00007ffc388e2e10 IO is subsequently stalled on ALL disks (we have vda/vdb/vdc with two of them running XFS and one running swap). I am pretty sure that this is not primarily an issue in the hypervisor (Qemu 4.1) or storage (Ceph Jewel) because I can warm reset the virtual machine using "system-reset" and the machine will boot properly. This could be related to https://bugzilla.kernel.org/show_bug.cgi?id=207053 but it doesn't look quite right.
Also not that I'm not sure that it's related to the thaw at all because the stuck thaw info comes exactly 122 seconds after the first blocked task report which IMHO indicates that it's a subsequent error, not a cause.
Here's another excerpt that I dug out from a different VM with the same issue. No thawing happening AFAICT around this time: [848865.353541] INFO: task nix-daemon:1855245 blocked for more than 122 seconds. [848865.355625] Not tainted 5.10.70 #1-NixOS [848865.356912] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [848865.359117] task:nix-daemon state:D stack: 0 pid:1855245 ppid: 2165 flags:0x00000000 [848865.360999] Call Trace: [848865.361480] __schedule+0x271/0x860 [848865.362069] schedule+0x46/0xb0 [848865.362657] xfs_log_commit_cil+0x6a4/0x800 [xfs] [848865.363436] ? wake_up_q+0xa0/0xa0 [848865.364026] __xfs_trans_commit+0x9d/0x310 [xfs] [848865.364818] xfs_create+0x472/0x560 [xfs] [848865.365517] xfs_generic_create+0x247/0x320 [xfs] [848865.366310] ? xfs_lookup+0x55/0x100 [xfs] [848865.366984] path_openat+0xdd7/0x1070 [848865.367617] do_filp_open+0x88/0x130 [848865.368199] ? getname_flags.part.0+0x29/0x1a0 [848865.368925] do_sys_openat2+0x97/0x150 [848865.369557] __x64_sys_openat+0x54/0x90 [848865.370176] do_syscall_64+0x33/0x40 [848865.370787] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [848865.371619] RIP: 0033:0x7f48b24c0ea8 [848865.372196] RSP: 002b:00007ffed91cee30 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 [848865.373398] RAX: ffffffffffffffda RBX: 00000000000800c1 RCX: 00007f48b24c0ea8 [848865.374530] RDX: 00000000000800c1 RSI: 00000000015c4750 RDI: 00000000ffffff9c [848865.375661] RBP: 00000000015c4750 R08: 0000000000000000 R09: 0000000000000003 [848865.376790] R10: 00000000000001b6 R11: 0000000000000293 R12: 00007ffed91cf220 [848865.377920] R13: 00007ffed91ceee0 R14: 00007ffed91ceed0 R15: 00007ffed91d0c50 and I can see it not making any progress: [848988.235884] INFO: task nix-daemon:1855245 blocked for more than 245 seconds. [848988.238567] Not tainted 5.10.70 #1-NixOS [848988.240254] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [848988.242358] task:nix-daemon state:D stack: 0 pid:1855245 ppid: 2165 flags:0x00000000 [848988.243756] Call Trace: [848988.244183] __schedule+0x271/0x860 [848988.244780] schedule+0x46/0xb0 [848988.245366] xfs_log_commit_cil+0x6a4/0x800 [xfs] [848988.246153] ? wake_up_q+0xa0/0xa0 [848988.246775] __xfs_trans_commit+0x9d/0x310 [xfs] [848988.247572] xfs_create+0x472/0x560 [xfs] [848988.248269] xfs_generic_create+0x247/0x320 [xfs] [848988.249084] ? xfs_lookup+0x55/0x100 [xfs] [848988.249781] path_openat+0xdd7/0x1070 [848988.250388] do_filp_open+0x88/0x130 [848988.251000] ? getname_flags.part.0+0x29/0x1a0 [848988.251746] do_sys_openat2+0x97/0x150 [848988.252366] __x64_sys_openat+0x54/0x90 [848988.253015] do_syscall_64+0x33/0x40 [848988.253610] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [848988.254445] RIP: 0033:0x7f48b24c0ea8 [848988.255071] RSP: 002b:00007ffed91cee30 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 [848988.256292] RAX: ffffffffffffffda RBX: 00000000000800c1 RCX: 00007f48b24c0ea8 [848988.257460] RDX: 00000000000800c1 RSI: 00000000015c4750 RDI: 00000000ffffff9c [848988.258622] RBP: 00000000015c4750 R08: 0000000000000000 R09: 0000000000000003 [848988.259777] R10: 00000000000001b6 R11: 0000000000000293 R12: 00007ffed91cf220 [848988.260933] R13: 00007ffed91ceee0 R14: 00007ffed91ceed0 R15: 00007ffed91d0c50 Both stack traces are identical in every regard so that might be useful info.
I have another machine that managed to break free after 20 minutes which did around 3 thaws around that time: Oct 20 18:05:22 pixometerstag04 kernel: INFO: task nix-daemon:1387736 blocked for more than 1228 seconds. Oct 20 18:05:22 pixometerstag04 kernel: Not tainted 5.10.70 #1-NixOS Oct 20 18:05:22 pixometerstag04 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 20 18:05:22 pixometerstag04 kernel: task:nix-daemon state:D stack: 0 pid:1387736 ppid: 1356 flags:0x00000000 Oct 20 18:05:22 pixometerstag04 kernel: Call Trace: Oct 20 18:05:22 pixometerstag04 kernel: __schedule+0x271/0x860 Oct 20 18:05:22 pixometerstag04 kernel: schedule+0x46/0xb0 Oct 20 18:05:22 pixometerstag04 kernel: xfs_log_commit_cil+0x6a4/0x800 [xfs] Oct 20 18:05:22 pixometerstag04 kernel: ? wake_up_q+0xa0/0xa0 Oct 20 18:05:22 pixometerstag04 kernel: __xfs_trans_commit+0x9d/0x310 [xfs] Oct 20 18:05:22 pixometerstag04 kernel: xfs_create+0x472/0x560 [xfs] Oct 20 18:05:22 pixometerstag04 kernel: xfs_generic_create+0x247/0x320 [xfs] Oct 20 18:05:22 pixometerstag04 kernel: ? xfs_lookup+0x55/0x100 [xfs] Oct 20 18:05:22 pixometerstag04 kernel: path_openat+0xdd7/0x1070 Oct 20 18:05:22 pixometerstag04 kernel: do_filp_open+0x88/0x130 Oct 20 18:05:22 pixometerstag04 kernel: ? getname_flags.part.0+0x29/0x1a0 Oct 20 18:05:22 pixometerstag04 kernel: do_sys_openat2+0x97/0x150 Oct 20 18:05:22 pixometerstag04 kernel: __x64_sys_openat+0x54/0x90 Oct 20 18:05:22 pixometerstag04 kernel: do_syscall_64+0x33/0x40 Oct 20 18:05:22 pixometerstag04 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Oct 20 18:05:22 pixometerstag04 kernel: RIP: 0033:0x7f3e3114dea8 Oct 20 18:05:22 pixometerstag04 kernel: RSP: 002b:00007ffd92f7df10 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 Oct 20 18:05:22 pixometerstag04 kernel: RAX: ffffffffffffffda RBX: 00000000000800c1 RCX: 00007f3e3114dea8 Oct 20 18:05:22 pixometerstag04 kernel: RDX: 00000000000800c1 RSI: 000000000195cb50 RDI: 00000000ffffff9c Oct 20 18:05:22 pixometerstag04 kernel: RBP: 000000000195cb50 R08: 0000000000000000 R09: 0000000000000003 Oct 20 18:05:22 pixometerstag04 kernel: R10: 00000000000001b6 R11: 0000000000000293 R12: 00007ffd92f7e300 Oct 20 18:05:22 pixometerstag04 kernel: R13: 00007ffd92f7dfc0 R14: 00007ffd92f7dfb0 R15: 00007ffd92f7fd30 However, the first thaw came between when the machine already reported 122 and 245s of hangs and the last thaw came after it broke free. Interestingly it also logs a variety of things that worked while those were logged, maybe it only happened for the /tmp filesystem and not / ... apparently postgresql was still writing stuff on the disk in that period, so I'd guess this only happened on /tmp This is square in the middle of the reported blocks: Oct 20 17:52:00 pixometerstag04 postgres[1008]: user=,db= LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.004 s, sync=0.001 s, total=0.018 s; sync fi>
Another thing I noticed: it always hangs the nix-daemon (nixos.org) which does some dance around managing an overlay read-only mount that gets re-mounted rw for a short period while changing things in there. Here is what it looks like normally: /dev/vda1 on / type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota) /dev/vda1 on /nix/store type xfs (ro,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
I've also picked this up with the friends from NixOS in case we can shed some light on what the dance is that the nix-daemon performs here: https://github.com/NixOS/nixpkgs/issues/142394
You might want to do an "echo w > /proc/sysrq-trigger when this happens, to get all blocked tasks. The thaw seems a little suspicious, what all is going on there with freezing and thawing?
Thanks, I'll try to do something like that, likely by using a qemu monitor command to send in the trigger via keyboard when I catch this the next time it happens. The freezing/thawing is part of our image backup structure and over the years we established to regularly call 'thaw' gratuitously as thawing can once in a blue moon cause VMs to get stuck infinitely if an initial thaw doesn't go through. This has been reliable for at least 2-3 years by now and I'm not sure whether this will turn out a red herring or not.
On Wed, Oct 20, 2021 at 07:36:07AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > [656898.034261] Call Trace: > [656898.034538] __schedule+0x271/0x860 > [656898.034881] schedule+0x46/0xb0 > [656898.035226] xfs_log_commit_cil+0x6a4/0x800 [xfs] This looks like it is stuck on the hard throttle. Which means this is the likely fix: 19f4e7cc8197 xfs: Fix CIL throttle hang when CIL space used going backwards Cheers, Dave.
Thanks Dave! From what I can tell this fix was merged into 5.14 with a largish bunch of other changes (via 9f7b640f001f). I checked that the fix you mentioned applies cleanly to the current 5.10 codebase (it does) but I'm not sure whether we missed some more backports for 5.10. I wonder whether "we" (I don't think I'm expert enough to do this reliably) need to check for other fixes that didn't make it back to 5.10. Do you have a recommendation how/who to approach? As I'm not doing this often, I would contact GKH and ask for a backport of the patch you mentioned (given that you consider it appropriate to apply to 5.10 from a semantic perspective) but maybe you have better advice.
Alright, I managed to catch a machine that logged a blocked request for only the /tmp partition. I logged in, ran a 'sync' and did the 'w' sysrequest. I'm attaching the log here where at first you see a number of blocked requests, then a while later I log in and run the sync and you can see the sync stuck as well as a large number of workers.
Going through the log I can see that aside from 'sync' and the workers there is a tar process that is stuck since 13:20 and not making any progress. I can also not see any IO on the block device.
Created attachment 299295 [details] log of a vm with only vdc1 (/tmp) stuck
We have run into a similar issue. We use xfs_freeze to take backup snapshots of our filesystems and occasionally, in the process the xfs_freeze/thaw process turns into zombies. We used the echo w > /proc/sysrq-trigger to get the stack trace: [736826.666940] task:kswapd0 state:D stack: 0 pid: 576 ppid: 2 flags:0x00004000 [736826.674439] Call Trace: [736826.677686] __schedule+0x1f9/0x660 [736826.681593] schedule+0x46/0xb0 [736826.685141] percpu_rwsem_wait+0xa9/0x180 [736826.688867] ? percpu_free_rwsem+0x30/0x30 [736826.692764] __percpu_down_read+0x49/0x60 [736826.696549] xfs_trans_alloc+0x15d/0x170 [736826.700367] xfs_free_eofblocks+0x130/0x1e0 [736826.807546] xfs_fs_destroy_inode+0xa8/0x1b0 [736826.811610] destroy_inode+0x3b/0x70 [736826.815150] dispose_list+0x48/0x60 [736826.818796] prune_icache_sb+0x54/0x80 [736826.822476] super_cache_scan+0x161/0x1e0 [736826.826341] do_shrink_slab+0x145/0x240 [736826.830294] shrink_slab_memcg+0xcd/0x1e0 [736826.834371] shrink_node_memcgs+0x186/0x1c0 [736826.838491] shrink_node+0x14f/0x570 [736826.842088] balance_pgdat+0x232/0x510 [736826.845820] kswapd+0xe2/0x170 [736826.849154] ? balance_pgdat+0x510/0x510 [736826.852944] kthread+0x11b/0x140 [736826.856326] ? __kthread_bind_mask+0x60/0x60 [736826.860313] ret_from_fork+0x22/0x30 [736826.863839] task:systemd-logind state:D stack: 0 pid: 2073 ppid: 1 flags:0x00000084 [736826.871453] Call Trace: [736826.874898] __schedule+0x1f9/0x660 [736826.878898] ? kvm_sched_clock_read+0xd/0x20 [736826.883265] schedule+0x46/0xb0 [736826.887003] rwsem_down_write_slowpath+0x234/0x4b0 [736826.891114] ? __kmalloc+0x154/0x2b0 [736826.894794] prealloc_shrinker+0x4c/0xf0 [736826.898526] alloc_super+0x29a/0x2f0 [736826.902210] ? set_anon_super+0x40/0x40 [736826.905867] sget_fc+0x6c/0x240 [736826.909206] ? shmem_create+0x20/0x20 [736826.912912] get_tree_nodev+0x23/0xb0 [736826.916533] ? bpf_lsm_capset+0x10/0x10 [736826.920310] vfs_get_tree+0x25/0xb0 [736826.923848] do_new_mount+0x152/0x1b0 [736826.927548] __x64_sys_mount+0x103/0x140 [736826.931531] do_syscall_64+0x33/0x40 [736826.935597] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [736826.940323] RIP: 0033:0x7f49f8efcdba [736826.944298] RSP: 002b:00007ffdb98f2658 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 [736826.952238] RAX: ffffffffffffffda RBX: 000055f21a7113a0 RCX: 00007f49f8efcdba [736826.959940] RDX: 000055f218672af0 RSI: 000055f21a721de0 RDI: 000055f218672af0 [736826.967721] RBP: 00007ffdb98f2690 R08: 000055f21a7a51e0 R09: 0000000000000000 [736826.975503] R10: 0000000000000006 R11: 0000000000000202 R12: 0000000000000000 [736826.983253] R13: 00007ffdb98f2740 R14: 00007ffdb98f2820 R15: 0000000000000000 [736826.990979] task:dockerd state:D stack: 0 pid: 6639 ppid: 1 flags:0x00000080 [736826.999289] Call Trace: [736827.002695] __schedule+0x1f9/0x660 [736827.006708] schedule+0x46/0xb0 [736827.010512] rwsem_down_write_slowpath+0x234/0x4b0 [736827.015214] ? __kmalloc+0x154/0x2b0 [736827.019291] prealloc_shrinker+0x4c/0xf0 [736827.023495] alloc_super+0x29a/0x2f0 [736827.027505] sget+0x91/0x220 [736827.031089] ? destroy_super_rcu+0x40/0x40 [736827.035268] ? ovl_get_lowerstack+0x190/0x190 [overlay] [736827.040172] mount_nodev+0x26/0x90 [736827.044070] legacy_get_tree+0x27/0x40 [736827.048169] vfs_get_tree+0x25/0xb0 [736827.052088] do_new_mount+0x152/0x1b0 [736827.056076] __x64_sys_mount+0x103/0x140 [736827.060176] do_syscall_64+0x33/0x40 [736827.064158] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [736827.068950] RIP: 0033:0x55c5416cc2aa [736827.072915] RSP: 002b:000000c0015b2768 EFLAGS: 00000216 ORIG_RAX: 00000000000000a5 [736827.080676] RAX: ffffffffffffffda RBX: 000000c00005e800 RCX: 000055c5416cc2aa [736827.088375] RDX: 000000c000df5320 RSI: 000000c0012db140 RDI: 000000c000df52e8 [736827.095311] RBP: 000000c0015b2800 R08: 000000c0015a0600 R09: 0000000000000000 [736827.102300] R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000133 [736827.109100] R13: 0000000000000132 R14: 0000000000000200 R15: 0000000000000055 [737030.525416] task:sh state:D stack: 0 pid:27526 ppid: 27517 flags:0x00000080 [737030.532905] Call Trace: [737030.536188] __schedule+0x1f9/0x660 [737030.540060] schedule+0x46/0xb0 [737030.543385] percpu_rwsem_wait+0xa9/0x180 [737030.547220] ? percpu_free_rwsem+0x30/0x30 [737030.551093] __percpu_down_read+0x49/0x60 [737030.555220] mnt_want_write+0x66/0x90 [737030.558966] open_last_lookups+0x30d/0x3f0 [737030.562914] ? path_init+0x2bd/0x3e0 [737030.681735] path_openat+0x88/0x1d0 [737030.685600] ? xfs_iunlock+0x94/0xe0 [737030.689575] do_filp_open+0x88/0x130 [737030.693496] ? __check_object_size.part.0+0x11f/0x140 [737030.698201] do_sys_openat2+0x97/0x150 [737030.702232] __x64_sys_openat+0x54/0x90 [737030.706301] do_syscall_64+0x33/0x40 [737030.710248] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [737030.714945] RIP: 0033:0x7f4502c0d14e [737030.718849] RSP: 002b:00007ffd066be7a0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [737030.726628] RAX: ffffffffffffffda RBX: 00000000009ba360 RCX: 00007f4502c0d14e [737030.734177] RDX: 0000000000000241 RSI: 00000000009bb8c0 RDI: ffffffffffffff9c [737030.741759] RBP: 00007ffd066be8a0 R08: 0000000000000020 R09: 00000000009bb8c0 [737030.749452] R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 [737030.757142] R13: 0000000000000001 R14: 0000000000000001 R15: 00000000009bad60 [737030.764836] task:bash state:D stack: 0 pid:27580 ppid: 1 flags:0x00000084 [737030.773068] Call Trace: [737030.776437] __schedule+0x1f9/0x660 [737030.780354] schedule+0x46/0xb0 [737030.784080] percpu_rwsem_wait+0xa9/0x180 [737030.788264] ? percpu_free_rwsem+0x30/0x30 [737030.792476] __percpu_down_read+0x49/0x60 [737030.796642] mnt_want_write+0x66/0x90 [737030.800660] open_last_lookups+0x30d/0x3f0 [737030.804923] ? path_init+0x2bd/0x3e0 [737030.808895] path_openat+0x88/0x1d0 [737030.812800] ? xfs_iunlock+0x94/0xe0 [737030.816745] do_filp_open+0x88/0x130 [737030.820694] ? getname_flags.part.0+0x29/0x1a0 [737030.825097] ? __check_object_size.part.0+0x11f/0x140 [737030.829846] ? audit_alloc_name+0x8c/0xe0 [737030.834010] do_sys_openat2+0x97/0x150 [737030.838056] __x64_sys_openat+0x54/0x90 [737030.842164] do_syscall_64+0x33/0x40 [737030.846146] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [737030.850855] RIP: 0033:0x7f2072a9414e [737030.854822] RSP: 002b:00007ffd855832e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [737030.862655] RAX: ffffffffffffffda RBX: 000000000178aab0 RCX: 00007f2072a9414e [737030.870259] RDX: 0000000000000241 RSI: 000000000178af20 RDI: ffffffffffffff9c [737030.877820] RBP: 00007ffd855833e0 R08: 0000000000000020 R09: 000000000178af20 [737030.885463] R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000 [737030.893083] R13: 0000000000000003 R14: 0000000000000001 R15: 000000000178aa60 [736992.593491] task:k state:D stack: 0 pid:11166 ppid: 29014 flags:0x00000184 [736992.600911] Call Trace: [736992.604234] __schedule+0x1f9/0x660 [736992.607842] schedule+0x46/0xb0 [736992.611295] percpu_rwsem_wait+0xa9/0x180 [736992.614999] ? percpu_free_rwsem+0x30/0x30 [736992.618901] __percpu_down_read+0x49/0x60 [736992.622614] mnt_want_write+0x66/0x90 [736992.626323] open_last_lookups+0x30d/0x3f0 [736992.630064] ? path_init+0x2bd/0x3e0 [736992.633795] path_openat+0x88/0x1d0 [736992.637360] do_filp_open+0x88/0x130 [736992.640911] ? getname_flags.part.0+0x29/0x1a0 [736992.645006] ? __check_object_size.part.0+0x11f/0x140 [736992.649268] ? audit_alloc_name+0x8c/0xe0 [736992.653173] do_sys_openat2+0x97/0x150 [736992.656832] __x64_sys_openat+0x54/0x90 [736992.660619] do_syscall_64+0x33/0x40 [736992.664181] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [736992.668543] RIP: 0033:0x7f81dc2d11ae [736992.672208] RSP: 002b:00007ffcff615c00 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [736992.679509] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f81dc2d11ae [736992.686481] RDX: 0000000000000042 RSI: 00007f81d757e6b8 RDI: 00000000ffffff9c [736992.693723] RBP: 00007f81d7ba14c4 R08: 000000000000003a R09: 0000000000000000 [736992.700684] R10: 00000000000001b6 R11: 0000000000000246 R12: 00007f81d757e6b8 [736992.708136] R13: 00007f81d7ba14c4 R14: 00007f81dbfb45b8 R15: 00007f81d7f3f8fc [736992.715824] task:k state:D stack: 0 pid:11273 ppid: 29014 flags:0x00000184 [736992.724256] Call Trace: [736992.727698] __schedule+0x1f9/0x660 [736992.731616] schedule+0x46/0xb0 [736992.735367] percpu_rwsem_wait+0xa9/0x180 [736992.739512] ? percpu_free_rwsem+0x30/0x30 [736992.743681] __percpu_down_read+0x49/0x60 [736992.747837] mnt_want_write+0x66/0x90 [736992.751896] open_last_lookups+0x30d/0x3f0 [736992.756191] ? path_init+0x2bd/0x3e0 [736992.760175] path_openat+0x88/0x1d0 [736992.764055] do_filp_open+0x88/0x130 [736992.768115] ? getname_flags.part.0+0x29/0x1a0 [736992.772563] ? __check_object_size.part.0+0x11f/0x140 [736992.777391] ? audit_alloc_name+0x8c/0xe0 [736992.781658] do_sys_openat2+0x97/0x150 [736992.785794] __x64_sys_openat+0x54/0x90 [736992.789856] do_syscall_64+0x33/0x40 [736992.793913] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [736992.798746] RIP: 0033:0x7fb7d337f1ae [736992.802878] RSP: 002b:00007ffd9b2dbda0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [736992.810824] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fb7d337f1ae [736992.818496] RDX: 0000000000000042 RSI: 00007fb7d2e2aa88 RDI: 00000000ffffff9c [736992.826359] RBP: 00007fb7cecdc7dc R08: 0000000000000008 R09: 0000000000000000 [736992.833753] R10: 00000000000001b6 R11: 0000000000000246 R12: 00007fb7d2e2aa88 [736992.840727] R13: 00007fb7d2e2aa7c R14: 00007fb7d3150920 R15: 0000000000000004 [736992.847580] task:tail state:D stack: 0 pid:11608 ppid: 9603 flags:0x00004186 [736992.855146] Call Trace: [736992.858238] __schedule+0x1f9/0x660 [736992.861943] schedule+0x46/0xb0 [736992.865387] percpu_rwsem_wait+0xa9/0x180 [736992.869138] ? percpu_free_rwsem+0x30/0x30 [736992.873134] __percpu_down_read+0x49/0x60 [736992.876948] xfs_trans_alloc+0x15d/0x170 [736992.880854] xfs_free_eofblocks+0x130/0x1e0 [736992.885085] xfs_release+0x13d/0x160 [736992.888807] __fput+0x96/0x240 [736992.892138] task_work_run+0x5f/0x90 [736992.895834] do_exit+0x22c/0x3b0 [736992.899189] ? timerqueue_del+0x1e/0x40 [736992.902967] do_group_exit+0x33/0xa0 [736992.906469] get_signal+0x15d/0x5b0 [736993.013638] arch_do_signal+0x25/0xf0 [736993.017223] exit_to_user_mode_loop+0x8d/0xc0 [736993.021148] exit_to_user_mode_prepare+0x6a/0x70 [736993.025371] syscall_exit_to_user_mode+0x22/0x140 [736993.029417] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [736993.033852] RIP: 0033:0x7f907725e6f4 [736993.037391] RSP: 002b:00007ffcf7722f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000023 [736993.044403] RAX: fffffffffffffdfc RBX: 0000000000000001 RCX: 00007f907725e6f4 [736993.051384] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007ffcf7722f40 [736993.058190] RBP: 0000000000000000 R08: 00007ffcf7723070 R09: 00007f9077312e80 [736993.065109] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffcf7722f40 [736993.071934] R13: 00007ffcf77256bf R14: 0000560396137440 R15: 0000000000000000 [736996.659414] task:process-exporte state:D stack: 0 pid:30396 ppid: 2204 flags:0x00004082 [736996.666720] Call Trace: [736996.669864] __schedule+0x1f9/0x660 [736996.673371] schedule+0x46/0xb0 [736996.676730] rwsem_down_write_slowpath+0x234/0x4b0 [736996.680991] unregister_memcg_shrinker.isra.0+0x18/0x40 [736996.685377] unregister_shrinker+0x7b/0x80 [736996.689295] deactivate_locked_super+0x29/0xa0 [736996.693241] cleanup_mnt+0x12d/0x190 [736996.696912] task_work_run+0x5f/0x90 [736996.700444] do_exit+0x22c/0x3b0 [736996.703996] do_group_exit+0x33/0xa0 [736996.707581] get_signal+0x15d/0x5b0 [736996.711191] arch_do_signal+0x25/0xf0 [736996.714771] exit_to_user_mode_loop+0x8d/0xc0 [736996.718759] exit_to_user_mode_prepare+0x6a/0x70 [736996.722740] syscall_exit_to_user_mode+0x22/0x140 [736996.726869] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [736996.835174] RIP: 0033:0x4a2c50 [736996.838687] RSP: 002b:000000c000213798 EFLAGS: 00000216 ORIG_RAX: 0000000000000000 [736996.845651] RAX: 0000000000000200 RBX: 000000c000032500 RCX: 00000000004a2c50 [736996.852636] RDX: 0000000000000200 RSI: 000000c0008ef600 RDI: 0000000000000005 [736996.859446] RBP: 000000c0002137e8 R08: 0000000000000000 R09: 0000000000000000 [736996.866347] R10: 0000000000000000 R11: 0000000000000216 R12: ffffffffffffffff [736996.873124] R13: 000000000000000c R14: 000000000000000b R15: 0000000000000010
Here is the result of echo t > /proc/sysrq-trigger: 1 I root 578 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfsalloc] 1 I root 579 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs_mru_cache] 1 I root 1251 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-buf/nvme0n1] 1 I root 1252 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-conv/nvme0n] 1 I root 1253 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-cil/nvme0n1] 1 I root 1254 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-reclaim/nvm] 1 I root 1255 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-eofblocks/n] 1 I root 1256 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-log/nvme0n1] 1 S root 1257 2 0 80 0 - 0 - Oct20 ? 00:02:06 [xfsaild/nvme0n1] 1 I root 1990 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-buf/dm-0] 1 I root 1991 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-conv/dm-0] 1 I root 1992 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-cil/dm-0] 1 I root 1993 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-reclaim/dm-] 1 I root 1994 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-eofblocks/d] 1 I root 1995 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-log/dm-0] 1 S root 1996 2 0 80 0 - 0 - Oct20 ? 00:00:00 [xfsaild/dm-0] 1 I root 1999 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-buf/dm-1] 1 I root 2000 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-conv/dm-1] 1 I root 2001 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-cil/dm-1] 1 I root 2002 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-reclaim/dm-] 1 I root 2003 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-eofblocks/d] 1 I root 2004 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-log/dm-1] 1 S root 2005 2 0 80 0 - 0 - Oct20 ? 00:01:53 [xfsaild/dm-1] 1 I root 2008 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-buf/dm-2] 1 I root 2009 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-conv/dm-2] 1 I root 2010 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-cil/dm-2] 1 I root 2011 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-reclaim/dm-] 1 I root 2012 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-eofblocks/d] 1 I root 2013 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-log/dm-2] 1 S root 2014 2 0 80 0 - 0 - Oct20 ? 00:00:00 [xfsaild/dm-2] 0 S root 2226 1942 0 80 0 - 30496 - 17:41 ? 00:00:00 /bin/sh -f /usr/sbin/xfs_freeze -u /usr/local/appian 4 D root 2230 2226 0 80 0 - 29825 - 17:41 ? 00:00:00 /usr/sbin/xfs_io -F -r -p xfs_freeze -x -c thaw /usr/local/appian 1 I root 4068 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-buf/dm-3] 1 I root 4069 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-conv/dm-3] 1 I root 4070 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-cil/dm-3] 1 I root 4071 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-reclaim/dm-] 1 I root 4072 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-eofblocks/d] 1 I root 4073 2 0 60 -20 - 0 - Oct20 ? 00:00:00 [xfs-log/dm-3] 1 S root 4074 2 0 80 0 - 0 - Oct20 ? 00:02:01 [xfsaild/dm-3] 0 S root 10903 13809 0 80 0 - 29855 - 17:57 pts/0 00:00:00 grep --color=auto xfs 1 I root 11221 2 0 80 0 - 0 - 17:44 ? 00:00:00 [kworker/1:4-xfs] 4 D root 22209 1 0 80 0 - 29825 - 16:08 ? 00:00:00 /usr/sbin/xfs_io -F -r -p xfs_freeze -x -c thaw /usr/local/appian 1 I root 23187 2 0 80 0 - 0 - 17:37 ? 00:00:00 [kworker/1:3-xfs] 1 I root 30083 2 0 80 0 - 0 - 17:29 ? 00:00:00 [kworker/1:1-xfs]
@Pedram: not sure those are related. I'm 50/50 whether my issue is freeze-related and all my stacktraces have a 'xfs_log_commit_cil' in there which I don't see in yours, could be a separate issue.
I started trying out the fix that Dave and am using it with 5.10.76 (applied clean with a bit of fuzzing). @Dave do you happen do know whether there's a helper that can stress test live systems in this regard?
On Wed, Nov 10, 2021 at 03:16:35PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=214767 > > --- Comment #16 from Christian Theune (ct@flyingcircus.io) --- > I started trying out the fix that Dave and am using it with 5.10.76 (applied > clean with a bit of fuzzing). > > @Dave do you happen do know whether there's a helper that can stress test > live > systems in this regard? fstests has some tests in the "freeze" group that stress freeze/thaw. And it has lots of other tests in it that will tell you if there's a regression in your backport, so might be an idea to run a full "auto" group pass rather than just the freeze group.... Cheers, Dave.