I'm on NixOS unstable 19.03, with an Intel i7-4820k with 20 GB ram and 10 GB zram (though it reproduces without zram as well). I run several dozen Docker containers through Kubernetes, each with an NFS mount inside. Randomly, however (it can take up to a day or as little as 10 minutes to occur), all reads hang as nfsd locks up: ``` Nov 10 15:29:55 rem kernel: INFO: task htop:6608 blocked for more than 120 seconds. Nov 10 15:29:55 rem kernel: Tainted: P O 4.19.1 #1-NixOS Nov 10 15:29:55 rem kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 10 15:29:55 rem kernel: htop D 0 6608 6591 0x00000000 Nov 10 15:29:55 rem kernel: Call Trace: Nov 10 15:29:55 rem kernel: ? __schedule+0x1d3/0x6f0 Nov 10 15:29:55 rem kernel: schedule+0x28/0x80 Nov 10 15:29:55 rem kernel: rwsem_down_read_failed+0xee/0x150 Nov 10 15:29:55 rem kernel: ? call_rwsem_down_read_failed+0x14/0x30 Nov 10 15:29:55 rem kernel: call_rwsem_down_read_failed+0x14/0x30 Nov 10 15:29:55 rem kernel: down_read+0x1c/0x30 Nov 10 15:29:55 rem kernel: __access_remote_vm+0x55/0x2e0 Nov 10 15:29:55 rem kernel: proc_pid_cmdline_read+0x1c0/0x310 Nov 10 15:29:55 rem kernel: __vfs_read+0x36/0x170 Nov 10 15:29:55 rem kernel: vfs_read+0x89/0x130 Nov 10 15:29:55 rem kernel: ksys_read+0x52/0xc0 Nov 10 15:29:55 rem kernel: do_syscall_64+0x4e/0x100 Nov 10 15:29:55 rem kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Nov 10 15:29:55 rem kernel: RIP: 0033:0x7fee3e824281 Nov 10 15:29:55 rem kernel: Code: Bad RIP value. Nov 10 15:29:55 rem kernel: RSP: 002b:00007fff11de02a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 Nov 10 15:29:55 rem kernel: RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fee3e824281 Nov 10 15:29:55 rem kernel: RDX: 0000000000001000 RSI: 00007fff11de0370 RDI: 0000000000000005 Nov 10 15:29:55 rem kernel: RBP: 0000000000001000 R08: 00007fee3eaeae00 R09: 000000000000001e Nov 10 15:29:55 rem kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff11de0370 Nov 10 15:29:55 rem kernel: R13: 0000000000000000 R14: 0000000000000005 R15: 00000000008c5760 Nov 10 15:29:55 rem kernel: INFO: task nfsd:7464 blocked for more than 120 seconds. Nov 10 15:29:55 rem kernel: Tainted: P O 4.19.1 #1-NixOS Nov 10 15:29:55 rem kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 10 15:29:55 rem kernel: nfsd D 0 7464 2 0x80000000 Nov 10 15:29:55 rem kernel: Call Trace: Nov 10 15:29:55 rem kernel: ? __schedule+0x1d3/0x6f0 Nov 10 15:29:55 rem kernel: schedule+0x28/0x80 Nov 10 15:29:55 rem kernel: rwsem_down_write_failed+0x15e/0x350 Nov 10 15:29:55 rem kernel: ? call_rwsem_down_write_failed+0x13/0x20 Nov 10 15:29:55 rem kernel: call_rwsem_down_write_failed+0x13/0x20 Nov 10 15:29:55 rem kernel: down_write+0x29/0x40 Nov 10 15:29:55 rem kernel: ext4_file_write_iter+0x91/0x3d0 [ext4] Nov 10 15:29:55 rem kernel: ? nfsd_proc_write+0x160/0x160 [nfsd] Nov 10 15:29:55 rem kernel: ? exportfs_decode_fh+0xf2/0x2b0 Nov 10 15:29:55 rem kernel: do_iter_readv_writev+0x12a/0x1a0 Nov 10 15:29:55 rem kernel: do_iter_write+0x80/0x190 Nov 10 15:29:55 rem kernel: nfsd_vfs_write+0xed/0x470 [nfsd] Nov 10 15:29:55 rem kernel: nfsd4_write+0x18a/0x220 [nfsd] Nov 10 15:29:55 rem kernel: nfsd4_proc_compound+0x4f3/0x6e0 [nfsd] Nov 10 15:29:55 rem kernel: nfsd_dispatch+0xf5/0x230 [nfsd] Nov 10 15:29:55 rem kernel: svc_process_common+0x4c4/0x720 [sunrpc] Nov 10 15:29:55 rem kernel: ? svc_recv+0x2c2/0xa10 [sunrpc] Nov 10 15:29:55 rem kernel: ? nfsd_destroy+0x60/0x60 [nfsd] Nov 10 15:29:55 rem kernel: svc_process+0xd7/0xf0 [sunrpc] Nov 10 15:29:55 rem kernel: nfsd+0xe3/0x150 [nfsd] Nov 10 15:29:55 rem kernel: kthread+0x113/0x130 Nov 10 15:29:55 rem kernel: ? kthread_create_worker_on_cpu+0x70/0x70 Nov 10 15:29:55 rem kernel: ret_from_fork+0x35/0x40 ``` For the time being, I have reverted to kernel 4.14.78, which does not experience the issue.
Those are the only processes it reports blocked? I'm guessing it's the inode_lock at the start of ext4_file_write_iter that's blocking. On a quick look I don't see any of the callers taking any locks. So I'd expect that elsewhere there'd be a process holding that inode lock and blocking on something else. Based just on this might first guess would be a vfs or maybe ext4 bug rather than an nfsd bug, but I'm not seeing how to reassign. May be worth reporting to the relevant mailing lists.
No, there are more blocked processes; looking at htop when this happens (assuming htop itself doesn't freeze), every process reading from NFS blocks. Thanks for the analysis! I'll try posting on the mailing lists about it.
From my reply on linux-ext4: Sorry, this isn't something I've seen before. And it's not at all obvious from the information in Bugzilla what's causing the deadlock. The down_read() up appears to be in mm/memory.c, in __access_remote_vm() getting called from proc_pid_cmdline_read(). access_Remote_vm is apparently trying to get a shared lock on &mm->mmap_sem. How this would get involved with the inode_lock() is not immediately obvious. Things I would suggest. 1) Try running your kernel console log throughn ./scripts/decode_stacktrace.sh so we can be sure we've correctly assessed where the kernel is grabbing which lock. Enabling CONFIG_DEBUG_INFO and CONFIG_DEBUG_INFO_REDUCED will be helpful. 2) Try turning on CONFIG_LOCKDEP and see if this reports some potential deadlock. 3) Try using sysrq-d to find all held locks (running the resulting kernel console output through decode_stacktrace.sh will also eb helpful).
I have a (similar?) problem. Where to start debugging?: [Mo Sep 30 21:37:47 2019] INFO: task nfsd:1557 blocked for more than 122 seconds. [Mo Sep 30 21:37:47 2019] Not tainted 5.2.14-artix1-1-ARTIX #1 [Mo Sep 30 21:37:47 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Mo Sep 30 21:37:47 2019] nfsd D 0 1557 2 0x80004000 [Mo Sep 30 21:37:47 2019] Call Trace: [Mo Sep 30 21:37:47 2019] ? __schedule+0x27f/0x6d0 [Mo Sep 30 21:37:47 2019] schedule+0x43/0xd0 [Mo Sep 30 21:37:47 2019] rwsem_down_write_failed+0x192/0x280 [Mo Sep 30 21:37:47 2019] ? btrfs_lookup_fs_root+0x32/0x40 [btrfs] [Mo Sep 30 21:37:47 2019] ? __d_obtain_alias+0x31/0x80 [Mo Sep 30 21:37:47 2019] down_write+0x38/0x40 [Mo Sep 30 21:37:47 2019] btrfs_file_write_iter+0xc1/0x630 [btrfs] [Mo Sep 30 21:37:47 2019] do_iter_readv_writev+0x166/0x1e0 [Mo Sep 30 21:37:47 2019] do_iter_write+0x7d/0x190 [Mo Sep 30 21:37:47 2019] nfsd_vfs_write+0x127/0x4b0 [nfsd] [Mo Sep 30 21:37:47 2019] nfsd4_write+0xf1/0x230 [nfsd] [Mo Sep 30 21:37:47 2019] nfsd4_proc_compound+0x334/0x740 [nfsd] [Mo Sep 30 21:37:47 2019] nfsd_dispatch+0xc1/0x200 [nfsd] [Mo Sep 30 21:37:47 2019] svc_process_common.isra.0+0x3cf/0x750 [sunrpc] [Mo Sep 30 21:37:47 2019] ? svc_sock_secure_port+0x12/0x30 [sunrpc] [Mo Sep 30 21:37:47 2019] ? nfsd_svc+0x2d0/0x2d0 [nfsd] [Mo Sep 30 21:37:47 2019] svc_process+0xe2/0x120 [sunrpc] [Mo Sep 30 21:37:47 2019] nfsd+0xed/0x150 [nfsd] [Mo Sep 30 21:37:47 2019] kthread+0xfb/0x130 [Mo Sep 30 21:37:47 2019] ? nfsd_destroy+0x60/0x60 [nfsd] [Mo Sep 30 21:37:47 2019] ? kthread_park+0x80/0x80 [Mo Sep 30 21:37:47 2019] ret_from_fork+0x1f/0x40
Since +- 1 months I have been hitting this bug too under ubuntu 20.04. Jan 25 09:11:32 o9 kernel: [ 604.872233] INFO: task nfsd:894 blocked for more than 120 seconds. Jan 25 09:11:32 o9 kernel: [ 604.872273] Not tainted 5.8.0-40-generic #45~20.04.1-Ubuntu Jan 25 09:11:32 o9 kernel: [ 604.872310] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 25 09:11:32 o9 kernel: [ 604.872358] nfsd D 0 894 2 0x00004000 Jan 25 09:11:32 o9 kernel: [ 604.872360] Call Trace: Jan 25 09:11:32 o9 kernel: [ 604.872365] __schedule+0x394/0xa60 Jan 25 09:11:32 o9 kernel: [ 604.872369] schedule+0x55/0xc0 Jan 25 09:11:32 o9 kernel: [ 604.872372] rwsem_down_write_slowpath+0x239/0x4b0 Jan 25 09:11:32 o9 kernel: [ 604.872377] down_write+0x41/0x50 Jan 25 09:11:32 o9 kernel: [ 604.872385] ovl_dir_release+0x2f/0x70 [overlay] Jan 25 09:11:32 o9 kernel: [ 604.872388] __fput+0xe9/0x250 Jan 25 09:11:32 o9 kernel: [ 604.872390] flush_delayed_fput+0x23/0x30 Jan 25 09:11:32 o9 kernel: [ 604.872407] nfsd_file_close_inode_sync+0x15a/0x170 [nfsd] Jan 25 09:11:32 o9 kernel: [ 604.872421] nfsd_unlink+0x256/0x2b0 [nfsd] Jan 25 09:11:32 o9 kernel: [ 604.872437] nfsd3_proc_remove+0x74/0x110 [nfsd] Jan 25 09:11:32 o9 kernel: [ 604.872450] nfsd_dispatch+0xe2/0x220 [nfsd] Jan 25 09:11:32 o9 kernel: [ 604.872476] svc_process_common+0x47b/0x6f0 [sunrpc] Jan 25 09:11:32 o9 kernel: [ 604.872501] ? svc_sock_secure_port+0x16/0x30 [sunrpc] Jan 25 09:11:32 o9 kernel: [ 604.872513] ? nfsd_svc+0x330/0x330 [nfsd] Jan 25 09:11:32 o9 kernel: [ 604.872538] svc_process+0xbc/0xf0 [sunrpc] Jan 25 09:11:32 o9 kernel: [ 604.872551] nfsd+0xe8/0x150 [nfsd] Jan 25 09:11:32 o9 kernel: [ 604.872555] kthread+0x114/0x150 Jan 25 09:11:32 o9 kernel: [ 604.872568] ? nfsd_destroy+0x60/0x60 [nfsd] Jan 25 09:11:32 o9 kernel: [ 604.872571] ? kthread_park+0x90/0x90 Jan 25 09:11:32 o9 kernel: [ 604.872575] ret_from_fork+0x22/0x30 This box serves root file system via NFS for several small boards. Once all nfs processes hang, I have to restart the machine to get everything running again.
Comments 4 and 5 are for different filesystem operations on different filesystems, with different NFS versions. It's possible they could all come down to the same nfsd bug somehow, but my first guess would be that they're unrelated to each other and to the originally reported bug, so I'd prefer seperate bugs. That said, the debugging steps Ted describes in comment 3 look like good starting points for debugging all 3 reports.
Assuming this is no longer an issue. Reopen if it actually is. (And please consider a new bug if you're in any doubt as to whether it's the same as the bug in the original description.)