Should be extremely easy to reproduce in 5 commands with kdevops [0], leave everything with the defaults, and then just enable fstests. [0] https://github.com/mcgrof/kdevops make bringup make fstests Just ssh to kdevops-xfs and run: cd /var/lib/xfstests/ ./gendisks.sh -m ./check generic/074 Aug 14 18:27:34 kdevops-xfs-dev kernel: XFS (loop16): Mounting V5 Filesystem Aug 14 18:27:34 kdevops-xfs-dev kernel: XFS (loop16): Ending clean mount Aug 14 18:27:34 kdevops-xfs-dev kernel: xfs filesystem being mounted at /media/test supports timestamps until 2038 (0x7fffffff) Aug 14 18:28:16 kdevops-xfs-dev kernel: nvme nvme1: I/O 128 QID 2 timeout, aborting Aug 14 18:28:16 kdevops-xfs-dev kernel: nvme nvme1: Abort status: 0x4001 Aug 14 18:28:47 kdevops-xfs-dev kernel: nvme nvme1: I/O 128 QID 2 timeout, reset controller Aug 14 18:28:54 kdevops-xfs-dev kernel: sched: RT throttling activated Aug 14 18:31:12 kdevops-xfs-dev kernel: INFO: task xfsaild/nvme1n1:289 blocked for more than 120 seconds. Aug 14 18:31:12 kdevops-xfs-dev kernel: Not tainted 5.7.0-2-amd64 #1 Debian 5.7.10-1 Aug 14 18:31:12 kdevops-xfs-dev kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 14 18:31:12 kdevops-xfs-dev kernel: xfsaild/nvme1n1 D 0 289 2 0x80004000 Aug 14 18:31:12 kdevops-xfs-dev kernel: Call Trace: Aug 14 18:31:12 kdevops-xfs-dev kernel: __schedule+0x2da/0x770 Aug 14 18:31:12 kdevops-xfs-dev kernel: ? xlog_cil_force_lsn+0xc0/0x220 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: schedule+0x4a/0xb0 Aug 14 18:31:12 kdevops-xfs-dev kernel: xlog_wait_on_iclog+0x113/0x130 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? wake_up_q+0xa0/0xa0 Aug 14 18:31:12 kdevops-xfs-dev kernel: xfsaild+0x1bd/0x810 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? __switch_to+0x80/0x3c0 Aug 14 18:31:12 kdevops-xfs-dev kernel: kthread+0xf9/0x130 Aug 14 18:31:12 kdevops-xfs-dev kernel: ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? kthread_park+0x90/0x90 Aug 14 18:31:12 kdevops-xfs-dev kernel: ret_from_fork+0x35/0x40 Aug 14 18:31:12 kdevops-xfs-dev kernel: INFO: task loop16:912 blocked for more than 120 seconds. Aug 14 18:31:12 kdevops-xfs-dev kernel: Not tainted 5.7.0-2-amd64 #1 Debian 5.7.10-1 Aug 14 18:31:12 kdevops-xfs-dev kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 14 18:31:12 kdevops-xfs-dev kernel: loop16 D 0 912 2 0x80004000 Aug 14 18:31:12 kdevops-xfs-dev kernel: Call Trace: Aug 14 18:31:12 kdevops-xfs-dev kernel: __schedule+0x2da/0x770 Aug 14 18:31:12 kdevops-xfs-dev kernel: ? xlog_cil_force_lsn+0xc0/0x220 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: schedule+0x4a/0xb0 Aug 14 18:31:12 kdevops-xfs-dev kernel: xlog_wait_on_iclog+0x113/0x130 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? wake_up_q+0xa0/0xa0 Aug 14 18:31:12 kdevops-xfs-dev kernel: xfsaild+0x1bd/0x810 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? __switch_to+0x80/0x3c0 Aug 14 18:31:12 kdevops-xfs-dev kernel: kthread+0xf9/0x130 Aug 14 18:31:12 kdevops-xfs-dev kernel: ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? kthread_park+0x90/0x90 Aug 14 18:31:12 kdevops-xfs-dev kernel: ret_from_fork+0x35/0x40 Aug 14 18:31:12 kdevops-xfs-dev kernel: INFO: task loop16:912 blocked for more than 120 seconds. Aug 14 18:31:12 kdevops-xfs-dev kernel: Not tainted 5.7.0-2-amd64 #1 Debian 5.7.10-1 Aug 14 18:31:12 kdevops-xfs-dev kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 14 18:31:12 kdevops-xfs-dev kernel: loop16 D 0 912 2 0x80004000 Aug 14 18:31:12 kdevops-xfs-dev kernel: Call Trace: Aug 14 18:31:12 kdevops-xfs-dev kernel: __schedule+0x2da/0x770 Aug 14 18:31:12 kdevops-xfs-dev kernel: schedule+0x4a/0xb0 Aug 14 18:31:12 kdevops-xfs-dev kernel: xlog_wait_on_iclog+0x113/0x130 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? wake_up_q+0xa0/0xa0 Aug 14 18:31:12 kdevops-xfs-dev kernel: __xfs_log_force_lsn+0x10a/0x1d0 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? xfs_file_fsync+0x1f4/0x230 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: xfs_log_force_lsn+0x91/0x120 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: xfs_file_fsync+0x1f4/0x230 [xfs] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? __switch_to_asm+0x34/0x70 Aug 14 18:31:12 kdevops-xfs-dev kernel: loop_queue_work+0x47d/0xa50 [loop] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? __switch_to+0x80/0x3c0 Aug 14 18:31:12 kdevops-xfs-dev kernel: ? __schedule+0x2e2/0x770 Aug 14 18:31:12 kdevops-xfs-dev kernel: kthread_worker_fn+0x73/0x1d0 Aug 14 18:31:12 kdevops-xfs-dev kernel: kthread+0xf9/0x130 Aug 14 18:31:12 kdevops-xfs-dev kernel: ? loop_info64_to_compat+0x220/0x220 [loop] Aug 14 18:31:12 kdevops-xfs-dev kernel: ? kthread_park+0x90/0x90 Aug 14 18:31:12 kdevops-xfs-dev kernel: ret_from_fork+0x35/0x40 Aug 14 18:31:12 kdevops-xfs-dev kernel: INFO: task umount:2212 blocked for more than 120 seconds. Aug 14 18:31:12 kdevops-xfs-dev kernel: Not tainted 5.7.0-2-amd64 #1 Debian 5.7.10-1 Aug 14 18:31:12 kdevops-xfs-dev kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 14 18:31:12 kdevops-xfs-dev kernel: umount D 0 2212 2208 0x00004000
This seems relevant, no? > Aug 14 18:28:16 kdevops-xfs-dev kernel: nvme nvme1: I/O 128 QID 2 timeout, > aborting > Aug 14 18:28:16 kdevops-xfs-dev kernel: nvme nvme1: Abort status: 0x4001 > Aug 14 18:28:47 kdevops-xfs-dev kernel: nvme nvme1: I/O 128 QID 2 timeout, > reset controller then 2.5 minutes later, > Aug 14 18:31:12 kdevops-xfs-dev kernel: INFO: task xfsaild/nvme1n1:289 > blocked for more than 120 seconds.
On Fri, Aug 14, 2020 at 06:43:18PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > Just ssh to kdevops-xfs and run: > > cd /var/lib/xfstests/ > ./gendisks.sh -m > ./check generic/074 > > Aug 14 18:27:34 kdevops-xfs-dev kernel: XFS (loop16): Mounting V5 Filesystem > Aug 14 18:27:34 kdevops-xfs-dev kernel: XFS (loop16): Ending clean mount > Aug 14 18:27:34 kdevops-xfs-dev kernel: xfs filesystem being mounted at > /media/test supports timestamps until 2038 (0x7fffffff) > Aug 14 18:28:16 kdevops-xfs-dev kernel: nvme nvme1: I/O 128 QID 2 timeout, > aborting > Aug 14 18:28:16 kdevops-xfs-dev kernel: nvme nvme1: Abort status: 0x4001 > Aug 14 18:28:47 kdevops-xfs-dev kernel: nvme nvme1: I/O 128 QID 2 timeout, > reset controller Hardware lost an IO. I'm guessing the error handling that reset the controller failed to error out the bio the lost IO belonged to, so XFS has hung waiting for it... Cheers, Dave.