Bug 207821 - NBD hangs indefinitely on stuck requests
Summary: NBD hangs indefinitely on stuck requests
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Block Layer (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jens Axboe
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-21 09:55 UTC by Roman Mamedov
Modified: 2023-03-20 01:41 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.4.32
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Roman Mamedov 2020-05-21 09:55:04 UTC
I was running a "btrfs balance" on an NBD mounted remote block device. This got stuck for whatever reason, and now I see in dmesg, repeating every few seconds:

[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 000000005f7012a2: control (write@1921269760,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 0000000060d42bfb: control (write@1921302528,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 0000000095dd1d9f: control (write@1921368064,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 0000000019275fab: control (write@1921400832,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 000000005e5801dd: control (write@1921433600,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 00000000f24ee337: control (write@1921515520,32768B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 000000005b7b51ef: control (write@1921581056,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 000000003c5fed95: control (write@1921679360,32768B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 000000000d920921: control (write@1921728512,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 00000000c3dbc3d7: control (write@1922498560,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 0000000039488620: control (write@1922596864,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 000000003a50be0e: control (write@1922826240,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 00000000d574bfc3: control (write@1921843200,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 000000008c69cf1d: control (write@1921875968,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 00000000d4eab2c6: control (write@1921974272,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 000000000383f720: control (write@1922138112,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 0000000084ce5d7c: control (write@1922187264,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 000000004049df4f: control (write@1922318336,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 0000000060b53f86: control (write@1922564096,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 00000000a461a25c: control (write@1922629632,16384B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 00000000e7ec3ded: control (write@1922711552,32768B). Runtime 6240 seconds
[Thu May 21 14:40:05 2020] block nbd3: Possible stuck request 000000001faacc79: control (write@1922859008,16384B). Runtime 6240 seconds
[Thu May 21 14:40:10 2020] block nbd3: Possible stuck request 00000000d4212a40: control (write@1922973696,16384B). Runtime 6240 seconds
[Thu May 21 14:40:10 2020] block nbd3: Possible stuck request 00000000d22a14e5: control (write@1923268608,16384B). Runtime 6240 seconds
[Thu May 21 14:40:10 2020] block nbd3: Possible stuck request 0000000024cb472f: control (write@1923334144,16384B). Runtime 6240 seconds
[Thu May 21 14:40:10 2020] block nbd3: Possible stuck request 0000000069f8e449: control (write@1923366912,16384B). Runtime 6240 seconds
[Thu May 21 14:40:10 2020] block nbd3: Possible stuck request 000000006d72d304: control (write@1923399680,16384B). Runtime 6240 seconds
[Thu May 21 14:40:10 2020] block nbd3: Possible stuck request 00000000b473109a: control (write@1923137536,16384B). Runtime 6240 seconds

Putting aside why it got stuck in first place, as it states, almost 2 hours have passed since then -- and looking at source code, there's no mechanism to ever make this time out.

	if (!nbd->tag_set.timeout) {
		/*
		 * Userspace sets timeout=0 to disable socket disconnection,
		 * so just warn and reset the timer.
		 */
		cmd->retries++;
		dev_info(nbd_to_dev(nbd), "Possible stuck request %p: control (%s@%llu,%uB). Runtime %u seconds\n",
			req, nbdcmd_to_ascii(req_to_nbd_cmd_type(req)),
			(unsigned long long)blk_rq_pos(req) << 9,
			blk_rq_bytes(req), (req->timeout / HZ) * cmd->retries);

		mutex_unlock(&cmd->lock);
		nbd_config_put(nbd);
		return BLK_EH_RESET_TIMER;
	}

It will always "just" warn, and reset the timer. The "retries" isn't checked anywhere for any maximum value either.

I was looking for ways to adjust the timeout being checked, but the comment is wrong here, userspace does not "set" this to 0, userspace has no way to set this, because the setting only appears if CONFIG_DEBUG_FS is enabled in the kernel, and in mine it isn't. So it just defaults to 0 with no way to adjust from userspace. 

Also tried making the NBD server entirely inaccessible with ip6tables REJECT, hoping it will make the requests bail out -- it doesn't.

Now also "sync" stalls, making other processes hang on unrelated operations such as new package install. It looks like on any stuck NBD request like this, there is no way out, other than a hard reboot via the reset button.

If nothing else, then at least the timeout setting absolutely needs to be available not depending on CONFIG_DEBUG_FS.
Comment 1 Roman Mamedov 2020-05-21 10:02:45 UTC
[3135160.111944] sysrq: Show Blocked State
[3135160.112048]   task                        PC stack   pid father
[3135160.112592] dmcrypt_write/2 D    0  5421      2 0x80004000
[3135160.112596] Call Trace:
[3135160.112606]  ? __schedule+0x2e4/0x6f0
[3135160.112610]  ? wbt_exit+0x30/0x30
[3135160.112612]  ? __wbt_done+0x30/0x30
[3135160.112618]  ? crypt_iv_lmk_ctr+0xc0/0xc0 [dm_crypt]
[3135160.112621]  schedule+0x2f/0xa0
[3135160.112623]  io_schedule+0x12/0x40
[3135160.112626]  rq_qos_wait+0xfa/0x170
[3135160.112629]  ? karma_partition+0x230/0x230
[3135160.112631]  ? wbt_exit+0x30/0x30
[3135160.112633]  wbt_wait+0xaf/0xf0
[3135160.112636]  __rq_qos_throttle+0x23/0x30
[3135160.112638]  blk_mq_make_request+0x12a/0x5d0
[3135160.112643]  ? crypt_iv_lmk_ctr+0xc0/0xc0 [dm_crypt]
[3135160.112645]  generic_make_request+0xcf/0x310
[3135160.112650]  dmcrypt_write+0x145/0x170 [dm_crypt]
[3135160.112654]  kthread+0x113/0x130
[3135160.112657]  ? kthread_park+0x80/0x80
[3135160.112660]  ret_from_fork+0x22/0x40
[3135160.112671] btrfs-cleaner   D    0  6240      2 0x80004000
[3135160.112674] Call Trace:
[3135160.112677]  ? __schedule+0x2e4/0x6f0
[3135160.112724]  ? __clear_extent_bit+0x2c7/0x430 [btrfs]
[3135160.112727]  schedule+0x2f/0xa0
[3135160.112730]  schedule_preempt_disabled+0xa/0x10
[3135160.112732]  __mutex_lock.isra.8+0x2b5/0x4a0
[3135160.112771]  btrfs_delete_unused_bgs+0xcf/0x5c0 [btrfs]
[3135160.112804]  ? __btrfs_btree_balance_dirty+0x60/0x60 [btrfs]
[3135160.112838]  ? run_delayed_iput_locked+0xab/0xb0 [btrfs]
[3135160.112871]  ? __btrfs_btree_balance_dirty+0x60/0x60 [btrfs]
[3135160.112903]  cleaner_kthread+0x118/0x120 [btrfs]
[3135160.112906]  kthread+0x113/0x130
[3135160.112908]  ? kthread_park+0x80/0x80
[3135160.112910]  ret_from_fork+0x22/0x40
[3135160.112913] btrfs-transacti D    0  6241      2 0x80004000
[3135160.112915] Call Trace:
[3135160.112918]  ? __schedule+0x2e4/0x6f0
[3135160.112921]  schedule+0x2f/0xa0
[3135160.112923]  io_schedule+0x12/0x40
[3135160.112926]  wait_on_page_bit+0x15c/0x230
[3135160.112929]  ? file_fdatawait_range+0x20/0x20
[3135160.112932]  __filemap_fdatawait_range+0x88/0xf0
[3135160.112969]  ? __clear_extent_bit+0x2c7/0x430 [btrfs]
[3135160.112972]  filemap_fdatawait_range+0xe/0x20
[3135160.113005]  __btrfs_wait_marked_extents.isra.21+0xce/0x110 [btrfs]
[3135160.113039]  btrfs_wait_extents+0x15/0x30 [btrfs]
[3135160.113073]  btrfs_write_and_wait_transaction.isra.25+0x74/0xc0 [btrfs]
[3135160.113076]  ? __wake_up_common_lock+0x87/0xc0
[3135160.113110]  btrfs_commit_transaction+0x71c/0x9b0 [btrfs]
[3135160.113113]  ? finish_wait+0xb0/0xb0
[3135160.113145]  transaction_kthread+0x146/0x180 [btrfs]
[3135160.113179]  ? btrfs_cleanup_transaction+0x530/0x530 [btrfs]
[3135160.113181]  kthread+0x113/0x130
[3135160.113183]  ? kthread_park+0x80/0x80
[3135160.113185]  ret_from_fork+0x22/0x40
[3135160.113188] btrfs           D    0 11704  11701 0x00004004
[3135160.113190] Call Trace:
[3135160.113193]  ? __schedule+0x2e4/0x6f0
[3135160.113196]  schedule+0x2f/0xa0
[3135160.113229]  wait_for_commit+0x58/0x80 [btrfs]
[3135160.113232]  ? finish_wait+0xb0/0xb0
[3135160.113265]  btrfs_commit_transaction+0x142/0x9b0 [btrfs]
[3135160.113304]  prepare_to_merge+0x1b0/0x260 [btrfs]
[3135160.113340]  relocate_block_group+0x1ae/0x630 [btrfs]
[3135160.113363]  btrfs_relocate_block_group+0x1b2/0x310 [btrfs]
[3135160.113383]  btrfs_relocate_chunk+0x36/0xa0 [btrfs]
[3135160.113403]  btrfs_balance+0x7b4/0xef0 [btrfs]
[3135160.113405]  ? _cond_resched+0x15/0x30
[3135160.113425]  btrfs_ioctl_balance+0x2bc/0x370 [btrfs]
[3135160.113445]  btrfs_ioctl+0x15de/0x2da0 [btrfs]
[3135160.113448]  ? mem_cgroup_commit_charge+0x74/0x4e0
[3135160.113449]  ? mem_cgroup_try_charge+0x7f/0x1b0
[3135160.113450]  ? mem_cgroup_throttle_swaprate+0x24/0x14a
[3135160.113452]  ? __handle_mm_fault+0xadc/0x12c0
[3135160.113454]  do_vfs_ioctl+0xb5/0x660
[3135160.113456]  ksys_ioctl+0x82/0xa0
[3135160.113457]  __x64_sys_ioctl+0x16/0x20
[3135160.113459]  do_syscall_64+0x52/0x170
[3135160.113460]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[3135160.113462] RIP: 0033:0x7f00494a8427
[3135160.113465] Code: Bad RIP value.
[3135160.113466] RSP: 002b:00007ffcb4982dc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
[3135160.113467] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f00494a8427
[3135160.113468] RDX: 00007ffcb4982e68 RSI: 00000000c4009420 RDI: 0000000000000003
[3135160.113468] RBP: 00007ffcb4982e68 R08: 0000000000000003 R09: 0000000000000078
[3135160.113469] R10: fffffffffffffb66 R11: 0000000000000202 R12: 0000000000000003
[3135160.113469] R13: 00007ffcb49848c1 R14: 0000000000000001 R15: 0000000000000000
[3135160.113473] btrfs           D    0 23577  23576 0x00000004
[3135160.113474] Call Trace:
[3135160.113476]  ? __schedule+0x2e4/0x6f0
[3135160.113477]  schedule+0x2f/0xa0
[3135160.113496]  btrfs_cancel_balance+0xf2/0x170 [btrfs]
[3135160.113498]  ? finish_wait+0xb0/0xb0
[3135160.113518]  btrfs_ioctl+0x290/0x2da0 [btrfs]
[3135160.113519]  ? mem_cgroup_commit_charge+0x74/0x4e0
[3135160.113521]  ? mem_cgroup_try_charge+0x7f/0x1b0
[3135160.113522]  ? mem_cgroup_throttle_swaprate+0x24/0x14a
[3135160.113523]  ? __handle_mm_fault+0xadc/0x12c0
[3135160.113524]  do_vfs_ioctl+0xb5/0x660
[3135160.113526]  ksys_ioctl+0x82/0xa0
[3135160.113527]  __x64_sys_ioctl+0x16/0x20
[3135160.113528]  do_syscall_64+0x52/0x170
[3135160.113529]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[3135160.113530] RIP: 0033:0x7f9135b20427
[3135160.113532] Code: Bad RIP value.
[3135160.113533] RSP: 002b:00007ffef74c02e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[3135160.113533] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f9135b20427
[3135160.113534] RDX: 0000000000000002 RSI: 0000000040049421 RDI: 0000000000000003
[3135160.113534] RBP: 00007ffef74c08b0 R08: 0000000000000003 R09: 0000000000000078
[3135160.113535] R10: fffffffffffffb66 R11: 0000000000000206 R12: 00007ffef74c18d6
[3135160.113535] R13: 0000000000000001 R14: 000055afd4652105 R15: 0000000000000000
[3135160.113540] sync            D    0  6623   1870 0x00004000
[3135160.113541] Call Trace:
[3135160.113543]  ? __schedule+0x2e4/0x6f0
[3135160.113544]  ? __switch_to_asm+0x34/0x70
[3135160.113546]  schedule+0x2f/0xa0
[3135160.113547]  io_schedule+0x12/0x40
[3135160.113548]  wait_on_page_bit+0x15c/0x230
[3135160.113550]  ? file_fdatawait_range+0x20/0x20
[3135160.113551]  __filemap_fdatawait_range+0x88/0xf0
[3135160.113553]  filemap_fdatawait_keep_errors+0x1a/0x40
[3135160.113555]  sync_inodes_sb+0x230/0x2d0
[3135160.113557]  ? default_file_splice_write+0x20/0x20
[3135160.113558]  iterate_supers+0x97/0x100
[3135160.113559]  ksys_sync+0x40/0xb0
[3135160.113561]  __ia32_sys_sync+0xa/0x10
[3135160.113562]  do_syscall_64+0x52/0x170
[3135160.113563]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[3135160.113564] RIP: 0033:0x7f0a1cd69287
[3135160.113565] Code: Bad RIP value.
[3135160.113566] RSP: 002b:00007fff6cac4eb8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a2
[3135160.113567] RAX: ffffffffffffffda RBX: 00007fff6cac4fe8 RCX: 00007f0a1cd69287
[3135160.113567] RDX: 00007f0a1ce36001 RSI: 0000000000000000 RDI: 00007f0a1cdfab3a
[3135160.113568] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[3135160.113568] R10: fffffffffffffbe3 R11: 0000000000000206 R12: 0000000000000001
[3135160.113569] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[3135160.113571] cron-hourly.php D    0  7821   7820 0x00004000
[3135160.113572] Call Trace:
[3135160.113574]  ? __schedule+0x2e4/0x6f0
[3135160.113575]  schedule+0x2f/0xa0
[3135160.113576]  io_schedule+0x12/0x40
[3135160.113578]  wait_on_page_bit+0x15c/0x230
[3135160.113579]  ? file_fdatawait_range+0x20/0x20
[3135160.113581]  migrate_pages+0x1cd/0xb70
[3135160.113583]  ? isolate_freepages_block+0x380/0x380
[3135160.113584]  ? __bpf_trace_mm_compaction_kcompactd_sleep+0x10/0x10
[3135160.113585]  compact_zone+0x708/0xcc0
[3135160.113587]  compact_node+0x86/0xc0
[3135160.113589]  ? cpumask_next+0x16/0x20
[3135160.113590]  sysctl_compaction_handler+0x40/0x70
[3135160.113592]  proc_sys_call_handler+0x1e2/0x1f0
[3135160.113594]  vfs_write+0xb9/0x1c0
[3135160.113595]  ksys_write+0x6e/0x100
[3135160.113597]  do_syscall_64+0x52/0x170
[3135160.113598]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[3135160.113598] RIP: 0033:0x7fae90dea504
[3135160.113600] Code: Bad RIP value.
[3135160.113600] RSP: 002b:00007ffd3d1f0068 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[3135160.113601] RAX: ffffffffffffffda RBX: 00007fae8e4ae300 RCX: 00007fae90dea504
[3135160.113602] RDX: 0000000000000001 RSI: 00007fae8e462bd8 RDI: 0000000000000003
[3135160.113602] RBP: 0000000000000001 R08: 000000000000cff8 R09: 00007fae90e3bb40
[3135160.113603] R10: 79732f636f72702f R11: 0000000000000246 R12: 00007fae8e462bd8
[3135160.113603] R13: 0000000000000000 R14: 00007fae8e462d60 R15: 0000000000000000
[3135160.113609] cron-hourly.php D    0 13721  13720 0x00004000
[3135160.113610] Call Trace:
[3135160.113612]  ? __schedule+0x2e4/0x6f0
[3135160.113614]  schedule+0x2f/0xa0
[3135160.113615]  io_schedule+0x12/0x40
[3135160.113616]  wait_on_page_bit+0x15c/0x230
[3135160.113618]  ? file_fdatawait_range+0x20/0x20
[3135160.113619]  migrate_pages+0x1cd/0xb70
[3135160.113620]  ? isolate_freepages_block+0x380/0x380
[3135160.113622]  ? __bpf_trace_mm_compaction_kcompactd_sleep+0x10/0x10
[3135160.113623]  compact_zone+0x708/0xcc0
[3135160.113624]  ? check_preempt_wakeup+0x189/0x230
[3135160.113626]  compact_node+0x86/0xc0
[3135160.113627]  ? cpumask_next+0x16/0x20
[3135160.113628]  sysctl_compaction_handler+0x40/0x70
[3135160.113629]  proc_sys_call_handler+0x1e2/0x1f0
[3135160.113631]  vfs_write+0xb9/0x1c0
[3135160.113632]  ksys_write+0x6e/0x100
[3135160.113633]  do_syscall_64+0x52/0x170
[3135160.113635]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[3135160.113635] RIP: 0033:0x7fdbcb082504
[3135160.113637] Code: Bad RIP value.
[3135160.113637] RSP: 002b:00007ffe4f604e88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[3135160.113638] RAX: ffffffffffffffda RBX: 00007fdbc86ae300 RCX: 00007fdbcb082504
[3135160.113639] RDX: 0000000000000001 RSI: 00007fdbc8662bd8 RDI: 0000000000000003
[3135160.113639] RBP: 0000000000000001 R08: 000000000000cff8 R09: 00007fdbcb0d3b40
[3135160.113640] R10: 79732f636f72702f R11: 0000000000000246 R12: 00007fdbc8662bd8
[3135160.113640] R13: 0000000000000000 R14: 00007fdbc8662d60 R15: 0000000000000000
[3135160.113669] cfdisk          D    0 19465  18342 0x00000004
[3135160.113670] Call Trace:
[3135160.113672]  ? __schedule+0x2e4/0x6f0
[3135160.113673]  ? _cond_resched+0x15/0x30
[3135160.113674]  ? wbt_exit+0x30/0x30
[3135160.113675]  ? __wbt_done+0x30/0x30
[3135160.113676]  schedule+0x2f/0xa0
[3135160.113677]  io_schedule+0x12/0x40
[3135160.113678]  rq_qos_wait+0xfa/0x170
[3135160.113679]  ? karma_partition+0x230/0x230
[3135160.113680]  ? wbt_exit+0x30/0x30
[3135160.113681]  wbt_wait+0xaf/0xf0
[3135160.113683]  __rq_qos_throttle+0x23/0x30
[3135160.113684]  blk_mq_make_request+0x12a/0x5d0
[3135160.113685]  generic_make_request+0xcf/0x310
[3135160.113686]  submit_bio+0x4d/0x1e0
[3135160.113688]  submit_bio_wait+0x53/0x80
[3135160.113690]  blkdev_issue_flush+0x8d/0xc0
[3135160.113692]  blkdev_fsync+0x3a/0x50
[3135160.113693]  do_fsync+0x43/0x80
[3135160.113694]  __x64_sys_fsync+0x10/0x20
[3135160.113695]  do_syscall_64+0x52/0x170
[3135160.113696]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[3135160.113697] RIP: 0033:0x7fdb04339214
[3135160.113699] Code: Bad RIP value.
[3135160.113699] RSP: 002b:00007ffc9aa33d98 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[3135160.113700] RAX: ffffffffffffffda RBX: 0000000002212490 RCX: 00007fdb04339214
[3135160.113701] RDX: 006e776f6e6b6e55 RSI: 00007fdb04d29b23 RDI: 0000000000000003
[3135160.113701] RBP: 00007fdb04d29b1b R08: 0000000002214060 R09: 0000000000000000
[3135160.113702] R10: fffffffffffffcba R11: 0000000000000246 R12: 0000000002213a80
[3135160.113702] R13: 0000000002212490 R14: 0000000000000000 R15: 0000000000000000
Comment 2 Roman Mamedov 2023-03-20 01:41:39 UTC
For the record, there is a -t (timeout) option in nbd-client (using the "NBD_SET_TIMEOUT" ioctl), which probably should help in cases like this. But this has to be set beforehand at mount-time, there is no way to adjust it on the fly, at least not without CONFIG_DEBUG_FS.

Note You need to log in before you can comment on or make changes to this bug.