Environment: host kernel : 6.11.0-rc3 guest kernel :6.11.0-rc5 qemu version : QEMU emulator version 9.0.92 (v9.1.0-rc2-28-g2eefd4fcec) Bug detail description: try to boot KVM guest , there is call trace during kernel loading Reproduce steps: 1. set up nfs-server, and put guest image "centos9.img" into NFS-server's /images folder. 2. mount NFS-server folder to local: <NfS-server>:/images /share/guest_imgs nfs defaults,tcp,nolock 0 0 3. create qcow2 image based on remote NFS-server qemu-img created -b /share/guest_imgs/centos9.img -f qcow2 centos9.qcow2 -F raw 4. boot KVM guest with : qemu-system-x86_64 -accel kvm \ -m 4096 \ -smp 4 \ -cpu host \ -drive file=centos9.qcow2,if=none,id=virtio-disk0 \ -device virtio-blk-pci,drive=virtio-disk0,bootindex=0 \ -device virtio-net-pci,netdev=nic0 \ -netdev user,id=nic0,hostfwd=tcp::10022-:22 \ -nographic Error log: During KVM guest booting, there are call trace and Guest could boot up in the end, but boot time is long. INFO: task kworker/u8:1:29 blocked for more than 122 seconds. [ 246.987403] Not tainted 6.11.0-rc5 #2 [ 246.989674] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.992986] task:kworker/u8:1 state:D stack:0 pid:29 tgid:29 ppid:2 flags:0x00004000 [ 246.996842] Workqueue: xfs-cil/dm-0 xlog_cil_push_work [xfs] [ 247.000029] Call Trace: [ 247.001707] <TASK> [ 247.003263] __schedule+0x276/0x710 [ 247.005381] schedule+0x27/0xa0 [ 247.007300] xlog_state_get_iclog_space+0x102/0x2a0 [xfs] [ 247.010298] ? __pfx_default_wake_function+0x10/0x10 [ 247.012800] xlog_write+0x7b/0x260 [xfs] [ 247.015379] ? _raw_spin_unlock+0xe/0x30 [ *** ] (5 of 5) A start job is running for…ernel arming (3min 39s / no limit) [ 247.023496] xlog_cil_push_work+0x50e/0x6f0 [xfs] [ 247.026329] process_one_work+0x158/0x360 [ 247.028584] worker_thread+0x235/0x340 [ 247.030710] ? __pfx_worker_thread+0x10/0x10 [ 247.032997] kthread+0xd0/0x100 [ 247.034937] ? __pfx_kthread+0x10/0x10 [ 247.037063] ret_from_fork+0x31/0x50 [ 247.039161] ? __pfx_kthread+0x10/0x10 [ 247.041302] ret_from_fork_asm+0x1a/0x30 [ 247.043479] </TASK> [ 247.045019] INFO: task kworker/0:3:255 blocked for more than 122 seconds. [ 247.047775] Not tainted 6.11.0-rc5 #2 [ 247.049802] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 247.052903] task:kworker/0:3 state:D stack:0 pid:255 tgid:255 ppid:2 flags:0x00004000 [ 247.056403] Workqueue: xfs-sync/dm-0 xfs_log_worker [xfs] [ 247.058880] Call Trace: [ 247.060293] <TASK> [ 247.061638] __schedule+0x276/0x710 [ 247.063332] schedule+0x27/0xa0 [ 247.064964] schedule_timeout+0x14e/0x160 [ 247.066778] __wait_for_common+0x8f/0x1d0 [ 247.068569] ? __pfx_schedule_timeout+0x10/0x10 [ 247.070332] __flush_workqueue+0x13e/0x3f0 [ 247.072011] xlog_cil_push_now.isra.0+0x25/0x90 [xfs] [ 247.074185] xlog_cil_force_seq+0x79/0x240 [xfs] [ 247.076253] ? rpm_suspend+0x1ad/0x5e0 [ 247.077831] ? _raw_spin_unlock+0xe/0x30 [ 247.079442] ? rpm_suspend+0x5b6/0x5e0 [ 247.080987] xfs_log_force+0x7a/0x230 [xfs] [ 247.082764] xfs_log_worker+0x3d/0xc0 [xfs] [ 247.084564] process_one_work+0x158/0x360 [ 247.086075] worker_thread+0x235/0x340 [ 247.087462] ? __pfx_worker_thread+0x10/0x10 [ 247.088895] kthread+0xd0/0x100 [ 247.090108] ? __pfx_kthread+0x10/0x10 [ 247.091410] ret_from_fork+0x31/0x50 [ 247.092665] ? __pfx_kthread+0x10/0x10 [ 247.093932] ret_from_fork_asm+0x1a/0x30 [ 247.095251] </TASK> [ 247.096232] INFO: task xfsaild/dm-0:543 blocked for more than 122 seconds. [ 247.099379] Not tainted 6.11.0-rc5 #2 [ 247.101574] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 247.104908] task:xfsaild/dm-0 state:D stack:0 pid:543 tgid:543 ppid:2 flags:0x00004000 [ 247.108707] Call Trace: [ 247.110363] <TASK> [ 247.111914] __schedule+0x276/0x710 [ 247.113892] schedule+0x27/0xa0 [ 247.115756] schedule_timeout+0x14e/0x160 [ 247.117899] __wait_for_common+0x8f/0x1d0 [ 247.119998] ? __pfx_schedule_timeout+0x10/0x10 [ 247.122276] __flush_workqueue+0x13e/0x3f0 [ 247.124457] ? ttwu_queue_wakelist+0xd0/0xf0 [ 247.126684] xlog_cil_push_now.isra.0+0x25/0x90 [xfs] [ 247.129559] xlog_cil_force_seq+0x79/0x240 [xfs] [ 247.132238] ? _raw_spin_unlock+0xe/0x30 [ 247.134300] ? finish_task_switch.isra.0+0x97/0x2a0 [ 247.136675] xfs_log_force+0x7a/0x230 [xfs] [ 247.139251] xfsaild_push+0x2d7/0x850 [xfs] [ 247.141753] ? __timer_delete_sync+0x2b/0x40 [ 247.143931] ? schedule_timeout+0x99/0x160 [ 247.146023] ? __pfx_process_timeout+0x10/0x10 [ 247.148238] xfsaild+0xaf/0x140 [xfs] [ 247.150566] ? __pfx_xfsaild+0x10/0x10 [xfs] [ 247.153062] kthread+0xd0/0x100 [ 247.154785] ? __pfx_kthread+0x10/0x10 [ 247.156715] ret_from_fork+0x31/0x50 [ 247.158404] ? __pfx_kthread+0x10/0x10 [ 247.160147] ret_from_fork_asm+0x1a/0x30 [ 247.161914] </TASK> [ 247.163238] INFO: task in:imjournal:787 blocked for more than 123 seconds. [ 247.165818] Not tainted 6.11.0-rc5 #2 [ 247.167697] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 247.170408] task:in:imjournal state:D stack:0 pid:787 tgid:745 ppid:1 flags:0x00000002 [ 247.173404] Call Trace: [ 247.174673] <TASK> [ 247.175844] __schedule+0x276/0x710 [ 247.177441] schedule+0x27/0xa0 [ 247.178888] schedule_timeout+0x14e/0x160 [ 247.180562] ? update_sd_lb_stats.constprop.0+0x66/0x270 [ 247.182469] ___down_common+0x110/0x170 [ 247.183983] __down_common+0x1e/0xc0 [ 247.185435] down+0x47/0x60 [ 247.186684] xfs_buf_lock+0x31/0xe0 [xfs] [ 247.188553] xfs_buf_find_lock+0x59/0x110 [xfs] [ 247.190532] xfs_buf_lookup.constprop.0+0xb9/0x180 [xfs] [ 247.192666] xfs_buf_get_map+0xea/0x5f0 [xfs] [ 247.194510] xfs_buf_read_map+0x58/0x2a0 [xfs] [ 247.196315] ? xfs_btree_read_buf_block+0xa2/0x120 [xfs] [ 247.198282] ? xfs_trans_add_item+0x37/0xb0 [xfs] [ 247.200066] xfs_trans_read_buf_map+0x119/0x300 [xfs] [ 247.201813] ? xfs_btree_read_buf_block+0xa2/0x120 [xfs] [ 247.203618] xfs_btree_read_buf_block+0xa2/0x120 [xfs] [ 247.205404] xfs_btree_lookup_get_block.part.0+0x81/0x1b0 [xfs] [ 247.207331] xfs_btree_lookup+0x102/0x550 [xfs] [ 247.208969] ? kmem_cache_alloc_noprof+0xdc/0x2b0 [ 247.210404] xfs_dialloc_ag_update_inobt+0x47/0x170 [xfs] [ 247.212187] ? xfs_inobt_init_cursor+0x67/0xa0 [xfs] [ 247.213808] xfs_dialloc_ag+0x1ad/0x2e0 [xfs] [ 247.215343] xfs_dialloc+0x24b/0x3b0 [xfs] [ 247.216798] xfs_create+0x167/0x490 [xfs] [ 247.218193] ? xfs_dir2_format+0x4a/0x130 [xfs] [ 247.219644] xfs_generic_create+0x30d/0x360 [xfs] [ 247.221129] lookup_open.isra.0+0x4e1/0x600 [ 247.222312] open_last_lookups+0x139/0x440 [ 247.223461] path_openat+0x88/0x290 [ 247.224451] do_filp_open+0xae/0x150 [ 247.225454] do_sys_openat2+0x96/0xd0 [ 247.226472] __x64_sys_openat+0x57/0xa0 [ 247.227515] do_syscall_64+0x64/0x170 [ 247.228524] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 247.229778] RIP: 0033:0x7fbcacb3e8d4 [ 247.230763] RSP: 002b:00007fbcab7fea40 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 [ 247.232427] RAX: ffffffffffffffda RBX: 00007fbc9c000c10 RCX: 00007fbcacb3e8d4 [ 247.234030] RDX: 0000000000000241 RSI: 00007fbcab7feb80 RDI: 00000000ffffff9c [ 247.235626] RBP: 00007fbcab7feb80 R08: 0000000000000000 R09: 0000000000000001 [ 247.237174] R10: 00000000000001b6 R11: 0000000000000293 R12: 0000000000000241 [ 247.238682] R13: 00007fbc9c000c10 R14: 0000000000000001 R15: 00007fbc9c015c40 [ 247.240204] </TASK> [ 247.242323] INFO: task updatedb:769 blocked for more than 123 seconds. [ 247.243697] Not tainted 6.11.0-rc5 #2 [ 247.244694] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 247.246246] task:updatedb state:D stack:0 pid:769 tgid:769 ppid:756 flags:0x00000002 [ 247.247995] Call Trace: [ 247.248689] <TASK> [ 247.249342] __schedule+0x276/0x710 [ 247.250192] ? vp_notify+0x16/0x20 [virtio_pci] [ 247.251208] schedule+0x27/0xa0 [ 247.252006] schedule_timeout+0x14e/0x160 [ 247.252926] ? __blk_flush_plug+0xf5/0x150 [ 247.253854] __wait_for_common+0x8f/0x1d0 [ 247.254726] ? __pfx_schedule_timeout+0x10/0x10 [ 247.255678] xfs_buf_iowait+0x1c/0xc0 [xfs] [ 247.256797] __xfs_buf_submit+0x131/0x1e0 [xfs] [ 247.257933] xfs_buf_read_map+0x11e/0x2a0 [xfs] [ 247.259044] ? xfs_da_read_buf+0xee/0x170 [xfs] [ 247.260178] xfs_trans_read_buf_map+0x119/0x300 [xfs] [ 247.261306] ? xfs_da_read_buf+0xee/0x170 [xfs] [ 247.262375] xfs_da_read_buf+0xee/0x170 [xfs] [ 247.263430] xfs_dir3_block_read+0x3c/0xf0 [xfs] [ 247.264506] xfs_dir2_block_lookup_int+0x4c/0x1d0 [xfs] [ 247.265658] ? xfs_bmap_last_offset+0x6b/0x110 [xfs] [ 247.266737] xfs_dir2_block_lookup+0x3b/0x130 [xfs] [ 247.267795] xfs_dir_lookup_args+0x3e/0x90 [xfs] [ 247.268810] xfs_dir_lookup+0x139/0x160 [xfs] [ 247.269798] xfs_lookup+0x94/0x160 [xfs] [ 247.270722] ? avc_has_perm_noaudit+0x6b/0xf0 [ 247.271570] xfs_vn_lookup+0x78/0xb0 [xfs] [ 247.272504] __lookup_slow+0x81/0x130 [ 247.273242] walk_component+0xe5/0x160 [ 247.273969] path_lookupat+0x6e/0x1c0 [ 247.274685] filename_lookup+0xcd/0x1c0 [ 247.275428] ? selinux_inode_getattr+0x9a/0xc0 [ 247.276249] ? terminate_walk+0x21/0x100 [ 247.277002] ? xfs_vn_getattr+0x66/0x340 [xfs] [ 247.277968] ? cp_new_stat+0x14f/0x180 [ 247.278682] vfs_statx+0x75/0xe0 [ 247.279332] vfs_fstatat+0x6c/0xb0 [ 247.279989] __do_sys_newfstatat+0x26/0x60 [ 247.280722] ? __seccomp_filter+0x3d8/0x5b0 [ 247.281473] ? syscall_trace_enter+0xff/0x190 [ 247.282270] do_syscall_64+0x64/0x170 [ 247.282958] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 247.283805] RIP: 0033:0x7f57f593e13e [ 247.284479] RSP: 002b:00007ffccb8f2bc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000106 [ 247.285541] RAX: ffffffffffffffda RBX: 0000564a28d7a9f0 RCX: 00007f57f593e13e [ 247.286569] RDX: 00007ffccb8f2c30 RSI: 0000564a28d88fb9 RDI: 00000000ffffff9c [ 247.287595] RBP: 0000564a28d88fb9 R08: 0000000000000000 R09: 00005649ecdb34e0 [ 247.288601] R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000000006 [ 247.289599] R13: 00007ffccb8f2e00 R14: 0000000000000006 R15: 0000000000000006 Note : it was verified this issue was not reproduced if host kernel returned to 6.10.0-rc7.
most KVM usage scenario, KVM guest images are saved on NFS server, this issue cause KVM Guest booting with call trace if KVM image in remote NFS server.
Hi NFS maintainers, I am working with Xuelian.Guo to debug this issue. I found that a VM needs more time to finish bootup on the host with 6.11-rcX kernel. VM bootup on the host with 6.10 kernel: 15-30s VM bootup on the host with 6.11-rcX kernel: 5-8mins I noticed that lots of extra network traffic made during VM bootup with 6.11-rcX host kernel. VM bootup on the host with 6.10 kernel: ----system---- ---net/lo-----net/eno0- time | recv send: recv send 19-09 10:43:17| : 19-09 10:43:18| 0 0 :1044k 11k 19-09 10:43:19| 0 0 : 994k 12k 19-09 10:43:20| 0 0 : 560k 46k 19-09 10:43:21| 0 0 : 192B 4225B 19-09 10:43:22| 0 0 : 362B 3884B 19-09 10:43:23| 0 0 : 192B 3266B 19-09 10:43:24| 0 0 : 192B 4034B 19-09 10:43:25| 0 0 :1121k 19k 19-09 10:43:26| 0 0 : 806k 4274B 19-09 10:43:27| 0 0 : 11M 60k 19-09 10:43:28| 0 0 : 40M 222k 19-09 10:43:29| 0 0 : 18M 616k 19-09 10:43:30| 0 0 :7710B 182k 19-09 10:43:31| 0 0 :1558k 56k 19-09 10:43:32| 0 0 :2958k 66k 19-09 10:43:33| 0 0 :3379k 71k 19-09 10:43:34| 0 0 :2215k 35k 19-09 10:43:35| 0 0 :9636k 72k 19-09 10:43:36| 0 0 : 17M 161k 19-09 10:43:37| 0 0 : 12M 106k 19-09 10:43:38| 0 0 :1564k 56k 19-09 10:43:39| 0 0 : 10M 313k 19-09 10:43:40| 0 0 : 15M 114k 19-09 10:43:41| 0 0 : 11M 123k 19-09 10:43:42| 0 0 : 32M 330k 19-09 10:43:43| 0 0 : 37M 410k 19-09 10:43:44| 0 0 : 30M 380k 19-09 10:43:45| 0 0 : 12M 184k 19-09 10:43:46| 0 0 : 52M 365k 19-09 10:43:47| 0 0 : 37M 399k 19-09 10:43:48| 0 0 : 70M 457k 19-09 10:43:49| 0 0 : 34M 282k 19-09 10:43:50| 0 0 : 47M 357k 19-09 10:43:51| 0 0 : 22M 137k 19-09 10:43:52| 0 0 :8174k 56k 19-09 10:43:53| 0 0 :8999k 92k 19-09 10:43:54| 0 0 :2844k 50k 19-09 10:43:55| 0 0 :2073k 90k ----system---- ---net/lo-----net/eno0- time | recv send: recv send 19-09 10:43:56| 0 0 : 177k 1427B VM bootup on the host with 6.11-rcX kernel ----system---- ---net/lo-----net/eno0- time | recv send: recv send 19-09 09:28:17| : 19-09 09:28:18| 0 0 : 505k 95k 19-09 09:28:19| 0 0 : 192B 3682B 19-09 09:28:20| 0 0 :1606B 3768B 19-09 09:28:21| 0 0 :2080B 3848B 19-09 09:28:22| 0 0 :3324B 3448B 19-09 09:28:23| 0 0 :1526k 18k 19-09 09:28:24| 0 0 : 807k 3434B 19-09 09:28:25| 0 0 :1078k 4678B 19-09 09:28:26| 0 0 : 807k 3294B 19-09 09:28:27| 0 0 :1076k 5303B 19-09 09:28:28| 0 0 :6449k 30k 19-09 09:28:29| 0 0 :3753k 18k 19-09 09:28:30| 0 0 :2151k 9234B 19-09 09:28:31| 0 0 : 29M 206k 19-09 09:28:32| 0 0 : 19M 101k 19-09 09:28:33| 0 0 :5654k 889k 19-09 09:28:34| 0 0 : 80k 171k 19-09 09:28:35| 0 0 :3313k 57k 19-09 09:28:36| 0 0 :4548k 53k 19-09 09:28:37| 0 0 :1616k 71k 19-09 09:28:38| 0 0 :4235k 36k 19-09 09:28:39| 0 0 : 14M 93k 19-09 09:28:40| 0 0 : 20M 232k 19-09 09:28:41| 0 0 :3985k 191k 19-09 09:28:42| 0 0 :9599k 235k 19-09 09:28:43| 0 0 : 16M 190k 19-09 09:28:44| 0 0 : 14M 224k 19-09 09:28:45| 0 0 : 43M 585k 19-09 09:28:46| 0 0 : 116M 3122k 19-09 09:28:47| 0 0 : 117M 1844k 19-09 09:28:48| 0 0 : 117M 1228k 19-09 09:28:49| 0 0 : 117M 313k 19-09 09:28:50| 0 0 : 117M 317k 19-09 09:28:51| 0 0 : 117M 310k 19-09 09:28:52| 0 0 : 117M 310k 19-09 09:28:53| 0 0 : 117M 308k 19-09 09:28:54| 0 0 : 117M 318k 19-09 09:28:55| 0 0 : 117M 311k ----system---- ---net/lo-----net/eno0- time | recv send: recv send 19-09 09:28:56| 0 0 : 117M 321k 19-09 09:28:57| 0 0 : 117M 320k 19-09 09:28:58| 0 0 : 117M 321k 19-09 09:28:59| 0 0 : 117M 314k 19-09 09:29:00| 0 0 : 117M 313k 19-09 09:29:01| 0 0 : 117M 315k 19-09 09:29:02| 0 0 : 117M 313k 19-09 09:29:03| 0 0 : 117M 311k 19-09 09:29:04| 0 0 : 117M 320k 19-09 09:29:05| 0 0 : 117M 315k 19-09 09:29:06| 0 0 : 117M 315k 19-09 09:29:07| 0 0 : 117M 321k 19-09 09:29:08| 0 0 : 117M 324k 19-09 09:29:09| 0 0 : 117M 327k 19-09 09:29:10| 0 0 : 117M 328k 19-09 09:29:11| 0 0 : 117M 324k 19-09 09:29:12| 0 0 : 117M 324k 19-09 09:29:13| 0 0 : 117M 321k 19-09 09:29:14| 0 0 : 117M 324k 19-09 09:29:15| 0 0 : 117M 315k 19-09 09:29:16| 0 0 : 117M 318k 19-09 09:29:17| 0 0 : 117M 324k 19-09 09:29:18| 0 0 : 117M 323k 19-09 09:29:19| 0 0 : 117M 323k 19-09 09:29:20| 0 0 : 117M 321k 19-09 09:29:21| 0 0 : 117M 323k 19-09 09:29:22| 0 0 : 117M 317k 19-09 09:29:23| 0 0 : 117M 323k 19-09 09:29:24| 0 0 : 117M 323k 19-09 09:29:25| 0 0 : 117M 329k 19-09 09:29:26| 0 0 : 117M 319k 19-09 09:29:27| 0 0 : 117M 319k 19-09 09:29:28| 0 0 : 117M 320k 19-09 09:29:29| 0 0 : 117M 323k 19-09 09:29:30| 0 0 : 117M 315k 19-09 09:29:31| 0 0 : 117M 328k 19-09 09:29:32| 0 0 : 117M 330k [...] 19-09 09:34:24| 0 0 : 117M 295k 19-09 09:34:25| 0 0 : 117M 280k 19-09 09:34:26| 0 0 : 117M 309k 19-09 09:34:27| 0 0 : 117M 332k 19-09 09:34:28| 0 0 : 117M 328k 19-09 09:34:29| 0 0 : 117M 327k 19-09 09:34:30| 0 0 : 117M 329k 19-09 09:34:31| 0 0 : 117M 304k 19-09 09:34:32| 0 0 : 117M 316k 19-09 09:34:33| 0 0 : 117M 299k 19-09 09:34:34| 0 0 : 117M 282k 19-09 09:34:35| 0 0 : 116M 287k 19-09 09:34:36| 0 0 : 99M 275k 19-09 09:34:37| 0 0 : 70M 199k 19-09 09:34:38| 0 0 : 79M 223k 19-09 09:34:39| 0 0 : 79M 221k 19-09 09:34:40| 0 0 : 80M 235k 19-09 09:34:41| 0 0 : 80M 223k 19-09 09:34:42| 0 0 : 80M 217k 19-09 09:34:43| 0 0 : 71M 633k 19-09 09:34:44| 0 0 : 77M 217k 19-09 09:34:45| 0 0 : 71M 201k 19-09 09:34:46| 0 0 : 67M 199k ----system---- ---net/lo-----net/eno0- time | recv send: recv send 19-09 09:34:47| 0 0 : 56M 492k 19-09 09:34:48| 0 0 : 63M 358k 19-09 09:34:49| 0 0 : 60M 302k 19-09 09:34:50| 0 0 : 72M 242k 19-09 09:34:51| 0 0 : 61M 448k 19-09 09:34:52| 0 0 : 60M 409k 19-09 09:34:53| 0 0 : 64M 210k 19-09 09:34:54| 0 0 : 75M 224k 19-09 09:34:55| 0 0 : 72M 268k 19-09 09:34:56| 0 0 : 29M 164k 19-09 09:34:57| 0 0 : 14M 123k 19-09 09:34:58| 0 0 : 21M 159k 19-09 09:34:59| 0 0 : 17M 121k 19-09 09:35:00| 0 0 :4276k 43k 19-09 09:35:01| 0 0 :3635k 45k 19-09 09:35:02| 0 0 :4158k 44k 19-09 09:35:03| 0 0 :3258k 34k 19-09 09:35:04| 0 0 :4735k 49k 19-09 09:35:05| 0 0 :5461k 48k 19-09 09:35:06| 0 0 :3232k 35k 19-09 09:35:07| 0 0 :3592k 47k 19-09 09:35:08| 0 0 :6994k 63k 19-09 09:35:09| 0 0 : 13M 81k 19-09 09:35:10| 0 0 :7027k 49k 19-09 09:35:11| 0 0 :3619k 33k 19-09 09:35:12| 0 0 :6174k 54k 19-09 09:35:13| 0 0 :5848k 51k 19-09 09:35:14| 0 0 :5758k 48k 19-09 09:35:15| 0 0 :5561k 45k 19-09 09:35:16| 0 0 :4929k 43k 19-09 09:35:17| 0 0 :2289k 35k 19-09 09:35:18| 0 0 :6416k 71k 19-09 09:35:19| 0 0 :3861k 44k 19-09 09:35:20| 0 0 :4425k 43k 19-09 09:35:21| 0 0 :3406k 38k 19-09 09:35:22| 0 0 :2691k 34k 19-09 09:35:23| 0 0 :1491k 33k 19-09 09:35:24| 0 0 :2038k 35k 19-09 09:35:25| 0 0 :5195k 46k ----system---- ---net/lo-----net/eno0- time | recv send: recv send 19-09 09:35:26| 0 0 :3397k 38k 19-09 09:35:27| 0 0 :4983k 53k 19-09 09:35:28| 0 0 :7727k 78k 19-09 09:35:29| 0 0 :9750k 126k 19-09 09:35:30| 0 0 :8484k 95k 19-09 09:35:31| 0 0 :4493k 129k 19-09 09:35:32| 0 0 :6160k 41k 19-09 09:35:33| 0 0 :7808k 44k 19-09 09:35:34| 0 0 :8776k 57k 19-09 09:35:35| 0 0 : 17M 58k 19-09 09:35:36| 0 0 :5701k 33k 19-09 09:35:37| 0 0 :6359k 38k 19-09 09:35:38| 0 0 :7414k 46k 19-09 09:35:39| 0 0 :8863k 55k 19-09 09:35:40| 0 0 :6184k 30k 19-09 09:35:41| 0 0 :6919k 34k 19-09 09:35:42| 0 0 :7661k 45k 19-09 09:35:43| 0 0 :9455k 47k 19-09 09:35:44| 0 0 :8072k 46k 19-09 09:35:45| 0 0 : 32M 89k 19-09 09:35:46| 0 0 :4737k 24k 19-09 09:35:47| 0 0 :6952k 48k 19-09 09:35:48| 0 0 :7279k 80k 19-09 09:35:49| 0 0 :9034k 73k 19-09 09:35:50| 0 0 : 28M 106k 19-09 09:35:51| 0 0 : 19M 74k 19-09 09:35:52| 0 0 : 13M 91k 19-09 09:35:53| 0 0 : 11M 95k 19-09 09:35:54| 0 0 : 11M 81k 19-09 09:35:55| 0 0 : 11M 56k 19-09 09:35:56| 0 0 : 10M 78k 19-09 09:35:57| 0 0 : 16M 79k 19-09 09:35:58| 0 0 : 18M 80k 19-09 09:35:59| 0 0 : 11M 93k 19-09 09:36:00| 0 0 :3574k 24k 19-09 09:36:01| 0 0 :1802k 21k 19-09 09:36:02| 0 0 : 723k 15k 19-09 09:36:03| 0 0 :1601k 22k 19-09 09:36:04| 0 0 :1247k 18k ----system---- ---net/lo-----net/eno0- time | recv send: recv send 19-09 09:36:05| 0 0 :1646k 22k 19-09 09:36:06| 0 0 :1820k 27k^C We can see all network bandwidth(117M) had been used from 09:28:47 to 09:34:34. That is why we can see some xfs call trace logs in guest's dmesg, because some xfs IOs cannot be finished on time, so guest thought something wrong happened in xfs, but it is not, the guest can be booted up finally without any error. I guess those extra network traffic came from nfs, so I tried to revert some patches, and found that the issue is gone if I reverted below patch. commit 49b29a573da83b65d5f4ecf2db6619bab7aa910c Author: Christoph Hellwig <hch@lst.de> Date: Mon May 27 18:36:09 2024 +0200 nfs: add support for large folios NFS already is void of folio size assumption, so just pass the chunk size to __filemap_get_folio and set the large folio address_space flag for all regular files. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> I am not an expert of NFS, not sure why folio support causes this problem. could you please take a look at it? Thanks.
@Christoph Hellwig, could you check this issue? we double checked this issue is involved by this patch: commit 49b29a573da83b65d5f4ecf2db6619bab7aa910c Author: Christoph Hellwig <hch@lst.de> Date: Mon May 27 18:36:09 2024 +0200 nfs: add support for large folios
I think this thread in the Kernel mailing list may be related. https://lore.kernel.org/all/49648605-d800-4859-be49-624bbe60519d@gmail.com/ I've been able to revert the patch mentioned above to avoid the same symptoms with VM hosted on NFS shares with 6.11 and recent 6.12 kernels.