Created attachment 306014 [details] dmesg After kernel upgrade 6.7.6 > 6.8.1 all Xen guests are dumping kernel traces. Downgrading the kernel in the Xen guest makes the problem go away. Xen host: 4.18.1 Xen guest type: PVH An example trace: ``` [88847.284348] Call Trace: [88847.284354] <IRQ> [88847.284361] dump_stack_lvl+0x47/0x60 [88847.284378] bad_page+0x71/0x100 [88847.284393] free_unref_page_prepare+0x236/0x390 [88847.284405] free_unref_page+0x34/0x180 [88847.284416] __pskb_pull_tail+0x3ff/0x4a0 [88847.284432] xennet_poll+0x909/0xa40 [xen_netfront 12c02fdcf84c692965d9cd6ca5a6ff0a530b4ce9] [88847.284470] __napi_poll+0x28/0x1b0 [88847.284483] net_rx_action+0x2b5/0x370 [88847.284495] ? handle_irq_desc+0x3e/0x60 [88847.284511] __do_softirq+0xc9/0x2c8 [88847.284523] __irq_exit_rcu+0xa3/0xc0 [88847.284536] sysvec_xen_hvm_callback+0x72/0x90 [88847.284545] </IRQ> [88847.284549] <TASK> [88847.284552] asm_sysvec_xen_hvm_callback+0x1a/0x20 [88847.284562] RIP: 0010:pv_native_safe_halt+0xf/0x20 [88847.284572] Code: 22 d7 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d e3 13 27 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 [88847.284579] RSP: 0018:ffffb2a1800c3e58 EFLAGS: 00000246 [88847.284587] RAX: 0000000000004000 RBX: ffff91358033b864 RCX: 000051404aebd79d [88847.284594] RDX: ffff9136f9b00000 RSI: ffff91358033b800 RDI: 0000000000000001 [88847.284599] RBP: ffff91358033b864 R08: ffffffff9b94dca0 R09: 0000000000000001 [88847.284604] R10: 0000000000000018 R11: ffff9136f9b331a4 R12: ffffffff9b94dca0 [88847.284609] R13: ffffffff9b94dd20 R14: 0000000000000001 R15: 0000000000000000 [88847.284623] acpi_safe_halt+0x15/0x30 [88847.284634] acpi_idle_do_entry+0x2f/0x50 [88847.284644] acpi_idle_enter+0x7f/0xd0 [88847.284655] cpuidle_enter_state+0x81/0x440 [88847.284667] cpuidle_enter+0x2d/0x40 [88847.284678] do_idle+0x1d8/0x230 [88847.284688] cpu_startup_entry+0x2a/0x30 [88847.284695] start_secondary+0x11e/0x140 [88847.284705] secondary_startup_64_no_verify+0x184/0x18b [88847.284725] </TASK> ```
Created attachment 306015 [details] systemd journal including boot
Possibly related is patch "mm/page_pool: catch page_pool memory leaks" https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.8&id=dba1b8a7ab6853a84bf3afdbeac1c2f2370d3444 It would suggest that there is a problem in the xen_netfront driver, which was previously not visible.
The cause has been fixed by a patch for the xen-front driver. It should land in the following Linux versions. 5.15.154 6.1.85 6.6.26 6.8.5 6.9-rc3 For me the problem should be solved, so I close this bug report.
Fixed by this patch. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/drivers/net/xen-netfront.c?id=c88b9b4cde17aec34fb9bfaf69f9f72a1c44f511