Bug 217572 - Initial blocked tasks causing deterioration over hours until (nearly) complete system lockup and data loss with PostgreSQL 13
Summary: Initial blocked tasks causing deterioration over hours until (nearly) complet...
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P3 high
Assignee: FileSystem/XFS Default Virtual Assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-19 08:29 UTC by Christian Theune
Modified: 2023-11-07 14:12 UTC (History)
6 users (show)

See Also:
Kernel Version: 6.1.57
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Christian Theune 2023-06-19 08:29:00 UTC
Last Friday we experienced the following hung task messages with PostgreSQL while performing our nightly backup using pg_dump. Normally this takes at most a few minutes with IO being stressed. This time it caused high SYS CPU time, went on for almost 20 minutes and caused the PostgreSQL dump to fail with inconsistent data.

Around 3:50 AM we got this:

[330289.821046] INFO: task .postgres-wrapp:11884 blocked for more than 122 seconds.
[330289.821830]       Not tainted 6.1.31 #1-NixOS
[330289.822285] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[330289.823098] task:.postgres-wrapp state:D stack:0     pid:11884 ppid:11881  flags:0x00000002
[330289.823911] Call Trace:
[330289.824221]  <TASK>
[330289.824451]  __schedule+0x35d/0x1370
[330289.824858]  ? mntput_no_expire+0x4a/0x250
[330289.825307]  schedule+0x5d/0xe0
[330289.825630]  rwsem_down_write_slowpath+0x34e/0x730
[330289.826128]  xfs_ilock+0xeb/0xf0 [xfs]
[330289.826599]  xfs_file_buffered_write+0x119/0x300 [xfs]
[330289.827212]  ? selinux_file_permission+0x10b/0x150
[330289.827683]  vfs_write+0x244/0x400
[330289.828049]  __x64_sys_pwrite64+0x94/0xc0
[330289.828459]  do_syscall_64+0x3a/0x90
[330289.828801]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[330289.829302] RIP: 0033:0x7ff8de90e7f7
[330289.829681] RSP: 002b:00007fff52069b08 EFLAGS: 00000202 ORIG_RAX: 0000000000000012
[330289.830408] RAX: ffffffffffffffda RBX: 0000562bb434f510 RCX: 00007ff8de90e7f7
[330289.831073] RDX: 0000000000002000 RSI: 00007ff888b52e80 RDI: 000000000000003b
[330289.831762] RBP: 00000000000021b0 R08: 000000000a000010 R09: 0000000000000040
[330289.832440] R10: 0000000004ed8000 R11: 0000000000000202 R12: 0000000000002000
[330289.833130] R13: 0000000004ed8000 R14: 00007ff8de8176c8 R15: 0000562bb434af75
[330289.833803]  </TASK>
[330289.834064] INFO: task .postgres-wrapp:1245532 blocked for more than 122 seconds.
[330289.834771]       Not tainted 6.1.31 #1-NixOS
[330289.835209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[330289.835926] task:.postgres-wrapp state:D stack:0     pid:1245532 ppid:11881  flags:0x00000002
[330289.836752] Call Trace:
[330289.837010]  <TASK>
[330289.837258]  __schedule+0x35d/0x1370
[330289.837622]  ? page_add_file_rmap+0xba/0x2f0
[330289.838047]  ? do_set_pte+0x174/0x1c0
[330289.838420]  ? unix_stream_read_generic+0x223/0xa60
[330289.838887]  schedule+0x5d/0xe0
[330289.839265]  schedule_preempt_disabled+0x14/0x30
[330289.839758]  rwsem_down_read_slowpath+0x29e/0x4f0
[330289.840521]  down_read+0x47/0xb0
[330289.840853]  xfs_ilock+0x79/0xf0 [xfs]
[330289.841346]  xfs_file_buffered_read+0x44/0xd0 [xfs]
[330289.841945]  xfs_file_read_iter+0x6a/0xd0 [xfs]
[330289.842496]  vfs_read+0x23c/0x310
[330289.842845]  __x64_sys_pread64+0x94/0xc0
[330289.843303]  do_syscall_64+0x3a/0x90
[330289.843857]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[330289.844420] RIP: 0033:0x7ff8de90e747
[330289.844898] RSP: 002b:00007fff5206aae8 EFLAGS: 00000202 ORIG_RAX: 0000000000000011
[330289.845705] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff8de90e747
[330289.846488] RDX: 0000000000002000 RSI: 00007ff8896e0e80 RDI: 0000000000000010
[330289.847133] RBP: 0000000000000001 R08: 000000000a00000d R09: 0000000000000000
[330289.847627] R10: 0000000018fb8000 R11: 0000000000000202 R12: 00007ff8cdd91278
[330289.848113] R13: 0000562bb434af75 R14: 0000562bb434f510 R15: 0000562bb5e89e70
[330289.848631]  </TASK>


This keeps going on for a few minutes until:

[330584.618978] rcu: INFO: rcu_preempt self-detected stall on CPU
[330584.619413] rcu: 	1-....: (20999 ticks this GP) idle=1204/1/0x4000000000000000 softirq=53073547/53073547 fqs=5231
[330584.620052] 	(t=21000 jiffies g=142859597 q=21168 ncpus=3)
[330584.620409] CPU: 1 PID: 1246481 Comm: .postgres-wrapp Not tainted 6.1.31 #1-NixOS
[330584.620880] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[330584.621588] RIP: 0010:xas_load+0xb/0x40
[330584.621844] Code: 04 48 8b 44 c2 08 c3 cc cc cc cc 48 8b 07 48 8b 40 08 c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 e8 3b ff ff ff 48 89 c2 83 e2 03 <48> 83 fa 02 75 08 48 3d 00 10 00 00 77 05 c3 cc cc cc cc 0f b6 4f
[330584.622996] RSP: 0018:ffffb427c3387bf8 EFLAGS: 00000202
[330584.623337] RAX: ffff988876d3c002 RBX: ffffb427c3387d70 RCX: 0000000000000002
[330584.623787] RDX: 0000000000000002 RSI: ffff98871f9d2920 RDI: ffffb427c3387c00
[330584.624239] RBP: 000000000000f161 R08: ffffb427c3387e68 R09: ffffda1444482000
[330584.624686] R10: 0000000000000001 R11: 0000000000000001 R12: 000000000000f161
[330584.625136] R13: ffff9887ad3c2700 R14: 000000000000f160 R15: ffffb427c3387e90
[330584.625591] FS:  00007ff8de817800(0000) GS:ffff98887ac80000(0000) knlGS:0000000000000000
[330584.626097] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[330584.626467] CR2: 0000562bb434e2d8 CR3: 0000000104bd6000 CR4: 00000000000006e0
[330584.626954] Call Trace:
[330584.627125]  <IRQ>
[330584.627281]  ? rcu_dump_cpu_stacks+0xc8/0x100
[330584.627567]  ? rcu_sched_clock_irq.cold+0x15b/0x2fb
[330584.627882]  ? sched_slice+0x87/0x140
[330584.628126]  ? timekeeping_update+0xdd/0x130
[330584.628414]  ? __cgroup_account_cputime_field+0x5b/0xa0
[330584.628751]  ? update_process_times+0x77/0xb0
[330584.629036]  ? update_wall_time+0xc/0x20
[330584.629300]  ? tick_sched_handle+0x34/0x50
[330584.629564]  ? tick_sched_timer+0x6f/0x80
[330584.629823]  ? tick_sched_do_timer+0xa0/0xa0
[330584.630103]  ? __hrtimer_run_queues+0x112/0x2b0
[330584.630404]  ? hrtimer_interrupt+0xfe/0x220
[330584.630678]  ? __sysvec_apic_timer_interrupt+0x7f/0x170
[330584.631016]  ? sysvec_apic_timer_interrupt+0x99/0xc0
[330584.631339]  </IRQ>
[330584.631485]  <TASK>
[330584.631631]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
[330584.631974]  ? xas_load+0xb/0x40
[330584.632195]  ? xas_load+0x30/0x40
[330584.632415]  filemap_get_read_batch+0x16e/0x250
[330584.632710]  filemap_get_pages+0xa9/0x630
[330584.632972]  ? memcg_check_events+0xda/0x210
[330584.633259]  ? free_unref_page_commit+0x7c/0x170
[330584.633560]  ? _raw_spin_unlock_irqrestore+0x23/0x40
[330584.633880]  ? free_unref_page+0x1ac/0x220
[330584.634146]  filemap_read+0xd2/0x340
[330584.634389]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
[330584.634778]  xfs_file_read_iter+0x6a/0xd0 [xfs]
[330584.635123]  vfs_read+0x23c/0x310
[330584.635354]  __x64_sys_pread64+0x94/0xc0
[330584.635609]  do_syscall_64+0x3a/0x90
[330584.635846]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[330584.636174] RIP: 0033:0x7ff8de90e747
[330584.636437] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66 90 f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
[330584.637582] RSP: 002b:00007fff5206aaa8 EFLAGS: 00000202 ORIG_RAX: 0000000000000011
[330584.638056] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff8de90e747
[330584.638506] RDX: 0000000000002000 RSI: 00007ff888b68e80 RDI: 0000000000000230
[330584.638954] RBP: 0000000000000001 R08: 000000000a00000d R09: 0000000000000000
[330584.639411] R10: 000000000f160000 R11: 0000000000000202 R12: 00007ff8cdd92688
[330584.639862] R13: 0000562bb434af75 R14: 0000562bb434f510 R15: 0000562bb5fe04d0
[330584.640313]  </TASK>
[330584.721995] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-.... } 21015 jiffies s: 7297 root: 0x2/.
[330584.722930] rcu: blocking rcu_node structures (internal RCU debug):
[330584.723428] Sending NMI from CPU 0 to CPUs 1:
[330584.723763] NMI backtrace for cpu 1
[330584.723769] CPU: 1 PID: 1246481 Comm: .postgres-wrapp Not tainted 6.1.31 #1-NixOS
[330584.723772] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[330584.723775] RIP: 0010:xas_load+0x29/0x40
[330584.723783] Code: 00 e8 3b ff ff ff 48 89 c2 83 e2 03 48 83 fa 02 75 08 48 3d 00 10 00 00 77 05 c3 cc cc cc cc 0f b6 4f 10 48 8d 70 fe 38 48 fe <72> ee e8 20 fe ff ff 80 3e 00 75 d0 c3 cc cc cc cc 66 0f 1f 44 00
[330584.723785] RSP: 0018:ffffb427c3387bf8 EFLAGS: 00000246
[330584.723787] RAX: ffff98871f8dbdaa RBX: ffffb427c3387d70 RCX: 0000000000000000
[330584.723788] RDX: 0000000000000002 RSI: ffff98871f8dbda8 RDI: ffffb427c3387c00
[330584.723789] RBP: 000000000000f161 R08: ffffb427c3387e68 R09: ffffda1444482000
[330584.723790] R10: 0000000000000001 R11: 0000000000000001 R12: 000000000000f161
[330584.723791] R13: ffff9887ad3c2700 R14: 000000000000f160 R15: ffffb427c3387e90
[330584.723793] FS:  00007ff8de817800(0000) GS:ffff98887ac80000(0000) knlGS:0000000000000000
[330584.723794] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[330584.723795] CR2: 0000562bb434e2d8 CR3: 0000000104bd6000 CR4: 00000000000006e0
[330584.723798] Call Trace:
[330584.723801]  <NMI>
[330584.723805]  ? nmi_cpu_backtrace.cold+0x1b/0x76
[330584.723810]  ? nmi_cpu_backtrace_handler+0xd/0x20
[330584.723814]  ? nmi_handle+0x5d/0x120
[330584.723817]  ? xas_load+0x29/0x40
[330584.723818]  ? default_do_nmi+0x69/0x170
[330584.723821]  ? exc_nmi+0x13c/0x170
[330584.723823]  ? end_repeat_nmi+0x16/0x67
[330584.723828]  ? xas_load+0x29/0x40
[330584.723830]  ? xas_load+0x29/0x40
[330584.723832]  ? xas_load+0x29/0x40
[330584.723834]  </NMI>
[330584.723834]  <TASK>
[330584.723835]  filemap_get_read_batch+0x16e/0x250
[330584.723840]  filemap_get_pages+0xa9/0x630
[330584.723842]  ? memcg_check_events+0xda/0x210
[330584.723845]  ? free_unref_page_commit+0x7c/0x170
[330584.723849]  ? _raw_spin_unlock_irqrestore+0x23/0x40
[330584.723851]  ? free_unref_page+0x1ac/0x220
[330584.723852]  filemap_read+0xd2/0x340
[330584.723857]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
[330584.723935]  xfs_file_read_iter+0x6a/0xd0 [xfs]
[330584.723990]  vfs_read+0x23c/0x310
[330584.723995]  __x64_sys_pread64+0x94/0xc0
[330584.723997]  do_syscall_64+0x3a/0x90
[330584.724000]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[330584.724003] RIP: 0033:0x7ff8de90e747
[330584.724019] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66 90 f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
[330584.724020] RSP: 002b:00007fff5206aaa8 EFLAGS: 00000202 ORIG_RAX: 0000000000000011
[330584.724021] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff8de90e747
[330584.724022] RDX: 0000000000002000 RSI: 00007ff888b68e80 RDI: 0000000000000230
[330584.724023] RBP: 0000000000000001 R08: 000000000a00000d R09: 0000000000000000
[330584.724023] R10: 000000000f160000 R11: 0000000000000202 R12: 00007ff8cdd92688
[330584.724024] R13: 0000562bb434af75 R14: 0000562bb434f510 R15: 0000562bb5fe04d0
[330584.724026]  </TASK>


This keeps repeating until around 4:03 which ends in systemd-journal getting coredumped:

[331277.846966] rcu: INFO: rcu_preempt self-detected stall on CPU
[331277.847413] rcu: 	1-....: (713858 ticks this GP) idle=1204/1/0x4000000000000000 softirq=53073547/53073547 fqs=311996
[331277.848088] 	(t=714253 jiffies g=142859597 q=1821602 ncpus=3)
[331277.848462] CPU: 1 PID: 1246481 Comm: .postgres-wrapp Not tainted 6.1.31 #1-NixOS
[331277.848941] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[331277.849667] RIP: 0010:xas_descend+0x22/0x70
[331277.849952] Code: cc cc cc cc cc cc cc cc 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 <48> 83 f9 02 75 08 48 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc
[331277.851120] RSP: 0018:ffffb427c3387bf0 EFLAGS: 00000202
[331277.851468] RAX: ffff988876d3c002 RBX: ffffb427c3387d70 RCX: 0000000000000002
[331277.851928] RDX: 000000000000000f RSI: ffff98871f9d2920 RDI: ffffb427c3387c00
[331277.852386] RBP: 000000000000f161 R08: ffffb427c3387e68 R09: ffffda1444482000
[331277.852847] R10: 0000000000000001 R11: 0000000000000001 R12: 000000000000f161
[331277.853303] R13: ffff9887ad3c2700 R14: 000000000000f160 R15: ffffb427c3387e90
[331277.853766] FS:  00007ff8de817800(0000) GS:ffff98887ac80000(0000) knlGS:0000000000000000
[331277.854278] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[331277.854653] CR2: 0000562bb434e2d8 CR3: 0000000104bd6000 CR4: 00000000000006e0
[331277.855111] Call Trace:
[331277.855284]  <IRQ>
[331277.855433]  ? rcu_dump_cpu_stacks+0xc8/0x100
[331277.855730]  ? rcu_sched_clock_irq.cold+0x15b/0x2fb
[331277.856049]  ? sched_slice+0x87/0x140
[331277.856294]  ? timekeeping_update+0xdd/0x130
[331277.856587]  ? __cgroup_account_cputime_field+0x5b/0xa0
[331277.856930]  ? update_process_times+0x77/0xb0
[331277.857222]  ? update_wall_time+0xc/0x20
[331277.857482]  ? tick_sched_handle+0x34/0x50
[331277.857758]  ? tick_sched_timer+0x6f/0x80
[331277.858024]  ? tick_sched_do_timer+0xa0/0xa0
[331277.858306]  ? __hrtimer_run_queues+0x112/0x2b0
[331277.858613]  ? hrtimer_interrupt+0xfe/0x220
[331277.858891]  ? __sysvec_apic_timer_interrupt+0x7f/0x170
[331277.859235]  ? sysvec_apic_timer_interrupt+0x99/0xc0
[331277.859558]  </IRQ>
[331277.859710]  <TASK>
[331277.859857]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
[331277.860206]  ? xas_descend+0x22/0x70
[331277.860446]  xas_load+0x30/0x40
[331277.860665]  filemap_get_read_batch+0x16e/0x250
[331277.860967]  filemap_get_pages+0xa9/0x630
[331277.861233]  ? memcg_check_events+0xda/0x210
[331277.861517]  ? free_unref_page_commit+0x7c/0x170
[331277.861834]  ? _raw_spin_unlock_irqrestore+0x23/0x40
[331277.862158]  ? free_unref_page+0x1ac/0x220
[331277.862427]  filemap_read+0xd2/0x340
[331277.862677]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
[331277.863072]  xfs_file_read_iter+0x6a/0xd0 [xfs]
[331277.863424]  vfs_read+0x23c/0x310
[331277.863657]  __x64_sys_pread64+0x94/0xc0
[331277.863917]  do_syscall_64+0x3a/0x90
[331277.864159]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[331277.864488] RIP: 0033:0x7ff8de90e747
[331277.864752] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66 90 f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
[331277.865917] RSP: 002b:00007fff5206aaa8 EFLAGS: 00000202 ORIG_RAX: 0000000000000011
[331277.866395] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff8de90e747
[331277.866848] RDX: 0000000000002000 RSI: 00007ff888b68e80 RDI: 0000000000000230
[331277.867298] RBP: 0000000000000001 R08: 000000000a00000d R09: 0000000000000000
[331277.867750] R10: 000000000f160000 R11: 0000000000000202 R12: 00007ff8cdd92688
[331277.868199] R13: 0000562bb434af75 R14: 0000562bb434f510 R15: 0000562bb5fe04d0
[331277.868658]  </TASK>
Fri Jun 16 04:03:23 AM CEST 2023 -- SERIAL CONSOLE IS LIVE --
[331289.987247] systemd[1]: Starting Serial console liveness marker...
[331289.992645] systemd[1]: nscd.service: Deactivated successfully.
[331290.011838] systemd[1]: nscd.service: Consumed 3min 16.251s CPU time, received 12.0M IP traffic, sent 8.2M IP traffic.
[331290.020289] systemd[1]: serial-console-liveness.service: Deactivated successfully.
[331290.035818] systemd[1]: Finished Serial console liveness marker.
[331290.068776] systemd[1]: Started Logrotate Service.
[331290.069700] systemd[1]: systemd-journald.service: Watchdog timeout (limit 3min)!
[331290.070475] systemd[1]: systemd-journald.service: Killing process 528 (systemd-journal) with signal SIGABRT.
[331290.096246] systemd[1]: logrotate.service: Deactivated successfully.
[331290.285285] systemd-coredump[1267441]: Process 528 (systemd-journal) of user 0 dumped core.
[331290.286002] systemd-coredump[1267441]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.39df468c96764e8c8fffc53a0d0a47d1.528.1686881003000000.zst
[331290.287045] systemd-coredump[1267441]: Module libaudit.so.1 without build-id.
[331290.287574] systemd-coredump[1267441]: Module libnftnl.so.11 without build-id.
[331290.288111] systemd-coredump[1267441]: Module libmnl.so.0 without build-id.
[331290.288587] systemd-coredump[1267441]: Module libgpg-error.so.0 without build-id.
[331290.289076] systemd-coredump[1267441]: Module libattr.so.1 without build-id.
[331290.289541] systemd-coredump[1267441]: Module libzstd.so.1 without build-id.
[331290.289991] systemd-coredump[1267441]: Module liblzma.so.5 without build-id.
[331290.290446] systemd-coredump[1267441]: Module libseccomp.so.2 without build-id.
[331290.290940] systemd-coredump[1267441]: Module libpam.so.0 without build-id.
[331290.291419] systemd-coredump[1267441]: Module liblz4.so.1 without build-id.
[331290.291893] systemd-coredump[1267441]: Module libkmod.so.2 without build-id.
[331290.292370] systemd-coredump[1267441]: Module libip4tc.so.2 without build-id.
[331290.295165] systemd-coredump[1267441]: Module libgcrypt.so.20 without build-id.
[331290.295678] systemd-coredump[1267441]: Module libcrypt.so.2 without build-id.
[331290.296160] systemd-coredump[1267441]: Module libcap.so.2 without build-id.
[331290.296708] systemd-coredump[1267441]: Module libacl.so.1 without build-id.
[331290.297206] systemd-coredump[1267441]: Module libsystemd-shared-253.so without build-id.
[331290.297768] systemd-coredump[1267441]: Module systemd-journald without build-id.
[331290.298273] systemd-coredump[1267441]: Stack trace of thread 528:
[331290.298714] systemd-coredump[1267441]: #0  0x00007f3e96431de8 check_object_header (libsystemd-shared-253.so + 0x231de8)
[331290.299435] systemd-coredump[1267441]: #1  0x00007f3e964330d2 journal_file_move_to_object (libsystemd-shared-253.so + 0x2330d2)
[331290.300199] systemd-coredump[1267441]: #2  0x00007f3e96434a71 journal_file_find_data_object_with_hash (libsystemd-shared-253.so + 0x234a71)
[331290.302207] systemd-coredump[1267441]: #3  0x00007f3e96434d3b journal_file_append_data (libsystemd-shared-253.so + 0x234d3b)
[331290.302983] systemd-coredump[1267441]: #4  0x00007f3e96437243 journal_file_append_entry (libsystemd-shared-253.so + 0x237243)
[331290.303767] systemd-coredump[1267441]: #5  0x000056536a4a7cd5 server_dispatch_message_real (systemd-journald + 0x10cd5)
[331290.304504] systemd-coredump[1267441]: #6  0x000056536a4a134b dev_kmsg_record (systemd-journald + 0xa34b)
[331290.305195] systemd-coredump[1267441]: #7  0x000056536a4a182b server_read_dev_kmsg (systemd-journald + 0xa82b)
[331290.305922] systemd-coredump[1267441]: #8  0x00007f3e9645c140 source_dispatch (libsystemd-shared-253.so + 0x25c140)
[331290.306640] systemd-coredump[1267441]: #9  0x00007f3e9645c42d sd_event_dispatch (libsystemd-shared-253.so + 0x25c42d)
[331290.307350] systemd-coredump[1267441]: #10 0x00007f3e9645cb48 sd_event_run (libsystemd-shared-253.so + 0x25cb48)
[331290.308043] systemd-coredump[1267441]: #11 0x000056536a4a0568 main (systemd-journald + 0x9568)
[331290.308652] systemd-coredump[1267441]: #12 0x00007f3e9603dace __libc_start_call_main (libc.so.6 + 0x23ace)
[331290.309292] systemd-coredump[1267441]: #13 0x00007f3e9603db89 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x23b89)
[331290.310048] systemd-coredump[1267441]: #14 0x000056536a4a0835 _start (systemd-journald + 0x9835)
[331290.310692] systemd-coredump[1267441]: ELF object binary architecture: AMD x86-64


This go back to normal until around 20:03 (around 14 hours later):

[389370.873026] rcu: INFO: rcu_preempt self-detected stall on CPU
[389370.873481] rcu: 	0-....: (21000 ticks this GP) idle=91fc/1/0x4000000000000000 softirq=85252827/85252827 fqs=4704
[389370.874126] 	(t=21002 jiffies g=167843445 q=13889 ncpus=3)
[389370.874500] CPU: 0 PID: 2202919 Comm: .postgres-wrapp Not tainted 6.1.31 #1-NixOS
[389370.874979] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[389370.875710] RIP: 0010:xas_descend+0x26/0x70
[389370.875991] Code: cc cc cc cc 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 48 83 f9 02 <75> 08 48 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc 48 c1 e8 02
[389370.877164] RSP: 0018:ffffb427c4917bf0 EFLAGS: 00000246
[389370.877512] RAX: ffff98871f8dbdaa RBX: ffffb427c4917d70 RCX: 0000000000000002
[389370.877967] RDX: 0000000000000005 RSI: ffff988876d3c000 RDI: ffffb427c4917c00
[389370.878426] RBP: 000000000000f177 R08: ffffb427c4917e68 R09: ffff988846485d38
[389370.878880] R10: ffffb427c4917e60 R11: ffff988846485d38 R12: 000000000000f177
[389370.879341] R13: ffff988827b4ae00 R14: 000000000000f176 R15: ffffb427c4917e90
[389370.879798] FS:  00007ff8de817800(0000) GS:ffff98887ac00000(0000) knlGS:0000000000000000
[389370.880308] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[389370.880683] CR2: 00007ff881c8c000 CR3: 000000010dfea000 CR4: 00000000000006f0
[389370.881139] Call Trace:
[389370.881309]  <IRQ>
[389370.881463]  ? rcu_dump_cpu_stacks+0xc8/0x100
[389370.881755]  ? rcu_sched_clock_irq.cold+0x15b/0x2fb
[389370.882077]  ? sched_slice+0x87/0x140
[389370.882325]  ? perf_event_task_tick+0x64/0x370
[389370.882629]  ? __cgroup_account_cputime_field+0x5b/0xa0
[389370.882968]  ? update_process_times+0x77/0xb0
[389370.883258]  ? tick_sched_handle+0x34/0x50
[389370.883538]  ? tick_sched_timer+0x6f/0x80
[389370.883802]  ? tick_sched_do_timer+0xa0/0xa0
[389370.884084]  ? __hrtimer_run_queues+0x112/0x2b0
[389370.884388]  ? hrtimer_interrupt+0xfe/0x220
[389370.884666]  ? __sysvec_apic_timer_interrupt+0x7f/0x170
[389370.885006]  ? sysvec_apic_timer_interrupt+0x99/0xc0
[389370.885338]  </IRQ>
[389370.885488]  <TASK>
[389370.885636]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
[389370.885983]  ? xas_descend+0x26/0x70
[389370.886224]  xas_load+0x30/0x40
[389370.886456]  filemap_get_read_batch+0x16e/0x250
[389370.886758]  filemap_get_pages+0xa9/0x630
[389370.887026]  ? atime_needs_update+0x104/0x180
[389370.887317]  ? touch_atime+0x46/0x1f0
[389370.887573]  filemap_read+0xd2/0x340
[389370.887817]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
[389370.888210]  xfs_file_read_iter+0x6a/0xd0 [xfs]
[389370.888571]  vfs_read+0x23c/0x310
[389370.888797]  __x64_sys_pread64+0x94/0xc0
[389370.889058]  do_syscall_64+0x3a/0x90
[389370.889299]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[389370.889634] RIP: 0033:0x7ff8de90e747
[389370.889900] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66 90 f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
[389370.891063] RSP: 002b:00007fff5206a848 EFLAGS: 00000202 ORIG_RAX: 0000000000000011
[389370.891549] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff8de90e747
[389370.892003] RDX: 0000000000002000 RSI: 00007ff881c8ce80 RDI: 00000000000001c7
[389370.892466] RBP: 0000000000000001 R08: 000000000a00000d R09: 0000000000000000
[389370.892923] R10: 000000000f176000 R11: 0000000000000202 R12: 00007ff8cdd91278
[389370.893386] R13: 0000562bb434af75 R14: 0000562bb434f510 R15: 0000562bb5e55828
[389370.893843]  </TASK>


At this point we have to restart the VM externally because it didn't respond properly any longer.
Comment 1 Christian Theune 2023-06-20 15:10:49 UTC
Hi,

is there anything I can do to raise interest on this one? :)

> On 19. Jun 2023, at 10:29, bugzilla-daemon@kernel.org wrote:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=217572
> 
>            Bug ID: 217572
>           Summary: Initial blocked tasks causing deterioration over hours
>                    until (nearly) complete system lockup and data loss
>                    with PostgreSQL 13
>           Product: File System
>           Version: 2.5
>          Hardware: All
>                OS: Linux
>            Status: NEW
>          Severity: high
>          Priority: P3
>         Component: XFS
>          Assignee: filesystem_xfs@kernel-bugs.kernel.org
>          Reporter: ct@flyingcircus.io
>        Regression: No
> 
> Last Friday we experienced the following hung task messages with PostgreSQL
> while performing our nightly backup using pg_dump. Normally this takes at
> most
> a few minutes with IO being stressed. This time it caused high SYS CPU time,
> went on for almost 20 minutes and caused the PostgreSQL dump to fail with
> inconsistent data.
> 
> Around 3:50 AM we got this:
> 
> [330289.821046] INFO: task .postgres-wrapp:11884 blocked for more than 122
> seconds.
> [330289.821830]       Not tainted 6.1.31 #1-NixOS
> [330289.822285] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [330289.823098] task:.postgres-wrapp state:D stack:0     pid:11884 ppid:11881 
> flags:0x00000002
> [330289.823911] Call Trace:
> [330289.824221]  <TASK>
> [330289.824451]  __schedule+0x35d/0x1370
> [330289.824858]  ? mntput_no_expire+0x4a/0x250
> [330289.825307]  schedule+0x5d/0xe0
> [330289.825630]  rwsem_down_write_slowpath+0x34e/0x730
> [330289.826128]  xfs_ilock+0xeb/0xf0 [xfs]
> [330289.826599]  xfs_file_buffered_write+0x119/0x300 [xfs]
> [330289.827212]  ? selinux_file_permission+0x10b/0x150
> [330289.827683]  vfs_write+0x244/0x400
> [330289.828049]  __x64_sys_pwrite64+0x94/0xc0
> [330289.828459]  do_syscall_64+0x3a/0x90
> [330289.828801]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [330289.829302] RIP: 0033:0x7ff8de90e7f7
> [330289.829681] RSP: 002b:00007fff52069b08 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000012
> [330289.830408] RAX: ffffffffffffffda RBX: 0000562bb434f510 RCX:
> 00007ff8de90e7f7
> [330289.831073] RDX: 0000000000002000 RSI: 00007ff888b52e80 RDI:
> 000000000000003b
> [330289.831762] RBP: 00000000000021b0 R08: 000000000a000010 R09:
> 0000000000000040
> [330289.832440] R10: 0000000004ed8000 R11: 0000000000000202 R12:
> 0000000000002000
> [330289.833130] R13: 0000000004ed8000 R14: 00007ff8de8176c8 R15:
> 0000562bb434af75
> [330289.833803]  </TASK>
> [330289.834064] INFO: task .postgres-wrapp:1245532 blocked for more than 122
> seconds.
> [330289.834771]       Not tainted 6.1.31 #1-NixOS
> [330289.835209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [330289.835926] task:.postgres-wrapp state:D stack:0     pid:1245532
> ppid:11881
> flags:0x00000002
> [330289.836752] Call Trace:
> [330289.837010]  <TASK>
> [330289.837258]  __schedule+0x35d/0x1370
> [330289.837622]  ? page_add_file_rmap+0xba/0x2f0
> [330289.838047]  ? do_set_pte+0x174/0x1c0
> [330289.838420]  ? unix_stream_read_generic+0x223/0xa60
> [330289.838887]  schedule+0x5d/0xe0
> [330289.839265]  schedule_preempt_disabled+0x14/0x30
> [330289.839758]  rwsem_down_read_slowpath+0x29e/0x4f0
> [330289.840521]  down_read+0x47/0xb0
> [330289.840853]  xfs_ilock+0x79/0xf0 [xfs]
> [330289.841346]  xfs_file_buffered_read+0x44/0xd0 [xfs]
> [330289.841945]  xfs_file_read_iter+0x6a/0xd0 [xfs]
> [330289.842496]  vfs_read+0x23c/0x310
> [330289.842845]  __x64_sys_pread64+0x94/0xc0
> [330289.843303]  do_syscall_64+0x3a/0x90
> [330289.843857]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [330289.844420] RIP: 0033:0x7ff8de90e747
> [330289.844898] RSP: 002b:00007fff5206aae8 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000011
> [330289.845705] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
> 00007ff8de90e747
> [330289.846488] RDX: 0000000000002000 RSI: 00007ff8896e0e80 RDI:
> 0000000000000010
> [330289.847133] RBP: 0000000000000001 R08: 000000000a00000d R09:
> 0000000000000000
> [330289.847627] R10: 0000000018fb8000 R11: 0000000000000202 R12:
> 00007ff8cdd91278
> [330289.848113] R13: 0000562bb434af75 R14: 0000562bb434f510 R15:
> 0000562bb5e89e70
> [330289.848631]  </TASK>
> 
> 
> This keeps going on for a few minutes until:
> 
> [330584.618978] rcu: INFO: rcu_preempt self-detected stall on CPU
> [330584.619413] rcu:    1-....: (20999 ticks this GP)
> idle=1204/1/0x4000000000000000 softirq=53073547/53073547 fqs=5231
> [330584.620052]         (t=21000 jiffies g=142859597 q=21168 ncpus=3)
> [330584.620409] CPU: 1 PID: 1246481 Comm: .postgres-wrapp Not tainted 6.1.31
> #1-NixOS
> [330584.620880] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [330584.621588] RIP: 0010:xas_load+0xb/0x40
> [330584.621844] Code: 04 48 8b 44 c2 08 c3 cc cc cc cc 48 8b 07 48 8b 40 08
> c3
> cc cc cc cc 0f 1f 84 00 00 00 00 00 e8 3b ff ff ff 48 89 c2 83 e2 03 <48> 83
> fa
> 02 75 08 48 3d 00 10 00 00 77 05 c3 cc cc cc cc 0f b6 4f
> [330584.622996] RSP: 0018:ffffb427c3387bf8 EFLAGS: 00000202
> [330584.623337] RAX: ffff988876d3c002 RBX: ffffb427c3387d70 RCX:
> 0000000000000002
> [330584.623787] RDX: 0000000000000002 RSI: ffff98871f9d2920 RDI:
> ffffb427c3387c00
> [330584.624239] RBP: 000000000000f161 R08: ffffb427c3387e68 R09:
> ffffda1444482000
> [330584.624686] R10: 0000000000000001 R11: 0000000000000001 R12:
> 000000000000f161
> [330584.625136] R13: ffff9887ad3c2700 R14: 000000000000f160 R15:
> ffffb427c3387e90
> [330584.625591] FS:  00007ff8de817800(0000) GS:ffff98887ac80000(0000)
> knlGS:0000000000000000
> [330584.626097] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [330584.626467] CR2: 0000562bb434e2d8 CR3: 0000000104bd6000 CR4:
> 00000000000006e0
> [330584.626954] Call Trace:
> [330584.627125]  <IRQ>
> [330584.627281]  ? rcu_dump_cpu_stacks+0xc8/0x100
> [330584.627567]  ? rcu_sched_clock_irq.cold+0x15b/0x2fb
> [330584.627882]  ? sched_slice+0x87/0x140
> [330584.628126]  ? timekeeping_update+0xdd/0x130
> [330584.628414]  ? __cgroup_account_cputime_field+0x5b/0xa0
> [330584.628751]  ? update_process_times+0x77/0xb0
> [330584.629036]  ? update_wall_time+0xc/0x20
> [330584.629300]  ? tick_sched_handle+0x34/0x50
> [330584.629564]  ? tick_sched_timer+0x6f/0x80
> [330584.629823]  ? tick_sched_do_timer+0xa0/0xa0
> [330584.630103]  ? __hrtimer_run_queues+0x112/0x2b0
> [330584.630404]  ? hrtimer_interrupt+0xfe/0x220
> [330584.630678]  ? __sysvec_apic_timer_interrupt+0x7f/0x170
> [330584.631016]  ? sysvec_apic_timer_interrupt+0x99/0xc0
> [330584.631339]  </IRQ>
> [330584.631485]  <TASK>
> [330584.631631]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [330584.631974]  ? xas_load+0xb/0x40
> [330584.632195]  ? xas_load+0x30/0x40
> [330584.632415]  filemap_get_read_batch+0x16e/0x250
> [330584.632710]  filemap_get_pages+0xa9/0x630
> [330584.632972]  ? memcg_check_events+0xda/0x210
> [330584.633259]  ? free_unref_page_commit+0x7c/0x170
> [330584.633560]  ? _raw_spin_unlock_irqrestore+0x23/0x40
> [330584.633880]  ? free_unref_page+0x1ac/0x220
> [330584.634146]  filemap_read+0xd2/0x340
> [330584.634389]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
> [330584.634778]  xfs_file_read_iter+0x6a/0xd0 [xfs]
> [330584.635123]  vfs_read+0x23c/0x310
> [330584.635354]  __x64_sys_pread64+0x94/0xc0
> [330584.635609]  do_syscall_64+0x3a/0x90
> [330584.635846]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [330584.636174] RIP: 0033:0x7ff8de90e747
> [330584.636437] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66
> 90
> f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d
> 00
> f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
> [330584.637582] RSP: 002b:00007fff5206aaa8 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000011
> [330584.638056] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
> 00007ff8de90e747
> [330584.638506] RDX: 0000000000002000 RSI: 00007ff888b68e80 RDI:
> 0000000000000230
> [330584.638954] RBP: 0000000000000001 R08: 000000000a00000d R09:
> 0000000000000000
> [330584.639411] R10: 000000000f160000 R11: 0000000000000202 R12:
> 00007ff8cdd92688
> [330584.639862] R13: 0000562bb434af75 R14: 0000562bb434f510 R15:
> 0000562bb5fe04d0
> [330584.640313]  </TASK>
> [330584.721995] rcu: INFO: rcu_preempt detected expedited stalls on
> CPUs/tasks:
> { 1-.... } 21015 jiffies s: 7297 root: 0x2/.
> [330584.722930] rcu: blocking rcu_node structures (internal RCU debug):
> [330584.723428] Sending NMI from CPU 0 to CPUs 1:
> [330584.723763] NMI backtrace for cpu 1
> [330584.723769] CPU: 1 PID: 1246481 Comm: .postgres-wrapp Not tainted 6.1.31
> #1-NixOS
> [330584.723772] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [330584.723775] RIP: 0010:xas_load+0x29/0x40
> [330584.723783] Code: 00 e8 3b ff ff ff 48 89 c2 83 e2 03 48 83 fa 02 75 08
> 48
> 3d 00 10 00 00 77 05 c3 cc cc cc cc 0f b6 4f 10 48 8d 70 fe 38 48 fe <72> ee
> e8
> 20 fe ff ff 80 3e 00 75 d0 c3 cc cc cc cc 66 0f 1f 44 00
> [330584.723785] RSP: 0018:ffffb427c3387bf8 EFLAGS: 00000246
> [330584.723787] RAX: ffff98871f8dbdaa RBX: ffffb427c3387d70 RCX:
> 0000000000000000
> [330584.723788] RDX: 0000000000000002 RSI: ffff98871f8dbda8 RDI:
> ffffb427c3387c00
> [330584.723789] RBP: 000000000000f161 R08: ffffb427c3387e68 R09:
> ffffda1444482000
> [330584.723790] R10: 0000000000000001 R11: 0000000000000001 R12:
> 000000000000f161
> [330584.723791] R13: ffff9887ad3c2700 R14: 000000000000f160 R15:
> ffffb427c3387e90
> [330584.723793] FS:  00007ff8de817800(0000) GS:ffff98887ac80000(0000)
> knlGS:0000000000000000
> [330584.723794] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [330584.723795] CR2: 0000562bb434e2d8 CR3: 0000000104bd6000 CR4:
> 00000000000006e0
> [330584.723798] Call Trace:
> [330584.723801]  <NMI>
> [330584.723805]  ? nmi_cpu_backtrace.cold+0x1b/0x76
> [330584.723810]  ? nmi_cpu_backtrace_handler+0xd/0x20
> [330584.723814]  ? nmi_handle+0x5d/0x120
> [330584.723817]  ? xas_load+0x29/0x40
> [330584.723818]  ? default_do_nmi+0x69/0x170
> [330584.723821]  ? exc_nmi+0x13c/0x170
> [330584.723823]  ? end_repeat_nmi+0x16/0x67
> [330584.723828]  ? xas_load+0x29/0x40
> [330584.723830]  ? xas_load+0x29/0x40
> [330584.723832]  ? xas_load+0x29/0x40
> [330584.723834]  </NMI>
> [330584.723834]  <TASK>
> [330584.723835]  filemap_get_read_batch+0x16e/0x250
> [330584.723840]  filemap_get_pages+0xa9/0x630
> [330584.723842]  ? memcg_check_events+0xda/0x210
> [330584.723845]  ? free_unref_page_commit+0x7c/0x170
> [330584.723849]  ? _raw_spin_unlock_irqrestore+0x23/0x40
> [330584.723851]  ? free_unref_page+0x1ac/0x220
> [330584.723852]  filemap_read+0xd2/0x340
> [330584.723857]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
> [330584.723935]  xfs_file_read_iter+0x6a/0xd0 [xfs]
> [330584.723990]  vfs_read+0x23c/0x310
> [330584.723995]  __x64_sys_pread64+0x94/0xc0
> [330584.723997]  do_syscall_64+0x3a/0x90
> [330584.724000]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [330584.724003] RIP: 0033:0x7ff8de90e747
> [330584.724019] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66
> 90
> f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d
> 00
> f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
> [330584.724020] RSP: 002b:00007fff5206aaa8 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000011
> [330584.724021] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
> 00007ff8de90e747
> [330584.724022] RDX: 0000000000002000 RSI: 00007ff888b68e80 RDI:
> 0000000000000230
> [330584.724023] RBP: 0000000000000001 R08: 000000000a00000d R09:
> 0000000000000000
> [330584.724023] R10: 000000000f160000 R11: 0000000000000202 R12:
> 00007ff8cdd92688
> [330584.724024] R13: 0000562bb434af75 R14: 0000562bb434f510 R15:
> 0000562bb5fe04d0
> [330584.724026]  </TASK>
> 
> 
> This keeps repeating until around 4:03 which ends in systemd-journal getting
> coredumped:
> 
> [331277.846966] rcu: INFO: rcu_preempt self-detected stall on CPU
> [331277.847413] rcu:    1-....: (713858 ticks this GP)
> idle=1204/1/0x4000000000000000 softirq=53073547/53073547 fqs=311996
> [331277.848088]         (t=714253 jiffies g=142859597 q=1821602 ncpus=3)
> [331277.848462] CPU: 1 PID: 1246481 Comm: .postgres-wrapp Not tainted 6.1.31
> #1-NixOS
> [331277.848941] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [331277.849667] RIP: 0010:xas_descend+0x22/0x70
> [331277.849952] Code: cc cc cc cc cc cc cc cc 0f b6 0e 48 8b 57 08 48 d3 ea
> 83
> e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 <48> 83
> f9
> 02 75 08 48 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc
> [331277.851120] RSP: 0018:ffffb427c3387bf0 EFLAGS: 00000202
> [331277.851468] RAX: ffff988876d3c002 RBX: ffffb427c3387d70 RCX:
> 0000000000000002
> [331277.851928] RDX: 000000000000000f RSI: ffff98871f9d2920 RDI:
> ffffb427c3387c00
> [331277.852386] RBP: 000000000000f161 R08: ffffb427c3387e68 R09:
> ffffda1444482000
> [331277.852847] R10: 0000000000000001 R11: 0000000000000001 R12:
> 000000000000f161
> [331277.853303] R13: ffff9887ad3c2700 R14: 000000000000f160 R15:
> ffffb427c3387e90
> [331277.853766] FS:  00007ff8de817800(0000) GS:ffff98887ac80000(0000)
> knlGS:0000000000000000
> [331277.854278] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [331277.854653] CR2: 0000562bb434e2d8 CR3: 0000000104bd6000 CR4:
> 00000000000006e0
> [331277.855111] Call Trace:
> [331277.855284]  <IRQ>
> [331277.855433]  ? rcu_dump_cpu_stacks+0xc8/0x100
> [331277.855730]  ? rcu_sched_clock_irq.cold+0x15b/0x2fb
> [331277.856049]  ? sched_slice+0x87/0x140
> [331277.856294]  ? timekeeping_update+0xdd/0x130
> [331277.856587]  ? __cgroup_account_cputime_field+0x5b/0xa0
> [331277.856930]  ? update_process_times+0x77/0xb0
> [331277.857222]  ? update_wall_time+0xc/0x20
> [331277.857482]  ? tick_sched_handle+0x34/0x50
> [331277.857758]  ? tick_sched_timer+0x6f/0x80
> [331277.858024]  ? tick_sched_do_timer+0xa0/0xa0
> [331277.858306]  ? __hrtimer_run_queues+0x112/0x2b0
> [331277.858613]  ? hrtimer_interrupt+0xfe/0x220
> [331277.858891]  ? __sysvec_apic_timer_interrupt+0x7f/0x170
> [331277.859235]  ? sysvec_apic_timer_interrupt+0x99/0xc0
> [331277.859558]  </IRQ>
> [331277.859710]  <TASK>
> [331277.859857]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [331277.860206]  ? xas_descend+0x22/0x70
> [331277.860446]  xas_load+0x30/0x40
> [331277.860665]  filemap_get_read_batch+0x16e/0x250
> [331277.860967]  filemap_get_pages+0xa9/0x630
> [331277.861233]  ? memcg_check_events+0xda/0x210
> [331277.861517]  ? free_unref_page_commit+0x7c/0x170
> [331277.861834]  ? _raw_spin_unlock_irqrestore+0x23/0x40
> [331277.862158]  ? free_unref_page+0x1ac/0x220
> [331277.862427]  filemap_read+0xd2/0x340
> [331277.862677]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
> [331277.863072]  xfs_file_read_iter+0x6a/0xd0 [xfs]
> [331277.863424]  vfs_read+0x23c/0x310
> [331277.863657]  __x64_sys_pread64+0x94/0xc0
> [331277.863917]  do_syscall_64+0x3a/0x90
> [331277.864159]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [331277.864488] RIP: 0033:0x7ff8de90e747
> [331277.864752] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66
> 90
> f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d
> 00
> f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
> [331277.865917] RSP: 002b:00007fff5206aaa8 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000011
> [331277.866395] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
> 00007ff8de90e747
> [331277.866848] RDX: 0000000000002000 RSI: 00007ff888b68e80 RDI:
> 0000000000000230
> [331277.867298] RBP: 0000000000000001 R08: 000000000a00000d R09:
> 0000000000000000
> [331277.867750] R10: 000000000f160000 R11: 0000000000000202 R12:
> 00007ff8cdd92688
> [331277.868199] R13: 0000562bb434af75 R14: 0000562bb434f510 R15:
> 0000562bb5fe04d0
> [331277.868658]  </TASK>
> Fri Jun 16 04:03:23 AM CEST 2023 -- SERIAL CONSOLE IS LIVE --
> [331289.987247] systemd[1]: Starting Serial console liveness marker...
> [331289.992645] systemd[1]: nscd.service: Deactivated successfully.
> [331290.011838] systemd[1]: nscd.service: Consumed 3min 16.251s CPU time,
> received 12.0M IP traffic, sent 8.2M IP traffic.
> [331290.020289] systemd[1]: serial-console-liveness.service: Deactivated
> successfully.
> [331290.035818] systemd[1]: Finished Serial console liveness marker.
> [331290.068776] systemd[1]: Started Logrotate Service.
> [331290.069700] systemd[1]: systemd-journald.service: Watchdog timeout (limit
> 3min)!
> [331290.070475] systemd[1]: systemd-journald.service: Killing process 528
> (systemd-journal) with signal SIGABRT.
> [331290.096246] systemd[1]: logrotate.service: Deactivated successfully.
> [331290.285285] systemd-coredump[1267441]: Process 528 (systemd-journal) of
> user 0 dumped core.
> [331290.286002] systemd-coredump[1267441]: Coredump diverted to
>
> /var/lib/systemd/coredump/core.systemd-journal.0.39df468c96764e8c8fffc53a0d0a47d1.528.1686881003000000.zst
> [331290.287045] systemd-coredump[1267441]: Module libaudit.so.1 without
> build-id.
> [331290.287574] systemd-coredump[1267441]: Module libnftnl.so.11 without
> build-id.
> [331290.288111] systemd-coredump[1267441]: Module libmnl.so.0 without
> build-id.
> [331290.288587] systemd-coredump[1267441]: Module libgpg-error.so.0 without
> build-id.
> [331290.289076] systemd-coredump[1267441]: Module libattr.so.1 without
> build-id.
> [331290.289541] systemd-coredump[1267441]: Module libzstd.so.1 without
> build-id.
> [331290.289991] systemd-coredump[1267441]: Module liblzma.so.5 without
> build-id.
> [331290.290446] systemd-coredump[1267441]: Module libseccomp.so.2 without
> build-id.
> [331290.290940] systemd-coredump[1267441]: Module libpam.so.0 without
> build-id.
> [331290.291419] systemd-coredump[1267441]: Module liblz4.so.1 without
> build-id.
> [331290.291893] systemd-coredump[1267441]: Module libkmod.so.2 without
> build-id.
> [331290.292370] systemd-coredump[1267441]: Module libip4tc.so.2 without
> build-id.
> [331290.295165] systemd-coredump[1267441]: Module libgcrypt.so.20 without
> build-id.
> [331290.295678] systemd-coredump[1267441]: Module libcrypt.so.2 without
> build-id.
> [331290.296160] systemd-coredump[1267441]: Module libcap.so.2 without
> build-id.
> [331290.296708] systemd-coredump[1267441]: Module libacl.so.1 without
> build-id.
> [331290.297206] systemd-coredump[1267441]: Module libsystemd-shared-253.so
> without build-id.
> [331290.297768] systemd-coredump[1267441]: Module systemd-journald without
> build-id.
> [331290.298273] systemd-coredump[1267441]: Stack trace of thread 528:
> [331290.298714] systemd-coredump[1267441]: #0  0x00007f3e96431de8
> check_object_header (libsystemd-shared-253.so + 0x231de8)
> [331290.299435] systemd-coredump[1267441]: #1  0x00007f3e964330d2
> journal_file_move_to_object (libsystemd-shared-253.so + 0x2330d2)
> [331290.300199] systemd-coredump[1267441]: #2  0x00007f3e96434a71
> journal_file_find_data_object_with_hash (libsystemd-shared-253.so + 0x234a71)
> [331290.302207] systemd-coredump[1267441]: #3  0x00007f3e96434d3b
> journal_file_append_data (libsystemd-shared-253.so + 0x234d3b)
> [331290.302983] systemd-coredump[1267441]: #4  0x00007f3e96437243
> journal_file_append_entry (libsystemd-shared-253.so + 0x237243)
> [331290.303767] systemd-coredump[1267441]: #5  0x000056536a4a7cd5
> server_dispatch_message_real (systemd-journald + 0x10cd5)
> [331290.304504] systemd-coredump[1267441]: #6  0x000056536a4a134b
> dev_kmsg_record (systemd-journald + 0xa34b)
> [331290.305195] systemd-coredump[1267441]: #7  0x000056536a4a182b
> server_read_dev_kmsg (systemd-journald + 0xa82b)
> [331290.305922] systemd-coredump[1267441]: #8  0x00007f3e9645c140
> source_dispatch (libsystemd-shared-253.so + 0x25c140)
> [331290.306640] systemd-coredump[1267441]: #9  0x00007f3e9645c42d
> sd_event_dispatch (libsystemd-shared-253.so + 0x25c42d)
> [331290.307350] systemd-coredump[1267441]: #10 0x00007f3e9645cb48
> sd_event_run
> (libsystemd-shared-253.so + 0x25cb48)
> [331290.308043] systemd-coredump[1267441]: #11 0x000056536a4a0568 main
> (systemd-journald + 0x9568)
> [331290.308652] systemd-coredump[1267441]: #12 0x00007f3e9603dace
> __libc_start_call_main (libc.so.6 + 0x23ace)
> [331290.309292] systemd-coredump[1267441]: #13 0x00007f3e9603db89
> __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x23b89)
> [331290.310048] systemd-coredump[1267441]: #14 0x000056536a4a0835 _start
> (systemd-journald + 0x9835)
> [331290.310692] systemd-coredump[1267441]: ELF object binary architecture:
> AMD
> x86-64
> 
> 
> This go back to normal until around 20:03 (around 14 hours later):
> 
> [389370.873026] rcu: INFO: rcu_preempt self-detected stall on CPU
> [389370.873481] rcu:    0-....: (21000 ticks this GP)
> idle=91fc/1/0x4000000000000000 softirq=85252827/85252827 fqs=4704
> [389370.874126]         (t=21002 jiffies g=167843445 q=13889 ncpus=3)
> [389370.874500] CPU: 0 PID: 2202919 Comm: .postgres-wrapp Not tainted 6.1.31
> #1-NixOS
> [389370.874979] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [389370.875710] RIP: 0010:xas_descend+0x26/0x70
> [389370.875991] Code: cc cc cc cc 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89
> d0
> 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 48 83 f9 02 <75> 08
> 48
> 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc 48 c1 e8 02
> [389370.877164] RSP: 0018:ffffb427c4917bf0 EFLAGS: 00000246
> [389370.877512] RAX: ffff98871f8dbdaa RBX: ffffb427c4917d70 RCX:
> 0000000000000002
> [389370.877967] RDX: 0000000000000005 RSI: ffff988876d3c000 RDI:
> ffffb427c4917c00
> [389370.878426] RBP: 000000000000f177 R08: ffffb427c4917e68 R09:
> ffff988846485d38
> [389370.878880] R10: ffffb427c4917e60 R11: ffff988846485d38 R12:
> 000000000000f177
> [389370.879341] R13: ffff988827b4ae00 R14: 000000000000f176 R15:
> ffffb427c4917e90
> [389370.879798] FS:  00007ff8de817800(0000) GS:ffff98887ac00000(0000)
> knlGS:0000000000000000
> [389370.880308] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [389370.880683] CR2: 00007ff881c8c000 CR3: 000000010dfea000 CR4:
> 00000000000006f0
> [389370.881139] Call Trace:
> [389370.881309]  <IRQ>
> [389370.881463]  ? rcu_dump_cpu_stacks+0xc8/0x100
> [389370.881755]  ? rcu_sched_clock_irq.cold+0x15b/0x2fb
> [389370.882077]  ? sched_slice+0x87/0x140
> [389370.882325]  ? perf_event_task_tick+0x64/0x370
> [389370.882629]  ? __cgroup_account_cputime_field+0x5b/0xa0
> [389370.882968]  ? update_process_times+0x77/0xb0
> [389370.883258]  ? tick_sched_handle+0x34/0x50
> [389370.883538]  ? tick_sched_timer+0x6f/0x80
> [389370.883802]  ? tick_sched_do_timer+0xa0/0xa0
> [389370.884084]  ? __hrtimer_run_queues+0x112/0x2b0
> [389370.884388]  ? hrtimer_interrupt+0xfe/0x220
> [389370.884666]  ? __sysvec_apic_timer_interrupt+0x7f/0x170
> [389370.885006]  ? sysvec_apic_timer_interrupt+0x99/0xc0
> [389370.885338]  </IRQ>
> [389370.885488]  <TASK>
> [389370.885636]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [389370.885983]  ? xas_descend+0x26/0x70
> [389370.886224]  xas_load+0x30/0x40
> [389370.886456]  filemap_get_read_batch+0x16e/0x250
> [389370.886758]  filemap_get_pages+0xa9/0x630
> [389370.887026]  ? atime_needs_update+0x104/0x180
> [389370.887317]  ? touch_atime+0x46/0x1f0
> [389370.887573]  filemap_read+0xd2/0x340
> [389370.887817]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
> [389370.888210]  xfs_file_read_iter+0x6a/0xd0 [xfs]
> [389370.888571]  vfs_read+0x23c/0x310
> [389370.888797]  __x64_sys_pread64+0x94/0xc0
> [389370.889058]  do_syscall_64+0x3a/0x90
> [389370.889299]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [389370.889634] RIP: 0033:0x7ff8de90e747
> [389370.889900] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66
> 90
> f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d
> 00
> f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
> [389370.891063] RSP: 002b:00007fff5206a848 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000011
> [389370.891549] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
> 00007ff8de90e747
> [389370.892003] RDX: 0000000000002000 RSI: 00007ff881c8ce80 RDI:
> 00000000000001c7
> [389370.892466] RBP: 0000000000000001 R08: 000000000a00000d R09:
> 0000000000000000
> [389370.892923] R10: 000000000f176000 R11: 0000000000000202 R12:
> 00007ff8cdd91278
> [389370.893386] R13: 0000562bb434af75 R14: 0000562bb434f510 R15:
> 0000562bb5e55828
> [389370.893843]  </TASK>
> 
> 
> At this point we have to restart the VM externally because it didn't respond
> properly any longer.
> 
> -- 
> You may reply to this email to add a comment.
> 
> You are receiving this mail because:
> You are watching the assignee of the bug.


Liebe Grüße,
Christian Theune
Comment 2 Christian Theune 2023-06-20 15:13:16 UTC
Argh. Sorry. My brain is a bit fried today and I - for some reason - thought I did the bug reports last week and not yesterday. I didn’t want to appear pushy, so, sorry for the early follow-up.

*ducks*

> On 20. Jun 2023, at 17:10, Christian Theune <ct@flyingcircus.io> wrote:
> 
> Hi,
> 
> is there anything I can do to raise interest on this one? :)
> 
>> On 19. Jun 2023, at 10:29, bugzilla-daemon@kernel.org wrote:
>> 
>> https://bugzilla.kernel.org/show_bug.cgi?id=217572
>> 
>>           Bug ID: 217572
>>          Summary: Initial blocked tasks causing deterioration over hours
>>                   until (nearly) complete system lockup and data loss
>>                   with PostgreSQL 13
>>          Product: File System
>>          Version: 2.5
>>         Hardware: All
>>               OS: Linux
>>           Status: NEW
>>         Severity: high
>>         Priority: P3
>>        Component: XFS
>>         Assignee: filesystem_xfs@kernel-bugs.kernel.org
>>         Reporter: ct@flyingcircus.io
>>       Regression: No
>> 
>> Last Friday we experienced the following hung task messages with PostgreSQL
>> while performing our nightly backup using pg_dump. Normally this takes at
>> most
>> a few minutes with IO being stressed. This time it caused high SYS CPU time,
>> went on for almost 20 minutes and caused the PostgreSQL dump to fail with
>> inconsistent data.
>> 
>> Around 3:50 AM we got this:
>> 
>> [330289.821046] INFO: task .postgres-wrapp:11884 blocked for more than 122
>> seconds.
>> [330289.821830]       Not tainted 6.1.31 #1-NixOS
>> [330289.822285] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
>> this message.
>> [330289.823098] task:.postgres-wrapp state:D stack:0     pid:11884
>> ppid:11881 
>> flags:0x00000002
>> [330289.823911] Call Trace:
>> [330289.824221]  <TASK>
>> [330289.824451]  __schedule+0x35d/0x1370
>> [330289.824858]  ? mntput_no_expire+0x4a/0x250
>> [330289.825307]  schedule+0x5d/0xe0
>> [330289.825630]  rwsem_down_write_slowpath+0x34e/0x730
>> [330289.826128]  xfs_ilock+0xeb/0xf0 [xfs]
>> [330289.826599]  xfs_file_buffered_write+0x119/0x300 [xfs]
>> [330289.827212]  ? selinux_file_permission+0x10b/0x150
>> [330289.827683]  vfs_write+0x244/0x400
>> [330289.828049]  __x64_sys_pwrite64+0x94/0xc0
>> [330289.828459]  do_syscall_64+0x3a/0x90
>> [330289.828801]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> [330289.829302] RIP: 0033:0x7ff8de90e7f7
>> [330289.829681] RSP: 002b:00007fff52069b08 EFLAGS: 00000202 ORIG_RAX:
>> 0000000000000012
>> [330289.830408] RAX: ffffffffffffffda RBX: 0000562bb434f510 RCX:
>> 00007ff8de90e7f7
>> [330289.831073] RDX: 0000000000002000 RSI: 00007ff888b52e80 RDI:
>> 000000000000003b
>> [330289.831762] RBP: 00000000000021b0 R08: 000000000a000010 R09:
>> 0000000000000040
>> [330289.832440] R10: 0000000004ed8000 R11: 0000000000000202 R12:
>> 0000000000002000
>> [330289.833130] R13: 0000000004ed8000 R14: 00007ff8de8176c8 R15:
>> 0000562bb434af75
>> [330289.833803]  </TASK>
>> [330289.834064] INFO: task .postgres-wrapp:1245532 blocked for more than 122
>> seconds.
>> [330289.834771]       Not tainted 6.1.31 #1-NixOS
>> [330289.835209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
>> this message.
>> [330289.835926] task:.postgres-wrapp state:D stack:0     pid:1245532
>> ppid:11881
>> flags:0x00000002
>> [330289.836752] Call Trace:
>> [330289.837010]  <TASK>
>> [330289.837258]  __schedule+0x35d/0x1370
>> [330289.837622]  ? page_add_file_rmap+0xba/0x2f0
>> [330289.838047]  ? do_set_pte+0x174/0x1c0
>> [330289.838420]  ? unix_stream_read_generic+0x223/0xa60
>> [330289.838887]  schedule+0x5d/0xe0
>> [330289.839265]  schedule_preempt_disabled+0x14/0x30
>> [330289.839758]  rwsem_down_read_slowpath+0x29e/0x4f0
>> [330289.840521]  down_read+0x47/0xb0
>> [330289.840853]  xfs_ilock+0x79/0xf0 [xfs]
>> [330289.841346]  xfs_file_buffered_read+0x44/0xd0 [xfs]
>> [330289.841945]  xfs_file_read_iter+0x6a/0xd0 [xfs]
>> [330289.842496]  vfs_read+0x23c/0x310
>> [330289.842845]  __x64_sys_pread64+0x94/0xc0
>> [330289.843303]  do_syscall_64+0x3a/0x90
>> [330289.843857]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> [330289.844420] RIP: 0033:0x7ff8de90e747
>> [330289.844898] RSP: 002b:00007fff5206aae8 EFLAGS: 00000202 ORIG_RAX:
>> 0000000000000011
>> [330289.845705] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
>> 00007ff8de90e747
>> [330289.846488] RDX: 0000000000002000 RSI: 00007ff8896e0e80 RDI:
>> 0000000000000010
>> [330289.847133] RBP: 0000000000000001 R08: 000000000a00000d R09:
>> 0000000000000000
>> [330289.847627] R10: 0000000018fb8000 R11: 0000000000000202 R12:
>> 00007ff8cdd91278
>> [330289.848113] R13: 0000562bb434af75 R14: 0000562bb434f510 R15:
>> 0000562bb5e89e70
>> [330289.848631]  </TASK>
>> 
>> 
>> This keeps going on for a few minutes until:
>> 
>> [330584.618978] rcu: INFO: rcu_preempt self-detected stall on CPU
>> [330584.619413] rcu:    1-....: (20999 ticks this GP)
>> idle=1204/1/0x4000000000000000 softirq=53073547/53073547 fqs=5231
>> [330584.620052]         (t=21000 jiffies g=142859597 q=21168 ncpus=3)
>> [330584.620409] CPU: 1 PID: 1246481 Comm: .postgres-wrapp Not tainted 6.1.31
>> #1-NixOS
>> [330584.620880] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
>> [330584.621588] RIP: 0010:xas_load+0xb/0x40
>> [330584.621844] Code: 04 48 8b 44 c2 08 c3 cc cc cc cc 48 8b 07 48 8b 40 08
>> c3
>> cc cc cc cc 0f 1f 84 00 00 00 00 00 e8 3b ff ff ff 48 89 c2 83 e2 03 <48> 83
>> fa
>> 02 75 08 48 3d 00 10 00 00 77 05 c3 cc cc cc cc 0f b6 4f
>> [330584.622996] RSP: 0018:ffffb427c3387bf8 EFLAGS: 00000202
>> [330584.623337] RAX: ffff988876d3c002 RBX: ffffb427c3387d70 RCX:
>> 0000000000000002
>> [330584.623787] RDX: 0000000000000002 RSI: ffff98871f9d2920 RDI:
>> ffffb427c3387c00
>> [330584.624239] RBP: 000000000000f161 R08: ffffb427c3387e68 R09:
>> ffffda1444482000
>> [330584.624686] R10: 0000000000000001 R11: 0000000000000001 R12:
>> 000000000000f161
>> [330584.625136] R13: ffff9887ad3c2700 R14: 000000000000f160 R15:
>> ffffb427c3387e90
>> [330584.625591] FS:  00007ff8de817800(0000) GS:ffff98887ac80000(0000)
>> knlGS:0000000000000000
>> [330584.626097] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [330584.626467] CR2: 0000562bb434e2d8 CR3: 0000000104bd6000 CR4:
>> 00000000000006e0
>> [330584.626954] Call Trace:
>> [330584.627125]  <IRQ>
>> [330584.627281]  ? rcu_dump_cpu_stacks+0xc8/0x100
>> [330584.627567]  ? rcu_sched_clock_irq.cold+0x15b/0x2fb
>> [330584.627882]  ? sched_slice+0x87/0x140
>> [330584.628126]  ? timekeeping_update+0xdd/0x130
>> [330584.628414]  ? __cgroup_account_cputime_field+0x5b/0xa0
>> [330584.628751]  ? update_process_times+0x77/0xb0
>> [330584.629036]  ? update_wall_time+0xc/0x20
>> [330584.629300]  ? tick_sched_handle+0x34/0x50
>> [330584.629564]  ? tick_sched_timer+0x6f/0x80
>> [330584.629823]  ? tick_sched_do_timer+0xa0/0xa0
>> [330584.630103]  ? __hrtimer_run_queues+0x112/0x2b0
>> [330584.630404]  ? hrtimer_interrupt+0xfe/0x220
>> [330584.630678]  ? __sysvec_apic_timer_interrupt+0x7f/0x170
>> [330584.631016]  ? sysvec_apic_timer_interrupt+0x99/0xc0
>> [330584.631339]  </IRQ>
>> [330584.631485]  <TASK>
>> [330584.631631]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
>> [330584.631974]  ? xas_load+0xb/0x40
>> [330584.632195]  ? xas_load+0x30/0x40
>> [330584.632415]  filemap_get_read_batch+0x16e/0x250
>> [330584.632710]  filemap_get_pages+0xa9/0x630
>> [330584.632972]  ? memcg_check_events+0xda/0x210
>> [330584.633259]  ? free_unref_page_commit+0x7c/0x170
>> [330584.633560]  ? _raw_spin_unlock_irqrestore+0x23/0x40
>> [330584.633880]  ? free_unref_page+0x1ac/0x220
>> [330584.634146]  filemap_read+0xd2/0x340
>> [330584.634389]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
>> [330584.634778]  xfs_file_read_iter+0x6a/0xd0 [xfs]
>> [330584.635123]  vfs_read+0x23c/0x310
>> [330584.635354]  __x64_sys_pread64+0x94/0xc0
>> [330584.635609]  do_syscall_64+0x3a/0x90
>> [330584.635846]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> [330584.636174] RIP: 0033:0x7ff8de90e747
>> [330584.636437] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66
>> 90
>> f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d
>> 00
>> f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
>> [330584.637582] RSP: 002b:00007fff5206aaa8 EFLAGS: 00000202 ORIG_RAX:
>> 0000000000000011
>> [330584.638056] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
>> 00007ff8de90e747
>> [330584.638506] RDX: 0000000000002000 RSI: 00007ff888b68e80 RDI:
>> 0000000000000230
>> [330584.638954] RBP: 0000000000000001 R08: 000000000a00000d R09:
>> 0000000000000000
>> [330584.639411] R10: 000000000f160000 R11: 0000000000000202 R12:
>> 00007ff8cdd92688
>> [330584.639862] R13: 0000562bb434af75 R14: 0000562bb434f510 R15:
>> 0000562bb5fe04d0
>> [330584.640313]  </TASK>
>> [330584.721995] rcu: INFO: rcu_preempt detected expedited stalls on
>> CPUs/tasks:
>> { 1-.... } 21015 jiffies s: 7297 root: 0x2/.
>> [330584.722930] rcu: blocking rcu_node structures (internal RCU debug):
>> [330584.723428] Sending NMI from CPU 0 to CPUs 1:
>> [330584.723763] NMI backtrace for cpu 1
>> [330584.723769] CPU: 1 PID: 1246481 Comm: .postgres-wrapp Not tainted 6.1.31
>> #1-NixOS
>> [330584.723772] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
>> [330584.723775] RIP: 0010:xas_load+0x29/0x40
>> [330584.723783] Code: 00 e8 3b ff ff ff 48 89 c2 83 e2 03 48 83 fa 02 75 08
>> 48
>> 3d 00 10 00 00 77 05 c3 cc cc cc cc 0f b6 4f 10 48 8d 70 fe 38 48 fe <72> ee
>> e8
>> 20 fe ff ff 80 3e 00 75 d0 c3 cc cc cc cc 66 0f 1f 44 00
>> [330584.723785] RSP: 0018:ffffb427c3387bf8 EFLAGS: 00000246
>> [330584.723787] RAX: ffff98871f8dbdaa RBX: ffffb427c3387d70 RCX:
>> 0000000000000000
>> [330584.723788] RDX: 0000000000000002 RSI: ffff98871f8dbda8 RDI:
>> ffffb427c3387c00
>> [330584.723789] RBP: 000000000000f161 R08: ffffb427c3387e68 R09:
>> ffffda1444482000
>> [330584.723790] R10: 0000000000000001 R11: 0000000000000001 R12:
>> 000000000000f161
>> [330584.723791] R13: ffff9887ad3c2700 R14: 000000000000f160 R15:
>> ffffb427c3387e90
>> [330584.723793] FS:  00007ff8de817800(0000) GS:ffff98887ac80000(0000)
>> knlGS:0000000000000000
>> [330584.723794] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [330584.723795] CR2: 0000562bb434e2d8 CR3: 0000000104bd6000 CR4:
>> 00000000000006e0
>> [330584.723798] Call Trace:
>> [330584.723801]  <NMI>
>> [330584.723805]  ? nmi_cpu_backtrace.cold+0x1b/0x76
>> [330584.723810]  ? nmi_cpu_backtrace_handler+0xd/0x20
>> [330584.723814]  ? nmi_handle+0x5d/0x120
>> [330584.723817]  ? xas_load+0x29/0x40
>> [330584.723818]  ? default_do_nmi+0x69/0x170
>> [330584.723821]  ? exc_nmi+0x13c/0x170
>> [330584.723823]  ? end_repeat_nmi+0x16/0x67
>> [330584.723828]  ? xas_load+0x29/0x40
>> [330584.723830]  ? xas_load+0x29/0x40
>> [330584.723832]  ? xas_load+0x29/0x40
>> [330584.723834]  </NMI>
>> [330584.723834]  <TASK>
>> [330584.723835]  filemap_get_read_batch+0x16e/0x250
>> [330584.723840]  filemap_get_pages+0xa9/0x630
>> [330584.723842]  ? memcg_check_events+0xda/0x210
>> [330584.723845]  ? free_unref_page_commit+0x7c/0x170
>> [330584.723849]  ? _raw_spin_unlock_irqrestore+0x23/0x40
>> [330584.723851]  ? free_unref_page+0x1ac/0x220
>> [330584.723852]  filemap_read+0xd2/0x340
>> [330584.723857]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
>> [330584.723935]  xfs_file_read_iter+0x6a/0xd0 [xfs]
>> [330584.723990]  vfs_read+0x23c/0x310
>> [330584.723995]  __x64_sys_pread64+0x94/0xc0
>> [330584.723997]  do_syscall_64+0x3a/0x90
>> [330584.724000]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> [330584.724003] RIP: 0033:0x7ff8de90e747
>> [330584.724019] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66
>> 90
>> f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d
>> 00
>> f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
>> [330584.724020] RSP: 002b:00007fff5206aaa8 EFLAGS: 00000202 ORIG_RAX:
>> 0000000000000011
>> [330584.724021] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
>> 00007ff8de90e747
>> [330584.724022] RDX: 0000000000002000 RSI: 00007ff888b68e80 RDI:
>> 0000000000000230
>> [330584.724023] RBP: 0000000000000001 R08: 000000000a00000d R09:
>> 0000000000000000
>> [330584.724023] R10: 000000000f160000 R11: 0000000000000202 R12:
>> 00007ff8cdd92688
>> [330584.724024] R13: 0000562bb434af75 R14: 0000562bb434f510 R15:
>> 0000562bb5fe04d0
>> [330584.724026]  </TASK>
>> 
>> 
>> This keeps repeating until around 4:03 which ends in systemd-journal getting
>> coredumped:
>> 
>> [331277.846966] rcu: INFO: rcu_preempt self-detected stall on CPU
>> [331277.847413] rcu:    1-....: (713858 ticks this GP)
>> idle=1204/1/0x4000000000000000 softirq=53073547/53073547 fqs=311996
>> [331277.848088]         (t=714253 jiffies g=142859597 q=1821602 ncpus=3)
>> [331277.848462] CPU: 1 PID: 1246481 Comm: .postgres-wrapp Not tainted 6.1.31
>> #1-NixOS
>> [331277.848941] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
>> [331277.849667] RIP: 0010:xas_descend+0x22/0x70
>> [331277.849952] Code: cc cc cc cc cc cc cc cc 0f b6 0e 48 8b 57 08 48 d3 ea
>> 83
>> e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 <48> 83
>> f9
>> 02 75 08 48 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc
>> [331277.851120] RSP: 0018:ffffb427c3387bf0 EFLAGS: 00000202
>> [331277.851468] RAX: ffff988876d3c002 RBX: ffffb427c3387d70 RCX:
>> 0000000000000002
>> [331277.851928] RDX: 000000000000000f RSI: ffff98871f9d2920 RDI:
>> ffffb427c3387c00
>> [331277.852386] RBP: 000000000000f161 R08: ffffb427c3387e68 R09:
>> ffffda1444482000
>> [331277.852847] R10: 0000000000000001 R11: 0000000000000001 R12:
>> 000000000000f161
>> [331277.853303] R13: ffff9887ad3c2700 R14: 000000000000f160 R15:
>> ffffb427c3387e90
>> [331277.853766] FS:  00007ff8de817800(0000) GS:ffff98887ac80000(0000)
>> knlGS:0000000000000000
>> [331277.854278] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [331277.854653] CR2: 0000562bb434e2d8 CR3: 0000000104bd6000 CR4:
>> 00000000000006e0
>> [331277.855111] Call Trace:
>> [331277.855284]  <IRQ>
>> [331277.855433]  ? rcu_dump_cpu_stacks+0xc8/0x100
>> [331277.855730]  ? rcu_sched_clock_irq.cold+0x15b/0x2fb
>> [331277.856049]  ? sched_slice+0x87/0x140
>> [331277.856294]  ? timekeeping_update+0xdd/0x130
>> [331277.856587]  ? __cgroup_account_cputime_field+0x5b/0xa0
>> [331277.856930]  ? update_process_times+0x77/0xb0
>> [331277.857222]  ? update_wall_time+0xc/0x20
>> [331277.857482]  ? tick_sched_handle+0x34/0x50
>> [331277.857758]  ? tick_sched_timer+0x6f/0x80
>> [331277.858024]  ? tick_sched_do_timer+0xa0/0xa0
>> [331277.858306]  ? __hrtimer_run_queues+0x112/0x2b0
>> [331277.858613]  ? hrtimer_interrupt+0xfe/0x220
>> [331277.858891]  ? __sysvec_apic_timer_interrupt+0x7f/0x170
>> [331277.859235]  ? sysvec_apic_timer_interrupt+0x99/0xc0
>> [331277.859558]  </IRQ>
>> [331277.859710]  <TASK>
>> [331277.859857]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
>> [331277.860206]  ? xas_descend+0x22/0x70
>> [331277.860446]  xas_load+0x30/0x40
>> [331277.860665]  filemap_get_read_batch+0x16e/0x250
>> [331277.860967]  filemap_get_pages+0xa9/0x630
>> [331277.861233]  ? memcg_check_events+0xda/0x210
>> [331277.861517]  ? free_unref_page_commit+0x7c/0x170
>> [331277.861834]  ? _raw_spin_unlock_irqrestore+0x23/0x40
>> [331277.862158]  ? free_unref_page+0x1ac/0x220
>> [331277.862427]  filemap_read+0xd2/0x340
>> [331277.862677]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
>> [331277.863072]  xfs_file_read_iter+0x6a/0xd0 [xfs]
>> [331277.863424]  vfs_read+0x23c/0x310
>> [331277.863657]  __x64_sys_pread64+0x94/0xc0
>> [331277.863917]  do_syscall_64+0x3a/0x90
>> [331277.864159]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> [331277.864488] RIP: 0033:0x7ff8de90e747
>> [331277.864752] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66
>> 90
>> f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d
>> 00
>> f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
>> [331277.865917] RSP: 002b:00007fff5206aaa8 EFLAGS: 00000202 ORIG_RAX:
>> 0000000000000011
>> [331277.866395] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
>> 00007ff8de90e747
>> [331277.866848] RDX: 0000000000002000 RSI: 00007ff888b68e80 RDI:
>> 0000000000000230
>> [331277.867298] RBP: 0000000000000001 R08: 000000000a00000d R09:
>> 0000000000000000
>> [331277.867750] R10: 000000000f160000 R11: 0000000000000202 R12:
>> 00007ff8cdd92688
>> [331277.868199] R13: 0000562bb434af75 R14: 0000562bb434f510 R15:
>> 0000562bb5fe04d0
>> [331277.868658]  </TASK>
>> Fri Jun 16 04:03:23 AM CEST 2023 -- SERIAL CONSOLE IS LIVE --
>> [331289.987247] systemd[1]: Starting Serial console liveness marker...
>> [331289.992645] systemd[1]: nscd.service: Deactivated successfully.
>> [331290.011838] systemd[1]: nscd.service: Consumed 3min 16.251s CPU time,
>> received 12.0M IP traffic, sent 8.2M IP traffic.
>> [331290.020289] systemd[1]: serial-console-liveness.service: Deactivated
>> successfully.
>> [331290.035818] systemd[1]: Finished Serial console liveness marker.
>> [331290.068776] systemd[1]: Started Logrotate Service.
>> [331290.069700] systemd[1]: systemd-journald.service: Watchdog timeout
>> (limit
>> 3min)!
>> [331290.070475] systemd[1]: systemd-journald.service: Killing process 528
>> (systemd-journal) with signal SIGABRT.
>> [331290.096246] systemd[1]: logrotate.service: Deactivated successfully.
>> [331290.285285] systemd-coredump[1267441]: Process 528 (systemd-journal) of
>> user 0 dumped core.
>> [331290.286002] systemd-coredump[1267441]: Coredump diverted to
>>
>> /var/lib/systemd/coredump/core.systemd-journal.0.39df468c96764e8c8fffc53a0d0a47d1.528.1686881003000000.zst
>> [331290.287045] systemd-coredump[1267441]: Module libaudit.so.1 without
>> build-id.
>> [331290.287574] systemd-coredump[1267441]: Module libnftnl.so.11 without
>> build-id.
>> [331290.288111] systemd-coredump[1267441]: Module libmnl.so.0 without
>> build-id.
>> [331290.288587] systemd-coredump[1267441]: Module libgpg-error.so.0 without
>> build-id.
>> [331290.289076] systemd-coredump[1267441]: Module libattr.so.1 without
>> build-id.
>> [331290.289541] systemd-coredump[1267441]: Module libzstd.so.1 without
>> build-id.
>> [331290.289991] systemd-coredump[1267441]: Module liblzma.so.5 without
>> build-id.
>> [331290.290446] systemd-coredump[1267441]: Module libseccomp.so.2 without
>> build-id.
>> [331290.290940] systemd-coredump[1267441]: Module libpam.so.0 without
>> build-id.
>> [331290.291419] systemd-coredump[1267441]: Module liblz4.so.1 without
>> build-id.
>> [331290.291893] systemd-coredump[1267441]: Module libkmod.so.2 without
>> build-id.
>> [331290.292370] systemd-coredump[1267441]: Module libip4tc.so.2 without
>> build-id.
>> [331290.295165] systemd-coredump[1267441]: Module libgcrypt.so.20 without
>> build-id.
>> [331290.295678] systemd-coredump[1267441]: Module libcrypt.so.2 without
>> build-id.
>> [331290.296160] systemd-coredump[1267441]: Module libcap.so.2 without
>> build-id.
>> [331290.296708] systemd-coredump[1267441]: Module libacl.so.1 without
>> build-id.
>> [331290.297206] systemd-coredump[1267441]: Module libsystemd-shared-253.so
>> without build-id.
>> [331290.297768] systemd-coredump[1267441]: Module systemd-journald without
>> build-id.
>> [331290.298273] systemd-coredump[1267441]: Stack trace of thread 528:
>> [331290.298714] systemd-coredump[1267441]: #0  0x00007f3e96431de8
>> check_object_header (libsystemd-shared-253.so + 0x231de8)
>> [331290.299435] systemd-coredump[1267441]: #1  0x00007f3e964330d2
>> journal_file_move_to_object (libsystemd-shared-253.so + 0x2330d2)
>> [331290.300199] systemd-coredump[1267441]: #2  0x00007f3e96434a71
>> journal_file_find_data_object_with_hash (libsystemd-shared-253.so +
>> 0x234a71)
>> [331290.302207] systemd-coredump[1267441]: #3  0x00007f3e96434d3b
>> journal_file_append_data (libsystemd-shared-253.so + 0x234d3b)
>> [331290.302983] systemd-coredump[1267441]: #4  0x00007f3e96437243
>> journal_file_append_entry (libsystemd-shared-253.so + 0x237243)
>> [331290.303767] systemd-coredump[1267441]: #5  0x000056536a4a7cd5
>> server_dispatch_message_real (systemd-journald + 0x10cd5)
>> [331290.304504] systemd-coredump[1267441]: #6  0x000056536a4a134b
>> dev_kmsg_record (systemd-journald + 0xa34b)
>> [331290.305195] systemd-coredump[1267441]: #7  0x000056536a4a182b
>> server_read_dev_kmsg (systemd-journald + 0xa82b)
>> [331290.305922] systemd-coredump[1267441]: #8  0x00007f3e9645c140
>> source_dispatch (libsystemd-shared-253.so + 0x25c140)
>> [331290.306640] systemd-coredump[1267441]: #9  0x00007f3e9645c42d
>> sd_event_dispatch (libsystemd-shared-253.so + 0x25c42d)
>> [331290.307350] systemd-coredump[1267441]: #10 0x00007f3e9645cb48
>> sd_event_run
>> (libsystemd-shared-253.so + 0x25cb48)
>> [331290.308043] systemd-coredump[1267441]: #11 0x000056536a4a0568 main
>> (systemd-journald + 0x9568)
>> [331290.308652] systemd-coredump[1267441]: #12 0x00007f3e9603dace
>> __libc_start_call_main (libc.so.6 + 0x23ace)
>> [331290.309292] systemd-coredump[1267441]: #13 0x00007f3e9603db89
>> __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x23b89)
>> [331290.310048] systemd-coredump[1267441]: #14 0x000056536a4a0835 _start
>> (systemd-journald + 0x9835)
>> [331290.310692] systemd-coredump[1267441]: ELF object binary architecture:
>> AMD
>> x86-64
>> 
>> 
>> This go back to normal until around 20:03 (around 14 hours later):
>> 
>> [389370.873026] rcu: INFO: rcu_preempt self-detected stall on CPU
>> [389370.873481] rcu:    0-....: (21000 ticks this GP)
>> idle=91fc/1/0x4000000000000000 softirq=85252827/85252827 fqs=4704
>> [389370.874126]         (t=21002 jiffies g=167843445 q=13889 ncpus=3)
>> [389370.874500] CPU: 0 PID: 2202919 Comm: .postgres-wrapp Not tainted 6.1.31
>> #1-NixOS
>> [389370.874979] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
>> [389370.875710] RIP: 0010:xas_descend+0x26/0x70
>> [389370.875991] Code: cc cc cc cc 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89
>> d0
>> 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 48 83 f9 02 <75> 08
>> 48
>> 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc 48 c1 e8 02
>> [389370.877164] RSP: 0018:ffffb427c4917bf0 EFLAGS: 00000246
>> [389370.877512] RAX: ffff98871f8dbdaa RBX: ffffb427c4917d70 RCX:
>> 0000000000000002
>> [389370.877967] RDX: 0000000000000005 RSI: ffff988876d3c000 RDI:
>> ffffb427c4917c00
>> [389370.878426] RBP: 000000000000f177 R08: ffffb427c4917e68 R09:
>> ffff988846485d38
>> [389370.878880] R10: ffffb427c4917e60 R11: ffff988846485d38 R12:
>> 000000000000f177
>> [389370.879341] R13: ffff988827b4ae00 R14: 000000000000f176 R15:
>> ffffb427c4917e90
>> [389370.879798] FS:  00007ff8de817800(0000) GS:ffff98887ac00000(0000)
>> knlGS:0000000000000000
>> [389370.880308] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [389370.880683] CR2: 00007ff881c8c000 CR3: 000000010dfea000 CR4:
>> 00000000000006f0
>> [389370.881139] Call Trace:
>> [389370.881309]  <IRQ>
>> [389370.881463]  ? rcu_dump_cpu_stacks+0xc8/0x100
>> [389370.881755]  ? rcu_sched_clock_irq.cold+0x15b/0x2fb
>> [389370.882077]  ? sched_slice+0x87/0x140
>> [389370.882325]  ? perf_event_task_tick+0x64/0x370
>> [389370.882629]  ? __cgroup_account_cputime_field+0x5b/0xa0
>> [389370.882968]  ? update_process_times+0x77/0xb0
>> [389370.883258]  ? tick_sched_handle+0x34/0x50
>> [389370.883538]  ? tick_sched_timer+0x6f/0x80
>> [389370.883802]  ? tick_sched_do_timer+0xa0/0xa0
>> [389370.884084]  ? __hrtimer_run_queues+0x112/0x2b0
>> [389370.884388]  ? hrtimer_interrupt+0xfe/0x220
>> [389370.884666]  ? __sysvec_apic_timer_interrupt+0x7f/0x170
>> [389370.885006]  ? sysvec_apic_timer_interrupt+0x99/0xc0
>> [389370.885338]  </IRQ>
>> [389370.885488]  <TASK>
>> [389370.885636]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
>> [389370.885983]  ? xas_descend+0x26/0x70
>> [389370.886224]  xas_load+0x30/0x40
>> [389370.886456]  filemap_get_read_batch+0x16e/0x250
>> [389370.886758]  filemap_get_pages+0xa9/0x630
>> [389370.887026]  ? atime_needs_update+0x104/0x180
>> [389370.887317]  ? touch_atime+0x46/0x1f0
>> [389370.887573]  filemap_read+0xd2/0x340
>> [389370.887817]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
>> [389370.888210]  xfs_file_read_iter+0x6a/0xd0 [xfs]
>> [389370.888571]  vfs_read+0x23c/0x310
>> [389370.888797]  __x64_sys_pread64+0x94/0xc0
>> [389370.889058]  do_syscall_64+0x3a/0x90
>> [389370.889299]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> [389370.889634] RIP: 0033:0x7ff8de90e747
>> [389370.889900] Code: 48 e8 9d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66
>> 90
>> f3 0f 1e fa 80 3d bd ac 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d
>> 00
>> f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
>> [389370.891063] RSP: 002b:00007fff5206a848 EFLAGS: 00000202 ORIG_RAX:
>> 0000000000000011
>> [389370.891549] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
>> 00007ff8de90e747
>> [389370.892003] RDX: 0000000000002000 RSI: 00007ff881c8ce80 RDI:
>> 00000000000001c7
>> [389370.892466] RBP: 0000000000000001 R08: 000000000a00000d R09:
>> 0000000000000000
>> [389370.892923] R10: 000000000f176000 R11: 0000000000000202 R12:
>> 00007ff8cdd91278
>> [389370.893386] R13: 0000562bb434af75 R14: 0000562bb434f510 R15:
>> 0000562bb5e55828
>> [389370.893843]  </TASK>
>> 
>> 
>> At this point we have to restart the VM externally because it didn't respond
>> properly any longer.
>> 
>> -- 
>> You may reply to this email to add a comment.
>> 
>> You are receiving this mail because:
>> You are watching the assignee of the bug.
> 
> 
> Liebe Grüße,
> Christian Theune
> 
> -- 
> Christian Theune · ct@flyingcircus.io · +49 345 219401 0
> Flying Circus Internet Operations GmbH · https://flyingcircus.io
> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian
> Zagrodnick
> 

Liebe Grüße,
Christian Theune
Comment 3 Christian Kujau 2023-06-20 15:21:41 UTC
So, these "blocked for more than" warnings are somewhat documented in https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#hung-task-timeout-secs and these warnings can be disabled.

These "rcu_preempt self-detected stall on CPU" messages may not be related to XFS at all, see e.g. https://stackoverflow.com/a/35403677. Maybe the system is just too busy while doing file system operations, hence "xfs" showing up in these backtraces.

Does this happen when dumping smaller databases too? Did this happen with earlier kernels? With newer ones?
Comment 4 Christian Theune 2023-06-20 17:26:52 UTC
I've only seen this once, so no indication on older or newer kernels, yet, either with smaller or larger databases. I fortunately could repair the PostgreSQL database with a FULL VACUUM on the table and then the dump worked fine again.

Hanging in the past typically indicated a network storage issue, so I'm aware of the multiple causes the hung tasks can have, I still appreciate the link. :)

At the time of the hung tasks, I can see almost no IO (but also no IO pressure) and 60% (of 3 CPUs) are reported as using up SYSTEM time. 

Something that made me think XFS was that we ended up with inconsistent data within PostgreSQL which I haven't seen in a decade.

Nevertheless, it appears this might be a MM issue as I stumbled over this inquiry which also mentions a 6.1 kernel: https://www.spinics.net/lists/kernel/msg4783004.html

I'm trying to get in touch with Daniel to see whether his analysis went anywhere ...
Comment 5 Ivan Mironov 2023-07-03 14:10:31 UTC
It looks like I had similar issues on Fedora with 6.3.* kernels, XFS and RocksDB: https://bugzilla.redhat.com/show_bug.cgi?id=2213967
Comment 6 Christian Theune 2023-07-03 19:56:36 UTC
Daniel pointed me to this patch they're considering as a valid fix:
https://lore.kernel.org/linux-fsdevel/20221129001632.GX3600936@dread.disaster.area/
Comment 7 Dave Chinner 2023-07-03 22:30:49 UTC
On Mon, Jul 03, 2023 at 07:56:36PM +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=217572
> 
> --- Comment #6 from Christian Theune (ct@flyingcircus.io) ---
> Daniel pointed me to this patch they're considering as a valid fix:
>
> https://lore.kernel.org/linux-fsdevel/20221129001632.GX3600936@dread.disaster.area/

No, that has nothing to do with the problem you are seeing on 6.1.31
kernels. That was a fix for a regression introduced in 6.3-rc1, and
hence does not exist in 6.1.y kernels.

The problem you are tripping over appears to be a livelock in the
page cache iterator infrastructure, not an issue with the filesystem
itself. This has been seen occasionally (maybe once every couple of
months of testing across the entire dev community) during testing
since large folios were enabled in the page cache, but nobody has
been able to reproduce it reliably enough to be able to isolate the
root cause and fix it yet.

If you can reproduce it reliably and quickly, then putting together
a recipe that we can use to trigger it would be a great help.

-Dave.
Comment 8 Christian Theune 2023-07-04 04:22:39 UTC
Ah, great, thanks! Superficially it looked related, but I saw that the patch never made it to 6.1.
 
I have only seen it once yet, too. If I can make out anything to make reproduction more likely, I'll make sure to let you know.

As Daniel isn't on CC on this, I'll let him know as well.
Comment 9 Sam James 2023-07-05 22:07:17 UTC
Any chance this one is related to https://bugzilla.kernel.org/show_bug.cgi?id=216646? It sounds similar in terms of symptoms (the eventual lockup & stalled tasks) at least.
Comment 10 Christian Theune 2023-09-28 12:39:05 UTC
@Dave I've seen this a second time now today. It also ended up being a copy statement with postgresql and I'm going to take a consistent snapshot of the full disk right now.

If you can point me to anything I can do to extract data from the filesystem I can boot this into an isolated machine and poke it. I'd also be open to look at it together if you (or anyone) wants to poke around.
Comment 11 Christian Theune 2023-09-28 13:06:13 UTC
Ok, so the issue didn't directly appear again after a reboot, but the PostgreSQL needed vacuuming (and luckily didn't looks data again).

I can provoke some issues in the machine when booted, but they might be after-effects in the filesystem due to the suspected memory bug dave mentioned.
Comment 12 Dave Chinner 2023-09-28 22:44:51 UTC
On Thu, Sep 28, 2023 at 12:39:05PM +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=217572
> 
> --- Comment #10 from Christian Theune (ct@flyingcircus.io) ---
> @Dave I've seen this a second time now today. It also ended up being a copy
> statement with postgresql and I'm going to take a consistent snapshot of the
> full disk right now.

Have you tried seeing if commit cbc02854331e ("XArray: Do not return
sibling entries from xa_load()") fixes the problem? That has been
marked for stable, so I'd expect that it eventually makes it back to
6.1.y or whatever LTS kernel you are running now....

As for debug, this isn't an XFS problem - it's a page cache problem
when high-order folios are in use (which XFS uses) - so you need to
be looking at in-memory page cache state that is being tripped over
first....

-Dave.
Comment 13 Christian Theune 2023-09-29 04:54:42 UTC
Hi,

sorry for picking up on the XFS end again – that was a brain fart on my side ... ;)

Unfortunately, I still can't (reliably) trigger the issue yet, but I've seen it twice in 3 months now.

I'll check the commit you mentioned. According to the LKML this has only made it into the unreleased 6.6 series. I could backport it to 6.1, ask a maintainer or it might be on its way to 6.5 already?

I'm undecided what the best option for  trying the commit is because both machines that were affected are in production use with customers.
Comment 14 Christian Theune 2023-09-29 05:01:18 UTC
Forget my last question! I went through the changelogs more closely (my search engine didn't properly match the commit globally) and found that it made its way into 6.1.53.

Which means this should show up soon in our environment.
Comment 15 Thomas Walker 2023-10-05 14:31:49 UTC
We seem to be hitting this with quite some regularity, also on 6.1(.44).  Including after applying the proposed patch (cbc02854331e ("XArray: Do not return sibling entries from xa_load()")

Curious whether this has actually solved the problem for anyone else?
(This issue should also probably be changed from XFS, which we know now is just a victim, to something in the memory management subsystem to get proper visibility).
Comment 16 Ivan Mironov 2023-10-08 17:35:40 UTC
It looks like this is still happening with 6.5.5, which includes cbc02854331e ("XArray: Do not return sibling entries from xa_load()").

[    0.000000] Linux version 6.5.5-200.fc38.x86_64 (mockbuild@d4d01d62c9c942e59de1ef4aa94df5a2) (gcc (GCC) 13.2.1 20230728 (Red Hat 13.2.1-1), GNU ld version 2.39-9.fc38) #1 SMP PREEMPT_DYNAMIC Sun Sep 24 15:52:44 UTC 2023
[    0.000000] Command line: BOOT_IMAGE=(md/boot)/vmlinuz-6.5.5-200.fc38.x86_64 root=/dev/mapper/vg--bmsolv-root ro rd.md.uuid=216337a3:789c28b0:81fbad29:6f190e56 rd.lvm.lv=vg-bmsolv/root rd.md.uuid=252001b9:2095e731:f1dd5baa:8b672d56 clocksource=tsc tsc=reliable amd_pstate=active rhgb quiet
...
[146990.211120] watchdog: BUG: soft lockup - CPU#20 stuck for 26s! [solAcctHashVer:3040]
[146990.211140] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid ipmi_ssif vfat fat intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi kvm snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer irqbypass snd acpi_ipmi cdc_ether rapl wmi_bmof usbnet ipmi_si soundcore i2c_piix4 k10temp mii ipmi_devintf ipmi_msghandler joydev fuse loop xfs raid1 igb dca ast i2c_algo_bit crct10dif_pclmul nvme crc32_pclmul crc32c_intel polyval_clmulni polyval_generic nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
[146990.211199] CPU: 20 PID: 3040 Comm: solAcctHashVer Not tainted 6.5.5-200.fc38.x86_64 #1
[146990.211201] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[146990.211203] RIP: 0010:xas_load+0x11/0x50
[146990.211208] Code: 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 55 53 48 89 fb e8 22 ff ff ff 48 89 c2 <83> e2 03 48 83 fa 02 75 08 48 3d 00 10 00 00 77 07 5b 5d e9 77 77
[146990.211210] RSP: 0018:ffffa80983a7bbb0 EFLAGS: 00000297
[146990.211211] RAX: ffff95d60e099ff2 RBX: ffffa80983a7bbc8 RCX: 000000000000000c
[146990.211213] RDX: ffff95d60e099ff2 RSI: ffff95d4c5326da0 RDI: ffffa80983a7bbc8
[146990.211214] RBP: 0000000000013e21 R08: 0000000000000000 R09: 0000000000002000
[146990.211215] R10: 0000000000000000 R11: ffff95d6a03d6538 R12: 0000000000013e21
[146990.211216] R13: ffff95e11e2d1400 R14: 0000000000013e20 R15: ffffa80983a7be60
[146990.211217] FS:  00007fd0d7fff6c0(0000) GS:ffff95f47ef00000(0000) knlGS:0000000000000000
[146990.211218] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[146990.211219] CR2: 00007fc0fbb12000 CR3: 000000010e2f2000 CR4: 0000000000750ee0
[146990.211221] PKRU: 55555554
[146990.211222] Call Trace:
[146990.211223]  <IRQ>
[146990.211227]  ? watchdog_timer_fn+0x1b8/0x220
[146990.211230]  ? __pfx_watchdog_timer_fn+0x10/0x10
[146990.211232]  ? __hrtimer_run_queues+0x112/0x2b0
[146990.211236]  ? hrtimer_interrupt+0xf8/0x230
[146990.211239]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[146990.211242]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[146990.211245]  </IRQ>
[146990.211245]  <TASK>
[146990.211246]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[146990.211252]  ? xas_load+0x11/0x50
[146990.211254]  filemap_get_read_batch+0x16e/0x250
[146990.211258]  filemap_get_pages+0xa6/0x630
[146990.211263]  filemap_read+0xd9/0x350
[146990.211270]  xfs_file_buffered_read+0x52/0xd0 [xfs]
[146990.211353]  xfs_file_read_iter+0x77/0xe0 [xfs]
[146990.211423]  vfs_read+0x201/0x350
[146990.211429]  ksys_read+0x6f/0xf0
[146990.211433]  do_syscall_64+0x60/0x90
[146990.211437]  ? srso_alias_return_thunk+0x5/0x7f
[146990.211441]  ? exc_page_fault+0x7f/0x180
[146990.211444]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[146990.211448] RIP: 0033:0x7ffb383230ea
[146990.211465] Code: 55 48 89 e5 48 83 ec 20 48 89 55 e8 48 89 75 f0 89 7d f8 e8 e8 79 f8 ff 48 8b 55 e8 48 8b 75 f0 41 89 c0 8b 7d f8 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 2e 44 89 c7 48 89 45 f8 e8 42 7a f8 ff 48 8b
[146990.211467] RSP: 002b:00007fd0d7ff80b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[146990.211469] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffb383230ea
[146990.211471] RDX: 0000000000002000 RSI: 00007fd28b056000 RDI: 000000000000084c
[146990.211472] RBP: 00007fd0d7ff80d0 R08: 0000000000000000 R09: 00000000000000cd
[146990.211474] R10: 00000000d40f7fa1 R11: 0000000000000246 R12: 00007fd28b056000
[146990.211475] R13: 00007fd0d7ffc3e0 R14: 0000000000002000 R15: 0000000000002000
[146990.211480]  </TASK>
[147038.206219] watchdog: BUG: soft lockup - CPU#2 stuck for 27s! [solAcctHashVer:3040]
[147038.206244] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid ipmi_ssif vfat fat intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi kvm snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer irqbypass snd acpi_ipmi cdc_ether rapl wmi_bmof usbnet ipmi_si soundcore i2c_piix4 k10temp mii ipmi_devintf ipmi_msghandler joydev fuse loop xfs raid1 igb dca ast i2c_algo_bit crct10dif_pclmul nvme crc32_pclmul crc32c_intel polyval_clmulni polyval_generic nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
[147038.206306] CPU: 2 PID: 3040 Comm: solAcctHashVer Tainted: G             L     6.5.5-200.fc38.x86_64 #1
[147038.206309] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[147038.206310] RIP: 0010:xas_load+0x45/0x50
[147038.206315] Code: 3d 00 10 00 00 77 07 5b 5d e9 77 77 02 00 0f b6 4b 10 48 8d 68 fe 38 48 fe 72 ec 48 89 ee 48 89 df e8 af fd ff ff 80 7d 00 00 <75> c7 eb d9 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90
[147038.206317] RSP: 0018:ffffa80983a7bbe0 EFLAGS: 00000206
[147038.206318] RAX: ffff95f326a6f47a RBX: ffffa80983a7bc20 RCX: 0000000000000002
[147038.206320] RDX: 0000000000000013 RSI: ffff95d60e099ff0 RDI: ffffa80983a7bc20
[147038.206321] RBP: ffff95d60e099ff0 R08: ffffffffffffffc0 R09: 0000000000000000
[147038.206322] R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000013e20
[147038.206323] R13: ffffa80983a7bcb8 R14: ffff95d6a03d66b0 R15: 000000000000000d
[147038.206324] FS:  00007fd0d7fff6c0(0000) GS:ffff95f47ea80000(0000) knlGS:0000000000000000
[147038.206326] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[147038.206327] CR2: 00007fc5390d8980 CR3: 000000010e2f2000 CR4: 0000000000750ee0
[147038.206328] PKRU: 55555554
[147038.206329] Call Trace:
[147038.206331]  <IRQ>
[147038.206335]  ? watchdog_timer_fn+0x1b8/0x220
[147038.206338]  ? __pfx_watchdog_timer_fn+0x10/0x10
[147038.206340]  ? __hrtimer_run_queues+0x112/0x2b0
[147038.206344]  ? hrtimer_interrupt+0xf8/0x230
[147038.206347]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[147038.206350]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[147038.206353]  </IRQ>
[147038.206353]  <TASK>
[147038.206354]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[147038.206361]  ? xas_load+0x45/0x50
[147038.206363]  xas_find+0x162/0x1c0
[147038.206365]  find_lock_entries+0x84/0x270
[147038.206371]  truncate_inode_pages_range+0xd7/0x400
[147038.206379]  evict+0x1b0/0x1d0
[147038.206383]  do_unlinkat+0x177/0x320
[147038.206387]  __x64_sys_unlink+0x42/0x70
[147038.206390]  do_syscall_64+0x60/0x90
[147038.206393]  ? srso_alias_return_thunk+0x5/0x7f
[147038.206395]  ? ksys_write+0xd8/0xf0
[147038.206399]  ? srso_alias_return_thunk+0x5/0x7f
[147038.206401]  ? syscall_exit_to_user_mode+0x2b/0x40
[147038.206403]  ? srso_alias_return_thunk+0x5/0x7f
[147038.206405]  ? do_syscall_64+0x6c/0x90
[147038.206407]  ? srso_alias_return_thunk+0x5/0x7f
[147038.206409]  ? irqtime_account_irq+0x40/0xc0
[147038.206412]  ? srso_alias_return_thunk+0x5/0x7f
[147038.206414]  ? __irq_exit_rcu+0x4b/0xc0
[147038.206417]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[147038.206419] RIP: 0033:0x7ffb38324aab
[147038.206436] Code: f0 ff ff 73 01 c3 48 8b 0d 5a 13 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 57 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 29 13 0d 00 f7 d8
[147038.206438] RSP: 002b:00007fd0d7ffb8e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000057
[147038.206439] RAX: ffffffffffffffda RBX: 000000000000003f RCX: 00007ffb38324aab
[147038.206441] RDX: 000000000000003f RSI: 00007fd0d7ffb8f0 RDI: 00007fd0d7ffb8f0
[147038.206442] RBP: 00007fd0d7ffca40 R08: 0000000000000000 R09: 0000000000000030
[147038.206442] R10: fefefefefefefeff R11: 0000000000000246 R12: 00005642797b0f00
[147038.206443] R13: 00007fd0d7ffc3e0 R14: 000000000000084c R15: 0000000000000000
[147038.206447]  </TASK>
[147066.205694] watchdog: BUG: soft lockup - CPU#2 stuck for 53s! [solAcctHashVer:3040]
[147066.205714] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid ipmi_ssif vfat fat intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi kvm snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer irqbypass snd acpi_ipmi cdc_ether rapl wmi_bmof usbnet ipmi_si soundcore i2c_piix4 k10temp mii ipmi_devintf ipmi_msghandler joydev fuse loop xfs raid1 igb dca ast i2c_algo_bit crct10dif_pclmul nvme crc32_pclmul crc32c_intel polyval_clmulni polyval_generic nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
[147066.205773] CPU: 2 PID: 3040 Comm: solAcctHashVer Tainted: G             L     6.5.5-200.fc38.x86_64 #1
[147066.205775] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[147066.205776] RIP: 0010:xas_load+0x45/0x50
[147066.205782] Code: 3d 00 10 00 00 77 07 5b 5d e9 77 77 02 00 0f b6 4b 10 48 8d 68 fe 38 48 fe 72 ec 48 89 ee 48 89 df e8 af fd ff ff 80 7d 00 00 <75> c7 eb d9 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90
[147066.205783] RSP: 0018:ffffa80983a7bbe0 EFLAGS: 00000206
[147066.205785] RAX: ffff95d4c5326da2 RBX: ffffa80983a7bc20 RCX: 0000000000000002
[147066.205786] RDX: 0000000000000038 RSI: ffff95f326a6f478 RDI: ffffa80983a7bc20
[147066.205787] RBP: ffff95f326a6f478 R08: ffffffffffffffc0 R09: 0000000000000000
[147066.205788] R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000013e20
[147066.205789] R13: ffffa80983a7bcb8 R14: ffff95d6a03d66b0 R15: 000000000000000d
[147066.205790] FS:  00007fd0d7fff6c0(0000) GS:ffff95f47ea80000(0000) knlGS:0000000000000000
[147066.205792] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[147066.205793] CR2: 00007fc5390d8980 CR3: 000000010e2f2000 CR4: 0000000000750ee0
[147066.205794] PKRU: 55555554
[147066.205795] Call Trace:
[147066.205797]  <IRQ>
[147066.205801]  ? watchdog_timer_fn+0x1b8/0x220
[147066.205804]  ? __pfx_watchdog_timer_fn+0x10/0x10
[147066.205806]  ? __hrtimer_run_queues+0x112/0x2b0
[147066.205811]  ? hrtimer_interrupt+0xf8/0x230
[147066.205813]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[147066.205816]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[147066.205819]  </IRQ>
[147066.205820]  <TASK>
[147066.205821]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[147066.205827]  ? xas_load+0x45/0x50
[147066.205829]  xas_find+0x162/0x1c0
[147066.205832]  find_lock_entries+0x84/0x270
[147066.205837]  truncate_inode_pages_range+0xd7/0x400
[147066.205846]  evict+0x1b0/0x1d0
[147066.205849]  do_unlinkat+0x177/0x320
[147066.205854]  __x64_sys_unlink+0x42/0x70
[147066.205856]  do_syscall_64+0x60/0x90
[147066.205859]  ? srso_alias_return_thunk+0x5/0x7f
[147066.205862]  ? ksys_write+0xd8/0xf0
[147066.205866]  ? srso_alias_return_thunk+0x5/0x7f
[147066.205868]  ? syscall_exit_to_user_mode+0x2b/0x40
[147066.205870]  ? srso_alias_return_thunk+0x5/0x7f
[147066.205872]  ? do_syscall_64+0x6c/0x90
[147066.205874]  ? srso_alias_return_thunk+0x5/0x7f
[147066.205876]  ? irqtime_account_irq+0x40/0xc0
[147066.205879]  ? srso_alias_return_thunk+0x5/0x7f
[147066.205880]  ? __irq_exit_rcu+0x4b/0xc0
[147066.205884]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[147066.205887] RIP: 0033:0x7ffb38324aab
[147066.205903] Code: f0 ff ff 73 01 c3 48 8b 0d 5a 13 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 57 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 29 13 0d 00 f7 d8
[147066.205905] RSP: 002b:00007fd0d7ffb8e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000057
[147066.205907] RAX: ffffffffffffffda RBX: 000000000000003f RCX: 00007ffb38324aab
[147066.205908] RDX: 000000000000003f RSI: 00007fd0d7ffb8f0 RDI: 00007fd0d7ffb8f0
[147066.205909] RBP: 00007fd0d7ffca40 R08: 0000000000000000 R09: 0000000000000030
[147066.205910] R10: fefefefefefefeff R11: 0000000000000246 R12: 00005642797b0f00
[147066.205911] R13: 00007fd0d7ffc3e0 R14: 000000000000084c R15: 0000000000000000
[147066.205914]  </TASK>
[147070.806580] rcu: INFO: rcu_preempt self-detected stall on CPU
[147070.806586] rcu: 	2-....: (59999 ticks this GP) idle=35c4/1/0x4000000000000000 softirq=43126317/43126325 fqs=14456
[147070.806590] rcu: 	(t=60001 jiffies g=64450853 q=40369 ncpus=32)
[147070.806593] CPU: 2 PID: 3040 Comm: solAcctHashVer Tainted: G             L     6.5.5-200.fc38.x86_64 #1
[147070.806596] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[147070.806597] RIP: 0010:xas_start+0x1e/0xc0
[147070.806603] Code: 90 90 90 90 90 90 90 90 90 90 90 90 48 8b 57 18 48 89 d0 83 e0 03 74 5c 48 81 fa 05 c0 ff ff 76 06 48 83 f8 02 74 46 48 8b 07 <48> 8b 57 08 48 8b 40 08 48 89 c1 83 e1 03 48 83 f9 02 75 08 48 3d
[147070.806604] RSP: 0018:ffffa80983a7bbd8 EFLAGS: 00000213
[147070.806607] RAX: ffff95d6a03d66b8 RBX: ffffa80983a7bc20 RCX: 0000000000000000
[147070.806608] RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffffa80983a7bc20
[147070.806609] RBP: fffffffffffffffe R08: ffffffffffffffc0 R09: 0000000000000000
[147070.806610] R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000013e20
[147070.806611] R13: ffffa80983a7bcb8 R14: ffff95d6a03d66b0 R15: 000000000000000d
[147070.806612] FS:  00007fd0d7fff6c0(0000) GS:ffff95f47ea80000(0000) knlGS:0000000000000000
[147070.806614] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[147070.806615] CR2: 00007fc5390d8980 CR3: 000000010e2f2000 CR4: 0000000000750ee0
[147070.806617] PKRU: 55555554
[147070.806618] Call Trace:
[147070.806619]  <IRQ>
[147070.806623]  ? rcu_dump_cpu_stacks+0xc4/0x100
[147070.806628]  ? rcu_sched_clock_irq+0x4f2/0x1170
[147070.806631]  ? srso_alias_return_thunk+0x5/0x7f
[147070.806634]  ? task_tick_fair+0x2fc/0x3f0
[147070.806637]  ? srso_alias_return_thunk+0x5/0x7f
[147070.806639]  ? srso_alias_return_thunk+0x5/0x7f
[147070.806641]  ? trigger_load_balance+0x73/0x390
[147070.806644]  ? srso_alias_return_thunk+0x5/0x7f
[147070.806647]  ? update_process_times+0x74/0xb0
[147070.806650]  ? tick_sched_handle+0x21/0x60
[147070.806653]  ? tick_sched_timer+0x6f/0x90
[147070.806654]  ? __pfx_tick_sched_timer+0x10/0x10
[147070.806656]  ? __hrtimer_run_queues+0x112/0x2b0
[147070.806660]  ? hrtimer_interrupt+0xf8/0x230
[147070.806662]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[147070.806665]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[147070.806668]  </IRQ>
[147070.806669]  <TASK>
[147070.806670]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[147070.806675]  ? xas_start+0x1e/0xc0
[147070.806677]  ? srso_alias_return_thunk+0x5/0x7f
[147070.806680]  xas_load+0xe/0x50
[147070.806682]  xas_find+0x162/0x1c0
[147070.806684]  find_lock_entries+0x84/0x270
[147070.806689]  truncate_inode_pages_range+0xd7/0x400
[147070.806699]  evict+0x1b0/0x1d0
[147070.806703]  do_unlinkat+0x177/0x320
[147070.806708]  __x64_sys_unlink+0x42/0x70
[147070.806711]  do_syscall_64+0x60/0x90
[147070.806714]  ? srso_alias_return_thunk+0x5/0x7f
[147070.806716]  ? ksys_write+0xd8/0xf0
[147070.806719]  ? srso_alias_return_thunk+0x5/0x7f
[147070.806721]  ? syscall_exit_to_user_mode+0x2b/0x40
[147070.806724]  ? srso_alias_return_thunk+0x5/0x7f
[147070.806726]  ? do_syscall_64+0x6c/0x90
[147070.806728]  ? srso_alias_return_thunk+0x5/0x7f
[147070.806730]  ? irqtime_account_irq+0x40/0xc0
[147070.806732]  ? srso_alias_return_thunk+0x5/0x7f
[147070.806734]  ? __irq_exit_rcu+0x4b/0xc0
[147070.806738]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[147070.806740] RIP: 0033:0x7ffb38324aab
[147070.806756] Code: f0 ff ff 73 01 c3 48 8b 0d 5a 13 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 57 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 29 13 0d 00 f7 d8
[147070.806758] RSP: 002b:00007fd0d7ffb8e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000057
[147070.806760] RAX: ffffffffffffffda RBX: 000000000000003f RCX: 00007ffb38324aab
[147070.806761] RDX: 000000000000003f RSI: 00007fd0d7ffb8f0 RDI: 00007fd0d7ffb8f0
[147070.806762] RBP: 00007fd0d7ffca40 R08: 0000000000000000 R09: 0000000000000030
[147070.806763] R10: fefefefefefefeff R11: 0000000000000246 R12: 00005642797b0f00
[147070.806764] R13: 00007fd0d7ffc3e0 R14: 000000000000084c R15: 0000000000000000
[147070.806768]  </TASK>
[147098.205093] watchdog: BUG: soft lockup - CPU#2 stuck for 82s! [solAcctHashVer:3040]
[147098.205113] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid ipmi_ssif vfat fat intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi kvm snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer irqbypass snd acpi_ipmi cdc_ether rapl wmi_bmof usbnet ipmi_si soundcore i2c_piix4 k10temp mii ipmi_devintf ipmi_msghandler joydev fuse loop xfs raid1 igb dca ast i2c_algo_bit crct10dif_pclmul nvme crc32_pclmul crc32c_intel polyval_clmulni polyval_generic nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
[147098.205173] CPU: 2 PID: 3040 Comm: solAcctHashVer Tainted: G             L     6.5.5-200.fc38.x86_64 #1
[147098.205176] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[147098.205177] RIP: 0010:xas_find+0xb3/0x1c0
[147098.205182] Code: b6 53 12 80 fa 40 0f 84 80 00 00 00 0f b6 c2 48 83 c0 04 48 8b 44 c6 08 48 89 c1 83 e1 03 48 83 f9 02 74 52 48 85 c0 74 11 5b <5d> 41 5c e9 c5 67 02 00 48 3d fd 00 00 00 77 ef 83 c2 01 4c 89 c0
[147098.205184] RSP: 0018:ffffa80983a7bc00 EFLAGS: 00000286
[147098.205186] RAX: ffffdf5ac1990000 RBX: fffffffffffffffe RCX: 0000000000000000
[147098.205187] RDX: 0000000000000020 RSI: ffff95d4c5326da0 RDI: ffffa80983a7bc20
[147098.205188] RBP: fffffffffffffffe R08: ffffffffffffffc0 R09: 0000000000000000
[147098.205189] R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000013e20
[147098.205191] R13: ffffa80983a7bcb8 R14: ffff95d6a03d66b0 R15: 000000000000000d
[147098.205192] FS:  00007fd0d7fff6c0(0000) GS:ffff95f47ea80000(0000) knlGS:0000000000000000
[147098.205193] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[147098.205194] CR2: 00007fc5390d8980 CR3: 000000010e2f2000 CR4: 0000000000750ee0
[147098.205196] PKRU: 55555554
[147098.205197] Call Trace:
[147098.205198]  <IRQ>
[147098.205202]  ? watchdog_timer_fn+0x1b8/0x220
[147098.205205]  ? __pfx_watchdog_timer_fn+0x10/0x10
[147098.205207]  ? __hrtimer_run_queues+0x112/0x2b0
[147098.205211]  ? hrtimer_interrupt+0xf8/0x230
[147098.205214]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[147098.205217]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[147098.205220]  </IRQ>
[147098.205220]  <TASK>
[147098.205221]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[147098.205227]  ? xas_find+0xb3/0x1c0
[147098.205230]  find_lock_entries+0x84/0x270
[147098.205235]  truncate_inode_pages_range+0xd7/0x400
[147098.205243]  evict+0x1b0/0x1d0
[147098.205248]  do_unlinkat+0x177/0x320
[147098.205252]  __x64_sys_unlink+0x42/0x70
[147098.205255]  do_syscall_64+0x60/0x90
[147098.205258]  ? srso_alias_return_thunk+0x5/0x7f
[147098.205261]  ? ksys_write+0xd8/0xf0
[147098.205264]  ? srso_alias_return_thunk+0x5/0x7f
[147098.205266]  ? syscall_exit_to_user_mode+0x2b/0x40
[147098.205268]  ? srso_alias_return_thunk+0x5/0x7f
[147098.205270]  ? do_syscall_64+0x6c/0x90
[147098.205272]  ? srso_alias_return_thunk+0x5/0x7f
[147098.205274]  ? irqtime_account_irq+0x40/0xc0
[147098.205276]  ? srso_alias_return_thunk+0x5/0x7f
[147098.205278]  ? __irq_exit_rcu+0x4b/0xc0
[147098.205282]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[147098.205284] RIP: 0033:0x7ffb38324aab
[147098.205302] Code: f0 ff ff 73 01 c3 48 8b 0d 5a 13 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 57 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 29 13 0d 00 f7 d8
[147098.205304] RSP: 002b:00007fd0d7ffb8e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000057
[147098.205306] RAX: ffffffffffffffda RBX: 000000000000003f RCX: 00007ffb38324aab
[147098.205307] RDX: 000000000000003f RSI: 00007fd0d7ffb8f0 RDI: 00007fd0d7ffb8f0
[147098.205308] RBP: 00007fd0d7ffca40 R08: 0000000000000000 R09: 0000000000000030
[147098.205309] R10: fefefefefefefeff R11: 0000000000000246 R12: 00005642797b0f00
[147098.205310] R13: 00007fd0d7ffc3e0 R14: 000000000000084c R15: 0000000000000000
[147098.205313]  </TASK>
[165201.884167] md: data-check of RAID array md126
[165207.917051] md: delaying data-check of md127 until md126 has finished (they share one or more physical units)
[166552.662760] md: md126: data-check done.
[166552.664316] md: data-check of RAID array md127
[166562.948199] md: md127: data-check done.
Comment 17 Ivan Mironov 2023-10-08 22:13:04 UTC
More of it:

[    0.000000] Linux version 6.5.5-200.fc38.x86_64 (mockbuild@d4d01d62c9c942e59de1ef4aa94df5a2) (gcc (GCC) 13.2.1 20230728 (Red Hat 13.2.1-1), GNU ld version 2.39-9.fc38) #1 SMP PREEMPT_DYNAMIC Sun Sep 24 15:52:44 UTC 2023
[    0.000000] Command line: BOOT_IMAGE=(md/boot)/vmlinuz-6.5.5-200.fc38.x86_64 root=/dev/mapper/vg--bmsolv-root ro rd.md.uuid=216337a3:789c28b0:81fbad29:6f190e56 rd.lvm.lv=vg-bmsolv/root rd.md.uuid=252001b9:2095e731:f1dd5baa:8b672d56 clocksource=tsc tsc=reliable amd_pstate=active rhgb quiet
...
[13024.332817] watchdog: BUG: soft lockup - CPU#8 stuck for 26s! [rocksdb:low:6331]
[13024.332841] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid ipmi_ssif vfat fat intel_rapl_msr intel_rapl_common snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core kvm snd_hwdep snd_pcm snd_timer irqbypass acpi_ipmi rapl snd wmi_bmof cdc_ether ipmi_si usbnet soundcore ipmi_devintf mii k10temp i2c_piix4 ipmi_msghandler joydev fuse loop xfs raid1 igb ast dca i2c_algo_bit crct10dif_pclmul crc32_pclmul crc32c_intel nvme polyval_clmulni polyval_generic nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
[13024.332904] CPU: 8 PID: 6331 Comm: rocksdb:low Not tainted 6.5.5-200.fc38.x86_64 #1
[13024.332906] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[13024.332908] RIP: 0010:xas_load+0x45/0x50
[13024.332914] Code: 3d 00 10 00 00 77 07 5b 5d e9 77 77 02 00 0f b6 4b 10 48 8d 68 fe 38 48 fe 72 ec 48 89 ee 48 89 df e8 af fd ff ff 80 7d 00 00 <75> c7 eb d9 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90
[13024.332917] RSP: 0018:ffff9c6541a1fbc0 EFLAGS: 00000206
[13024.332919] RAX: ffff8ee4773b8b6a RBX: ffff9c6541a1fbd8 RCX: 0000000000000002
[13024.332921] RDX: 0000000000000000 RSI: ffff8ed9bb019ff0 RDI: ffff9c6541a1fbd8
[13024.332923] RBP: ffff8ed9bb019ff0 R08: 0000000000000000 R09: 000000000000131c
[13024.332924] R10: 0000000000000000 R11: ffff8ed9e3a9f538 R12: 0000000000009010
[13024.332925] R13: ffff8eee64944900 R14: 000000000000900f R15: ffff9c6541a1fe70
[13024.332927] FS:  00007f32871ff6c0(0000) GS:ffff8ef8fec00000(0000) knlGS:0000000000000000
[13024.332929] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13024.332931] CR2: 00007ef3cac01000 CR3: 0000000101d7e000 CR4: 0000000000750ee0
[13024.332933] PKRU: 55555554
[13024.332934] Call Trace:
[13024.332937]  <IRQ>
[13024.332941]  ? watchdog_timer_fn+0x1b8/0x220
[13024.332946]  ? __pfx_watchdog_timer_fn+0x10/0x10
[13024.332949]  ? __hrtimer_run_queues+0x112/0x2b0
[13024.332954]  ? hrtimer_interrupt+0xf8/0x230
[13024.332957]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[13024.332961]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[13024.332965]  </IRQ>
[13024.332966]  <TASK>
[13024.332968]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[13024.332974]  ? xas_load+0x45/0x50
[13024.332976]  filemap_get_read_batch+0x16e/0x250
[13024.332981]  filemap_get_pages+0xa6/0x630
[13024.332984]  ? srso_alias_return_thunk+0x5/0x7f
[13024.332988]  ? srso_alias_return_thunk+0x5/0x7f
[13024.332990]  ? touch_atime+0x48/0x1b0
[13024.332994]  ? srso_alias_return_thunk+0x5/0x7f
[13024.332996]  ? filemap_read+0x329/0x350
[13024.332999]  filemap_read+0xd9/0x350
[13024.333005]  xfs_file_buffered_read+0x52/0xd0 [xfs]
[13024.333107]  xfs_file_read_iter+0x77/0xe0 [xfs]
[13024.333216]  vfs_read+0x201/0x350
[13024.333225]  __x64_sys_pread64+0x98/0xd0
[13024.333229]  do_syscall_64+0x60/0x90
[13024.333232]  ? srso_alias_return_thunk+0x5/0x7f
[13024.333236]  ? __irq_exit_rcu+0x4b/0xc0
[13024.333241]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[13024.333245] RIP: 0033:0x7f32aa721105
[13024.333264] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 d4 99 f8 ff 4c 8b 55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 27 9a f8 ff 48 8b
[13024.333266] RSP: 002b:00007f32871f90d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[13024.333269] RAX: ffffffffffffffda RBX: 00007f32871f9220 RCX: 00007f32aa721105
[13024.333270] RDX: 000000000000131c RSI: 00007f3283e6fc00 RDI: 00000000000005c9
[13024.333272] RBP: 00007f32871f90f0 R08: 0000000000000000 R09: 00007f32871f9268
[13024.333273] R10: 000000000900f060 R11: 0000000000000293 R12: 000000000900f060
[13024.333275] R13: 000000000000131c R14: 00007f3283e6fc00 R15: 00007f3299010580
[13024.333280]  </TASK>
[13052.332283] watchdog: BUG: soft lockup - CPU#8 stuck for 52s! [rocksdb:low:6331]
[13052.332303] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid ipmi_ssif vfat fat intel_rapl_msr intel_rapl_common snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core kvm snd_hwdep snd_pcm snd_timer irqbypass acpi_ipmi rapl snd wmi_bmof cdc_ether ipmi_si usbnet soundcore ipmi_devintf mii k10temp i2c_piix4 ipmi_msghandler joydev fuse loop xfs raid1 igb ast dca i2c_algo_bit crct10dif_pclmul crc32_pclmul crc32c_intel nvme polyval_clmulni polyval_generic nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
[13052.332364] CPU: 8 PID: 6331 Comm: rocksdb:low Tainted: G             L     6.5.5-200.fc38.x86_64 #1
[13052.332367] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[13052.332368] RIP: 0010:xas_descend+0x3/0x90
[13052.332374] Code: 00 48 8b 57 10 48 89 07 48 c1 e8 20 48 89 57 08 e9 c2 79 02 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f b6 0e <48> 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48
[13052.332375] RSP: 0018:ffff9c6541a1fbb8 EFLAGS: 00000206
[13052.332377] RAX: ffff8ed9e8c56da2 RBX: ffff9c6541a1fbd8 RCX: 000000000000000c
[13052.332378] RDX: 0000000000000002 RSI: ffff8ed9e8c56da0 RDI: ffff9c6541a1fbd8
[13052.332379] RBP: ffff8ed9e8c56da0 R08: 0000000000000000 R09: 000000000000131c
[13052.332380] R10: 0000000000000000 R11: ffff8ed9e3a9f538 R12: 0000000000009010
[13052.332381] R13: ffff8eee64944900 R14: 000000000000900f R15: ffff9c6541a1fe70
[13052.332383] FS:  00007f32871ff6c0(0000) GS:ffff8ef8fec00000(0000) knlGS:0000000000000000
[13052.332384] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13052.332385] CR2: 00007ef3cac01000 CR3: 0000000101d7e000 CR4: 0000000000750ee0
[13052.332387] PKRU: 55555554
[13052.332387] Call Trace:
[13052.332389]  <IRQ>
[13052.332391]  ? watchdog_timer_fn+0x1b8/0x220
[13052.332395]  ? __pfx_watchdog_timer_fn+0x10/0x10
[13052.332398]  ? __hrtimer_run_queues+0x112/0x2b0
[13052.332402]  ? hrtimer_interrupt+0xf8/0x230
[13052.332404]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[13052.332407]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[13052.332410]  </IRQ>
[13052.332411]  <TASK>
[13052.332412]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[13052.332418]  ? xas_descend+0x3/0x90
[13052.332420]  ? srso_alias_return_thunk+0x5/0x7f
[13052.332423]  xas_load+0x41/0x50
[13052.332426]  filemap_get_read_batch+0x16e/0x250
[13052.332431]  filemap_get_pages+0xa6/0x630
[13052.332433]  ? srso_alias_return_thunk+0x5/0x7f
[13052.332436]  ? srso_alias_return_thunk+0x5/0x7f
[13052.332438]  ? touch_atime+0x48/0x1b0
[13052.332441]  ? srso_alias_return_thunk+0x5/0x7f
[13052.332443]  ? filemap_read+0x329/0x350
[13052.332445]  filemap_read+0xd9/0x350
[13052.332451]  xfs_file_buffered_read+0x52/0xd0 [xfs]
[13052.332548]  xfs_file_read_iter+0x77/0xe0 [xfs]
[13052.332633]  vfs_read+0x201/0x350
[13052.332641]  __x64_sys_pread64+0x98/0xd0
[13052.332643]  do_syscall_64+0x60/0x90
[13052.332646]  ? srso_alias_return_thunk+0x5/0x7f
[13052.332649]  ? __irq_exit_rcu+0x4b/0xc0
[13052.332652]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[13052.332655] RIP: 0033:0x7f32aa721105
[13052.332671] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 d4 99 f8 ff 4c 8b 55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 27 9a f8 ff 48 8b
[13052.332673] RSP: 002b:00007f32871f90d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[13052.332675] RAX: ffffffffffffffda RBX: 00007f32871f9220 RCX: 00007f32aa721105
[13052.332676] RDX: 000000000000131c RSI: 00007f3283e6fc00 RDI: 00000000000005c9
[13052.332677] RBP: 00007f32871f90f0 R08: 0000000000000000 R09: 00007f32871f9268
[13052.332678] R10: 000000000900f060 R11: 0000000000000293 R12: 000000000900f060
[13052.332679] R13: 000000000000131c R14: 00007f3283e6fc00 R15: 00007f3299010580
[13052.332683]  </TASK>
[13059.632827] rcu: INFO: rcu_preempt self-detected stall on CPU
[13059.632832] rcu: 	8-....: (60000 ticks this GP) idle=5d8c/1/0x4000000000000000 softirq=3001365/3001365 fqs=27154
[13059.632836] rcu: 	(t=60001 jiffies g=4535761 q=1170639 ncpus=32)
[13059.632838] CPU: 8 PID: 6331 Comm: rocksdb:low Tainted: G             L     6.5.5-200.fc38.x86_64 #1
[13059.632840] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4U, BIOS P1.20 05/19/2021
[13059.632841] RIP: 0010:xas_load+0x2d/0x50
[13059.632847] Code: fa 55 53 48 89 fb e8 22 ff ff ff 48 89 c2 83 e2 03 48 83 fa 02 75 08 48 3d 00 10 00 00 77 07 5b 5d e9 77 77 02 00 0f b6 4b 10 <48> 8d 68 fe 38 48 fe 72 ec 48 89 ee 48 89 df e8 af fd ff ff 80 7d
[13059.632849] RSP: 0018:ffff9c6541a1fbc0 EFLAGS: 00000282
[13059.632851] RAX: ffff8ed9bb019ff2 RBX: ffff9c6541a1fbd8 RCX: 0000000000000000
[13059.632852] RDX: 0000000000000002 RSI: ffff8ed9e8c56da0 RDI: ffff9c6541a1fbd8
[13059.632853] RBP: ffff8ed9e8c56da0 R08: 0000000000000000 R09: 000000000000131c
[13059.632854] R10: 0000000000000000 R11: ffff8ed9e3a9f538 R12: 0000000000009010
[13059.632855] R13: ffff8eee64944900 R14: 000000000000900f R15: ffff9c6541a1fe70
[13059.632856] FS:  00007f32871ff6c0(0000) GS:ffff8ef8fec00000(0000) knlGS:0000000000000000
[13059.632858] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13059.632859] CR2: 00007ef3cac01000 CR3: 0000000101d7e000 CR4: 0000000000750ee0
[13059.632860] PKRU: 55555554
[13059.632861] Call Trace:
[13059.632863]  <IRQ>
[13059.632865]  ? rcu_dump_cpu_stacks+0xc4/0x100
[13059.632869]  ? rcu_sched_clock_irq+0x4f2/0x1170
[13059.632872]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632875]  ? task_tick_fair+0x2fc/0x3f0
[13059.632879]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632881]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632883]  ? trigger_load_balance+0x73/0x390
[13059.632885]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632888]  ? update_process_times+0x74/0xb0
[13059.632891]  ? tick_sched_handle+0x21/0x60
[13059.632894]  ? tick_sched_timer+0x6f/0x90
[13059.632896]  ? __pfx_tick_sched_timer+0x10/0x10
[13059.632897]  ? __hrtimer_run_queues+0x112/0x2b0
[13059.632901]  ? hrtimer_interrupt+0xf8/0x230
[13059.632903]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[13059.632906]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[13059.632909]  </IRQ>
[13059.632910]  <TASK>
[13059.632911]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[13059.632916]  ? xas_load+0x2d/0x50
[13059.632918]  filemap_get_read_batch+0x16e/0x250
[13059.632923]  filemap_get_pages+0xa6/0x630
[13059.632925]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632928]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632929]  ? touch_atime+0x48/0x1b0
[13059.632933]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632935]  ? filemap_read+0x329/0x350
[13059.632937]  filemap_read+0xd9/0x350
[13059.632944]  xfs_file_buffered_read+0x52/0xd0 [xfs]
[13059.633041]  xfs_file_read_iter+0x77/0xe0 [xfs]
[13059.633121]  vfs_read+0x201/0x350
[13059.633127]  __x64_sys_pread64+0x98/0xd0
[13059.633130]  do_syscall_64+0x60/0x90
[13059.633133]  ? srso_alias_return_thunk+0x5/0x7f
[13059.633135]  ? __irq_exit_rcu+0x4b/0xc0
[13059.633139]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[13059.633142] RIP: 0033:0x7f32aa721105
[13059.633158] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 d4 99 f8 ff 4c 8b 55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 27 9a f8 ff 48 8b
[13059.633159] RSP: 002b:00007f32871f90d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[13059.633161] RAX: ffffffffffffffda RBX: 00007f32871f9220 RCX: 00007f32aa721105
[13059.633162] RDX: 000000000000131c RSI: 00007f3283e6fc00 RDI: 00000000000005c9
[13059.633163] RBP: 00007f32871f90f0 R08: 0000000000000000 R09: 00007f32871f9268
[13059.633164] R10: 000000000900f060 R11: 0000000000000293 R12: 000000000900f060
[13059.633165] R13: 000000000000131c R14: 00007f3283e6fc00 R15: 00007f3299010580
[13059.633169]  </TASK>
Comment 18 Christian Theune 2023-11-02 15:27:58 UTC
We've updated a while ago and our fleet is not seeing improved results. They've actually seemed to have gotten worse according to the number of alerts we've seen. 

We've had a multitude of crashes in the last weeks with the following statistics:

6.1.31 - 2 affected machines
6.1.35 - 1 affected machine
6.1.37 - 1 affected machine
6.1.51 - 5 affected machines
6.1.55 - 2 affected machines
6.1.57 - 2 affected machines

Here's the more detailed behaviour of one of the machines with 6.1.57.

$ uptime
 16:10:23  up 13 days 19:00,  1 user,  load average: 3.21, 1.24, 0.57

$ uname -a
Linux ts00 6.1.57 #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 10 20:00:46 UTC 2023 x86_64 GNU/Linux

And here' the stall:

[654042.623386] rcu: INFO: rcu_preempt self-detected stall on CPU
[654042.624109] rcu:    1-....: (21079 ticks this GP) idle=380c/1/0x4000000000000000 softirq=136208646/136208648 fqs=7552
[654042.625253]         (t=21000 jiffies g=210623333 q=40912 ncpus=2)
[654042.625871] CPU: 1 PID: 1230375 Comm: nix-build Not tainted 6.1.57 #1-NixOS
[654042.626650] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[654042.627898] RIP: 0010:xas_descend+0x22/0x90
[654042.628379] Code: cc cc cc cc cc cc cc cc 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 <48> 83 f9 02 75 08 48 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc
[654042.630402] RSP: 0018:ffffa213c4c07bf8 EFLAGS: 00000202
[654042.630993] RAX: ffff8f9da3bca492 RBX: ffffa213c4c07d78 RCX: 0000000000000002
[654042.631782] RDX: 0000000000000004 RSI: ffff8f9eb8700248 RDI: ffffa213c4c07c08
[654042.632570] RBP: 000000000000010f R08: ffffa213c4c07e70 R09: ffff8f9e54dc2138
[654042.633352] R10: ffffa213c4c07e68 R11: ffff8f9e54dc2138 R12: 000000000000010f
[654042.634140] R13: ffff8f9d44c7ad00 R14: 0000000000000100 R15: ffffa213c4c07e98
[654042.634934] FS:  00007faf9514ff80(0000) GS:ffff8f9ebad00000(0000) knlGS:0000000000000000
[654042.635823] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[654042.636468] CR2: 00007faf78168000 CR3: 00000000366d2000 CR4: 00000000000006e0
[654042.637264] Call Trace:
[654042.637560]  <IRQ>
[654042.637809]  ? rcu_dump_cpu_stacks+0xc8/0x100
[654042.638305]  ? rcu_sched_clock_irq.cold+0x15b/0x2fb
[654042.638862]  ? sched_slice+0x87/0x140
[654042.639281]  ? timekeeping_update+0xdd/0x130
[654042.639781]  ? __cgroup_account_cputime_field+0x5b/0xa0
[654042.640363]  ? update_process_times+0x77/0xb0
[654042.640862]  ? update_wall_time+0xc/0x20
[654042.641305]  ? tick_sched_handle+0x34/0x50
[654042.641773]  ? tick_sched_timer+0x6f/0x80
[654042.642224]  ? tick_sched_do_timer+0xa0/0xa0
[654042.642710]  ? __hrtimer_run_queues+0x112/0x2b0
[654042.643220]  ? hrtimer_interrupt+0xfe/0x220
[654042.643703]  ? __sysvec_apic_timer_interrupt+0x7f/0x170
[654042.644286]  ? sysvec_apic_timer_interrupt+0x99/0xc0
[654042.644849]  </IRQ>
[654042.645101]  <TASK>
[654042.645353]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
[654042.645956]  ? xas_descend+0x22/0x90
[654042.646366]  xas_load+0x30/0x40
[654042.646738]  filemap_get_read_batch+0x16e/0x250
[654042.647253]  filemap_get_pages+0xa9/0x630
[654042.647714]  filemap_read+0xd2/0x340
[654042.648124]  ? __mod_memcg_lruvec_state+0x6e/0xd0
[654042.648670]  xfs_file_buffered_read+0x4f/0xd0 [xfs]
[654042.649307]  xfs_file_read_iter+0x6a/0xd0 [xfs]
[654042.649887]  vfs_read+0x23c/0x310
[654042.650276]  ksys_read+0x6b/0xf0
[654042.650658]  do_syscall_64+0x3a/0x90
[654042.651071]  entry_SYSCALL_64_after_hwframe+0x64/0xce
[654042.651650] RIP: 0033:0x7faf968ee78c
[654042.652085] Code: ec 28 48 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 a9 bb f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 34 44 89 c7 48 89 44 24 08 e8 ff bb f8 ff 48
[654042.654113] RSP: 002b:00007fff8d7e72e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[654042.654954] RAX: ffffffffffffffda RBX: 00005572a3d2c5f0 RCX: 00007faf968ee78c
[654042.655745] RDX: 0000000000010000 RSI: 00005572a3d2c5f0 RDI: 000000000000000c
[654042.656540] RBP: 00007fff8d7e7380 R08: 0000000000000000 R09: 0000000000000000
[654042.657327] R10: 0000000000000022 R11: 0000000000000246 R12: 000000000000000c
[654042.658119] R13: 00007faf96dfe6a8 R14: 0000000000000001 R15: 0000000000000001
[654042.658916]  </TASK>

In previous situations this self-detected stall only happened after other errors occured before them, afaict this is now happening "standalone" without those other errors, maybe this is new info?
Comment 19 Christian Theune 2023-11-02 15:29:36 UTC
As Dave pointed out that this is more of a MM issue, I'm moving this to MM but I'm not sure what the correct component would be, so I placed it in "other". Maybe this helps getting it to the right eyeballs. :)
Comment 20 Christian Theune 2023-11-02 16:23:02 UTC
A collague noticed that the assignee is still set to the XFS default virtual assignee, but that's something I can't change ...
Comment 21 Dave Chinner 2023-11-02 20:59:09 UTC
On Thu, Nov 02, 2023 at 03:27:58PM +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=217572
> 
> --- Comment #18 from Christian Theune (ct@flyingcircus.io) ---
> We've updated a while ago and our fleet is not seeing improved results.
> They've
> actually seemed to have gotten worse according to the number of alerts we've
> seen. 

This is still an unreproducable, unfixed bug in upstream kernels.
There is no known reproducer, so actually triggering it and hence
performing RCA is extremely difficult at this point in time. We don't
really even know what workload triggers it.

> We've had a multitude of crashes in the last weeks with the following
> statistics:
> 
> 6.1.31 - 2 affected machines
> 6.1.35 - 1 affected machine
> 6.1.37 - 1 affected machine
> 6.1.51 - 5 affected machines
> 6.1.55 - 2 affected machines
> 6.1.57 - 2 affected machines

Do these machines have ECC memory?

> Here's the more detailed behaviour of one of the machines with 6.1.57.
> 
> $ uptime
>  16:10:23  up 13 days 19:00,  1 user,  load average: 3.21, 1.24, 0.57

Yeah, that's the problem - such a rare, one off issue that we don't
really even know where to begin looking. :(

Given you seem to have a workload that occasionally triggers it,
could you try to craft a reproducer workload that does stuff similar
to your production workload and see if you can find out something
that makes this easier to trigger?

> $ uname -a
> Linux ts00 6.1.57 #1-NixOS SMP PREEMPT_DYNAMIC Tue Oct 10 20:00:46 UTC 2023
> x86_64 GNU/Linux
> 
> And here' the stall:
....
> [654042.645101]  <TASK>
> [654042.645353]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [654042.645956]  ? xas_descend+0x22/0x90
> [654042.646366]  xas_load+0x30/0x40
> [654042.646738]  filemap_get_read_batch+0x16e/0x250
> [654042.647253]  filemap_get_pages+0xa9/0x630
> [654042.647714]  filemap_read+0xd2/0x340
> [654042.648124]  ? __mod_memcg_lruvec_state+0x6e/0xd0
> [654042.648670]  xfs_file_buffered_read+0x4f/0xd0 [xfs]

This implies you are using memcg to constrain memory footprint of
the applications? Are these workloads running in memcgs that
experience random memcg OOM conditions? Or maybe the failure
correlates with global OOM conditions triggering memcg reclaim?

Cheers,

Dave.
Comment 22 Christian Theune 2023-11-03 12:52:34 UTC
(In reply to Dave Chinner from comment #21)
> 
> This is still an unreproducable, unfixed bug in upstream kernels.
> There is no known reproducer, so actually triggering it and hence
> performing RCA is extremely difficult at this point in time. We don't
> really even know what workload triggers it.

It seems IO-pressure related and we've seen it multiple times with various PostgreSQL activities.

I've planned time for next week to analyze this further and trying to help establishing a reproducer.

> > We've had a multitude of crashes in the last weeks with the following
> > statistics:
> > 
> > 6.1.31 - 2 affected machines
> > 6.1.35 - 1 affected machine
> > 6.1.37 - 1 affected machine
> > 6.1.51 - 5 affected machines
> > 6.1.55 - 2 affected machines
> > 6.1.57 - 2 affected machines
> 
> Do these machines have ECC memory?

The physical hosts do. The affected systems are all Qemu/KVM virtual machines, though.

> > Here's the more detailed behaviour of one of the machines with 6.1.57.
> > 
> > $ uptime
> >  16:10:23  up 13 days 19:00,  1 user,  load average: 3.21, 1.24, 0.57
> 
> Yeah, that's the problem - such a rare, one off issue that we don't
> really even know where to begin looking. :(
> 
> Given you seem to have a workload that occasionally triggers it,
> could you try to craft a reproducer workload that does stuff similar
> to your production workload and see if you can find out something
> that makes this easier to trigger?

Yup. I'm prioritizing this for the next weeks.


> This implies you are using memcg to constrain memory footprint of
> the applications? Are these workloads running in memcgs that
> experience random memcg OOM conditions? Or maybe the failure
> correlates with global OOM conditions triggering memcg reclaim?

I'll have to read up on what memcg is and whether we're doing anything with it on purpose. At the moment I think this is just whatever we're getting from our baseline environment with kernel or distro defaults. 

How do I notice a memcg OOM? I've always tried to correlate all kernel log messages and haven't seen any other tracebacks than the ones I posted.

Global (so I guess a "regular") OOM wasn't involved in any case so far.

I can try digging deeper into system VM statistics. We're running telegraf/prometheus and have a relatively exhaustive number of system variables we're monitoring on all systems. Anything specific I could look for?

> 
> Cheers,
> 
> Dave.
Comment 23 Christian Theune 2023-11-07 10:11:45 UTC
I'm trying to come up with a reproducer and reviewed our machines. Currently all the affected machines are running one of our database configurations (MySQL or PostgreSQL) and one difference is that they are running with increased readahead (128KiB default on our fleet is being increased to 1 MiB when running MySQL or PostgreSQL).

I also reviewed the RedHat bug and all the linked discussions. The RedHat bug smells like there might be a reproducer as the kernel dumps show uptimes of less than 30 minutes. Also, there's another database system involved as I've seen RocksDB (and compactions) mentioned there.

Using the PostgreSQL benchmark did not trigger anything for me in less than 1 hour even under memory pressure. I'm going to check out the rocksdb benchmark now.
Comment 24 KN 2023-11-07 10:25:59 UTC
Long time lurker here offering a potential workaround.


We experienced near identical kernel issues as mentioned here with a completely different setup. We saw the issue on our OKD cluster (4.12 and 4.13) running on Fedora CoreOS (37 and 38). We had ~70 nodes with a specific workload profile, and of these, anywhere between 1 and 5 would run into this issue each night on our production cluster. These nodes were very IO intensive (druid middlemanager/ingest nodes) but not database related. The persistent volumes that were contributing the majority of the disk IO were configured as xfs. We tried for weeks to reproduce this error but could not.

Whilst we have to accept this is a kernel bug and not an xfs bug, we *resolved* our issues by switching from xfs to ext4. Haven't had a single instance of this error since we migrated our persistent volumes away from xfs. 

3 weeks and counting and not a single failure.
Comment 25 Christian Theune 2023-11-07 14:12:23 UTC
Unfortunately building the rocksdb benchmark utilities is weirdly complicated (on NixOS) and I failed. We tried a number more stress tests with MySQL and PostgreSQL workloads today while stressing IO and swap but could not find a reproducer.

As we've seen slight data corruption issues due to this previously we're going to downgrade to the 5.15 series in our environment for now. 

I'd love to be of more help. If anyone ever can recommend any tests that we could perform, then please feel free to reach out. We're happy to help, but I've exhausted our capabilities for now. :/

Note You need to log in before you can comment on or make changes to this bug.