Bug 215562

Summary: BUG: unable to handle page fault in cache_reap
Product: Memory Management Reporter: Patrick Schaaf (kernelorg)
Component: Slab AllocatorAssignee: Andrew Morton (akpm)
Status: NEW ---    
Severity: normal CC: regressions
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.10.93 Subsystem:
Regression: No Bisected commit-id:

Description Patrick Schaaf 2022-02-02 15:33:40 UTC
We've been running self-built 5.10.x kernels on DL380 hosts for quite a while, also inside the VMs there.

With I think 5.10.90 three weeks or so back, we experienced a lockup upon umounting a larger, dirty filesystem on the host side, unfortunately without capturing a backtrace back then.

Today something feeling similar, happened again, on a machine running 5.10.93 both on the host and inside its 10 various VMs.

Problem showed shortly (minutes) after shutting down one of the VMs (few hundred GB memory / dataset, VM shutdown was complete already; direct I/O), and then some LVM volume renames, a quick short outside ext4 mount followed by an umount (8 GB volume, probably a few hundred megabyte only to write). Actually monitoring suggests that disk writes were already done about a minute before the onset.

What we then experienced, was the following BUG:, followed by one after the other CPU saying goodbye with soft lockup messages over the course of a few minutes; meanwhile there was no more pinging the box, logging in on console, etc. We hard powercycled and it recovered fully.

here's the BUG that was logged; if it is useful for someone to see the followup soft lockup messages, tell me + I'll add them.

Feb 02 15:22:27 kvm3j kernel: BUG: unable to handle page fault for address: ffffebde00000008
Feb 02 15:22:27 kvm3j kernel: #PF: supervisor read access in kernel mode
Feb 02 15:22:27 kvm3j kernel: #PF: error_code(0x0000) - not-present page
Feb 02 15:22:27 kvm3j kernel: Oops: 0000 [#1] SMP PTI
Feb 02 15:22:27 kvm3j kernel: CPU: 7 PID: 39833 Comm: kworker/7:0 Tainted: G          I       5.10.93-kvm #1
Feb 02 15:22:27 kvm3j kernel: Hardware name: HP ProLiant DL380p Gen8, BIOS P70 12/20/2013
Feb 02 15:22:27 kvm3j kernel: Workqueue: events cache_reap
Feb 02 15:22:27 kvm3j kernel: RIP: 0010:free_block.constprop.0+0xc0/0x1f0
Feb 02 15:22:27 kvm3j kernel: Code: 4c 8b 16 4c 89 d0 48 01 e8 0f 82 32 01 00 00 4c 89 f2 48 bb 00 00 00 00 00 ea ff ff 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 01 d8 <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 >
Feb 02 15:22:27 kvm3j kernel: RSP: 0018:ffffc9000252bdc8 EFLAGS: 00010086
Feb 02 15:22:27 kvm3j kernel: RAX: ffffebde00000000 RBX: ffffea0000000000 RCX: ffff888889141b00
Feb 02 15:22:27 kvm3j kernel: RDX: 0000777f80000000 RSI: ffff893d3edf3400 RDI: ffff8881000403c0
Feb 02 15:22:27 kvm3j kernel: RBP: 0000000080000000 R08: ffff888100041300 R09: 0000000000000003
Feb 02 15:22:27 kvm3j kernel: R10: 0000000000000000 R11: ffff888100041308 R12: dead000000000122
Feb 02 15:22:27 kvm3j kernel: R13: dead000000000100 R14: 0000777f80000000 R15: ffff893ed8780d60
Feb 02 15:22:27 kvm3j kernel: FS:  0000000000000000(0000) GS:ffff893d3edc0000(0000) knlGS:0000000000000000
Feb 02 15:22:27 kvm3j kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 02 15:22:27 kvm3j kernel: CR2: ffffebde00000008 CR3: 000000048c4aa002 CR4: 00000000001726e0
Feb 02 15:22:27 kvm3j kernel: Call Trace:
Feb 02 15:22:27 kvm3j kernel:  drain_array_locked.constprop.0+0x2e/0x80
Feb 02 15:22:27 kvm3j kernel:  drain_array.constprop.0+0x54/0x70
Feb 02 15:22:27 kvm3j kernel:  cache_reap+0x6c/0x100
Feb 02 15:22:27 kvm3j kernel:  process_one_work+0x1cf/0x360
Feb 02 15:22:27 kvm3j kernel:  worker_thread+0x45/0x3a0
Feb 02 15:22:27 kvm3j kernel:  ? process_one_work+0x360/0x360
Feb 02 15:22:27 kvm3j kernel:  kthread+0x116/0x130
Feb 02 15:22:27 kvm3j kernel:  ? kthread_create_worker_on_cpu+0x40/0x40
Feb 02 15:22:27 kvm3j kernel:  ret_from_fork+0x22/0x30
Feb 02 15:22:27 kvm3j kernel: Modules linked in: hpilo
Feb 02 15:22:27 kvm3j kernel: CR2: ffffebde00000008
Feb 02 15:22:27 kvm3j kernel: ---[ end trace ded3153d86a92898 ]---
Feb 02 15:22:27 kvm3j kernel: RIP: 0010:free_block.constprop.0+0xc0/0x1f0
Feb 02 15:22:27 kvm3j kernel: Code: 4c 8b 16 4c 89 d0 48 01 e8 0f 82 32 01 00 00 4c 89 f2 48 bb 00 00 00 00 00 ea ff ff 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 01 d8 <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 >
Feb 02 15:22:27 kvm3j kernel: RSP: 0018:ffffc9000252bdc8 EFLAGS: 00010086
Feb 02 15:22:27 kvm3j kernel: RAX: ffffebde00000000 RBX: ffffea0000000000 RCX: ffff888889141b00
Feb 02 15:22:27 kvm3j kernel: RDX: 0000777f80000000 RSI: ffff893d3edf3400 RDI: ffff8881000403c0
Feb 02 15:22:27 kvm3j kernel: RBP: 0000000080000000 R08: ffff888100041300 R09: 0000000000000003
Feb 02 15:22:27 kvm3j kernel: R10: 0000000000000000 R11: ffff888100041308 R12: dead000000000122
Feb 02 15:22:27 kvm3j kernel: R13: dead000000000100 R14: 0000777f80000000 R15: ffff893ed8780d60
Feb 02 15:22:27 kvm3j kernel: FS:  0000000000000000(0000) GS:ffff893d3edc0000(0000) knlGS:0000000000000000
Feb 02 15:22:27 kvm3j kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 02 15:22:27 kvm3j kernel: CR2: ffffebde00000008 CR3: 000000048c4aa002 CR4: 00000000001726e0
Comment 1 Patrick Schaaf 2022-02-02 15:54:44 UTC
Upon checking the previous event we experienced, I actually found precisely the same place faulting, with a 5.10.90 kernel.

We've been running earlier 5.10.X kernels, .80 before that, for a long while, so that is probably something new.

The event on January 13th was a DL385 Gen10plus with AMD processor, 64 cores, the one today was a DL380 Gen8 with Intel processor, 12 cores. Pretty much ruling out hardware.

I will attach both of the captured kernel message logs
Comment 2 Patrick Schaaf 2022-02-02 16:59:17 UTC
And it happened a third time. Again a 5.10.93 kernel and similar work we did.

To complete the "what we did" picture
 * WHILE a set of VMs, including one taking hundreds of GB of memory, ran (fine)
 * we create a fresh ext4 filesystem on the host, mount it, fill it with rsync to the tune of another few hundred GB
 * I/O from that settled (nr_dirty on the host down to nothing)
 * larger VM this time round, shut down intentionally, actually stopped, qemu terminated for a few minutes
 * umounting the large new ext4 outside
 * which also completes (from userlevel / shell view) but is then soon followed by the BUG + hang
Comment 3 Patrick Schaaf 2022-03-14 08:16:02 UTC
I meanwhile switched to 5.15, and also from SLAB to SLUB, and didn't experience these symptoms anymore.
Comment 4 The Linux kernel's regression tracker (Thorsten Leemhuis) 2022-03-21 09:29:55 UTC
Sorry, I for some time lost track of this.

(In reply to Patrick Schaaf from comment #3)
> I meanwhile switched to 5.15, and also from SLAB to SLUB, and didn't
> experience these symptoms anymore.

I just wanted to get this rolling again, but I guess in that case it might be not worth it, as without a bisection I guess it will be hard to find anyone looking into this. Is that okay for you?
Comment 5 Patrick Schaaf 2022-03-21 09:50:40 UTC
Fine with me Thorsten and thanks for the attempt at pushing this on.

I'm totally happy with having switched to 5.15, my workloads love it.