Bug 215562
Summary: | BUG: unable to handle page fault in cache_reap | ||
---|---|---|---|
Product: | Memory Management | Reporter: | Patrick Schaaf (kernelorg) |
Component: | Slab Allocator | Assignee: | Andrew Morton (akpm) |
Status: | NEW --- | ||
Severity: | normal | CC: | regressions |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.10.93 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Patrick Schaaf
2022-02-02 15:33:40 UTC
Upon checking the previous event we experienced, I actually found precisely the same place faulting, with a 5.10.90 kernel. We've been running earlier 5.10.X kernels, .80 before that, for a long while, so that is probably something new. The event on January 13th was a DL385 Gen10plus with AMD processor, 64 cores, the one today was a DL380 Gen8 with Intel processor, 12 cores. Pretty much ruling out hardware. I will attach both of the captured kernel message logs And it happened a third time. Again a 5.10.93 kernel and similar work we did. To complete the "what we did" picture * WHILE a set of VMs, including one taking hundreds of GB of memory, ran (fine) * we create a fresh ext4 filesystem on the host, mount it, fill it with rsync to the tune of another few hundred GB * I/O from that settled (nr_dirty on the host down to nothing) * larger VM this time round, shut down intentionally, actually stopped, qemu terminated for a few minutes * umounting the large new ext4 outside * which also completes (from userlevel / shell view) but is then soon followed by the BUG + hang I meanwhile switched to 5.15, and also from SLAB to SLUB, and didn't experience these symptoms anymore. Sorry, I for some time lost track of this. (In reply to Patrick Schaaf from comment #3) > I meanwhile switched to 5.15, and also from SLAB to SLUB, and didn't > experience these symptoms anymore. I just wanted to get this rolling again, but I guess in that case it might be not worth it, as without a bisection I guess it will be hard to find anyone looking into this. Is that okay for you? Fine with me Thorsten and thanks for the attempt at pushing this on. I'm totally happy with having switched to 5.15, my workloads love it. |