Bug 218684
Summary: | CPU soft lockups in KVM VMs on kernel 6.x after switching hypervisor from C8S to C9S | ||
---|---|---|---|
Product: | Virtualization | Reporter: | Frantisek Sumsal (frantisek) |
Component: | kvm | Assignee: | virtualization_kvm |
Status: | RESOLVED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | imammedo, seanjc |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: |
Description
Frantisek Sumsal
2024-04-05 20:33:15 UTC
On Fri, Apr 05, 2024, bugzilla-daemon@kernel.org wrote: > I'm currently in the middle of moving some of our hypervisors for upstream > systemd CI from CentOS Stream 8 to CentOS Stream 9 (as the former will go EOL > soon), and started hitting soft lockups on the guest machines (Arch Linux, > both > with "stock" kernel and mainline one). > > The hypervisors are AWS EC2 C5n Metal instances [0] running CentOS Stream, > which then run Arch Linux (KVM) VMs (using libvirt via Vagrant) - cpuinfo > from > one of the guests is at [1]. > > The "production" hypervisors currently run CentOS Stream 8 (kernel > 4.18.0-548.el8.x86_64) and everything is fine. However, after trying to > upgrade > a couple of them to CentOS Stream 9 (kernel 5.14.0-432.el9.x86_64) the guests > started exhibiting frequent soft lockups when running just the systemd unit > test suite. ... > [ 75.796414] kernel: RIP: 0010:pv_native_safe_halt+0xf/0x20 v> [ 75.796421] kernel: Code: 22 d7 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 > 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d 23 db 24 00 fb f4 > <c3> ... > [ 75.796447] kernel: Call Trace: > [ 75.796450] kernel: <IRQ> > [ 75.800549] kernel: ? watchdog_timer_fn+0x1dd/0x260 > [ 75.800553] kernel: ? __pfx_watchdog_timer_fn+0x10/0x10 > [ 75.800556] kernel: ? __hrtimer_run_queues+0x10f/0x2a0 > [ 75.800560] kernel: ? hrtimer_interrupt+0xfa/0x230 > [ 75.800563] kernel: ? __sysvec_apic_timer_interrupt+0x55/0x150 > [ 75.800567] kernel: ? sysvec_apic_timer_interrupt+0x6c/0x90 > [ 75.800569] kernel: </IRQ> > [ 75.800569] kernel: <TASK> > [ 75.800571] kernel: ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 > [ 75.800590] kernel: ? pv_native_safe_halt+0xf/0x20 > [ 75.800593] kernel: default_idle+0x9/0x20 > [ 75.800596] kernel: default_idle_call+0x30/0x100 > [ 75.800598] kernel: do_idle+0x1cb/0x210 > [ 75.800603] kernel: cpu_startup_entry+0x29/0x30 > [ 75.800606] kernel: start_secondary+0x11c/0x140 > [ 75.800610] kernel: common_startup_64+0x13e/0x141 > [ 75.800616] kernel: </TASK> Hmm, the vCPU is stuck in the idle HLT loop, which suggests that the vCPU isn't waking up when it should. But it does obviously get the hrtimer interrupt, so it's not completely hosed. Are you able to test custom kernels? If so, bisecting the host kernel is likely the easiest way to figure out what's going on. It might not be the _fastest_, but it should be straightforward, and shouldn't require much KVM expertise, i.e. won't require lengthy back-and-forth discussions if no one immediately spots a bug. And before bisecting, it'd be worth seeing if an upstream host kernel has the same problem, e.g. if upstream works, it might be easier/faster to bisect to a fix, than to bisect to a bug. (In reply to Sean Christopherson from comment #1) > On Fri, Apr 05, 2024, bugzilla-daemon@kernel.org wrote: <...snip...> > > Hmm, the vCPU is stuck in the idle HLT loop, which suggests that the vCPU > isn't > waking up when it should. But it does obviously get the hrtimer interrupt, > so > it's not completely hosed. > > Are you able to test custom kernels? If so, bisecting the host kernel is > likely > the easiest way to figure out what's going on. It might not be the > _fastest_, > but it should be straightforward, and shouldn't require much KVM expertise, > i.e. > won't require lengthy back-and-forth discussions if no one immediately spots > a > bug. > > And before bisecting, it'd be worth seeing if an upstream host kernel has the > same problem, e.g. if upstream works, it might be easier/faster to bisect to > a > fix, than to bisect to a bug. I did some tests over the weekend, and after installing the latest-ish mainline kernel on the host (6.9.0-0.rc1.316.vanilla.fc40.x86_64, ignore the fc40 part, I was just lazy and used [0] for a quick test) the soft lockups disappear completely. I really should've tried this before filing an issue - I tried just 6.7.1-0.hs1.hsx.el9.x86_64 (from [1]) and that didn't help, so I mistakenly assumed that it's not the host kernel who's at fault. Also, with the mainline kernel on the host, I can now use the "stock" Arch Linux kernel on the guest as well without any soft lockups. Given the mainline kernel works as expected I'll go ahead and move this issue to the RHEL downstream (and bisect the kernel to find out what's the fix). Thanks a lot for nudging me into the right direction! [0] https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories [1] https://sig.centos.org/hyperscale/ Given that 6.7 is still broken, my money is on commit d02c357e5bfa ("KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing"). Ugh, which I neglected to mark for stable. Note, if that's indeed what's to blame, there's also a bug in the kernel's preemption model logic that is contributing to the problems. https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com (In reply to Sean Christopherson from comment #3) > Given that 6.7 is still broken, my money is on commit d02c357e5bfa ("KVM: > x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing"). > Ugh, which I neglected to mark for stable. > > Note, if that's indeed what's to blame, there's also a bug in the kernel's > preemption model logic that is contributing to the problems. > https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com You're absolutely right. I took the C9S kernel RPM (5.14.0-436), slapped d02c357e5bfa on top of it, and after running the same tests as in previous cases all the soft lockups seem to be magically gone. Thanks a lot! I'll run a couple more tests and if they pass I'll go ahead and sum this up in a RHEL report, so the necessary patches make it to C9S/RHEL9. (In reply to Frantisek Sumsal from comment #4) > (In reply to Sean Christopherson from comment #3) > > Given that 6.7 is still broken, my money is on commit d02c357e5bfa ("KVM: > > x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing"). > > Ugh, which I neglected to mark for stable. > > > > Note, if that's indeed what's to blame, there's also a bug in the kernel's > > preemption model logic that is contributing to the problems. > > https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com > > You're absolutely right. I took the C9S kernel RPM (5.14.0-436), slapped > d02c357e5bfa on top of it, and after running the same tests as in previous > cases all the soft lockups seem to be magically gone. Thanks a lot! > > I'll run a couple more tests and if they pass I'll go ahead and sum this up > in a RHEL report, so the necessary patches make it to C9S/RHEL9. Thanks for reporting it upstream, as for C9S it should be fixed in kernel-5.14.0-444.el9 https://issues.redhat.com/browse/RHEL-17714 |