Bug 209155
Summary: | KVM Linux guest with more than 1 CPU panics after commit 404d5d7bff0d419fe11c7eaebca9ec8f25258f95 on old CPU (Phenom x4) | ||
---|---|---|---|
Product: | Virtualization | Reporter: | Paul K. (kronenpj) |
Component: | kvm | Assignee: | virtualization_kvm |
Status: | NEW --- | ||
Severity: | normal | CC: | seanjc, wanpeng.li |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 5.8.0 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Screen capture from VM.
Dmesg from host on working kernel. lscpu from host lscpu from guest Output from /sys/module/kvm_amd/parameters/ Attempt to fix the immediate bug of guest rip going into the weeds |
Created attachment 292343 [details]
Dmesg from host on working kernel.
Hardware: Gigabyte GA-MA780G-UD3H AMD Phenom(tm) 9600 Quad-Core Processor (family: 0x10, model: 0x2, stepping: 0x2) 8GB RAM OS: Fedora 32 kernel-5.7.17-200.fc32.x86_64 - Works kernel-5.8.5-200.fc32.x86_64 - Causes VM crash. Virtualization Packages: ipxe-roms-qemu-20190125-4.git36a4c85f.fc32.noarch libvirt-daemon-6.1.0-4.fc32.x86_64 libvirt-daemon-config-network-6.1.0-4.fc32.x86_64 libvirt-daemon-kvm-6.1.0-4.fc32.x86_64 libvirt-glib-3.0.0-2.fc32.x86_64 libvirt-libs-6.1.0-4.fc32.x86_64 qemu-kvm-4.2.1-1.fc32.x86_64 qemu-kvm-core-4.2.1-1.fc32.x86_64 qemu-system-x86-4.2.1-1.fc32.x86_64 qemu-system-x86-core-4.2.1-1.fc32.x86_64 Please let me know if there's any additional information that could be helpful. $ git bisect log # bad: [bcf876870b95592b52519ed4aafcf9d95999bc9c] Linux 5.8 git bisect start 'v5.8' # good: [3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162] Linux 5.7 git bisect good 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162 # bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714 # bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714 # bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714 # bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714 # bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714 # bad: [2e63f6ce7ed2c4ff83ba30ad9ccad422289a6c63] Merge branch 'uaccess.comedi' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs git bisect bad 2e63f6ce7ed2c4ff83ba30ad9ccad422289a6c63 # good: [cfa3b8068b09f25037146bfd5eed041b78878bee] Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma git bisect good cfa3b8068b09f25037146bfd5eed041b78878bee # good: [c41219fda6e04255c44d37fd2c0d898c1c46abf1] Merge tag 'drm-intel-next-fixes-2020-05-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-next git bisect good c41219fda6e04255c44d37fd2c0d898c1c46abf1 # good: [f3cdc8ae116e27d84e1f33c7a2995960cebb73ac] Merge tag 'for-5.8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux git bisect good f3cdc8ae116e27d84e1f33c7a2995960cebb73ac # good: [f1e455352b6f503532eb3637d0a6d991895e7856] Merge tag 'kgdb-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux git bisect good f1e455352b6f503532eb3637d0a6d991895e7856 # bad: [cb953129bfe5c0f2da835a0469930873fb7e71df] kvm: add halt-polling cpu usage stats git bisect bad cb953129bfe5c0f2da835a0469930873fb7e71df # good: [3754afe7cf7cc3693a9c9ff795e9bd97175ca639] tools/kvm_stat: Add command line switch '-L' to log to file git bisect good 3754afe7cf7cc3693a9c9ff795e9bd97175ca639 # good: [c4e115f08c08cb9f3b70247b42323e40b9afd1fd] kvm/eventfd: remove unneeded conversion to bool git bisect good c4e115f08c08cb9f3b70247b42323e40b9afd1fd # good: [5b494aea13fe9ec67365510c0d75835428cbb303] KVM: No need to retry for hva_to_pfn_remapped() git bisect good 5b494aea13fe9ec67365510c0d75835428cbb303 # bad: [379a3c8ee44440d5afa505230ed8cb5b0d0e314b] KVM: VMX: Optimize posted-interrupt delivery for timer fastpath git bisect bad 379a3c8ee44440d5afa505230ed8cb5b0d0e314b # good: [9e826feb8f114964cbdce026340b6cb9bde68a18] KVM: nVMX: Drop superfluous VMREAD of vmcs02.GUEST_SYSENTER_* git bisect good 9e826feb8f114964cbdce026340b6cb9bde68a18 # good: [2c4c41325540cf3abb12aef142c0e550f6afeffc] KVM: x86: Print symbolic names of VMX VM-Exit flags in traces git bisect good 2c4c41325540cf3abb12aef142c0e550f6afeffc # bad: [404d5d7bff0d419fe11c7eaebca9ec8f25258f95] KVM: X86: Introduce more exit_fastpath_completion enum values git bisect bad 404d5d7bff0d419fe11c7eaebca9ec8f25258f95 # good: [5a9f54435a488f8a1153efd36cccee3e7e0fc28b] KVM: X86: Introduce kvm_vcpu_exit_request() helper git bisect good 5a9f54435a488f8a1153efd36cccee3e7e0fc28b # first bad commit: [404d5d7bff0d419fe11c7eaebca9ec8f25258f95] KVM: X86: Introduce more exit_fastpath_completion enum values Could you dump the lscpu results in both the guest and the host? In addition, could you dump the result from grep . /sys/module/kvm_amd/parameters/* ? Created attachment 292421 [details]
lscpu from host
Created attachment 292423 [details]
lscpu from guest
Created attachment 292425 [details]
Output from /sys/module/kvm_amd/parameters/
From code inspection, I'm 99% confident the immediate bug is that svm->next_rip is reset in svm_vcpu_run() only after calling svm_exit_handlers_fastpath(), which will cause SVM's skip_emulated_instruction() to write a stale RIP. I don't have AMD hardware to confirm, but this should be reproducible on modern CPUs by loading kvm_amd with nrips=0. That issue is easy enough to resolve, e.g. simply hoist "svm->next_rip = 0;" up above the fastpath handling. But, there are additional complications with advancing rip in the fastpath as svm_complete_interrupts() consumes rip, e.g. for NMI unmasking logic and event reinjection. Odds are that NMI unmasking will never "fail" as it would require the new rip to match the last IRET rip, which would be very bizarre. Similarly, event reinjection should also be a non-issue in practice as the WRMSR fastpath shouldn't be reachable if KVM was injecting an event. All the being said, IMO, the safest play would be to first yank out the call to handle_fastpath_set_msr_irqoff() in svm_exit_handlers_fastpath() to ensure a clean base and to provide a safe backport patch, then move svm_complete_interrupts() into svm_vcpu_run(), and finally move the call to svm_exit_handlers_fastpath() down a ways and reenable handle_fastpath_set_msr_irqoff(). Aside from resolving weirdness with rip and fastpath, it would also align VMX and SVM with respect to completing interrupts. I'm happy to apply a patch. Sadly, that explanation isn't enough guidance for me to create one myself. Created attachment 292439 [details]
Attempt to fix the immediate bug of guest rip going into the weeds
Attached patch should fix the immediate bug, assuming my guess is correct.
(In reply to Sean Christopherson from comment #8) > From code inspection, I'm 99% confident the immediate bug is that > svm->next_rip is reset in svm_vcpu_run() only after calling > svm_exit_handlers_fastpath(), which will cause SVM's > skip_emulated_instruction() to write a stale RIP. I don't have AMD hardware > to confirm, but this should be reproducible on modern CPUs by loading > kvm_amd with nrips=0. > > That issue is easy enough to resolve, e.g. simply hoist "svm->next_rip = 0;" > up above the fastpath handling. But, there are additional complications > with advancing rip in the fastpath as svm_complete_interrupts() consumes > rip, e.g. for NMI unmasking logic and event reinjection. Odds are that NMI > unmasking will never "fail" as it would require the new rip to match the > last IRET rip, which would be very bizarre. Similarly, event reinjection > should also be a non-issue in practice as the WRMSR fastpath shouldn't be > reachable if KVM was injecting an event. > > All the being said, IMO, the safest play would be to first yank out the call > to handle_fastpath_set_msr_irqoff() in svm_exit_handlers_fastpath() to > ensure a clean base and to provide a safe backport patch, then move > svm_complete_interrupts() into svm_vcpu_run(), and finally move the call to > svm_exit_handlers_fastpath() down a ways and reenable > handle_fastpath_set_msr_irqoff(). Aside from resolving weirdness with rip > and fastpath, it would also align VMX and SVM with respect to completing > interrupts. Hi Sean, thanks for your analyses, I will send out patches to fix it. :) Verified fix works in both the bisected revision and v5.9-rc3. I have not tried to apply the three patches sent to the mailing list. Should I? (In reply to Paul K. from comment #12) > Verified fix works in both the bisected revision and v5.9-rc3. I have not > tried to apply the three patches sent to the mailing list. Should I? Please have a try, we test these three patches on AMD ROME with nrips=0. I'm having a bit of a problem applying the patches cleanly. Working with both v5.9-rc3 and 5.9-rc4 give the same: Patch 1/3 goes fine: $ patch -p1 < /net/phenom/export/home2/users/kronenpj/tmp/patch1-3.txt patching file arch/x86/kvm/svm/svm.c Patch 2/3 fails on hunk #3: $ patch -p1 < /net/phenom/export/home2/users/kronenpj/tmp/patch2-3.txt patching file arch/x86/kvm/svm/svm.c Hunk #1 succeeded at 3349 (offset 2 lines). Hunk #2 succeeded at 3504 (offset 4 lines). Hunk #3 FAILED at 3533. 1 out of 3 hunks FAILED -- saving rejects to file arch/x86/kvm/svm/svm.c.rej $ cat svm.c.rej --- arch/x86/kvm/svm/svm.c +++ arch/x86/kvm/svm/svm.c @@ -3533,6 +3537,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu) svm_handle_mce(svm); svm_complete_interrupts(svm); + exit_fastpath = svm_exit_handlers_fastpath(vcpu); vmcb_mark_all_clean(svm->vmcb); return exit_fastpath; Adding that line manually and continuing with the third patch: $ patch -p1 < /net/phenom/export/home2/users/kronenpj/tmp/patch3-3.txt patching file arch/x86/kvm/svm/svm.c Hunk #2 succeeded at 3536 with fuzz 2 (offset 8 lines). The patch against v5.9-rc4+ works as expected. |
Created attachment 292341 [details] Screen capture from VM. Verified bisecting generic kernel builds and Fedora config-5.8.4-200.fc32.x86_64 config. Used libvirt and qemu-system-x86-4.2.1-1.fc32.x86_64. VM Panic log attached.