Bug 209155 - KVM Linux guest with more than 1 CPU panics after commit 404d5d7bff0d419fe11c7eaebca9ec8f25258f95 on old CPU (Phenom x4)
Summary: KVM Linux guest with more than 1 CPU panics after commit 404d5d7bff0d419fe11c...
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-04 22:35 UTC by Paul K.
Modified: 2020-09-10 12:12 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.8.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Screen capture from VM. (20.57 KB, text/plain)
2020-09-04 22:35 UTC, Paul K.
Details
Dmesg from host on working kernel. (85.43 KB, text/plain)
2020-09-04 22:38 UTC, Paul K.
Details
lscpu from host (1.91 KB, text/plain)
2020-09-08 11:25 UTC, Paul K.
Details
lscpu from guest (1.03 KB, text/plain)
2020-09-08 11:25 UTC, Paul K.
Details
Output from /sys/module/kvm_amd/parameters/ (640 bytes, text/plain)
2020-09-08 11:26 UTC, Paul K.
Details
Attempt to fix the immediate bug of guest rip going into the weeds (1.05 KB, patch)
2020-09-08 22:58 UTC, Sean Christopherson
Details | Diff

Description Paul K. 2020-09-04 22:35:50 UTC
Created attachment 292341 [details]
Screen capture from VM.

Verified bisecting generic kernel builds and Fedora config-5.8.4-200.fc32.x86_64 config. Used libvirt and qemu-system-x86-4.2.1-1.fc32.x86_64.

VM Panic log attached.
Comment 1 Paul K. 2020-09-04 22:38:00 UTC
Created attachment 292343 [details]
Dmesg from host on working kernel.
Comment 2 Paul K. 2020-09-04 22:45:34 UTC
Hardware:
Gigabyte GA-MA780G-UD3H
AMD Phenom(tm) 9600 Quad-Core Processor (family: 0x10, model: 0x2, stepping: 0x2)
8GB RAM

OS:
Fedora 32
kernel-5.7.17-200.fc32.x86_64 - Works
kernel-5.8.5-200.fc32.x86_64 - Causes VM crash.

Virtualization Packages:
ipxe-roms-qemu-20190125-4.git36a4c85f.fc32.noarch
libvirt-daemon-6.1.0-4.fc32.x86_64
libvirt-daemon-config-network-6.1.0-4.fc32.x86_64
libvirt-daemon-kvm-6.1.0-4.fc32.x86_64
libvirt-glib-3.0.0-2.fc32.x86_64
libvirt-libs-6.1.0-4.fc32.x86_64
qemu-kvm-4.2.1-1.fc32.x86_64
qemu-kvm-core-4.2.1-1.fc32.x86_64
qemu-system-x86-4.2.1-1.fc32.x86_64
qemu-system-x86-core-4.2.1-1.fc32.x86_64

Please let me know if there's any additional information that could be helpful.
Comment 3 Paul K. 2020-09-04 22:47:11 UTC
$ git bisect log
# bad: [bcf876870b95592b52519ed4aafcf9d95999bc9c] Linux 5.8
git bisect start 'v5.8'
# good: [3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162] Linux 5.7
git bisect good 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162
# bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714
# bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714
# bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714
# bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714
# bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714
# bad: [2e63f6ce7ed2c4ff83ba30ad9ccad422289a6c63] Merge branch 'uaccess.comedi' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect bad 2e63f6ce7ed2c4ff83ba30ad9ccad422289a6c63
# good: [cfa3b8068b09f25037146bfd5eed041b78878bee] Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
git bisect good cfa3b8068b09f25037146bfd5eed041b78878bee
# good: [c41219fda6e04255c44d37fd2c0d898c1c46abf1] Merge tag 'drm-intel-next-fixes-2020-05-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
git bisect good c41219fda6e04255c44d37fd2c0d898c1c46abf1
# good: [f3cdc8ae116e27d84e1f33c7a2995960cebb73ac] Merge tag 'for-5.8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
git bisect good f3cdc8ae116e27d84e1f33c7a2995960cebb73ac
# good: [f1e455352b6f503532eb3637d0a6d991895e7856] Merge tag 'kgdb-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux
git bisect good f1e455352b6f503532eb3637d0a6d991895e7856
# bad: [cb953129bfe5c0f2da835a0469930873fb7e71df] kvm: add halt-polling cpu usage stats
git bisect bad cb953129bfe5c0f2da835a0469930873fb7e71df
# good: [3754afe7cf7cc3693a9c9ff795e9bd97175ca639] tools/kvm_stat: Add command line switch '-L' to log to file
git bisect good 3754afe7cf7cc3693a9c9ff795e9bd97175ca639
# good: [c4e115f08c08cb9f3b70247b42323e40b9afd1fd] kvm/eventfd: remove unneeded conversion to bool
git bisect good c4e115f08c08cb9f3b70247b42323e40b9afd1fd
# good: [5b494aea13fe9ec67365510c0d75835428cbb303] KVM: No need to retry for hva_to_pfn_remapped()
git bisect good 5b494aea13fe9ec67365510c0d75835428cbb303
# bad: [379a3c8ee44440d5afa505230ed8cb5b0d0e314b] KVM: VMX: Optimize posted-interrupt delivery for timer fastpath
git bisect bad 379a3c8ee44440d5afa505230ed8cb5b0d0e314b
# good: [9e826feb8f114964cbdce026340b6cb9bde68a18] KVM: nVMX: Drop superfluous VMREAD of vmcs02.GUEST_SYSENTER_*
git bisect good 9e826feb8f114964cbdce026340b6cb9bde68a18
# good: [2c4c41325540cf3abb12aef142c0e550f6afeffc] KVM: x86: Print symbolic names of VMX VM-Exit flags in traces
git bisect good 2c4c41325540cf3abb12aef142c0e550f6afeffc
# bad: [404d5d7bff0d419fe11c7eaebca9ec8f25258f95] KVM: X86: Introduce more exit_fastpath_completion enum values
git bisect bad 404d5d7bff0d419fe11c7eaebca9ec8f25258f95
# good: [5a9f54435a488f8a1153efd36cccee3e7e0fc28b] KVM: X86: Introduce kvm_vcpu_exit_request() helper
git bisect good 5a9f54435a488f8a1153efd36cccee3e7e0fc28b
# first bad commit: [404d5d7bff0d419fe11c7eaebca9ec8f25258f95] KVM: X86: Introduce more exit_fastpath_completion enum values
Comment 4 Wanpeng Li 2020-09-08 00:31:43 UTC
Could you dump the lscpu results in both the guest and the host? In addition, could you dump the result from grep . /sys/module/kvm_amd/parameters/* ?
Comment 5 Paul K. 2020-09-08 11:25:18 UTC
Created attachment 292421 [details]
lscpu from host
Comment 6 Paul K. 2020-09-08 11:25:53 UTC
Created attachment 292423 [details]
lscpu from guest
Comment 7 Paul K. 2020-09-08 11:26:38 UTC
Created attachment 292425 [details]
Output from /sys/module/kvm_amd/parameters/
Comment 8 Sean Christopherson 2020-09-08 17:08:14 UTC
From code inspection, I'm 99% confident the immediate bug is that svm->next_rip is reset in svm_vcpu_run() only after calling svm_exit_handlers_fastpath(), which will cause SVM's skip_emulated_instruction() to write a stale RIP.  I don't have AMD hardware to confirm, but this should be reproducible on modern CPUs by loading kvm_amd with nrips=0.

That issue is easy enough to resolve, e.g. simply hoist "svm->next_rip = 0;" up above the fastpath handling.  But, there are additional complications with advancing rip in the fastpath as svm_complete_interrupts() consumes rip, e.g. for NMI unmasking logic and event reinjection.  Odds are that NMI unmasking will never "fail" as it would require the new rip to match the last IRET rip, which would be very bizarre.  Similarly, event reinjection should also be a non-issue in practice as the WRMSR fastpath shouldn't be reachable if KVM was injecting an event.

All the being said, IMO, the safest play would be to first yank out the call to handle_fastpath_set_msr_irqoff() in svm_exit_handlers_fastpath() to ensure a clean base and to provide a safe backport patch, then move svm_complete_interrupts() into svm_vcpu_run(), and finally move the call to svm_exit_handlers_fastpath() down a ways and reenable handle_fastpath_set_msr_irqoff().  Aside from resolving weirdness with rip and fastpath, it would also align VMX and SVM with respect to completing interrupts.
Comment 9 Paul K. 2020-09-08 22:47:02 UTC
I'm happy to apply a patch. Sadly, that explanation isn't enough guidance for me to create one myself.
Comment 10 Sean Christopherson 2020-09-08 22:58:50 UTC
Created attachment 292439 [details]
Attempt to fix the immediate bug of guest rip going into the weeds

Attached patch should fix the immediate bug, assuming my guess is correct.
Comment 11 Wanpeng Li 2020-09-09 02:35:19 UTC
(In reply to Sean Christopherson from comment #8)
> From code inspection, I'm 99% confident the immediate bug is that
> svm->next_rip is reset in svm_vcpu_run() only after calling
> svm_exit_handlers_fastpath(), which will cause SVM's
> skip_emulated_instruction() to write a stale RIP.  I don't have AMD hardware
> to confirm, but this should be reproducible on modern CPUs by loading
> kvm_amd with nrips=0.
> 
> That issue is easy enough to resolve, e.g. simply hoist "svm->next_rip = 0;"
> up above the fastpath handling.  But, there are additional complications
> with advancing rip in the fastpath as svm_complete_interrupts() consumes
> rip, e.g. for NMI unmasking logic and event reinjection.  Odds are that NMI
> unmasking will never "fail" as it would require the new rip to match the
> last IRET rip, which would be very bizarre.  Similarly, event reinjection
> should also be a non-issue in practice as the WRMSR fastpath shouldn't be
> reachable if KVM was injecting an event.
> 
> All the being said, IMO, the safest play would be to first yank out the call
> to handle_fastpath_set_msr_irqoff() in svm_exit_handlers_fastpath() to
> ensure a clean base and to provide a safe backport patch, then move
> svm_complete_interrupts() into svm_vcpu_run(), and finally move the call to
> svm_exit_handlers_fastpath() down a ways and reenable
> handle_fastpath_set_msr_irqoff().  Aside from resolving weirdness with rip
> and fastpath, it would also align VMX and SVM with respect to completing
> interrupts.

Hi Sean, thanks for your analyses, I will send out patches to fix it. :)
Comment 12 Paul K. 2020-09-09 16:20:55 UTC
Verified fix works in both the bisected revision and v5.9-rc3. I have not tried to apply the three patches sent to the mailing list. Should I?
Comment 13 Wanpeng Li 2020-09-10 00:14:05 UTC
(In reply to Paul K. from comment #12)
> Verified fix works in both the bisected revision and v5.9-rc3. I have not
> tried to apply the three patches sent to the mailing list. Should I?

Please have a try, we test these three patches on AMD ROME with nrips=0.
Comment 14 Paul K. 2020-09-10 12:12:58 UTC
I'm having a bit of a problem applying the patches cleanly. Working with both v5.9-rc3 and 5.9-rc4 give the same:

Patch 1/3 goes fine:
$ patch -p1 < /net/phenom/export/home2/users/kronenpj/tmp/patch1-3.txt 
patching file arch/x86/kvm/svm/svm.c

Patch 2/3 fails on hunk #3:
$ patch -p1 < /net/phenom/export/home2/users/kronenpj/tmp/patch2-3.txt 
patching file arch/x86/kvm/svm/svm.c
Hunk #1 succeeded at 3349 (offset 2 lines).
Hunk #2 succeeded at 3504 (offset 4 lines).
Hunk #3 FAILED at 3533.
1 out of 3 hunks FAILED -- saving rejects to file arch/x86/kvm/svm/svm.c.rej

$ cat svm.c.rej 
--- arch/x86/kvm/svm/svm.c
+++ arch/x86/kvm/svm/svm.c
@@ -3533,6 +3537,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 		svm_handle_mce(svm);
 
 	svm_complete_interrupts(svm);
+	exit_fastpath = svm_exit_handlers_fastpath(vcpu);
 
 	vmcb_mark_all_clean(svm->vmcb);
 	return exit_fastpath;

Adding that line manually and continuing with the third patch:
$ patch -p1 < /net/phenom/export/home2/users/kronenpj/tmp/patch3-3.txt 
patching file arch/x86/kvm/svm/svm.c
Hunk #2 succeeded at 3536 with fuzz 2 (offset 8 lines).

The patch against v5.9-rc4+ works as expected.

Note You need to log in before you can comment on or make changes to this bug.