Bug 204209

Summary: kernel 5.2.1: "floating point exception" in qemu with kvm enabled
Product: Virtualization Reporter: Antonio (antdev66)
Component: kvmAssignee: virtualization_kvm
Status: RESOLVED CODE_FIX    
Severity: high CC: bonob, john.ettedgui+kernel, mail
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.2.1 Subsystem:
Regression: No Bisected commit-id:

Description Antonio 2019-07-17 14:27:01 UTC
After updating the kernel from 5.1.17 to 5.2.1, when I use the g++ compiler in qemu with kvm enabled, often the compiler launched in the guest for compile my sources ends with this error: "exception in floating point".

Moreever it is not possible to update the bsd system because the process (that use g++ compiler) exits with the same error.

Same thing for some system utilities, like "pkg update".

But if I boot the system with the previous 5.1.17 kernel and launch qemu, as indicated above, everything works fine and I don't detect errors.

Command line is:

/usr/bin/qemu-system-x86_64 -k it -machine accel=kvm -m 4096 -no-fd-bootchk -show-cursor -drive file="vmdragon.img",if=ide,media=disk -boot once=c,menu=off -net none -rtc base=localtime -name "vmdragon" -smp 8 -vga std -device qemu-xhci,id=xhci

I can't test without kvm because the process is very slow, but I think the problem could be in some changes made to the kvm module.

Thanks,
Antonio
Comment 1 Thomas Lambertz 2019-07-18 12:28:42 UTC
I can confirm this issue. It occurs since 5.2, when the FPU state changes were introduced. I send an e-mail about this yesterday, seems I should have included the kvm mailing list:

https://lkml.org/lkml/2019/7/17/758


- Thomas
Comment 2 Antonio 2019-07-19 20:46:38 UTC
I tested the patch indicated on https://lkml.org/lkml/2019/7/19/644 with simple row position adjustments for the kernel 5.2.1 and it seems to work.


--- a/arch/x86/kvm/x86.c    2019-07-19 20:17:35.358848175 +0200
+++ b/arch/x86/kvm/x86.c    2019-07-19 20:17:17.956692942 +0200
@@ -3264,6 +3264,12 @@

    kvm_x86_ops->vcpu_load(vcpu, cpu);

+
+   // fix floating point error kvm guest
+   if (test_thread_flag(TIF_NEED_FPU_LOAD))
+       switch_fpu_return();
+
+
    /* Apply any externally detected TSC adjustments (due to suspend) */
    if (unlikely(vcpu->arch.tsc_offset_adjustment)) {
        adjust_tsc_offset_host(vcpu, vcpu->arch.tsc_offset_adjustment);
@@ -7955,9 +7961,11 @@
        wait_lapic_expire(vcpu);
    guest_enter_irqoff();

-   fpregs_assert_state_consistent();
-   if (test_thread_flag(TIF_NEED_FPU_LOAD))
-       switch_fpu_return();
+// fix floating point error kvm guest
+//
+//     fpregs_assert_state_consistent();
+//     if (test_thread_flag(TIF_NEED_FPU_LOAD))
+//         switch_fpu_return();

    if (unlikely(vcpu->arch.switch_db_regs)) {
        set_debugreg(0, 7);
Comment 3 Antonio 2019-07-28 12:03:53 UTC
Today I saw the following commit:

>Revert "kvm: x86: Use task structs fpu field for user"
>commit ec269475cba7bcdd1eb8fdf8e87f4c6c81a376fe upstream.
>
>This reverts commit 240c35a3783ab9b3a0afaba0dde7291295680a6b
>("kvm: x86: Use task structs fpu field for user", 2018-11-06).
>The commit is broken and causes QEMU's FPU state to be destroyed
>when KVM_RUN is preempted.
>
>Fixes: 240c35a3783a ("kvm: x86: Use task structs fpu field for user")

applied to the 5.2.4 kernel and I thought it could relate to the reported bug, but recompiling the kernel without previous indicated patch, the guest report "fpu exception" error again: it was necessary to re-include the patch and recompile the kernel for it to work.
Comment 4 Antonio 2019-07-31 11:15:29 UTC
Tested with Kernel 5.2.5: problem solved.

Thanks,
Antonio