Bug 206215

Summary: QEMU guest crash due to random 'general protection fault' since kernel 5.2.5 on i7-3517UE
Product: Virtualization Reporter: kernel
Component: kvmAssignee: virtualization_kvm
Status: RESOLVED CODE_FIX    
Severity: blocking    
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.5.0-0.rc6 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: relevant logs
0001-thread_info-Add-a-debug-hook-to-detect-FPU-changes-w.patch

Description kernel 2020-01-15 21:18:56 UTC
Created attachment 286831 [details]
relevant logs

Since kernel 5.2.5 any qemu guest fail to start due to "general protection fault"

[  188.533545] traps: gsd-wacom[1855] general protection fault ip:7fed39b5e7b0 sp:7fff3e349620 error:0 in libglib-2.0.so.0.6200.1[7fed39ae3000+83000]
[  192.002357] traps: gvfs-fuse-sub[1560] general protection fault ip:7f9cd88100b2 sp:7f9cd5db0bf0 error:0 in libglib-2.0.so.0.6200.1[7f9cd87de000+83000]

Please note that kernel 5.2.4 work fine.

Tested guests with Widows Server 2016/2019 & Fedora 31

Attached logs show the DMESG output of the guests

Attached host files contains a WARNING thrown upong first guest start on the hypervisor:

[   49.533713] WARNING: CPU: 3 PID: 966 at arch/x86/kvm/x86.c:7963 kvm_arch_vcpu_ioctl_run+0x1927/0x1ce0 [kvm]
[   49.533714] Modules linked in: vhost_net vhost tap tun xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT nf_reject_ipv4 iptable_filter iptable_security iptable_raw xt_state xt_conntrack xt_DSCP xt_multiport iptable_mangle xt_TCPMSS xt_tcpmss xt_policy xt_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel sunrpc kvm vfat fat mei_hdcp mei_wdt snd_hda_codec_hdmi iTCO_wdt irqbypass iTCO_vendor_support snd_hda_codec_realtek snd_hda_codec_generic crct10dif_pclmul crc32_pclmul ledtrig_audio snd_hda_intel ghash_clmulni_intel snd_hda_codec intel_cstate intel_uncore snd_hda_core snd_hwdep intel_rapl_perf snd_seq snd_seq_device snd_pcm i2c_i801 r8169 lpc_ich mei_me snd_timer snd mei e1000e soundcore pcc_cpufreq tcp_bbr sch_fq ip_tables xfs i915 libcrc32c i2c_algo_bit drm_kms_helper crc32c_intel drm
[   49.533760]  serio_raw video
[   49.533764] CPU: 3 PID: 966 Comm: CPU 0/KVM Not tainted 5.2.5-200.fc30.x86_64 #1
[   49.533765] Hardware name: CompuLab 0000000-00000/Intense-PC, BIOS IPC_2.2.400.5 X64 03/15/2018
[   49.533784] RIP: 0010:kvm_arch_vcpu_ioctl_run+0x1927/0x1ce0 [kvm]
[   49.533786] Code: 4c 89 e7 e8 1b 0b ff ff 4c 89 e7 e8 d3 8c fe ff 41 83 a4 24 e8 36 00 00 fb e9 bd ed ff ff f0 41 80 4c 24 31 10 e9 a5 ee ff ff <0f> 0b e9 74 ed ff ff 49 8b 84 24 c8 02 00 00 a9 00 00 01 00 0f 84
[   49.533787] RSP: 0018:ffffbe4e423ffd30 EFLAGS: 00010002
[   49.533789] RAX: 0000000000004b00 RBX: 0000000000000000 RCX: ffffa044ce958000
[   49.533790] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   49.533791] RBP: ffffbe4e423ffdd8 R08: 0000000000000000 R09: 00000000000003e8
[   49.533792] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa044d38f8000
[   49.533792] R13: 0000000000000000 R14: ffffbe4e41ccf7b8 R15: 0000000000000000
[   49.533794] FS:  00007f117953f700(0000) GS:ffffa044ee2c0000(0000) knlGS:0000000000000000
[   49.533795] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   49.533796] CR2: 0000000000000000 CR3: 000000040e8c2003 CR4: 00000000001626e0
[   49.533797] Call Trace:
[   49.533817]  kvm_vcpu_ioctl+0x215/0x5c0 [kvm]
[   49.533821]  ? __seccomp_filter+0x7b/0x640
[   49.533824]  ? __switch_to_asm+0x34/0x70
[   49.533826]  ? __switch_to_asm+0x34/0x70
[   49.533827]  ? apic_timer_interrupt+0xa/0x20
[   49.533831]  do_vfs_ioctl+0x405/0x660
[   49.533834]  ksys_ioctl+0x5e/0x90
[   49.533836]  __x64_sys_ioctl+0x16/0x20
[   49.533839]  do_syscall_64+0x5f/0x1a0
[   49.533842]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   49.533844] RIP: 0033:0x7f117d1fb34b
[   49.533845] Code: 0f 1e fa 48 8b 05 3d 9b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 9b 0c 00 f7 d8 64 89 01 48
[   49.533846] RSP: 002b:00007f117953e698 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   49.533848] RAX: ffffffffffffffda RBX: 0000564f2cb65ba0 RCX: 00007f117d1fb34b
[   49.533849] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000019
[   49.533850] RBP: 00007f1179f20000 R08: 0000564f2b7e5390 R09: 000000000000ffff
[   49.533851] R10: 0000564f2ca7a710 R11: 0000000000000246 R12: 0000000000000001
[   49.533852] R13: 00007f1179f21002 R14: 0000000000000000 R15: 0000564f2bc66e80
[   49.533854] ---[ end trace a562473b18c9b742 ]---

/proc/cpuinfo

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 58
model name      : Intel(R) Core(TM) i7-3517UE CPU @ 1.70GHz
stepping        : 9
microcode       : 0x1f
cpu MHz         : 828.296
cache size      : 4096 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds
bogomips        : 4389.89
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:
Comment 1 Sean Christopherson 2020-01-15 21:52:58 UTC
Created attachment 286833 [details]
0001-thread_info-Add-a-debug-hook-to-detect-FPU-changes-w.patch

+cc Derek, who is hitting the same thing.

On Wed, Jan 15, 2020 at 09:18:56PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=206215
> 
>             Bug ID: 206215
>            Summary: QEMU guest crash due to random 'general protection
>                     fault' since kernel 5.2.5 on i7-3517UE
>            Product: Virtualization
>            Version: unspecified
>     Kernel Version: 5.5.0-0.rc6
>           Hardware: x86-64
>                 OS: Linux
>               Tree: Fedora
>             Status: NEW
>           Severity: blocking
>           Priority: P1
>          Component: kvm
>           Assignee: virtualization_kvm@kernel-bugs.osdl.org
>           Reporter: kernel@najdan.com
>         Regression: Yes
> 
> Created attachment 286831 [details]
>   --> https://bugzilla.kernel.org/attachment.cgi?id=286831&action=edit
> relevant logs
> 
> Since kernel 5.2.5 any qemu guest fail to start due to "general protection
> fault"
> 
> [  188.533545] traps: gsd-wacom[1855] general protection fault
> ip:7fed39b5e7b0
> sp:7fff3e349620 error:0 in libglib-2.0.so.0.6200.1[7fed39ae3000+83000]
> [  192.002357] traps: gvfs-fuse-sub[1560] general protection fault
> ip:7f9cd88100b2 sp:7f9cd5db0bf0 error:0 in
> libglib-2.0.so.0.6200.1[7f9cd87de000+83000]
> 
> Please note that kernel 5.2.4 work fine.
> 
> Tested guests with Widows Server 2016/2019 & Fedora 31
> 
> Attached logs show the DMESG output of the guests
> 
> Attached host files contains a WARNING thrown upong first guest start on the
> hypervisor:
> 
> [   49.533713] WARNING: CPU: 3 PID: 966 at arch/x86/kvm/x86.c:7963
> kvm_arch_vcpu_ioctl_run+0x1927/0x1ce0 [kvm]

Between the WARN, which is

  WARN_ON_ONCE(test_thread_flag(TIF_NEED_FPU_LOAD));

and the total diff of arch/x86/kvm for 5.2.4 -> 5.2.5 is

--- 5.2.4/arch/x86/kvm/x86.c    2020-01-15 13:37:05.154445843 -0800
+++ 5.2.5/arch/x86/kvm/x86.c    2020-01-15 13:37:08.190438719 -0800
@@ -3264,6 +3264,10 @@

        kvm_x86_ops->vcpu_load(vcpu, cpu);

+       fpregs_assert_state_consistent();
+       if (test_thread_flag(TIF_NEED_FPU_LOAD))
+               switch_fpu_return();
+
        /* Apply any externally detected TSC adjustments (due to suspend) */
        if (unlikely(vcpu->arch.tsc_offset_adjustment)) {
                adjust_tsc_offset_host(vcpu, vcpu->arch.tsc_offset_adjustment);
@@ -7955,9 +7959,8 @@
                wait_lapic_expire(vcpu);
        guest_enter_irqoff();

-       fpregs_assert_state_consistent();
-       if (test_thread_flag(TIF_NEED_FPU_LOAD))
-               switch_fpu_return();
+       /* The preempt notifier should have taken care of the FPU already.  */
+       WARN_ON_ONCE(test_thread_flag(TIF_NEED_FPU_LOAD));

        if (unlikely(vcpu->arch.switch_db_regs)) {
                set_debugreg(0, 7); 


that's a big smoking gun pointing at commit ca7e6b286333 ("KVM: X86: Fix
fpu state crash in kvm guest"), which is commit e751732486eb upstream.

1. Can you verify reverting ca7e6b286333 (or e751732486eb in upstream)
   solves the issue?

2. Assuming the answer is yes, on a buggy kernel, can you run with the
   attached patch to try get debug info?

> [   49.533714] Modules linked in: vhost_net vhost tap tun xfrm4_tunnel
> tunnel4
> ipcomp xfrm_ipcomp esp4 ah4 af_key ebtable_filter ebtables ip6table_filter
> ip6_tables bridge stp llc nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT
> nf_reject_ipv4 iptable_filter iptable_security iptable_raw xt_state
> xt_conntrack xt_DSCP xt_multiport iptable_mangle xt_TCPMSS xt_tcpmss
> xt_policy
> xt_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
> intel_rapl
> x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel sunrpc kvm vfat fat
> mei_hdcp mei_wdt snd_hda_codec_hdmi iTCO_wdt irqbypass iTCO_vendor_support
> snd_hda_codec_realtek snd_hda_codec_generic crct10dif_pclmul crc32_pclmul
> ledtrig_audio snd_hda_intel ghash_clmulni_intel snd_hda_codec intel_cstate
> intel_uncore snd_hda_core snd_hwdep intel_rapl_perf snd_seq snd_seq_device
> snd_pcm i2c_i801 r8169 lpc_ich mei_me snd_timer snd mei e1000e soundcore
> pcc_cpufreq tcp_bbr sch_fq ip_tables xfs i915 libcrc32c i2c_algo_bit
> drm_kms_helper crc32c_intel drm
> [   49.533760]  serio_raw video
> [   49.533764] CPU: 3 PID: 966 Comm: CPU 0/KVM Not tainted
> 5.2.5-200.fc30.x86_64 #1
> [   49.533765] Hardware name: CompuLab 0000000-00000/Intense-PC, BIOS
> IPC_2.2.400.5 X64 03/15/2018
> [   49.533784] RIP: 0010:kvm_arch_vcpu_ioctl_run+0x1927/0x1ce0 [kvm]
> [   49.533786] Code: 4c 89 e7 e8 1b 0b ff ff 4c 89 e7 e8 d3 8c fe ff 41 83 a4
> 24 e8 36 00 00 fb e9 bd ed ff ff f0 41 80 4c 24 31 10 e9 a5 ee ff ff <0f> 0b
> e9
> 74 ed ff ff 49 8b 84 24 c8 02 00 00 a9 00 00 01 00 0f 84
> [   49.533787] RSP: 0018:ffffbe4e423ffd30 EFLAGS: 00010002
> [   49.533789] RAX: 0000000000004b00 RBX: 0000000000000000 RCX:
> ffffa044ce958000
> [   49.533790] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000000
> [   49.533791] RBP: ffffbe4e423ffdd8 R08: 0000000000000000 R09:
> 00000000000003e8
> [   49.533792] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffffa044d38f8000
> [   49.533792] R13: 0000000000000000 R14: ffffbe4e41ccf7b8 R15:
> 0000000000000000
> [   49.533794] FS:  00007f117953f700(0000) GS:ffffa044ee2c0000(0000)
> knlGS:0000000000000000
> [   49.533795] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   49.533796] CR2: 0000000000000000 CR3: 000000040e8c2003 CR4:
> 00000000001626e0
> [   49.533797] Call Trace:
> [   49.533817]  kvm_vcpu_ioctl+0x215/0x5c0 [kvm]
> [   49.533821]  ? __seccomp_filter+0x7b/0x640
> [   49.533824]  ? __switch_to_asm+0x34/0x70
> [   49.533826]  ? __switch_to_asm+0x34/0x70
> [   49.533827]  ? apic_timer_interrupt+0xa/0x20
> [   49.533831]  do_vfs_ioctl+0x405/0x660
> [   49.533834]  ksys_ioctl+0x5e/0x90
> [   49.533836]  __x64_sys_ioctl+0x16/0x20
> [   49.533839]  do_syscall_64+0x5f/0x1a0
> [   49.533842]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   49.533844] RIP: 0033:0x7f117d1fb34b
> [   49.533845] Code: 0f 1e fa 48 8b 05 3d 9b 0c 00 64 c7 00 26 00 00 00 48 c7
> c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d
> 01
> f0 ff ff 73 01 c3 48 8b 0d 0d 9b 0c 00 f7 d8 64 89 01 48
> [   49.533846] RSP: 002b:00007f117953e698 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [   49.533848] RAX: ffffffffffffffda RBX: 0000564f2cb65ba0 RCX:
> 00007f117d1fb34b
> [   49.533849] RDX: 0000000000000000 RSI: 000000000000ae80 RDI:
> 0000000000000019
> [   49.533850] RBP: 00007f1179f20000 R08: 0000564f2b7e5390 R09:
> 000000000000ffff
> [   49.533851] R10: 0000564f2ca7a710 R11: 0000000000000246 R12:
> 0000000000000001
> [   49.533852] R13: 00007f1179f21002 R14: 0000000000000000 R15:
> 0000564f2bc66e80
> [   49.533854] ---[ end trace a562473b18c9b742 ]---
> 
> /proc/cpuinfo
> 
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 58
> model name      : Intel(R) Core(TM) i7-3517UE CPU @ 1.70GHz
> stepping        : 9
> microcode       : 0x1f
> cpu MHz         : 828.296
> cache size      : 4096 KB
> physical id     : 0
> siblings        : 4
> core id         : 0
> cpu cores       : 2
> apicid          : 0
> initial apicid  : 0
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 13
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> cmov
> pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp
> lm
> constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid
> aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16
> xtpr
> pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c
> rdrand lahf_lm cpuid_fault epb pti ibrs ibpb stibp tpr_shadow vnmi
> flexpriority
> ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
> bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
> mds
> bogomips        : 4389.89
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 36 bits physical, 48 bits virtual
> power management:
> 
> -- 
> You are receiving this mail because:
> You are watching the assignee of the bug.
Comment 2 kernel 2020-01-15 22:15:27 UTC
(In reply to Sean Christopherson from comment #1)
> Created attachment 286833 [details]
> 0001-thread_info-Add-a-debug-hook-to-detect-FPU-changes-w.patch
> 
> +cc Derek, who is hitting the same thing.
> 
> On Wed, Jan 15, 2020 at 09:18:56PM +0000,
> bugzilla-daemon@bugzilla.kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=206215
> > 
> >             Bug ID: 206215
> >            Summary: QEMU guest crash due to random 'general protection
> >                     fault' since kernel 5.2.5 on i7-3517UE
> >            Product: Virtualization
> >            Version: unspecified
> >     Kernel Version: 5.5.0-0.rc6
> >           Hardware: x86-64
> >                 OS: Linux
> >               Tree: Fedora
> >             Status: NEW
> >           Severity: blocking
> >           Priority: P1
> >          Component: kvm
> >           Assignee: virtualization_kvm@kernel-bugs.osdl.org
> >           Reporter: kernel@najdan.com
> >         Regression: Yes
> > 
> > Created attachment 286831 [details]
> >   --> https://bugzilla.kernel.org/attachment.cgi?id=286831&action=edit
> > relevant logs
> > 
> > Since kernel 5.2.5 any qemu guest fail to start due to "general protection
> > fault"
> > 
> > [  188.533545] traps: gsd-wacom[1855] general protection fault
> > ip:7fed39b5e7b0
> > sp:7fff3e349620 error:0 in libglib-2.0.so.0.6200.1[7fed39ae3000+83000]
> > [  192.002357] traps: gvfs-fuse-sub[1560] general protection fault
> > ip:7f9cd88100b2 sp:7f9cd5db0bf0 error:0 in
> > libglib-2.0.so.0.6200.1[7f9cd87de000+83000]
> > 
> > Please note that kernel 5.2.4 work fine.
> > 
> > Tested guests with Widows Server 2016/2019 & Fedora 31
> > 
> > Attached logs show the DMESG output of the guests
> > 
> > Attached host files contains a WARNING thrown upong first guest start on
> the
> > hypervisor:
> > 
> > [   49.533713] WARNING: CPU: 3 PID: 966 at arch/x86/kvm/x86.c:7963
> > kvm_arch_vcpu_ioctl_run+0x1927/0x1ce0 [kvm]
> 
> Between the WARN, which is
> 
>   WARN_ON_ONCE(test_thread_flag(TIF_NEED_FPU_LOAD));
> 
> and the total diff of arch/x86/kvm for 5.2.4 -> 5.2.5 is
> 
> --- 5.2.4/arch/x86/kvm/x86.c    2020-01-15 13:37:05.154445843 -0800
> +++ 5.2.5/arch/x86/kvm/x86.c    2020-01-15 13:37:08.190438719 -0800
> @@ -3264,6 +3264,10 @@
> 
>         kvm_x86_ops->vcpu_load(vcpu, cpu);
> 
> +       fpregs_assert_state_consistent();
> +       if (test_thread_flag(TIF_NEED_FPU_LOAD))
> +               switch_fpu_return();
> +
>         /* Apply any externally detected TSC adjustments (due to suspend) */
>         if (unlikely(vcpu->arch.tsc_offset_adjustment)) {
>                 adjust_tsc_offset_host(vcpu,
> vcpu->arch.tsc_offset_adjustment);
> @@ -7955,9 +7959,8 @@
>                 wait_lapic_expire(vcpu);
>         guest_enter_irqoff();
> 
> -       fpregs_assert_state_consistent();
> -       if (test_thread_flag(TIF_NEED_FPU_LOAD))
> -               switch_fpu_return();
> +       /* The preempt notifier should have taken care of the FPU already. 
> */
> +       WARN_ON_ONCE(test_thread_flag(TIF_NEED_FPU_LOAD));
> 
>         if (unlikely(vcpu->arch.switch_db_regs)) {
>                 set_debugreg(0, 7); 
> 
> 
> that's a big smoking gun pointing at commit ca7e6b286333 ("KVM: X86: Fix
> fpu state crash in kvm guest"), which is commit e751732486eb upstream.
> 
> 1. Can you verify reverting ca7e6b286333 (or e751732486eb in upstream)
>    solves the issue?
> 
> 2. Assuming the answer is yes, on a buggy kernel, can you run with the
>    attached patch to try get debug info?
> 
> > [   49.533714] Modules linked in: vhost_net vhost tap tun xfrm4_tunnel
> > tunnel4
> > ipcomp xfrm_ipcomp esp4 ah4 af_key ebtable_filter ebtables ip6table_filter
> > ip6_tables bridge stp llc nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT
> > nf_reject_ipv4 iptable_filter iptable_security iptable_raw xt_state
> > xt_conntrack xt_DSCP xt_multiport iptable_mangle xt_TCPMSS xt_tcpmss
> > xt_policy
> > xt_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
> > intel_rapl
> > x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel sunrpc kvm vfat
> fat
> > mei_hdcp mei_wdt snd_hda_codec_hdmi iTCO_wdt irqbypass iTCO_vendor_support
> > snd_hda_codec_realtek snd_hda_codec_generic crct10dif_pclmul crc32_pclmul
> > ledtrig_audio snd_hda_intel ghash_clmulni_intel snd_hda_codec intel_cstate
> > intel_uncore snd_hda_core snd_hwdep intel_rapl_perf snd_seq snd_seq_device
> > snd_pcm i2c_i801 r8169 lpc_ich mei_me snd_timer snd mei e1000e soundcore
> > pcc_cpufreq tcp_bbr sch_fq ip_tables xfs i915 libcrc32c i2c_algo_bit
> > drm_kms_helper crc32c_intel drm
> > [   49.533760]  serio_raw video
> > [   49.533764] CPU: 3 PID: 966 Comm: CPU 0/KVM Not tainted
> > 5.2.5-200.fc30.x86_64 #1
> > [   49.533765] Hardware name: CompuLab 0000000-00000/Intense-PC, BIOS
> > IPC_2.2.400.5 X64 03/15/2018
> > [   49.533784] RIP: 0010:kvm_arch_vcpu_ioctl_run+0x1927/0x1ce0 [kvm]
> > [   49.533786] Code: 4c 89 e7 e8 1b 0b ff ff 4c 89 e7 e8 d3 8c fe ff 41 83
> a4
> > 24 e8 36 00 00 fb e9 bd ed ff ff f0 41 80 4c 24 31 10 e9 a5 ee ff ff <0f>
> 0b
> > e9
> > 74 ed ff ff 49 8b 84 24 c8 02 00 00 a9 00 00 01 00 0f 84
> > [   49.533787] RSP: 0018:ffffbe4e423ffd30 EFLAGS: 00010002
> > [   49.533789] RAX: 0000000000004b00 RBX: 0000000000000000 RCX:
> > ffffa044ce958000
> > [   49.533790] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> > 0000000000000000
> > [   49.533791] RBP: ffffbe4e423ffdd8 R08: 0000000000000000 R09:
> > 00000000000003e8
> > [   49.533792] R10: 0000000000000000 R11: 0000000000000000 R12:
> > ffffa044d38f8000
> > [   49.533792] R13: 0000000000000000 R14: ffffbe4e41ccf7b8 R15:
> > 0000000000000000
> > [   49.533794] FS:  00007f117953f700(0000) GS:ffffa044ee2c0000(0000)
> > knlGS:0000000000000000
> > [   49.533795] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   49.533796] CR2: 0000000000000000 CR3: 000000040e8c2003 CR4:
> > 00000000001626e0
> > [   49.533797] Call Trace:
> > [   49.533817]  kvm_vcpu_ioctl+0x215/0x5c0 [kvm]
> > [   49.533821]  ? __seccomp_filter+0x7b/0x640
> > [   49.533824]  ? __switch_to_asm+0x34/0x70
> > [   49.533826]  ? __switch_to_asm+0x34/0x70
> > [   49.533827]  ? apic_timer_interrupt+0xa/0x20
> > [   49.533831]  do_vfs_ioctl+0x405/0x660
> > [   49.533834]  ksys_ioctl+0x5e/0x90
> > [   49.533836]  __x64_sys_ioctl+0x16/0x20
> > [   49.533839]  do_syscall_64+0x5f/0x1a0
> > [   49.533842]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [   49.533844] RIP: 0033:0x7f117d1fb34b
> > [   49.533845] Code: 0f 1e fa 48 8b 05 3d 9b 0c 00 64 c7 00 26 00 00 00 48
> c7
> > c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48>
> 3d
> > 01
> > f0 ff ff 73 01 c3 48 8b 0d 0d 9b 0c 00 f7 d8 64 89 01 48
> > [   49.533846] RSP: 002b:00007f117953e698 EFLAGS: 00000246 ORIG_RAX:
> > 0000000000000010
> > [   49.533848] RAX: ffffffffffffffda RBX: 0000564f2cb65ba0 RCX:
> > 00007f117d1fb34b
> > [   49.533849] RDX: 0000000000000000 RSI: 000000000000ae80 RDI:
> > 0000000000000019
> > [   49.533850] RBP: 00007f1179f20000 R08: 0000564f2b7e5390 R09:
> > 000000000000ffff
> > [   49.533851] R10: 0000564f2ca7a710 R11: 0000000000000246 R12:
> > 0000000000000001
> > [   49.533852] R13: 00007f1179f21002 R14: 0000000000000000 R15:
> > 0000564f2bc66e80
> > [   49.533854] ---[ end trace a562473b18c9b742 ]---
> > 
> > /proc/cpuinfo
> > 
> > processor       : 0
> > vendor_id       : GenuineIntel
> > cpu family      : 6
> > model           : 58
> > model name      : Intel(R) Core(TM) i7-3517UE CPU @ 1.70GHz
> > stepping        : 9
> > microcode       : 0x1f
> > cpu MHz         : 828.296
> > cache size      : 4096 KB
> > physical id     : 0
> > siblings        : 4
> > core id         : 0
> > cpu cores       : 2
> > apicid          : 0
> > initial apicid  : 0
> > fpu             : yes
> > fpu_exception   : yes
> > cpuid level     : 13
> > wp              : yes
> > flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> > cmov
> > pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp
> > lm
> > constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
> cpuid
> > aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16
> > xtpr
> > pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c
> > rdrand lahf_lm cpuid_fault epb pti ibrs ibpb stibp tpr_shadow vnmi
> > flexpriority
> > ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
> > bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
> > mds
> > bogomips        : 4389.89
> > clflush size    : 64
> > cache_alignment : 64
> > address sizes   : 36 bits physical, 48 bits virtual
> > power management:
> > 
> > -- 
> > You are receiving this mail because:
> > You are watching the assignee of the bug.

Sean,
Thank you for the quick feedback.
In the zip file I did attach DMESG logs with latest vanilla kernel with same behavior:
5.5.0-0.rc6.git1.1.vanilla.knurd.1.fc31.x86_64

If I'm rebuilding the kernel I'd rather spend time on the most recent one ?

Can you confirm if I apply the patch on kernel 5.5.0-rc6 ?
Comment 3 derek 2020-01-16 01:15:42 UTC
On 1/15/20 4:52 PM, Sean Christopherson wrote:
> +cc Derek, who is hitting the same thing.
>
> On Wed, Jan 15, 2020 at 09:18:56PM +0000, bugzilla-daemon@bugzilla.kernel.org
> wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=206215
> *snip*
> that's a big smoking gun pointing at commit ca7e6b286333 ("KVM: X86: Fix
> fpu state crash in kvm guest"), which is commit e751732486eb upstream.
>
> 1. Can you verify reverting ca7e6b286333 (or e751732486eb in upstream)
>     solves the issue?
>
> 2. Assuming the answer is yes, on a buggy kernel, can you run with the
>     attached patch to try get debug info?
I did these out of order since I had 5.3.11 built with the patch, ready to go 
for weeks now, waiting for an opportunity to test.

Win10 guest immediately BSOD'ed with:

WARNING: CPU: 2 PID: 9296 at include/linux/thread_info.h:55 
kernel_fpu_begin+0x6b/0xc0

Then stashed the patch, reverted ca7e6b286333, compile, reboot.

Guest is running stable now on 5.3.11. Did test my CAD under the guest, did not 
experience the crashes that had me stuck at 5.1.
Comment 4 kernel 2020-01-16 01:36:29 UTC
(In reply to derek from comment #3)
> On 1/15/20 4:52 PM, Sean Christopherson wrote:
> > +cc Derek, who is hitting the same thing.
> >
> > On Wed, Jan 15, 2020 at 09:18:56PM +0000,
> bugzilla-daemon@bugzilla.kernel.org
> > wrote:
> >> https://bugzilla.kernel.org/show_bug.cgi?id=206215
> > *snip*
> > that's a big smoking gun pointing at commit ca7e6b286333 ("KVM: X86: Fix
> > fpu state crash in kvm guest"), which is commit e751732486eb upstream.
> >
> > 1. Can you verify reverting ca7e6b286333 (or e751732486eb in upstream)
> >     solves the issue?
> >
> > 2. Assuming the answer is yes, on a buggy kernel, can you run with the
> >     attached patch to try get debug info?
> I did these out of order since I had 5.3.11 built with the patch, ready to
> go

Sean,
I'm not familiar with rebuilding the kernel, nor applying a patch but I'm working on it right now so I can provide feedback.

> for weeks now, waiting for an opportunity to test.
> 
> Win10 guest immediately BSOD'ed with:
> 
> WARNING: CPU: 2 PID: 9296 at include/linux/thread_info.h:55 
> kernel_fpu_begin+0x6b/0xc0
> 
> Then stashed the patch, reverted ca7e6b286333, compile, reboot.
> 
> Guest is running stable now on 5.3.11. Did test my CAD under the guest, did
> not 
> experience the crashes that had me stuck at 5.1.

> I did these out of order since I had 5.3.11 built with the patch, ready to
> go 
> for weeks now, waiting for an opportunity to test.
> 
> Win10 guest immediately BSOD'ed with:
> 
> WARNING: CPU: 2 PID: 9296 at include/linux/thread_info.h:55 
> kernel_fpu_begin+0x6b/0xc0
> 
> Then stashed the patch, reverted ca7e6b286333, compile, reboot.
> 
> Guest is running stable now on 5.3.11. Did test my CAD under the guest, did
> not 
> experience the crashes that had me stuck at 5.1.

Derek,

Thanks for the update.

I'm still curious about the hypervisor CPU model you have.

On Windows I did exprecience a different behavior though.
The OS did boot but the spice/VNC screen did freeze randomly.
One of my windows VM did end up being corrupted as an update ehich tried to install just failed miserably.

Anyways, I will try to provide debug details asap from the patch that Sean provided...

To be continued ...
Comment 5 Sean Christopherson 2020-01-16 15:38:56 UTC
On Wed, Jan 15, 2020 at 08:08:32PM -0500, Derek Yerger wrote:
> On 1/15/20 4:52 PM, Sean Christopherson wrote:
> >+cc Derek, who is hitting the same thing.
> >
> >On Wed, Jan 15, 2020 at 09:18:56PM +0000,
> bugzilla-daemon@bugzilla.kernel.org wrote:
> >>https://bugzilla.kernel.org/show_bug.cgi?id=206215
> >*snip*
> >that's a big smoking gun pointing at commit ca7e6b286333 ("KVM: X86: Fix
> >fpu state crash in kvm guest"), which is commit e751732486eb upstream.
> >
> >1. Can you verify reverting ca7e6b286333 (or e751732486eb in upstream)
> >    solves the issue?
> >
> >2. Assuming the answer is yes, on a buggy kernel, can you run with the
> >    attached patch to try get debug info?
> I did these out of order since I had 5.3.11 built with the patch, ready to
> go for weeks now, waiting for an opportunity to test.
> 
> Win10 guest immediately BSOD'ed with:
> 
> WARNING: CPU: 2 PID: 9296 at include/linux/thread_info.h:55
> kernel_fpu_begin+0x6b/0xc0

Can you provide the full stack trace of the WARN?  I'm hoping that will
provide a hint as to what's going wrong.

> Then stashed the patch, reverted ca7e6b286333, compile, reboot.
> 
> Guest is running stable now on 5.3.11. Did test my CAD under the guest, did
> not experience the crashes that had me stuck at 5.1.
Comment 6 Sean Christopherson 2020-01-16 18:08:49 UTC
On Thu, Jan 16, 2020 at 07:38:54AM -0800, Sean Christopherson wrote:
> On Wed, Jan 15, 2020 at 08:08:32PM -0500, Derek Yerger wrote:
> > On 1/15/20 4:52 PM, Sean Christopherson wrote:
> > >+cc Derek, who is hitting the same thing.
> > >
> > >On Wed, Jan 15, 2020 at 09:18:56PM +0000,
> bugzilla-daemon@bugzilla.kernel.org wrote:
> > >>https://bugzilla.kernel.org/show_bug.cgi?id=206215
> > >*snip*
> > >that's a big smoking gun pointing at commit ca7e6b286333 ("KVM: X86: Fix
> > >fpu state crash in kvm guest"), which is commit e751732486eb upstream.
> > >
> > >1. Can you verify reverting ca7e6b286333 (or e751732486eb in upstream)
> > >    solves the issue?
> > >
> > >2. Assuming the answer is yes, on a buggy kernel, can you run with the
> > >    attached patch to try get debug info?
> > I did these out of order since I had 5.3.11 built with the patch, ready to
> > go for weeks now, waiting for an opportunity to test.
> > 
> > Win10 guest immediately BSOD'ed with:
> > 
> > WARNING: CPU: 2 PID: 9296 at include/linux/thread_info.h:55
> > kernel_fpu_begin+0x6b/0xc0
> 
> Can you provide the full stack trace of the WARN?  I'm hoping that will
> provide a hint as to what's going wrong.

Aha!  I found at least two cases where TIF_NEED_FPU_LOAD could be set
without the vCPU being preempted.

The comment on fpregs_lock() states that softirq can set TIF_NEED_FPU_LOAD,
which would not be handled by the preempt notifier.
 
  /*
   * Use fpregs_lock() while editing CPU's FPU registers or fpu->state.
   * A context switch will (and softirq might) save CPU's FPU registers to
                           ^^^^^^^^^^^^^^^^^^^
   * fpu->state and set TIF_NEED_FPU_LOAD leaving CPU's FPU registers in
   * a random state.
   */
  static inline void fpregs_lock(void)

The other scenario is from a stack trace from commit f775b13eedee ("x86,kvm:
move qemu/guest FPU switching out to vcpu_run"), which clearly shows that
kernel_fpu_begin() can be invoked without KVM being preempted.

  __warn+0xcb/0xf0
  warn_slowpath_null+0x1d/0x20
  kernel_fpu_disable+0x3f/0x50
  __kernel_fpu_begin+0x49/0x100
  kernel_fpu_begin+0xe/0x10
  crc32c_pcl_intel_update+0x84/0xb0
  crypto_shash_update+0x3f/0x110
  crc32c+0x63/0x8a [libcrc32c]
  dm_bm_checksum+0x1b/0x20 [dm_persistent_data]
  node_prepare_for_write+0x44/0x70 [dm_persistent_data]
  dm_block_manager_write_callback+0x41/0x50 [dm_persistent_data]
  submit_io+0x170/0x1b0 [dm_bufio]
  __write_dirty_buffer+0x89/0x90 [dm_bufio]
  __make_buffer_clean+0x4f/0x80 [dm_bufio]
  __try_evict_buffer+0x42/0x60 [dm_bufio]
  dm_bufio_shrink_scan+0xc0/0x130 [dm_bufio]
  shrink_slab.part.40+0x1f5/0x420
  shrink_node+0x22c/0x320
  do_try_to_free_pages+0xf5/0x330
  try_to_free_pages+0xe9/0x190
  __alloc_pages_slowpath+0x40f/0xba0
  __alloc_pages_nodemask+0x209/0x260
  alloc_pages_vma+0x1f1/0x250
  do_huge_pmd_anonymous_page+0x123/0x660
  handle_mm_fault+0xfd3/0x1330
  __get_user_pages+0x113/0x640
  get_user_pages+0x4f/0x60
  __gfn_to_pfn_memslot+0x120/0x3f0 [kvm]
  try_async_pf+0x66/0x230 [kvm]
  tdp_page_fault+0x130/0x280 [kvm]
  kvm_mmu_page_fault+0x60/0x120 [kvm]
  handle_ept_violation+0x91/0x170 [kvm_intel]
  vmx_handle_exit+0x1ca/0x1400 [kvm_intel]
  
Either of the above explains why pre-e751732486eb code waited until IRQs
are disabled by vcpu_enter_guest() to do switch_fpu_return().

Properly fixing soley within KVM is going to be somewhat painful.  The
most common case, vcpu_enter_guest(), which is being hit here, is easy
to handle by restoring the switch_fpu_return() that was removed by commit
e751732486eb.  The other obvious case I see is emulator's access of guest
fpu state, which will effectively require reverting commit 6ab0b9feb82a
("x86,kvm: remove KVM emulator get_fpu / put_fpu") along with new
implementations of the hooks to handle TIF_NEED_FPU_LOAD.

> > Then stashed the patch, reverted ca7e6b286333, compile, reboot.
> > 
> > Guest is running stable now on 5.3.11. Did test my CAD under the guest, did
> > not experience the crashes that had me stuck at 5.1.
Comment 7 derek 2020-01-16 19:21:27 UTC
On 1/16/20 10:38 AM, Sean Christopherson wrote:
> On Wed, Jan 15, 2020 at 08:08:32PM -0500, Derek Yerger wrote:
>> On 1/15/20 4:52 PM, Sean Christopherson wrote:
>>> +cc Derek, who is hitting the same thing.
>>>
>>> On Wed, Jan 15, 2020 at 09:18:56PM +0000,
>>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=206215
>>> *snip*
>>> that's a big smoking gun pointing at commit ca7e6b286333 ("KVM: X86: Fix
>>> fpu state crash in kvm guest"), which is commit e751732486eb upstream.
>>>
>>> 1. Can you verify reverting ca7e6b286333 (or e751732486eb in upstream)
>>>     solves the issue?
>>>
>>> 2. Assuming the answer is yes, on a buggy kernel, can you run with the
>>>     attached patch to try get debug info?
>> I did these out of order since I had 5.3.11 built with the patch, ready to
>> go for weeks now, waiting for an opportunity to test.
>>
>> Win10 guest immediately BSOD'ed with:
>>
>> WARNING: CPU: 2 PID: 9296 at include/linux/thread_info.h:55
>> kernel_fpu_begin+0x6b/0xc0
> Can you provide the full stack trace of the WARN?  I'm hoping that will
> provide a hint as to what's going wrong.
WARNING: CPU: 2 PID: 9296 at include/linux/thread_info.h:55 
kernel_fpu_begin+0x6b/0xc0
Modules linked in: vhost_net(E) vhost(E) macvtap(E) macvlan(E) tap(E) esp4(E) 
xt_CHECKSUM(E) xt_MASQUERADE(E) tun(E) bridge(E) stp(E) llc(E) ip6t_rpfilter(E) 
nf_log_ipv6(E) ip6t_REJECT(E) nf_reject_ipv6>
  mei_hdcp(E) kvm(E) intel_cstate(E) intel_uncore(E) intel_rapl_perf(E) 
eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) rfkill(E) snd_hda_codec_generic(E) 
pcspkr(E) wmi_bmof(E) ledtrig_audio(E) i2c_i801(E) snd>
CPU: 2 PID: 9296 Comm: CPU 1/KVM Tainted: P           OE     5.3.11+ #16
Hardware name: System manufacturer System Product Name/Z170-K, BIOS 3805 05/16/2018
RIP: 0010:kernel_fpu_begin+0x6b/0xc0
Code: f6 40 26 20 75 08 48 8b 10 80 e6 40 74 16 65 48 c7 05 b5 27 fe 70 00 00 00 
00 c3 65 8a 05 a5 27 fe 70 eb c4 80 78 0c 00 74 02 <0f> 0b 48 83 c0 01 f0 80 08 
40 65 48 8b 0c 25 c0 6b 01 00 0f 1f 44
RSP: 0018:ffffb42e0014c7f8 EFLAGS: 00010202
RAX: ffff98f1783a1ec0 RBX: 0000000000000038 RCX: 0000000000000048
RDX: 0000000000000020 RSI: ffff98f1d9a5cb00 RDI: ffff98f1d9a5cb00
RBP: ffffb42e0014caa0 R08: ffffb42e0014cab0 R09: ffffb42e0014c860
R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000002ba
R13: ffffb42e0014c860 R14: ffff98f1d36882aa R15: ffff98f1d9a5caa8
FS:  00007f02faffd700(0000) GS:ffff98f286a80000(0000) knlGS:000000f0dd174000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000001f8249cd000 CR3: 000000043d3dc003 CR4: 00000000003626e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  <IRQ>
  gcmaes_crypt_by_sg.constprop.12+0x26e/0x660
  ? 0xffffffffc024547d
  ? __qdisc_run+0x83/0x510
  ? __dev_queue_xmit+0x45e/0x990
  ? ip_finish_output2+0x1a8/0x570
  ? fib4_rule_action+0x61/0x70
  ? fib4_rule_action+0x70/0x70
  ? fib_rules_lookup+0x13f/0x1c0
  ? helper_rfc4106_decrypt+0x82/0xa0
  ? crypto_aead_decrypt+0x40/0x70
  ? crypto_aead_decrypt+0x40/0x70
  ? crypto_aead_decrypt+0x40/0x70
  ? esp_output_tail+0x8f4/0xa5a [esp4]
  ? skb_ext_add+0xd3/0x170
  ? xfrm_input+0x7a6/0x12c0
  ? xfrm4_rcv_encap+0xae/0xd0
  ? xfrm4_transport_finish+0x200/0x200
  ? udp_queue_rcv_one_skb+0x1ba/0x460
  ? udp_unicast_rcv_skb.isra.63+0x72/0x90
  ? __udp4_lib_rcv+0x51b/0xb00
  ? ip_protocol_deliver_rcu+0xd2/0x1c0
  ? ip_local_deliver_finish+0x44/0x50
  ? ip_local_deliver+0xe0/0xf0
  ? ip_protocol_deliver_rcu+0x1c0/0x1c0
  ? ip_rcv+0xbc/0xd0
  ? ip_rcv_finish_core.isra.19+0x380/0x380
  ? __netif_receive_skb_one_core+0x7e/0x90
  ? netif_receive_skb_internal+0x3d/0xb0
  ? napi_gro_receive+0xed/0x150
  ? 0xffffffffc0243c77
  ? net_rx_action+0x149/0x3b0
  ? __do_softirq+0xe4/0x2f8
  ? handle_irq_event_percpu+0x6a/0x80
  ? irq_exit+0xe6/0xf0
  ? do_IRQ+0x7f/0xd0
  ? common_interrupt+0xf/0xf
  </IRQ>
  ? irq_entries_start+0x20/0x660
  ? vmx_get_interrupt_shadow+0x2f0/0x710 [kvm_intel]
  ? kvm_set_msr_common+0xfc7/0x2380 [kvm]
  ? recalibrate_cpu_khz+0x10/0x10
  ? ktime_get+0x3a/0xa0
  ? kvm_arch_vcpu_ioctl_run+0x107/0x560 [kvm]
  ? kvm_init+0x6bf/0xd00 [kvm]
  ? __seccomp_filter+0x7a/0x680
  ? do_vfs_ioctl+0xa4/0x630
  ? security_file_ioctl+0x32/0x50
  ? ksys_ioctl+0x60/0x90
  ? __x64_sys_ioctl+0x16/0x20
  ? do_syscall_64+0x5f/0x1a0
  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace 9564a1ccad733a90 ]---
WARNING: CPU: 2 PID: 9296 at arch/x86/kvm/x86.c:8060 
kvm_set_msr_common+0x2230/0x2380 [kvm]
Modules linked in: vhost_net(E) vhost(E) macvtap(E) macvlan(E) tap(E) esp4(E) 
xt_CHECKSUM(E) xt_MASQUERADE(E) tun(E) bridge(E) stp(E) llc(E) ip6t_rpfilter(E) 
nf_log_ipv6(E) ip6t_REJECT(E) nf_reject_ipv6>
  mei_hdcp(E) kvm(E) intel_cstate(E) intel_uncore(E) intel_rapl_perf(E) 
eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) rfkill(E) snd_hda_codec_generic(E) 
pcspkr(E) wmi_bmof(E) ledtrig_audio(E) i2c_i801(E) snd>
CPU: 2 PID: 9296 Comm: CPU 1/KVM Tainted: P        W  OE     5.3.11+ #16
Hardware name: System manufacturer System Product Name/Z170-K, BIOS 3805 05/16/2018
RIP: 0010:kvm_set_msr_common+0x2230/0x2380 [kvm]
Code: b0 26 00 00 e8 91 9f b5 ce 66 90 bf 06 00 00 00 48 8b b3 88 26 00 00 e8 7e 
9f b5 ce 66 90 83 a3 60 26 00 00 fb e9 e9 ec ff ff <0f> 0b e9 d2 ec ff ff f0 80 
4b 31 10 e9 32 ee ff ff 48 8b 83 98 02
RSP: 0018:ffffb42e03d17d30 EFLAGS: 00010002
RAX: 0000000080004b20 RBX: ffff98f1783abf40 RCX: ffff98f17757f000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffb42e03d17db0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff98f1783abf70
R13: 0000000000000000 R14: 0000000000000000 R15: ffff98f1d8bc6c00
FS:  00007f02faffd700(0000) GS:ffff98f286a80000(0000) knlGS:000000f0dd174000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000001f8249cd000 CR3: 000000043d3dc003 CR4: 00000000003626e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  ? recalibrate_cpu_khz+0x10/0x10
  ? ktime_get+0x3a/0xa0
  kvm_arch_vcpu_ioctl_run+0x107/0x560 [kvm]
  kvm_init+0x6bf/0xd00 [kvm]
  ? __seccomp_filter+0x7a/0x680
  do_vfs_ioctl+0xa4/0x630
  ? security_file_ioctl+0x32/0x50
  ksys_ioctl+0x60/0x90
  __x64_sys_ioctl+0x16/0x20
  do_syscall_64+0x5f/0x1a0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f0302457d4b
Code: 0f 1e fa 48 8b 05 3d b1 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 
66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 
48 8b 0d 0d b1 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007f02faffc6c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f03005fd001 RCX: 00007f0302457d4b
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000020
RBP: 0000000000000001 R08: 0000564efe32fa50 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000246 R12: 0000564efe3129c0
R13: 0000000000000000 R14: 00007f03005fc000 R15: 0000564f00123300
---[ end trace 9564a1ccad733a91 ]---
Comment 8 Sean Christopherson 2020-01-16 19:32:45 UTC
On Thu, Jan 16, 2020 at 02:21:25PM -0500, Derek Yerger wrote:
>  <IRQ>
>  gcmaes_crypt_by_sg.constprop.12+0x26e/0x660
>  ? 0xffffffffc024547d
>  ? __qdisc_run+0x83/0x510
>  ? __dev_queue_xmit+0x45e/0x990
>  ? ip_finish_output2+0x1a8/0x570
>  ? fib4_rule_action+0x61/0x70
>  ? fib4_rule_action+0x70/0x70
>  ? fib_rules_lookup+0x13f/0x1c0
>  ? helper_rfc4106_decrypt+0x82/0xa0
>  ? crypto_aead_decrypt+0x40/0x70
>  ? crypto_aead_decrypt+0x40/0x70
>  ? crypto_aead_decrypt+0x40/0x70
>  ? esp_output_tail+0x8f4/0xa5a [esp4]
>  ? skb_ext_add+0xd3/0x170
>  ? xfrm_input+0x7a6/0x12c0
>  ? xfrm4_rcv_encap+0xae/0xd0
>  ? xfrm4_transport_finish+0x200/0x200
>  ? udp_queue_rcv_one_skb+0x1ba/0x460
>  ? udp_unicast_rcv_skb.isra.63+0x72/0x90
>  ? __udp4_lib_rcv+0x51b/0xb00
>  ? ip_protocol_deliver_rcu+0xd2/0x1c0
>  ? ip_local_deliver_finish+0x44/0x50
>  ? ip_local_deliver+0xe0/0xf0
>  ? ip_protocol_deliver_rcu+0x1c0/0x1c0
>  ? ip_rcv+0xbc/0xd0
>  ? ip_rcv_finish_core.isra.19+0x380/0x380
>  ? __netif_receive_skb_one_core+0x7e/0x90
>  ? netif_receive_skb_internal+0x3d/0xb0
>  ? napi_gro_receive+0xed/0x150
>  ? 0xffffffffc0243c77
>  ? net_rx_action+0x149/0x3b0
>  ? __do_softirq+0xe4/0x2f8

Bingo!  Thanks Derek!

>  ? handle_irq_event_percpu+0x6a/0x80
>  ? irq_exit+0xe6/0xf0
>  ? do_IRQ+0x7f/0xd0
>  ? common_interrupt+0xf/0xf
>  </IRQ>
>  ? irq_entries_start+0x20/0x660
>  ? vmx_get_interrupt_shadow+0x2f0/0x710 [kvm_intel]
>  ? kvm_set_msr_common+0xfc7/0x2380 [kvm]
>  ? recalibrate_cpu_khz+0x10/0x10
>  ? ktime_get+0x3a/0xa0
>  ? kvm_arch_vcpu_ioctl_run+0x107/0x560 [kvm]
>  ? kvm_init+0x6bf/0xd00 [kvm]
>  ? __seccomp_filter+0x7a/0x680
>  ? do_vfs_ioctl+0xa4/0x630
>  ? security_file_ioctl+0x32/0x50
>  ? ksys_ioctl+0x60/0x90
>  ? __x64_sys_ioctl+0x16/0x20
>  ? do_syscall_64+0x5f/0x1a0
>  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> ---[ end trace 9564a1ccad733a90 ]---
> WARNING: CPU: 2 PID: 9296 at arch/x86/kvm/x86.c:8060
> kvm_set_msr_common+0x2230/0x2380 [kvm]
> Modules linked in: vhost_net(E) vhost(E) macvtap(E) macvlan(E) tap(E)
> esp4(E) xt_CHECKSUM(E) xt_MASQUERADE(E) tun(E) bridge(E) stp(E) llc(E)
> ip6t_rpfilter(E) nf_log_ipv6(E) ip6t_REJECT(E) nf_reject_ipv6>
>  mei_hdcp(E) kvm(E) intel_cstate(E) intel_uncore(E) intel_rapl_perf(E)
> eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) rfkill(E) snd_hda_codec_generic(E)
> pcspkr(E) wmi_bmof(E) ledtrig_audio(E) i2c_i801(E) snd>
> CPU: 2 PID: 9296 Comm: CPU 1/KVM Tainted: P        W  OE     5.3.11+ #16
> Hardware name: System manufacturer System Product Name/Z170-K, BIOS 3805
> 05/16/2018
> RIP: 0010:kvm_set_msr_common+0x2230/0x2380 [kvm]
> Code: b0 26 00 00 e8 91 9f b5 ce 66 90 bf 06 00 00 00 48 8b b3 88 26 00 00
> e8 7e 9f b5 ce 66 90 83 a3 60 26 00 00 fb e9 e9 ec ff ff <0f> 0b e9 d2 ec ff
> ff f0 80 4b 31 10 e9 32 ee ff ff 48 8b 83 98 02
> RSP: 0018:ffffb42e03d17d30 EFLAGS: 00010002
> RAX: 0000000080004b20 RBX: ffff98f1783abf40 RCX: ffff98f17757f000
> RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
> RBP: ffffb42e03d17db0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff98f1783abf70
> R13: 0000000000000000 R14: 0000000000000000 R15: ffff98f1d8bc6c00
> FS:  00007f02faffd700(0000) GS:ffff98f286a80000(0000) knlGS:000000f0dd174000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000001f8249cd000 CR3: 000000043d3dc003 CR4: 00000000003626e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  ? recalibrate_cpu_khz+0x10/0x10
>  ? ktime_get+0x3a/0xa0
>  kvm_arch_vcpu_ioctl_run+0x107/0x560 [kvm]
>  kvm_init+0x6bf/0xd00 [kvm]
>  ? __seccomp_filter+0x7a/0x680
>  do_vfs_ioctl+0xa4/0x630
>  ? security_file_ioctl+0x32/0x50
>  ksys_ioctl+0x60/0x90
>  __x64_sys_ioctl+0x16/0x20
>  do_syscall_64+0x5f/0x1a0
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f0302457d4b
> Code: 0f 1e fa 48 8b 05 3d b1 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff
> ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff
> 73 01 c3 48 8b 0d 0d b1 0c 00 f7 d8 64 89 01 48
> RSP: 002b:00007f02faffc6c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 00007f03005fd001 RCX: 00007f0302457d4b
> RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000020
> RBP: 0000000000000001 R08: 0000564efe32fa50 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 0000564efe3129c0
> R13: 0000000000000000 R14: 00007f03005fc000 R15: 0000564f00123300
> ---[ end trace 9564a1ccad733a91 ]---
>
Comment 9 kernel 2020-01-17 22:43:51 UTC
Sean,

for the record ...

I did:

   git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git @f5ae2ea6347a308cfe91f53b53682ce635497d0d

   git revert e751732486eb3f159089a64d1901992b1357e7cc

Then I built and installed kernel: 5.5.0-rc6-revert-e751732486eb3f159089a64d1901992b1357e7cc+ #1 SMP Thu Jan 16 13:02:23 EST 2020 x86_64 x86_64 x86_64 GNU/Linux

Guest is stable; no more "general protection fault".

Here is the stack trace with your patch '0001-thread_info-Add-a-debug-hook-to-detect-FPU-changes-w.patch'

[  122.323347] ------------[ cut here ]------------
[  122.323355] WARNING: CPU: 1 PID: 1132 at include/linux/thread_info.h:55 kernel_fpu_begin+0x6b/0xc0
[  122.323356] Modules linked in: vhost_net vhost tap tun xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc rfkill xt_TCPMSS xt_tcpmss xt_nat iptable_nat nf_nat xt_DSCP iptable_mangle iptable_raw iptable_security nf_log_ipv4 nf_log_common xt_policy xt_LOG xt_multiport ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek snd_hda_codec_generic coretemp ledtrig_audio kvm_intel snd_hda_intel sunrpc snd_intel_dspcfg kvm snd_hda_codec snd_hda_core irqbypass mei_wdt mei_hdcp snd_hwdep vfat crct10dif_pclmul snd_seq fat crc32_pclmul snd_seq_device ghash_clmulni_intel iTCO_wdt snd_pcm iTCO_vendor_support intel_cstate snd_timer intel_uncore snd pcspkr intel_rapl_perf mei_me soundcore i2c_i801 lpc_ich mei ata_generic pata_acpi tcp_bbr sch_fq ip_tables xfs libcrc32c i915
[  122.323384]  i2c_algo_bit drm_kms_helper crc32c_intel e1000e drm r8169 serio_raw video
[  122.323389] CPU: 1 PID: 1132 Comm: CPU 2/KVM Not tainted 5.5.0-rc6-thread_info-Add-a-debug-hook-to-detect-FPU-changes-w+ #1
[  122.323390] Hardware name: CompuLab 0000000-00000/Intense-PC, BIOS IPC_2.2.400.5 X64 03/15/2018
[  122.323392] RIP: 0010:kernel_fpu_begin+0x6b/0xc0
[  122.323394] Code: f6 40 26 20 75 08 48 8b 10 80 e6 40 74 16 65 48 c7 05 d5 2b fe 48 00 00 00 00 c3 65 8a 05 c5 2b fe 48 eb c4 80 78 0c 00 74 02 <0f> 0b 48 83 c0 01 f0 80 08 40 65 48 8b 0c 25 c0 8b 01 00 0f 1f 44
[  122.323395] RSP: 0018:ffffa69b80108308 EFLAGS: 00010202
[  122.323396] RAX: ffff8992513ecd00 RBX: 0000000000000088 RCX: ffffdd96904487c0
[  122.323397] RDX: 0000000000000000 RSI: ffff89925fa99b00 RDI: ffff89925fa99b00
[  122.323397] RBP: ffffa69b801085b0 R08: ffffa69b801085c0 R09: ffffa69b80108370
[  122.323398] R10: 0000000000000000 R11: 0000000000000bbe R12: 0000000000000bce
[  122.323399] R13: ffffa69b80108370 R14: 0000000000000000 R15: ffff89925121fbbe
[  122.323400] FS:  00007ff235b52700(0000) GS:ffff89927e240000(0000) knlGS:0000000000000000
[  122.323401] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  122.323401] CR2: 00000000ffffffff CR3: 000000040b55e002 CR4: 00000000001626e0
[  122.323402] Call Trace:
[  122.323404]  <IRQ>
[  122.323409]  gcmaes_crypt_by_sg.constprop.0+0x276/0x6c0
[  122.323415]  ? skb_clone_tx_timestamp+0x3c/0xa0
[  122.323419]  ? sch_direct_xmit+0x8b/0x310
[  122.323423]  ? esp4_err+0x120/0x120 [esp4]
[  122.323425]  ? helper_rfc4106_encrypt+0x7c/0xa0
[  122.323428]  ? crypto_aead_encrypt+0x3c/0x60
[  122.323429]  ? crypto_aead_encrypt+0x3c/0x60
[  122.323431]  ? seqiv_aead_encrypt+0x13a/0x1d0
[  122.323434]  ? fib4_rule_action+0x61/0x70
[  122.323436]  ? fib4_rule_action+0x70/0x70
[  122.323438]  ? fib_rules_lookup+0x143/0x1a0
[  122.323440]  ? __fib_lookup+0x6b/0xb0
[  122.323442]  ? ip_route_output_key_hash_rcu+0x562/0x890
[  122.323444]  ? ip_route_output_key_hash+0x5e/0x80
[  122.323445]  ? __xfrm4_dst_lookup.isra.0+0x88/0x90
[  122.323446]  ? xfrm4_dst_lookup+0x2f/0x50
[  122.323447]  ? rt_add_uncached_list+0x4b/0x80
[  122.323449]  ? xfrm4_fill_dst+0xae/0xf0
[  122.323450]  ? crypto_aead_encrypt+0x3c/0x60
[  122.323452]  ? esp_output_tail+0x1e5/0x580 [esp4]
[  122.323454]  ? esp_output+0x116/0x190 [esp4]
[  122.323457]  ? xfrm_output_resume+0x431/0x4f0
[  122.323464]  ? nf_confirm+0xcb/0xf0 [nf_conntrack]
[  122.323466]  ? __xfrm4_output+0x3f/0x70
[  122.323467]  ? xfrm4_output+0x3b/0xd0
[  122.323468]  ? xfrm4_udp_encap_rcv+0x190/0x190
[  122.323470]  ? ip_forward+0x36c/0x470
[  122.323472]  ? ip_defrag.cold+0x37/0x37
[  122.323473]  ? ip_rcv+0xbc/0xd0
[  122.323475]  ? ip_rcv_finish_core.isra.0+0x410/0x410
[  122.323476]  ? __netif_receive_skb_one_core+0x80/0x90
[  122.323478]  ? netif_receive_skb_internal+0x41/0xb0
[  122.323479]  ? nf_hook_slow+0x40/0xb0
[  122.323480]  ? netif_receive_skb+0x18/0xb0
[  122.323486]  ? br_pass_frame_up+0x133/0x150 [bridge]
[  122.323491]  ? br_port_flags_change+0x40/0x40 [bridge]
[  122.323495]  ? br_handle_frame_finish+0x16f/0x430 [bridge]
[  122.323497]  ? enqueue_entity+0x10e/0x650
[  122.323501]  ? br_handle_frame_finish+0x430/0x430 [bridge]
[  122.323505]  ? br_handle_frame+0x247/0x370 [bridge]
[  122.323506]  ? enqueue_task_fair+0x8c/0x4e0
[  122.323508]  ? update_group_capacity+0x25/0x1e0
[  122.323512]  ? br_handle_frame_finish+0x430/0x430 [bridge]
[  122.323513]  ? __netif_receive_skb_core+0x2db/0xf70
[  122.323515]  ? __netif_receive_skb_list_core+0x138/0x2e0
[  122.323517]  ? netif_receive_skb_list_internal+0x1cc/0x300
[  122.323518]  ? enqueue_task_fair+0x8c/0x4e0
[  122.323520]  ? kmem_cache_alloc+0x165/0x220
[  122.323521]  ? gro_normal_list.part.0+0x19/0x40
[  122.323522]  ? napi_complete_done+0x92/0x130
[  122.323526]  ? rtl8169_poll+0x5a9/0x640 [r8169]
[  122.323527]  ? net_rx_action+0x148/0x3c0
[  122.323530]  ? rtl8169_interrupt+0xfd/0x1e0 [r8169]
[  122.323532]  ? __do_softirq+0xee/0x2ff
[  122.323535]  ? irq_exit+0xe9/0xf0
[  122.323536]  ? do_IRQ+0x55/0xe0
[  122.323538]  ? common_interrupt+0xf/0xf
[  122.323539]  </IRQ>
[  122.323541]  ? irq_entries_start+0x30/0x660
[  122.323546]  ? handle_external_interrupt_irqoff+0x7a/0x100 [kvm_intel]
[  122.323568]  ? kvm_arch_vcpu_ioctl_run+0x995/0x1a60 [kvm]
[  122.323570]  ? futex_wake+0x90/0x170
[  122.323581]  ? kvm_vcpu_ioctl+0x218/0x5c0 [kvm]
[  122.323584]  ? __seccomp_filter+0x7b/0x670
[  122.323585]  ? signal_setup_done+0x82/0xa0
[  122.323586]  ? __fpu__restore_sig+0x436/0x500
[  122.323588]  ? do_vfs_ioctl+0x461/0x6d0
[  122.323590]  ? ksys_ioctl+0x5e/0x90
[  122.323591]  ? __x64_sys_ioctl+0x16/0x20
[  122.323593]  ? do_syscall_64+0x5b/0x1c0
[  122.323595]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  122.323597] ---[ end trace fffe8684d1d1c2f4 ]---
[  122.323626] ------------[ cut here ]------------
[  122.323644] WARNING: CPU: 1 PID: 1132 at arch/x86/kvm/x86.c:8206 kvm_arch_vcpu_ioctl_run+0x163e/0x1a60 [kvm]
[  122.323644] Modules linked in: vhost_net vhost tap tun xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc rfkill xt_TCPMSS xt_tcpmss xt_nat iptable_nat nf_nat xt_DSCP iptable_mangle iptable_raw iptable_security nf_log_ipv4 nf_log_common xt_policy xt_LOG xt_multiport ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek snd_hda_codec_generic coretemp ledtrig_audio kvm_intel snd_hda_intel sunrpc snd_intel_dspcfg kvm snd_hda_codec snd_hda_core irqbypass mei_wdt mei_hdcp snd_hwdep vfat crct10dif_pclmul snd_seq fat crc32_pclmul snd_seq_device ghash_clmulni_intel iTCO_wdt snd_pcm iTCO_vendor_support intel_cstate snd_timer intel_uncore snd pcspkr intel_rapl_perf mei_me soundcore i2c_i801 lpc_ich mei ata_generic pata_acpi tcp_bbr sch_fq ip_tables xfs libcrc32c i915
[  122.323664]  i2c_algo_bit drm_kms_helper crc32c_intel e1000e drm r8169 serio_raw video
[  122.323669] CPU: 1 PID: 1132 Comm: CPU 2/KVM Tainted: G        W         5.5.0-rc6-thread_info-Add-a-debug-hook-to-detect-FPU-changes-w+ #1
[  122.323669] Hardware name: CompuLab 0000000-00000/Intense-PC, BIOS IPC_2.2.400.5 X64 03/15/2018
[  122.323683] RIP: 0010:kvm_arch_vcpu_ioctl_run+0x163e/0x1a60 [kvm]
[  122.323685] Code: a8 f3 fe ff 41 f6 44 24 42 02 75 08 4c 89 e7 e8 e8 f3 fe ff 4c 89 e7 e8 20 7d fe ff 41 83 a4 24 60 26 00 00 fb e9 cb f2 ff ff <0f> 0b e9 8e f2 ff ff 31 db bf 07 00 00 00 48 89 de e8 cc 11 65 f6
[  122.323685] RSP: 0018:ffffa69b80fcbd40 EFLAGS: 00010002
[  122.323686] RAX: 0000000080004b00 RBX: 0000000000000000 RCX: ffff8992513ecd00
[  122.323687] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
[  122.323688] RBP: ffffa69b80fcbde0 R08: 0000000000000001 R09: 0000000000000000
[  122.323688] R10: 0000000000000000 R11: 0000000000000000 R12: ffff899250a28000
[  122.323689] R13: 0000000000000000 R14: ffff899250a28038 R15: ffffa69b81042320
[  122.323690] FS:  00007ff235b52700(0000) GS:ffff89927e240000(0000) knlGS:0000000000000000
[  122.323691] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  122.323691] CR2: 00000000ffffffff CR3: 000000040b55e002 CR4: 00000000001626e0
[  122.323692] Call Trace:
[  122.323695]  ? futex_wake+0x90/0x170
[  122.323707]  kvm_vcpu_ioctl+0x218/0x5c0 [kvm]
[  122.323709]  ? __seccomp_filter+0x7b/0x670
[  122.323710]  ? signal_setup_done+0x82/0xa0
[  122.323711]  ? __fpu__restore_sig+0x436/0x500
[  122.323713]  do_vfs_ioctl+0x461/0x6d0
[  122.323715]  ksys_ioctl+0x5e/0x90
[  122.323716]  __x64_sys_ioctl+0x16/0x20
[  122.323718]  do_syscall_64+0x5b/0x1c0
[  122.323720]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  122.323722] RIP: 0033:0x7ff2399bf34b
[  122.323723] Code: 0f 1e fa 48 8b 05 3d 9b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 9b 0c 00 f7 d8 64 89 01 48
[  122.323724] RSP: 002b:00007ff235b51698 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  122.323725] RAX: ffffffffffffffda RBX: 000056321fab4f50 RCX: 00007ff2399bf34b
[  122.323726] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001b
[  122.323726] RBP: 00007ff23534f000 R08: 000056321d5ac390 R09: 000056321da50d40
[  122.323727] R10: 000056321f960760 R11: 0000000000000246 R12: 000056321fad7770
[  122.323727] R13: 000056321fab4f50 R14: 00007ffdb09a9730 R15: 000056321da2de80
[  122.323729] ---[ end trace fffe8684d1d1c2f5 ]---