Bug 205171 - kernel panic during windows 10pro start
Summary: kernel panic during windows 10pro start
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-12 16:20 UTC by Ivan
Modified: 2019-10-15 16:07 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.19.74 and higher
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description Ivan 2019-10-12 16:20:56 UTC
works fine on 4.19.73

[ 5829.948945] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[ 5829.948951] PGD 0 P4D 0 
[ 5829.948954] Oops: 0002 [#1] SMP NOPTI
[ 5829.948957] CPU: 3 PID: 1699 Comm: CPU 0/KVM Tainted: G           OE     4.19.78-2-lts #1
[ 5829.948958] Hardware name: Micro-Star International Co., Ltd. GE62 6QF/MS-16J4, BIOS E16J4IMS.117 01/18/2018
[ 5829.948989] RIP: 0010:kvm_write_guest_virt_system+0x1e/0x40 [kvm]
[ 5829.948991] Code: 5d 41 5c 41 5d 41 5e e9 40 e0 ff ff 0f 1f 44 00 00 49 89 fa 4d 89 c1 48 89 f7 48 89 d6 41 c6 82 29 56 00 00 01 89 ca 4c 89 d1 <49> c7 00 00 00 00 00 49 c7 40 08 00 00 00 00 49 c7 40 10 00 00 00
[ 5829.948992] RSP: 0018:ffffb80142743cc0 EFLAGS: 00010202
[ 5829.948993] RAX: 0000000000000400 RBX: 0000000000000003 RCX: ffff916d7c9e8000
[ 5829.948993] RDX: 0000000000000008 RSI: ffffb80142743cd0 RDI: 0000010000003d68
[ 5829.948994] RBP: ffff916d7c9e8000 R08: 0000000000000000 R09: 0000000000000000
[ 5829.948995] R10: ffff916d7c9e8000 R11: 0000000000000000 R12: 0000000000e1c908
[ 5829.948995] R13: 0000000000000000 R14: ffff916d6b57e228 R15: 0000000000000000
[ 5829.948997] FS:  00007fb3a23ff700(0000) GS:ffff916daeac0000(0000) knlGS:0000000000000000
[ 5829.948997] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5829.948998] CR2: 0000000000000000 CR3: 0000000442444003 CR4: 00000000003626e0
[ 5829.948999] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5829.948999] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5829.949000] Call Trace:
[ 5829.949006]  handle_vmread+0x28f/0x300 [kvm_intel]
[ 5829.949009]  ? handle_vmwrite+0x269/0x4d0 [kvm_intel]
[ 5829.949019]  kvm_arch_vcpu_ioctl_run+0xbb2/0x1b20 [kvm]
[ 5829.949026]  kvm_vcpu_ioctl+0x24b/0x5d0 [kvm]
[ 5829.949029]  ? __seccomp_filter+0x42/0x480
[ 5829.949032]  do_vfs_ioctl+0x40e/0x670
[ 5829.949034]  ksys_ioctl+0x5e/0x90
[ 5829.949035]  __x64_sys_ioctl+0x16/0x20
[ 5829.949038]  do_syscall_64+0x4e/0x100
[ 5829.949041]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 5829.949044] RIP: 0033:0x7fb3a58f625b
[ 5829.949045] Code: 0f 1e fa 48 8b 05 25 9c 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f5 9b 0c 00 f7 d8 64 89 01 48
[ 5829.949046] RSP: 002b:00007fb3a23fcee8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 5829.949047] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007fb3a58f625b
[ 5829.949048] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000016
[ 5829.949048] RBP: 00007fb3a3194140 R08: 000055bfdd046910 R09: 000000003b9aca00
[ 5829.949049] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
[ 5829.949050] R13: 00007fb3a7059004 R14: 0000000000000608 R15: 0000000000000000
[ 5829.949051] Modules linked in: vhost_net vhost tap rfcomm dm_crypt dm_mod uas usb_storage ccm devlink fuse xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat nf_nat_ipv6 ebtable_filter ebtables ip6table_filter ip6_tables tun cmac iptable_mangle xt_CHECKSUM algif_hash iptable_nat algif_skcipher ipt_MASQUERADE nf_nat_ipv4 af_alg nf_nat bnep nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_tcpudp bridge stp llc iptable_filter nls_iso8859_1 nls_cp437 vfat fat snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic intel_rapl rtsx_usb_sdmmc rtsx_usb_ms mmc_core memstick x86_pkg_temp_thermal arc4 i915 intel_powerclamp kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc iwlmvm aesni_intel kvmgt vfio_mdev mdev aes_x86_64 vfio_iommu_type1 crypto_simd vfio cryptd
[ 5829.949084]  glue_helper intel_cstate kvm mac80211 intel_uncore intel_rapl_perf snd_hda_intel irqbypass i2c_algo_bit drm_kms_helper uvcvideo snd_hda_codec videobuf2_vmalloc videobuf2_memops drm snd_hda_core videobuf2_v4l2 iwlwifi videobuf2_common snd_hwdep snd_pcm videodev btusb btrtl btbcm btintel snd_timer bluetooth mousedev psmouse joydev intel_gtt media rtsx_usb ecdh_generic cfg80211 input_leds snd agpgart alx i2c_i801 mei_me soundcore syscopyarea sysfillrect rfkill sysimgblt mei fb_sys_fops mdio battery ac evdev mac_hid vboxnetflt(OE) vboxnetadp(OE) vboxpci(OE) vboxdrv(OE) usbip_host usbip_core coretemp msr sg crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid hid sr_mod sd_mod cdrom serio_raw atkbd ahci libps2 libahci libata xhci_pci scsi_mod crc32c_intel
[ 5829.949122]  xhci_hcd i8042 serio
[ 5829.949125] CR2: 0000000000000000
[ 5829.949127] ---[ end trace 61e91d3bdaf90c11 ]---
[ 5829.949158] RIP: 0010:kvm_write_guest_virt_system+0x1e/0x40 [kvm]
[ 5829.949160] Code: 5d 41 5c 41 5d 41 5e e9 40 e0 ff ff 0f 1f 44 00 00 49 89 fa 4d 89 c1 48 89 f7 48 89 d6 41 c6 82 29 56 00 00 01 89 ca 4c 89 d1 <49> c7 00 00 00 00 00 49 c7 40 08 00 00 00 00 49 c7 40 10 00 00 00
[ 5829.949161] RSP: 0018:ffffb80142743cc0 EFLAGS: 00010202
[ 5829.949162] RAX: 0000000000000400 RBX: 0000000000000003 RCX: ffff916d7c9e8000
[ 5829.949163] RDX: 0000000000000008 RSI: ffffb80142743cd0 RDI: 0000010000003d68
[ 5829.949164] RBP: ffff916d7c9e8000 R08: 0000000000000000 R09: 0000000000000000
[ 5829.949164] R10: ffff916d7c9e8000 R11: 0000000000000000 R12: 0000000000e1c908
[ 5829.949165] R13: 0000000000000000 R14: ffff916d6b57e228 R15: 0000000000000000
[ 5829.949166] FS:  00007fb3a23ff700(0000) GS:ffff916daeac0000(0000) knlGS:0000000000000000
[ 5829.949167] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5829.949167] CR2: 0000000000000000 CR3: 0000000442444003 CR4: 00000000003626e0
[ 5829.949168] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5829.949169] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Comment 1 Ivan 2019-10-12 16:21:50 UTC
ArchLinux
Linux 4.19.78-2-lts #1 SMP Wed Oct 9 16:25:33 CEST 2019 x86_64 GNU/Linux
qemu 4.1.0
libvirt 5.6.0
Comment 2 vkuznets 2019-10-14 09:08:29 UTC
bugzilla-daemon@bugzilla.kernel.org writes:

> https://bugzilla.kernel.org/show_bug.cgi?id=205171
>
>             Bug ID: 205171
>            Summary: kernel panic during windows 10pro start
>            Product: Virtualization
>            Version: unspecified
>     Kernel Version: 4.19.74 and higher
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: kvm
>           Assignee: virtualization_kvm@kernel-bugs.osdl.org
>           Reporter: dront78@gmail.com
>         Regression: No
>
> works fine on 4.19.73
>
> [ 5829.948945] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000000
> [ 5829.948951] PGD 0 P4D 0 
> [ 5829.948954] Oops: 0002 [#1] SMP NOPTI
> [ 5829.948957] CPU: 3 PID: 1699 Comm: CPU 0/KVM Tainted: G           OE    
> 4.19.78-2-lts #1
> [ 5829.948958] Hardware name: Micro-Star International Co., Ltd. GE62
> 6QF/MS-16J4, BIOS E16J4IMS.117 01/18/2018
> [ 5829.948989] RIP: 0010:kvm_write_guest_virt_system+0x1e/0x40 [kvm]

It seems 4.19 stable backport is broken, upstream commit f7eea636c3d50
has:

@@ -4588,7 +4589,8 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
                                vmx_instruction_info, true, len, &gva))
                        return 1;
                /* _system ok, nested_vmx_check_permission has verified cpl=0 */
-               kvm_write_guest_virt_system(vcpu, gva, &field_value, len, NULL);
+               if (kvm_write_guest_virt_system(vcpu, gva, &field_value, len, &e))
+                       kvm_inject_page_fault(vcpu, &e);
        }

and it's 4.19 counterpart (73c31bd92039):

@@ -8798,8 +8799,10 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
                                vmx_instruction_info, true, &gva))
                        return 1;
                /* _system ok, nested_vmx_check_permission has verified cpl=0 */
-               kvm_write_guest_virt_system(vcpu, gva, &field_value,
-                                           (is_long_mode(vcpu) ? 8 : 4), NULL);
+               if (kvm_write_guest_virt_system(vcpu, gva, &field_value,
+                                               (is_long_mode(vcpu) ? 8 : 4),
+                                               NULL))
+                       kvm_inject_page_fault(vcpu, &e);
        }
 
(note the last argument to kvm_write_guest_virt_system() - it's NULL
instead of &e.

And v4.19.74 has 6e60900cfa3e (541ab2aeb282 upstream):

@@ -5016,6 +5016,13 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
        /* kvm_write_guest_virt_system can pull in tons of pages. */
        vcpu->arch.l1tf_flush_l1d = true;
 
+       /*
+        * FIXME: this should call handle_emulation_failure if X86EMUL_IO_NEEDED
+        * is returned, but our callers are not ready for that and they blindly
+        * call kvm_inject_page_fault.  Ensure that they at least do not leak
+        * uninitialized kernel stack memory into cr2 and error code.
+        */
+       memset(exception, 0, sizeof(*exception));
        return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
                                           PFERR_WRITE_MASK, exception);
 }

This all results in memset(NULL). (also, 6e60900cfa3e should come
*after* f7eea636c3d50 and not before but oh well..)

The following will likely fix the problem (untested):

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e83f4f6bfdac..d3a900a4fa0e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8801,7 +8801,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
                /* _system ok, nested_vmx_check_permission has verified cpl=0 */
                if (kvm_write_guest_virt_system(vcpu, gva, &field_value,
                                                (is_long_mode(vcpu) ? 8 : 4),
-                                               NULL))
+                                               &e))
                        kvm_inject_page_fault(vcpu, &e);
        }

I can send a patch to stable@ if needed.
Comment 3 gregkh 2019-10-14 09:41:21 UTC
On Mon, Oct 14, 2019 at 11:08:24AM +0200, Vitaly Kuznetsov wrote:
> bugzilla-daemon@bugzilla.kernel.org writes:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=205171
> >
> >             Bug ID: 205171
> >            Summary: kernel panic during windows 10pro start
> >            Product: Virtualization
> >            Version: unspecified
> >     Kernel Version: 4.19.74 and higher
> >           Hardware: All
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: kvm
> >           Assignee: virtualization_kvm@kernel-bugs.osdl.org
> >           Reporter: dront78@gmail.com
> >         Regression: No
> >
> > works fine on 4.19.73
> >
> > [ 5829.948945] BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000000
> > [ 5829.948951] PGD 0 P4D 0 
> > [ 5829.948954] Oops: 0002 [#1] SMP NOPTI
> > [ 5829.948957] CPU: 3 PID: 1699 Comm: CPU 0/KVM Tainted: G           OE    
> > 4.19.78-2-lts #1
> > [ 5829.948958] Hardware name: Micro-Star International Co., Ltd. GE62
> > 6QF/MS-16J4, BIOS E16J4IMS.117 01/18/2018
> > [ 5829.948989] RIP: 0010:kvm_write_guest_virt_system+0x1e/0x40 [kvm]
> 
> It seems 4.19 stable backport is broken, upstream commit f7eea636c3d50
> has:
> 
> @@ -4588,7 +4589,8 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
>                                 vmx_instruction_info, true, len, &gva))
>                         return 1;
>                 /* _system ok, nested_vmx_check_permission has verified cpl=0
>                 */
> -               kvm_write_guest_virt_system(vcpu, gva, &field_value, len,
> NULL);
> +               if (kvm_write_guest_virt_system(vcpu, gva, &field_value, len,
> &e))
> +                       kvm_inject_page_fault(vcpu, &e);
>         }
> 
> and it's 4.19 counterpart (73c31bd92039):
> 
> @@ -8798,8 +8799,10 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
>                                 vmx_instruction_info, true, &gva))
>                         return 1;
>                 /* _system ok, nested_vmx_check_permission has verified cpl=0
>                 */
> -               kvm_write_guest_virt_system(vcpu, gva, &field_value,
> -                                           (is_long_mode(vcpu) ? 8 : 4),
> NULL);
> +               if (kvm_write_guest_virt_system(vcpu, gva, &field_value,
> +                                               (is_long_mode(vcpu) ? 8 : 4),
> +                                               NULL))
> +                       kvm_inject_page_fault(vcpu, &e);
>         }
>  
> (note the last argument to kvm_write_guest_virt_system() - it's NULL
> instead of &e.
> 
> And v4.19.74 has 6e60900cfa3e (541ab2aeb282 upstream):
> 
> @@ -5016,6 +5016,13 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu,
> gva_t addr, void *val,
>         /* kvm_write_guest_virt_system can pull in tons of pages. */
>         vcpu->arch.l1tf_flush_l1d = true;
>  
> +       /*
> +        * FIXME: this should call handle_emulation_failure if
> X86EMUL_IO_NEEDED
> +        * is returned, but our callers are not ready for that and they
> blindly
> +        * call kvm_inject_page_fault.  Ensure that they at least do not leak
> +        * uninitialized kernel stack memory into cr2 and error code.
> +        */
> +       memset(exception, 0, sizeof(*exception));
>         return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
>                                            PFERR_WRITE_MASK, exception);
>  }
> 
> This all results in memset(NULL). (also, 6e60900cfa3e should come
> *after* f7eea636c3d50 and not before but oh well..)
> 
> The following will likely fix the problem (untested):
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index e83f4f6bfdac..d3a900a4fa0e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -8801,7 +8801,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
>                 /* _system ok, nested_vmx_check_permission has verified cpl=0
>                 */
>                 if (kvm_write_guest_virt_system(vcpu, gva, &field_value,
>                                                 (is_long_mode(vcpu) ? 8 : 4),
> -                                               NULL))
> +                                               &e))
>                         kvm_inject_page_fault(vcpu, &e);
>         }
> 
> I can send a patch to stable@ if needed.

A patch was already sent, and is included in the 4.19.79 and 4.14.149
kernel releases, and will be part of the next 4.9.y and 4.4.y kernel
releases that happen later this week.

thanks,

greg k-h
Comment 4 Patrick Schönthaler 2019-10-15 13:53:54 UTC
Possible duplicate of https://bugzilla.kernel.org/show_bug.cgi?id=205173
Comment 5 Ivan 2019-10-15 16:07:37 UTC
I can confirm the issue is gone after upgrading to the latest kernel in ArchLinux.
Linux 4.19.79-2-lts #1 SMP Fri Oct 11 20:04:02 UTC 2019 x86_64 GNU/Linux

Note You need to log in before you can comment on or make changes to this bug.