Bug 120971 - KVM: entry failed, hardware error 0x80000021 on Intel Host
Summary: KVM: entry failed, hardware error 0x80000021 on Intel Host
Status: RESOLVED CODE_FIX
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-06-24 18:33 UTC by Joe Knockenhauer
Modified: 2016-06-28 20:07 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.7.0-rc4-mainline
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Joe Knockenhauer 2016-06-24 18:33:59 UTC
hardware is as follows:

Intel i5 6600K
32Gig DDR4 Memory 
GTX 980ti
2 SSD
2 HDD

Using qemu-kvm and ovmf the following error occurs erratically when booted into a windows 10 host:

KVM: entry failed, hardware error 0x80000021

If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.

RAX=ffffffffffd148c0 RBX=0000000000000000 RCX=0000000000000086 RDX=0000000000000                                 000
RSI=ffffda819a1e1180 RDI=ffff910d6ee037b0 RBP=0000000000000000 RSP=ffffda819a1f7                                 758
R8 =00000000ffffffff R9 =0000000000000000 R10=00000000ffffffff R11=0000000000000                                 000
R12=00000000a71b10d9 R13=0000000000000046 R14=0000000000000000 R15=0000000000000                                 000
RIP=fffff8013c22615f RFL=00000282 [--S----] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 00000000 00409300 DPL=0 DS   [-WA]
DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
FS =0053 00000000b300c000 0000fc00 0040f300 DPL=3 DS   [-WA]
GS =002b ffffda819a1e1000 ffffffff 00c0f300 DPL=3 DS   [-WA]
LDT=0000 0000000000000000 ffffffff 00000000
TR =0040 ffffda819a1e7b40 00000067 00008b00 DPL=0 TSS64-busy
GDT=     ffffda819a1eec00 0000006f
IDT=     ffffda819a1eec70 00000fff
CR0=80050033 CR2=0000000000000030 CR3=00000000001ab000 CR4=001506f8
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000                                 000
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000d01
Code=00 00 00 00 00 48 83 ec 28 e8 47 b3 ff ff 48 83 c4 28 fb f4 <c3> cc cc cc c                                 c cc cc 66 66 0f 1f 84 00 00 00 00 00 0f 20 d0 0f 22 d0 c3 cc cc cc cc cc cc

The error occurs whether or not pci devices are passed through to the host or not, and in every conceivable processor configuration that works for Windows guests. 

The error occurs approximately 10 minutes into using the machine (give or take a few minutes).

Occasionally the error occurs on boot of the guest. 

This has also happened on previous kernels. 

Currently I have the kvm_intel module loaded with emulate_invalid_guest_state=0, although it also occurs in the same capacity if the parameter is set to 1.

Please feel free to contact me for additional information.
Comment 1 Paolo Bonzini 2016-06-25 07:22:02 UTC
Patch here: http://thread.gmane.org/gmane.linux.kernel/2232266
Comment 2 Joe Knockenhauer 2016-06-25 14:53:03 UTC
Thanks!
Comment 3 Joe Knockenhauer 2016-06-28 15:56:30 UTC
Patch does not fix the issue. Patched 4.7-rc5 and 4.6.2. 

Issue still exists.
Comment 4 Paolo Bonzini 2016-06-28 16:21:53 UTC
Please look for the VMCS dump in /var/log/messages or dmesg and paste it here. Thanks!
Comment 5 Joe Knockenhauer 2016-06-28 16:48:22 UTC
I believe this is what you are looking for?

[ 2087.637650] *** Guest State ***
[ 2087.637654] CR0: actual=0x0000000080050033, shadow=0x0000000080050033, gh_mask=fffffffffffffff7
[ 2087.637655] CR4: actual=0x00000000001526f8, shadow=0x00000000001506f8, gh_mask=fffffffffffff871
[ 2087.637656] CR3 = 0x000000013f310000
[ 2087.637656] PDPTR0 = 0x0000000000000000  PDPTR1 = 0x0000000000000000
[ 2087.637657] PDPTR2 = 0x0000000000000000  PDPTR3 = 0x0000000000000000
[ 2087.637658] RSP = 0xffffb080565f64d0  RIP = 0xfffff80354891f35
[ 2087.637659] RFLAGS=0x00000283         DR7 = 0x0000000000000400
[ 2087.637660] Sysenter RSP=0000000000000000 CS:RIP=0000:0000000000000000
[ 2087.637661] CS:   sel=0x0010, attr=0x0209b, limit=0x00000000, base=0x0000000000000000
[ 2087.637662] DS:   sel=0x002b, attr=0x0c0f3, limit=0xffffffff, base=0x0000000000000000
[ 2087.637663] SS:   sel=0x0018, attr=0x04093, limit=0x00000000, base=0x0000000000000000
[ 2087.637664] ES:   sel=0x002b, attr=0x0c0f3, limit=0xffffffff, base=0x0000000000000000
[ 2087.637665] FS:   sel=0x0053, attr=0x040f3, limit=0x00007c00, base=0x00000000003e0000
[ 2087.637666] GS:   sel=0x002b, attr=0x0c0f3, limit=0xffffffff, base=0xffffb08059180000
[ 2087.637667] GDTR:                           limit=0x0000006f, base=0xffffb0805918dbc0
[ 2087.637668] LDTR: sel=0x0000, attr=0x1c000, limit=0xffffffff, base=0x0000000000000000
[ 2087.637669] IDTR:                           limit=0x00000fff, base=0xffffb0805918dc30
[ 2087.637670] TR:   sel=0x0040, attr=0x0008b, limit=0x00000067, base=0xffffb08059186b40
[ 2087.637671] EFER =     0x0000000000000801  PAT = 0x0007010600070106
[ 2087.637672] DebugCtl = 0x0000000000000000  DebugExceptions = 0x0000000000000000
[ 2087.637672] PerfGlobCtl = 0x00000007000000ff
[ 2087.637673] BndCfgS = 0x0000000000000000
[ 2087.637674] Interruptibility = 00000000  ActivityState = 00000000
[ 2087.637674] *** Host State ***
[ 2087.637675] RIP = 0xffffffffa07bb11c  RSP = 0xffff880439e5fcd8
[ 2087.637676] CS=0010 SS=0018 DS=0000 ES=0000 FS=0000 GS=0000 TR=0040
[ 2087.637677] FSBase=00007fc48d1ff700 GSBase=ffff880885d80000 TRBase=ffff880885d94240
[ 2087.637678] GDTBase=ffff880885d89000 IDTBase=ffffffffff57c000
[ 2087.637679] CR0=0000000080050033 CR3=0000000565798000 CR4=00000000003426e0
[ 2087.637680] Sysenter RSP=0000000000000000 CS:RIP=0010:ffffffff815d7390
[ 2087.637681] EFER = 0x0000000000000d01  PAT = 0x0407010600070106
[ 2087.637681] PerfGlobCtl = 0x00000002000000fe
[ 2087.637682] *** Control State ***
[ 2087.637683] PinBased=0000003f CPUBased=b6a06dfa SecondaryExec=001214eb
[ 2087.637684] EntryControls=000173ff ExitControls=008fffff
[ 2087.637685] ExceptionBitmap=00060042 PFECmask=00000000 PFECmatch=00000000
[ 2087.637686] VMEntry: intr_info=000000d2 errcode=00000000 ilen=00000000
[ 2087.637686] VMExit: intr_info=00000000 errcode=00000000 ilen=00000001
[ 2087.637687]         reason=80000021 qualification=0000000000000000
[ 2087.637688] IDTVectoring: info=00000000 errcode=00000000
[ 2087.637689] TSC Offset = 0xfffe0e32cd5acec2
[ 2087.637689] TPR Threshold = 0x00
[ 2087.637690] EPT pointer = 0x0000000803dc105e
[ 2087.637691] PLE Gap=00000080 Window=00010000
[ 2087.637691] Virtual processor ID = 0x0002
Comment 6 Paolo Bonzini 2016-06-28 20:05:45 UTC
It's the same bug.  The patch has not yet hit Linus's tree, you need to apply it manually.
Comment 7 Joe Knockenhauer 2016-06-28 20:07:23 UTC
Very strange, I applied it to the latest mainline and got the same issue. It is entirely possible I am doing something wrong. I apologize if I wasted your time, I'll try again on an LTS kernel and see if I can get it working. 

Sorry about that. Thanks!

Note You need to log in before you can comment on or make changes to this bug.