Bug 217574

Summary: kvm_intel loads only after suspend
Product: Virtualization Reporter: drigohighlander (drigoslkx)
Component: kvmAssignee: virtualization_kvm
Status: NEW ---    
Severity: blocking CC: chao.gao, seanjc
Priority: P3    
Hardware: Intel   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description drigohighlander 2023-06-19 16:38:13 UTC
Hi, I tested this bug on Slackware 5.15.29, 5.15.97 and 6.1
I tested too on ubuntu 23.04, Fedora 38 and Linux Mint 21.01 and got the same results.

When I boot Linux, it shows the error
kvm: CPU 0 feature inconsistency!
(Depending on the kernel version it throws this message for each cpu core)

The kernel loads the kvm module but not kvm_intel.
(Without kvm_intel kvm doens't work).
If I try a manual load, doing modprobe kvm_intel it shows
modprobe: ERROR: could not insert 'kvm_intel': Input/output error

So I take some time break, when I back, the system was suspended, I turned it on again, and tried to load modprobe kvm_intel, it works, and kvm got working now.

Tested this in all other distros described above and got the same results.
I only get kvm works after a suspend.

My specs
Motherboard Jingsha x99
CPU Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
Comment 1 Chao Gao 2023-06-20 08:49:28 UTC
Could you check if each MSR indexed from 0x480 to 0x492 is consistent across all CPUs?


To read an MSR (e.g., 0x480) on all CPUs, run
$ sudo rdmsr -a 0x480


Please do the check after bootup and suspension.
Comment 2 Sean Christopherson 2023-06-22 20:42:45 UTC
Ya, as Chao alluded to, the problem appears to be that not all CPUs have have the same VMX configuration, which is reflected in the MSRs Chao mentioned.  My best guess is that going through a suspend+resume cycle either wipes out a problematic ucode update, or applies a "good" ucode update to all CPUs, and thus resolves the inconsistent VMX configuration across CPUs.
Comment 3 drigohighlander 2023-06-23 16:22:47 UTC
Hi, here is the output after bootup:

bash-5.2# rdmsr -a 0x480
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012

after suspension:
bash-5.2# rdmsr -a 0x480
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
da040000000012
bash-5.2# 

lscpu:
bash-5.2# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  36
  On-line CPU(s) list:   0-35
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Intel
  Model name:            Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
    BIOS Model name:     Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz  CPU @ 2.3GHz
    BIOS CPU family:     179
    CPU family:          6
    Model:               63
    Thread(s) per core:  2
    Core(s) per socket:  18
    Socket(s):           1
    Stepping:            2
    CPU(s) scaling MHz:  64%
    CPU max MHz:         3800.0000
    CPU min MHz:         1200.0000
    BogoMIPS:            4589.05
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
                         a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss 
                         ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arc
                         h_perfmon pebs bts rep_good nopl xtopology nonstop_tsc 
                         cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vm
                         x smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca ss
                         e4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdra
                         nd lahf_lm abm cpuid_fault epb invpcid_single pti intel
                         _ppin tpr_shadow vnmi flexpriority ept vpid ept_ad fsgs
                         base tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rt
                         m cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pl
                         n pts
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   576 KiB (18 instances)
  L1i:                   576 KiB (18 instances)
  L2:                    4.5 MiB (18 instances)
  L3:                    45 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-35
Vulnerabilities:         
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Mitigation; PTE Inversion; VMX conditional cache flushe
                         s, SMT vulnerable
  Mds:                   Vulnerable: Clear CPU buffers attempted, no microcode; 
                         SMT vulnerable
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Vulnerable: Clear CPU buffers attempted, no microcode; 
                         SMT vulnerable
  Retbleed:              Not affected
  Spec store bypass:     Vulnerable
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer
                          sanitization
  Spectre v2:            Mitigation; Retpolines, STIBP disabled, RSB filling, PB
                         RSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected
bash-5.2#
Comment 4 Sean Christopherson 2023-06-23 20:08:51 UTC
Can you check all MSRs in the range 0x480-0x491, i.e. all the known VMX MSRs, and just report back any divergences between CPUs?  The values for MSRs that are consistent across all CPUs aren't interesting at this time.  What we *suspect* is going on is that one or more CPUs has different MSRs in one or more of the VMX MSRs.  Before we can debug further, we need to first confirm that that is indeed why KVM is refusing to load.
Comment 5 drigohighlander 2023-06-24 18:58:48 UTC
Hi folks, sorry for my misunderstood.
I ran the command for all MSRs in the range 0x480-0x491 after bootup and after a suspend and I didn't find any inconsistencies.

How can I share the file of the outputs?

Thank you
Comment 6 Chao Gao 2023-06-26 02:56:22 UTC
I found this: https://ubuntuforums.org/archive/index.php/t-2344602.html. it was reported on the same CPU E5-2696 v3.

Could you run below commands after bootup and suspension, and share the output?

# rdmsr -a 0x48b |uniq
# cat /proc/cpuinfo |grep microcode |uniq
Comment 7 drigohighlander 2023-06-28 13:26:13 UTC
Hi here is the outputs:

After bootup:
bash-5.2# rdmsr -a 0x48b |uniq
2007fff00000000
7fff00000000
2007fff00000000
7fff00000000

bash-5.2# cat /proc/cpuinfo |grep microcode |uniq
microcode	: 0x39

After suspend:
bash-5.2# rdmsr -a 0x48b |uniq
7fff00000000

bash-5.2# cat /proc/cpuinfo |grep microcode |uniq
microcode	: 0x39
Comment 8 Chao Gao 2023-06-28 13:53:41 UTC
Yes. MSR 0x48b is inconsistent.

Your microcode is too old; according to [1], the latest one for your system (Family-Model-Stepping: 06-3f-02) is 0x49. Could you update the microcode and retry?

[1]: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/blob/main/releasenote.md