Bug 215969

Summary: Guest deploying TAA mitigation on (not affected) ICX host
Product: Virtualization Reporter: Pawan Gupta (pawan.kumar.gupta)
Component: kvmAssignee: virtualization_kvm
Status: NEW ---    
Severity: high CC: bonzini, pawan.kumar.gupta, seanjc, tony.luck
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: v5.18-rc3 Subsystem:
Regression: No Bisected commit-id:

Description Pawan Gupta 2022-05-12 05:07:14 UTC
On a hardware that enumerates TAA_NO (i.e. not affected by TSX Async Abort (TAA)), a certain guest/host configuration can result in guest enumerating TAA vulnerability and unnecessarily deploying MD_CLEAR(CPU buffer clear) mitigation. 

Icelake Server has TAA_NO and supports MSR TSX_CTRL, and by default linux disables TSX feature, resetting CPUID.RTM at host bootup.

Currently KVM hides TAA_NO from guests when host has CPUID.RTM=0. Because KVM also exports MSR TSX_CTRL to guests, a guest with "tsx=on" cmdline parameter would enable TSX feature, setting X86_FEATURE_RTM.

taa_select_mitigation() with X86_FEATURE_RTM=1 and TAA_NO=0, deploys Clear CPU buffers mitigation.

A probable fix is to export TAA_NO to guests. Alternately, KVM can choose not to export MSR TSX_CTRL.

Guests anyways can't use MSR TSX_CTRL to enable TSX, but I think it was exported to guest to support some migration scenarios:

  https://lore.kernel.org/lkml/20210129101912.1857809-1-pbonzini@redhat.com/

---
Setup info:

ICX HOST configuration:

Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           106
Model name:                      Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz
Stepping:                        6

Vulnerability Mds:               Not affected
Vulnerability Tsx async abort:   Not affected

//TSX feature flag not present on host
$ grep rtm /proc/cpuinfo
$


GUEST info:

Launch kvm/qemu guest with "-cpu host" and guest kernel parameter "tsx=on"

"rtm" shows up in /proc/cpuinfo

# rdmsr -a 0x122
0
0
0
0

// Guest sysfs shows mitigation being deployed.
[root@vm-fedora-35 ~]# grep . /sys/devices/system/cpu/vulnerabilities/tsx_async_abort
Mitigation: Clear CPU buffers; SMT Host state unknown

Thanks,
Pawan