Bug 216102

Summary: kernel BUG at arch/x86/kernel/traps.c:252 caused by CFI (Intel CET) and KVM
Product: Platform Specific/Hardware Reporter: Laurent Bonnaud (L.Bonnaud)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: NEW ---    
Severity: normal CC: basjetimmer, bp, darose, jason.nader, johannes.penssel, kernel, mail, rawatdeepakg, saxophonebritish, simon, tiwarigeeta027, viktor.a.voronin
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.18.3 Subsystem:
Regression: No Bisected commit-id:
Attachments: Full dmesg output
dmesg output

Description Laurent Bonnaud 2022-06-09 12:18:58 UTC
Hi,

I am running Qemu on an Ubuntu 22.04 system and kernel 5.18.3.

The CPU is an 11th Gen Intel(R) Core(TM) i7-1165G7 and therefore has CFI (Intel CET) and support for this has been enabled in 5.18.x kernels.

The error I am seeing is:

[  206.052479] traps: Missing ENDBR: cmpl_eax_edx+0x0/0x10 [kvm]
[  206.052516] ------------[ cut here ]------------
[  206.052517] kernel BUG at arch/x86/kernel/traps.c:252!
[  206.052521] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  206.052523] CPU: 1 PID: 7725 Comm: qemu-system-x86 Tainted: G           OE     5.18.3-051803-generic #202206090934
[  206.052525] Hardware name: TUXEDO TUXEDO InfinityBook S 15 Gen6/NS50_70MU, BIOS 1.07.15RTR2 04/11/2022
[  206.052525] RIP: 0010:exc_control_protection+0xd7/0xe0
[  206.052529] Code: 94 24 80 00 00 00 be f9 00 00 00 48 c7 c7 60 a5 9b a7 e8 3c 57 2e ff e9 70 ff ff ff 48 c7 c7 a2 a5 9b a7 e8 51 b2 f5 ff 0f 0b <0f> 0b 0f 1f 80 00 00 00 00 66 0f 1f 00 55 48 89 e5 41 55 41 54 49
[  206.052530] RSP: 0018:ffffbe13c4157b70 EFLAGS: 00010002
[  206.052532] RAX: 0000000000000031 RBX: 0000000000000001 RCX: 0000000000000000
[  206.052533] RDX: 0000000000000000 RSI: ffff9cf6514a15a0 RDI: ffff9cf6514a15a0
[  206.052534] RBP: ffffbe13c4157b88 R08: 0000000000000031 R09: 694d203a73706172
[  206.052535] R10: 4520676e69737369 R11: 4d203a7370617274 R12: ffffbe13c4157b98
[  206.052536] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  206.052537] FS:  00007f1336f3a640(0000) GS:ffff9cf651480000(0000) knlGS:0000000000000000
[  206.052538] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  206.052539] CR2: 0000000000000000 CR3: 000000030f0f8006 CR4: 0000000000f72ee0
[  206.052540] PKRU: 55555554
[  206.052541] Call Trace:
[  206.052542]  <TASK>
[  206.052543]  asm_exc_control_protection+0x22/0x30
[  206.052545] RIP: 0010:cmpl_eax_edx+0x0/0x10 [kvm]
[  206.052566] Code: d0 c3 cc 0f 1f 80 00 00 00 00 f3 0f 1e fa 38 d0 c3 cc 0f 1f 84 00 00 00 00 00 66 0f 1f 00 66 39 d0 c3 cc 0f 1f 80 00 00 00 00 <66> 0f 1f 00 39 d0 c3 cc 0f 1f 84 00 00 00 00 00 66 0f 1f 00 48 39
[  206.052567] RSP: 0018:ffffbe13c4157c48 EFLAGS: 00010287
[  206.052568] RAX: 00000000ffffffff RBX: 0000000000000001 RCX: 0000000000000000
[  206.052569] RDX: 0000000054464269 RSI: ffffffffc0d75c10 RDI: 0000000000000285
[  206.052570] RBP: ffffbe13c4157c50 R08: ffff9cf06db00000 R09: 0000000000000002
[  206.052571] R10: ffff9cf06db00000 R11: 0000000000000000 R12: ffff9cf06db00000
[  206.052572] R13: ffffffffc0dbd020 R14: 0000000000000001 R15: 0000000000000000
[  206.052573]  ? cmpw_ax_dx+0x10/0x10 [kvm]
[  206.052594]  ? fastop+0x5d/0xa0 [kvm]
[  206.052614]  x86_emulate_insn+0x7b6/0xe90 [kvm]
[  206.052633]  x86_emulate_instruction+0x4ce/0x830 [kvm]
[  206.052653]  ? kvm_arch_vcpu_load+0x80/0x230 [kvm]
[  206.052672]  complete_emulated_mmio+0x295/0x2f0 [kvm]
[  206.052690]  kvm_arch_vcpu_ioctl_run+0x2d7/0x5b0 [kvm]
[  206.052708]  kvm_vcpu_ioctl+0x29c/0x6f0 [kvm]
[  206.052722]  ? __seccomp_filter+0x4a/0x4b0
[  206.052726]  ? __fget_light+0xa7/0x130
[  206.052728]  __x64_sys_ioctl+0x93/0xd0
[  206.052730]  do_syscall_64+0x5d/0x90
[  206.052733]  ? kvm_on_user_return+0x88/0xe0 [kvm]
[  206.052751]  ? fire_user_return_notifiers+0x46/0x70
[  206.052753]  ? exit_to_user_mode_prepare+0x37/0xb0
[  206.052756]  ? syscall_exit_to_user_mode+0x2a/0x50
[  206.052758]  ? do_syscall_64+0x6d/0x90
[  206.052759]  ? do_syscall_64+0x6d/0x90
[  206.052760]  ? do_syscall_64+0x6d/0x90
[  206.052761]  ? do_syscall_64+0x6d/0x90
[  206.052762]  ? asm_sysvec_reschedule_ipi+0xe/0x20
[  206.052763]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  206.052765] RIP: 0033:0x7f1340f1aaff
[  206.052766] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[  206.052767] RSP: 002b:00007f1336f39460 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  206.052768] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f1340f1aaff
[  206.052769] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000010
[  206.052770] RBP: 000055a0f6217d40 R08: 000055a0f3df40f0 R09: 00000000000000ff
[  206.052771] R10: 000055a0f68ffdc0 R11: 0000000000000246 R12: 0000000000000000
[  206.052772] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000002
[  206.052773]  </TASK>
[  206.052774] Modules linked in: rfcomm nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter nvme_fabrics xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc cmac algif_hash algif_skcipher af_alg bnep overlay snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel pmt_telemetry pmt_class soundwire_generic_allocation intel_rapl_msr soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_codec_hdmi snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_soc_core snd_hda_codec_realtek snd_compress snd_hda_codec_generic ac97_bus ledtrig_audio snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp snd_seq_midi_event coretemp intel_cstate iwlmvm binfmt_misc
[  206.052800]  input_leds mac80211 joydev snd_rawmidi libarc4 serio_raw cmdlinepart btusb snd_seq spi_nor iwlwifi wmi_bmof btrtl btbcm uvcvideo videobuf2_vmalloc efi_pstore nls_iso8859_1 mtd ee1004 btintel btmtk videobuf2_memops iwlmei snd_seq_device snd_timer videobuf2_v4l2 snd cfg80211 bluetooth videobuf2_common soundcore ecdh_generic mei hid_multitouch videodev ecc mc processor_thermal_device_pci_legacy processor_thermal_device processor_thermal_rfim intel_vsec ucsi_acpi processor_thermal_mbox processor_thermal_rapl typec_ucsi intel_rapl_common intel_soc_dts_iosf typec igen6_edac int3403_thermal int340x_thermal_zone mac_hid clevo_acpi(OE) tuxedo_io(OE) int3400_thermal intel_hid tuxedo_keyboard(OE) acpi_pad acpi_thermal_rel sparse_keymap sch_fq_codel kvm_intel kvm nf_tables ipmi_devintf ipmi_msghandler nfnetlink msr drivetemp ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c i915 drm_buddy i2c_algo_bit ttm drm_dp_helper cec rc_core drm_kms_helper
[  206.052833]  syscopyarea sysfillrect usbhid crct10dif_pclmul hid_generic rtsx_pci_sdmmc crc32_pclmul sysimgblt fb_sys_fops nvme i2c_i801 spi_intel_pci ghash_clmulni_intel aesni_intel psmouse drm i2c_smbus spi_intel rtsx_pci crypto_simd cryptd r8169 thunderbolt nvme_core realtek intel_lpss_pci intel_lpss xhci_pci idma64 xhci_pci_renesas wmi i2c_hid_acpi i2c_hid hid video pinctrl_tigerlake
[  206.052847] ---[ end trace 0000000000000000 ]---
[  206.249710] RIP: 0010:exc_control_protection+0xd7/0xe0
[  206.249713] Code: 94 24 80 00 00 00 be f9 00 00 00 48 c7 c7 60 a5 9b a7 e8 3c 57 2e ff e9 70 ff ff ff 48 c7 c7 a2 a5 9b a7 e8 51 b2 f5 ff 0f 0b <0f> 0b 0f 1f 80 00 00 00 00 66 0f 1f 00 55 48 89 e5 41 55 41 54 49
[  206.249714] RSP: 0018:ffffbe13c4157b70 EFLAGS: 00010002
[  206.249715] RAX: 0000000000000031 RBX: 0000000000000001 RCX: 0000000000000000
[  206.249716] RDX: 0000000000000000 RSI: ffff9cf6514a15a0 RDI: ffff9cf6514a15a0
[  206.249717] RBP: ffffbe13c4157b88 R08: 0000000000000031 R09: 694d203a73706172
[  206.249718] R10: 4520676e69737369 R11: 4d203a7370617274 R12: ffffbe13c4157b98
[  206.249718] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  206.249719] FS:  00007f1336f3a640(0000) GS:ffff9cf651480000(0000) knlGS:0000000000000000
[  206.249720] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  206.249721] CR2: 0000000000000000 CR3: 000000030f0f8006 CR4: 0000000000f72ee0
[  206.249722] PKRU: 55555554
Comment 1 Laurent Bonnaud 2022-06-09 12:21:10 UTC
Created attachment 301135 [details]
Full dmesg output
Comment 2 Victor 2022-07-05 11:01:55 UTC
Created attachment 301336 [details]
dmesg output

Same issue by me. Can't start the virtual machine since updating to 5.18.x. Linux Mint 20.3, 12th Gen Intel(R) Core(TM) i9-12900K. I have also attached a full dmesg output.
Comment 3 Johannes Penßel 2022-07-29 17:39:40 UTC
This affects me as well on Gentoo with 5.19-rc8 / Core i5-1135G7. GNOME Boxes (both native and Flatpak version) freezes the entire system as soon as I boot up Windows 10/11 in a VM. The Gentoo installation medium seems to work fine though. Adding 'ibt=off' to the kernel commandline circumvents this issue.
Comment 4 basjetimmer 2023-01-07 13:19:42 UTC
I'm still seeing this with on an Intel 13600K running 6.1.3. Seeing as it affect more and more cpu's and 'ibt' is about to be enabled by default for 6.2, I expect more and more people to run into this.

A related Arch-bug links to the following commit as the problem: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6649fa876da4c505548b8e8945a6fc48e62e427c
Comment 6 Borislav Petkov 2023-06-15 09:53:35 UTC
Can folks pls try to reproduce this with latest Linus master, 6.4-rc6 currently?

Also pls upload kernel .config.

Thx.
Comment 7 Laurent Bonnaud 2023-06-15 11:34:39 UTC
My system is currently running kernel 6.3.7, and I am only seeing warning messages in the kernel logs:

[494097.166700] x86/split lock detection: #AC: qemu-system-x86/537034 took a split_lock trap at address: 0xfffff8006ca1e643
[498654.325731] x86/split lock detection: #AC: qemu-system-x86/554439 took a split_lock trap at address: 0xfffff80556a1e643
[499673.397842] x86/split lock detection: #AC: qemu-system-x86/556825 took a split_lock trap at address: 0x3ff2624d
[499957.351816] x86/split lock detection: #AC: qemu-system-x86/557489 took a split_lock trap at address: 0x3ff2624d
Comment 8 Borislav Petkov 2023-06-15 12:38:55 UTC
That's basically saying that you have locks in your kernel or a module which are  split between two cachelines. This should not happen with a kernel build done with the usual toolchains used by distros.

Sounds like you're using some out-of-tree module which got built by some weird compiler.
Comment 9 Laurent Bonnaud 2023-06-15 17:25:08 UTC
> That's basically saying that you have locks in your kernel or a module

I understand in the error message that the problem is in an userspace process (qemu-system-x86) and not in kernel code.
Comment 10 Thomas Gleixner 2023-06-15 20:44:49 UTC
On Thu, Jun 15 2023 at 17:25, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=216102
>
> --- Comment #9 from Laurent Bonnaud (L.Bonnaud@laposte.net) ---
>
>> That's basically saying that you have locks in your kernel or a module
>
> I understand in the error message that the problem is in an userspace process
> (qemu-system-x86) and not in kernel code.

Well it's not necessarily the qemu process itself. Guest split lock
detection is ending up in the same error path. And that's likely the
guest because:

 [494097.166700] x86/split lock detection: #AC: qemu-system-x86/537034 took a
 split_lock trap at address: 0xfffff8006ca1e643

which is clearly a kernel address, but this can't be on the host because
the host would end up with a different error message and die.

Laurent, which kernel is running in your guest? Something ancient?

Thanks,

        tglx
Comment 11 Laurent Bonnaud 2023-06-16 06:34:45 UTC
> Laurent, which kernel is running in your guest?

The guest OS is Windows 10.
Comment 12 Thomas Gleixner 2023-06-16 07:48:14 UTC
On Fri, Jun 16 2023 at 06:34, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=216102
>
> --- Comment #11 from Laurent Bonnaud (L.Bonnaud@laposte.net) ---
>
>> Laurent, which kernel is running in your guest?
>
> The guest OS is Windows 10.

Ok. Unfortunately the dmesg output does not differentiate between host
user space and guest originated split lock access.

The below patch for the host kernel makes it more obvious where this
originates from. I'm going to polish that up and post it on LKML too.

The problem itself is mostly harmless. Though split lock access which is
unpriviledged can be used for a DoS attack on a machine because such an
access has to fully lock the bus which causes a tremendous slow down for
the whole system if e.g. done in a loop.

Thanks,

        tglx

---
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 1c4639588ff9..f3b88a87efd9 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1343,14 +1343,14 @@ static int splitlock_cpu_offline(unsigned int cpu)
 	return 0;
 }
 
-static void split_lock_warn(unsigned long ip)
+static void split_lock_warn(unsigned long ip, bool guest)
 {
 	struct delayed_work *work;
 	int cpu;
 
 	if (!current->reported_split_lock)
-		pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n",
-				    current->comm, current->pid, ip);
+		pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx guest: %d\n",
+				    current->comm, current->pid, ip, guest);
 	current->reported_split_lock = 1;
 
 	if (sysctl_sld_mitigate) {
@@ -1382,7 +1382,7 @@ static void split_lock_warn(unsigned long ip)
 bool handle_guest_split_lock(unsigned long ip)
 {
 	if (sld_state == sld_warn) {
-		split_lock_warn(ip);
+		split_lock_warn(ip, true);
 		return true;
 	}
 
@@ -1425,7 +1425,7 @@ bool handle_user_split_lock(struct pt_regs *regs, long error_code)
 {
 	if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal)
 		return false;
-	split_lock_warn(regs->ip);
+	split_lock_warn(regs->ip, false);
 	return true;
 }
Comment 13 Laurent Bonnaud 2023-06-16 16:05:32 UTC
> The below patch for the host kernel makes it more obvious where this
originates from. 

Thanks for the patch!

> I'm going to polish that up and post it on LKML too.

I am looking forward to testing it in a released kernel...
Comment 14 Laurent Bonnaud 2023-06-16 16:07:42 UTC
BTW, I did not state it explicitly, but the original issue (kernel BUG caused by CFI) is fixed in recent kernels.

Thanks to whoever fixed it!