Bug 211467 - Regression affecting 32->64 bit SYSENTER on AMD
Summary: Regression affecting 32->64 bit SYSENTER on AMD
Status: RESOLVED IMPLEMENTED
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-29 20:50 UTC by jonny5532
Modified: 2021-03-15 19:24 UTC (History)
1 user (show)

See Also:
Kernel Version: all since 5.8-rc1
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Simple patch which appears to fix it (706 bytes, patch)
2021-01-29 21:47 UTC, jonny5532
Details | Diff
Patch with whitespace fix and SOB (555 bytes, patch)
2021-02-01 10:31 UTC, jonny5532
Details | Diff

Description jonny5532 2021-01-29 20:50:12 UTC
For obscure legacy reasons I am running MacOS 10.6.8 in a qemu VM, and since commit fede8076aab4c2280c673492f8f7a2e87712e8b4 the guest crashes 30 seconds or so after boot.

It seems that the crash is triggered by starting a i386 program within the x86_64 XNU kernel, which makes syscalls using the SYSENTER instruction. The emulate.c change within the problematic commit truncates the intended EIP (0xffffff80002e3ad0 corresponding to the _hi64_sysenter symbol in the guest's mach_kernel) down to 32 bits, and the guest then crashes - commenting out the truncation fixes the problem.

This doesn't seem be a problem on Intel VT since there SYSENTER isn't trapped by KVM, but it fails on an AMD Ryzen 5600X. It is also presumably not a problem with Linux guests, since the EIP used for Linux SYSENTER syscalls fits within 32 bits (I think).

I don't know what problem the truncation is intended to fix, but I assume it isn't meant to interfere with SYSENTER. I've looked through all the documentation I can find and am not certain whether SYSENTER is specified to copy the full 64 bits from IA32_SYSENTER_EIP when used in 32 bit mode on a CPU with 64 bit support, but I assume it is if the 32->64bit XNU SYSENTER syscall is intended to work.

Also, ctxt->mode is still set to X86EMUL_MODE_PROT32 at the point that the truncation is done, despite em_sysenter having updated CS to enter long mode. Perhaps em_sysenter should use assign_eip_far to set ctxt->_eip instead, to ensure that the mode is updated? (if that is indeed correct behaviour?). The truncation would then not take place and the problem would not occur.
Comment 1 jonny5532 2021-01-29 21:47:03 UTC
Created attachment 294993 [details]
Simple patch which appears to fix it

This patch stops my VM crashing without removing the truncation.

It is based on the assumption that a SYSENTER which changes the CS flags to long mode (if applicable) should also change the ctxt->mode to PROT64. I don't know if that is correct (or safe) though.
Comment 2 Sean Christopherson 2021-01-29 23:38:47 UTC
LOL, this is awesome.  Presumably you're also exposing an Intel CPU model to the guest, otherwise KVM would inject a #UD on SYSENTER instead of emulating.

Anyways, hilarity aside, your patch looks correct as Intel CPUs unconditionally transition to 64-bit mode on SYSENTER from compatibility mode.  Want to send a formal patch?  If you're not set up to send a patch, I'd be happy to write a changelog if you want to provide a Signed-off-by.
Comment 3 jonny5532 2021-02-01 10:31:57 UTC
Created attachment 295027 [details]
Patch with whitespace fix and SOB

Thanks for confirming, here's the same patch again without the missing newline and with my SOB tacked on the bottom. I am indeed not set up for sending patches so would appreciate it if you could do that bit.

I had a look through emulate.c to see if I could foresee any unintended side effects - it seems that SYSEXIT already handles the potential transition back from 64->32 bits. 

I do note that some of the transitions to PROT64 are wrapped in "#ifdef CONFIG_X86_64 ... #endif", for example in em_syscall. However that looks pretty dubious in itself - if em_syscall gets run with the guest in long mode, on a 32 bit kernel, then it'll just fail to update the EIP but carry on regardless. Probably an unlikely scenario however (KVM on AMD has only ever worked on 64 bit procs as I understand).
Comment 4 Sean Christopherson 2021-02-01 21:16:37 UTC
Running a 64-bit guest with a 32-bit host kernel is firmly unsupported, emulator issues are but one of many things that would completely break.
Comment 5 jonny5532 2021-03-15 19:24:28 UTC
I am now running with 5.11 (which contains the fix) and it seems to be working fine now.

Thanks all!

Note You need to log in before you can comment on or make changes to this bug.