Bug 215524 - Unable to set $xmm2-9 registers but able to set $xmm0-1,10-15 with ptrace syscall
Summary: Unable to set $xmm2-9 registers but able to set $xmm0-1,10-15 with ptrace sys...
Status: RESOLVED CODE_FIX
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: platform_x86_64@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-01-24 01:52 UTC by Luís Ferreira
Modified: 2022-02-21 00:52 UTC (History)
6 users (show)

See Also:
Kernel Version: 5.1x
Tree: Mainline
Regression: Yes


Attachments

Description Luís Ferreira 2022-01-24 01:52:05 UTC
Writing some XMM registers on a process with `ptrace` syscall with `PTRACE_SETREGSET` request and using `NT_FPREGSET` is broken. Although, when using `NT_X86_XSTATE` works just fine. Also, getting the registers with `PTRACE_GETREGSET` is fine too.

This can be considered a blocker for old hardware that lacks AVX extensions and simply fail to call ptrace with `NT_X86_XSTATE`. This particularly failed the testsuite of LLDB on my server.

I've tested this on Linux 5.16.2, Linux LTS 5.15.16 and Linux LTS 5.10.93. Only the last one (5.10.93) succeeded with `NT_FPREGSET`. I tested this in a clean installation of Arch Linux with the official packages and backported version of Linux LTS, available in AUR repository.

Note: I didn't test `NT_FPREGSET` on recent hardware (hardware with AVX).

From my investigation on the issue and the kernel source code inspection, I believe that this was introduced here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/kernel/fpu/regset.c?id=6164331d15f7d912fb9369245368e9564ea49813

I'm going to leave a reference to some issues/patches on GDB and LLDB debuggers to either reproduce the issue and see some logs I created with strace:
- https://reviews.llvm.org/D117928
- https://sourceware.org/bugzilla/show_bug.cgi?id=28803
Comment 1 Borislav Petkov 2022-01-24 10:30:56 UTC
Just to confirm I've got the right reproducer:

when I run

$ gdb /bin/bash -batch -x gdbcmds

with gdbcmds containing:

---
r -c 'kill -SIGILL $$'
p $xmm7.uint128
echo Setting xmm7 to 35322350018591\n
set $xmm7.uint128 = 35322350018591
p $xmm7.uint128
---

I get on the core2duo VM:

Program received signal SIGILL, Illegal instruction.
0x00007ffff79d8a97 in kill () from /lib/x86_64-linux-gnu/libc.so.6
$1 = 0x20202020202020202020202020202020
Setting xmm7 to 35322350018591
$2 = 0

while on a newer one, it would properly set the register:

Program received signal SIGILL, Illegal instruction.
0x00007ffff79d8a97 in kill () from /lib/x86_64-linux-gnu/libc.so.6
$1 = 281474976710655
Setting xmm7 to 35322350018591
$2 = 35322350018591

Right?
Comment 2 Luís Ferreira 2022-01-24 17:37:47 UTC
Right, that is a reproduce of my issue.

The difference in the hardware is due to the fact that GDB/LLDB does different `ptrace` calls. Probably the same call fails on both hardware, but I can't confirm that. The difference is that in old hardware can't call `ptrace` with `NT_X86_XSTATE`. I don't have a development environment yet setup, but what are your thoughts about the  introduced `memset` on xmm8-15 ? Is that correct offsets and what is the rationale behind it? I think that is the problematic code.
Comment 3 Borislav Petkov 2022-01-24 18:46:11 UTC
> but what are your thoughts about the  introduced `memset` on xmm8-15

Looks like the offset is wrong. Does that fix it?

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 437d7c930c0b..001a4f552509 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -118,7 +118,7 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
 
        /* Clear xmm8..15 */
        BUILD_BUG_ON(sizeof(fpu->__fpstate.regs.fxsave.xmm_space) != 16 * 16);
-       memset(&fpu->fpstate->regs.fxsave.xmm_space[8], 0, 8 * 16);
+       memset(&fpu->fpstate->regs.fxsave.xmm_space[8 * 4], 0, 8 * 16);
 
        /* Mark FP and SSE as in use when XSAVE is enabled */
        if (use_xsave())

as to the rationale, amluto needs to explain himself what he was thinking. I guess this was meant to be for something 32-bit but it isn't the case with this reproducer...
Comment 4 Borislav Petkov 2022-01-24 19:49:04 UTC
In any case, this particular reproducer does set regs 4 times - dunno why gdb does this, maybe there's a reason:

[   26.067862] arch_ptrace: request: 0xd
[   26.068181] copy_regset_from_user: setno: 0

that's PTRACE_SETREGS with REGSET_GENERAL. That happens two more times and the 4th and last one does:

[   26.118536] arch_ptrace: request: 0xf
[   26.118728] copy_regset_from_user: setno: 1

and that's PTRACE_SETFPREGS with REGSET_FP. And that last one would lead to running xfpregs_set().
Comment 5 Luís Ferreira 2022-01-25 04:43:29 UTC
> Looks like the offset is wrong. Does that fix it?

No. Although the offsets now match with what is documented, xmm8-xmm15 range doesn't work. It is worth noting that NT_FPREGSET is not i686 specific (at least ptrace documentation doesn't enforce that), so on x86_64 the xmm8-xmm15 registers are available too.

I made a patch removing that call. I recompiled the mainline kernel and tested it on my machine and it seems to work fine. I can't reference this on patchwork but added your email as CC.

> In any case, this particular reproducer does set regs 4 times - dunno why gdb
> does this, maybe there's a reason:

Well I'm not into GDB flow, but as far as I know, LLDB only does one single call  on writing according to strace. Anyway, how is that relevant? Is there anything I'm missing? One thing I know about LLDB possibly making multiple ptrace calls to read registers instead of writing is the fact that have a fallback system to discover if the system supports XSTATE or FXSTATE. Probably happening on GDB too? To explain a bit more, they try to call ptrace with NT_X86_XSTATE, if it fails, they fallback to another ptrace call but with NT_FPREGSET. I think this is because NT_X86_XSTATE is specific for XSTATE instead of FXSTATE (that also includes YMM registers for AVX, hence the hardware specific part).
Comment 6 Luís Ferreira 2022-01-25 04:46:37 UTC
My bad on the following:

FXSTATE -> FXSAVE
XSTATE -> XSAVE
Comment 8 The Linux kernel's regression tracker (Thorsten Leemhuis) 2022-02-08 12:18:50 UTC
I'm tracking this issue and it seems nothing happened for a while. What's the status here? Does the patch need testing?
Comment 9 Borislav Petkov 2022-02-08 12:23:42 UTC
Yes it does. I'll try to find some time to whip up my reproducer again and run it unless Luis beats me to it.

Thx.
Comment 10 Luís Ferreira 2022-02-09 04:13:37 UTC
Due to lack of time, I haven't touched this. I plan to test this, but I don't have a good enough environment to test different kernel versions efficiently, so I have to either set up one or do a full build inside my libvirt VM. I only tested my previous patch on x86_64, not on i686, so Andy's patch might require extra steps to reproduce it properly. I will try to take a look into this tomorrow and give some feedback.
Comment 11 Borislav Petkov 2022-02-18 10:55:40 UTC
https://git.kernel.org/tip/44cad52cc14ae10062f142ec16ede489bccf4469

Will go to Linus soon.
Comment 12 Luís Ferreira 2022-02-21 00:52:26 UTC
Thanks for the effort! I just tested on my environment to reproduce my issue and can confirm it is fixed :)

Note You need to log in before you can comment on or make changes to this bug.