Bug 211151
Summary: | WARNING: CPU: 1 PID: 101841 at arch/powerpc/mm/fault.c:229 __do_page_fault+0x6b4/0xcd0 | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Zorro Lang (zlang) |
Component: | PPC-64 | Assignee: | Jens Axboe (axboe) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | axboe, christophe.leroy, michael |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | v5.11-rc3+ | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Zorro Lang
2021-01-12 09:57:25 UTC
Just noticed this one. Seems to be because the powerpc fault code does this: if (unlikely(!is_user && bad_kernel_fault(regs, error_code, address, is_write))) return SIGSEGV; where is_user is based on user_mod(regs) only. This doesn't work for io_uring, where the helper thread can assume the user app identity and could perform this fault just fine. Not sure what the best fix is, would have to ask the powerpc folks. Something basic like the below might make it work, but this is really an arch question. diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 8961b44f350c..2176f9ffebcc 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -395,7 +395,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, struct mm_struct *mm = current->mm; unsigned int flags = FAULT_FLAG_DEFAULT; int is_exec = TRAP(regs) == 0x400; - int is_user = user_mode(regs); + int is_user = user_mode(regs) || current->flags & PF_IO_WORKER; int is_write = page_fault_is_write(error_code); vm_fault_t fault, major = 0; bool kprobe_fault = kprobe_page_fault(regs, 11); (In reply to Jens Axboe from comment #1) > Just noticed this one. Seems to be because the powerpc fault code does > this: > > if (unlikely(!is_user && bad_kernel_fault(regs, error_code, address, > is_write))) > return SIGSEGV; > > where is_user is based on user_mod(regs) only. This doesn't work for > io_uring, where the helper thread can assume the user app identity and > could perform this fault just fine. > > Not sure what the best fix is, would have to ask the powerpc folks. > Something basic like the below might make it work, but this is really an > arch question. > > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c > index 8961b44f350c..2176f9ffebcc 100644 > --- a/arch/powerpc/mm/fault.c > +++ b/arch/powerpc/mm/fault.c > @@ -395,7 +395,7 @@ static int __do_page_fault(struct pt_regs *regs, > unsigned long address, > struct mm_struct *mm = current->mm; > unsigned int flags = FAULT_FLAG_DEFAULT; > int is_exec = TRAP(regs) == 0x400; > - int is_user = user_mode(regs); > + int is_user = user_mode(regs) || current->flags & PF_IO_WORKER; > int is_write = page_fault_is_write(error_code); > vm_fault_t fault, major = 0; > bool kprobe_fault = kprobe_page_fault(regs, 11); Thanks Jens, I just changed this bug Component to "PPC-64". Feel free to cc powerpc folks if you know anyone about that, or you might like to send your patch to powerpc maillist to ask them directly. (I can help to ask too, if you would not like to take that time.) I actually think the patch should be even simpler, something ala: diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 8961b44f350c..5a4d6af04c99 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -417,9 +417,12 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, /* * The kernel should never take an execute fault nor should it * take a page fault to a kernel address or a page fault to a user - * address outside of dedicated places + * address outside of dedicated places. Use mm to decide if this is + * valid or not, it's perfectly legitimate to have a kernel thread + * take a fault as long as it's performed kthread_use_mm() prior. An + * example of that would be io_uring helper threads. */ - if (unlikely(!is_user && bad_kernel_fault(regs, error_code, address, is_write))) + if (unlikely(!mm && bad_kernel_fault(regs, error_code, address, is_write))) return SIGSEGV; /* Feel free to send out a feeler with the patch, would not mind someone else driving this and I have no access to test any of this. Seems like you do, perhaps you can test the patch as well? (In reply to Jens Axboe from comment #4) > Feel free to send out a feeler with the patch, would not mind someone else > driving this and I have no access to test any of this. Seems like you do, > perhaps you can test the patch as well? Yes, I have test env. which can 100% reproduce this issue. Let me check if the latest mainline linux fixed this bug, last time I tested this bug on 5.11-rc4, it's rc5 now. If the bug is still there, I'll send this patch with your name to powerpc list to ask. (In reply to Jens Axboe from comment #1) > Just noticed this one. Seems to be because the powerpc fault code does > this: > > if (unlikely(!is_user && bad_kernel_fault(regs, error_code, address, > is_write))) > return SIGSEGV; > > where is_user is based on user_mod(regs) only. This doesn't work for > io_uring, where the helper thread can assume the user app identity and > could perform this fault just fine. > > Not sure what the best fix is, would have to ask the powerpc folks. > Something basic like the below might make it work, but this is really an > arch question. > I don't understand why testing is_user would be an issue. KUAP purpose it to block any unallowed access from kernel to user memory (Equivalent to SMAP on x86). So it really must be based on MSR_PR bit, that is what is_user provides. If the kernel access is legitimate, kernel should have opened userspace access then you shouldn't get this "Bug: Read fault blocked by KUAP!". As far as I understand, the fault occurs in iov_iter_fault_in_readable() which calls fault_in_pages_readable() And fault_in_pages_readable() uses __get_user() so it is a legitimate access and you really should get a KUAP fault. So the problem is somewhere else, I think you proposed patch just hides the problem, it doesn't fix it. Can you provide your vmlinux binary together with your .config ? I can't reproduce here on a Power9 bare metal machine. What hardware are you on? And you're running tests/generic/013 from xfstests from kernel.org? What defconfig? (In reply to Michael Ellerman from comment #7) > I can't reproduce here on a Power9 bare metal machine. > > What hardware are you on? > And you're running tests/generic/013 from xfstests from kernel.org? > What defconfig? Yes, clone it from: # git clone git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git And I use local.config file as below: FSTYP=xfs TEST_DEV=/dev/sda5 TEST_DIR=/mnt/xfstests/test SCRATCH_DEV=/dev/sda3 SCRATCH_MNT=/mnt/xfstests/scratch LOGWRITES_DEV=/dev/sda6 MKFS_OPTIONS="-m crc=1,finobt=1,reflink=1,rmapbt=1,bigtime=1,inobtcount=1" TEST_FS_MOUNT_OPTS="" I tested on a P9 Virtual machine which base on a P9 bare metal machine. System: Host Hypervisor PowerVM Vendor IBM Model 9009-42A Memory 8192 MB NUMA Nodes 1 CPU: Vendor IBM Model Name POWER9 (architected), altivec supported Hyper True Arch(s) ppc64le Disks: Model Size Logical sector size Physical sector size VDASD 107.37 GB / 100.00 GiB 512 bytes 512 bytes Devices: Description Type Bus Driver Virtual I/O device (vty) generic vio hvc_console Virtual I/O device (v-scsi) generic vio ibmvscsi I don't know if you enabled IO_URING? CONFIG_IO_URING=y CONFIG_IO_WQ=y To reproduce this bug, io_uring is necessary. So kernel must support it, and xfstests must be built with liburing-devel to make xfstests/ltp/fsstress to do io_uring test, then generic/013 or generic/051 or other cases which use fsstress will trigger this bug. Thanks, Zorro Fixed in 8c511eff1827 ("powerpc/kuap: Allow kernel thread to access userspace after kthread_use_mm") |