Bug 214913
Summary: | [xfstests generic/051] BUG: Kernel NULL pointer dereference on read at 0x00000108 NIP [c0000000000372e4] tm_cgpr_active+0x14/0x40 | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Zorro Lang (zlang) |
Component: | PPC-64 | Assignee: | platform_ppc-64 |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | hramrach, michael |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | mainline linux v5.15 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | .config file |
Description
Zorro Lang
2021-11-02 09:27:48 UTC
Created attachment 299403 [details]
.config file
Thanks for the report, I agree this looks like a powerpc bug not an XFS bug. I won't have time to look at this until next week probably, unless someone beats me to it. What CPU is this? Does it go away if you boot with ppc_tm=off (In reply to Michal Suchanek from comment #3) > What CPU is this? > > Does it go away if you boot with ppc_tm=off (In reply to Michael Ellerman from comment #2) > Thanks for the report, I agree this looks like a powerpc bug not an XFS bug. > > I won't have time to look at this until next week probably, unless someone > beats me to it. Thanks for you reply. (Un)fortunately, due to linux keeps updating, I can't reproduce this panic on latest mainline linux master branch now. The HEAD commit is 7ddb58cb0eca. From 8bb7eca972ad (v5.15) to 7ddb58cb0eca (v5.15+), there're many changes, I can't sure which commit fixes this bug, or hide it? Do you know if there was a known issue about this has been fixed? Thanks, Zorro Sorry I don't have any idea which commit could have fixed this. The process that crashed was "fsstress", do you know if it uses io_uring? FYI, still hit this issue on linux 6.1.0-rc8+. And it's nearly 100% reproducible. [ 1581.047788] run fstests generic/051 at 2022-12-10 11:28:27 [ 1582.574596] XFS (sda3): Mounting V5 Filesystem [ 1582.638653] XFS (sda3): Ending clean mount [ 1582.646329] XFS (sda3): User initiated shutdown received. [ 1582.646397] XFS (sda3): Metadata I/O Error (0x4) detected at xfs_fs_goingdown+0x68/0x160 [xfs] (fs/xfs/xfs_fsops.c:483). Shutting down filesystem. [ 1582.646506] XFS (sda3): Please unmount the filesystem and rectify the problem(s) [ 1582.692102] XFS (sda3): Unmounting Filesystem [ 1584.011651] XFS (sda3): Mounting V5 Filesystem [ 1584.123764] XFS (sda3): Ending clean mount [ 1605.168286] restraintd[3598]: *** Current Time: Sat Dec 10 11:28:52 2022 Localwatchdog at: Mon Dec 12 11:03:52 2022 [ 1614.846132] XFS (sda3): Unmounting Filesystem [ 1615.569693] XFS (sda3): Mounting V5 Filesystem [ 1615.725272] XFS (sda3): Ending clean mount [ 1650.793064] XFS (sda3): User initiated shutdown received. [ 1650.793108] XFS (sda3): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xf8/0x160 [xfs] (fs/xfs/xfs_fsops.c:486). Shutting down filesystem. [ 1650.793200] XFS (sda3): Please unmount the filesystem and rectify the problem(s) [ 1650.801605] Kernel attempted to read user page (108) - exploit attempt? (uid: 0) [ 1650.801625] BUG: Kernel NULL pointer dereference on read at 0x00000108 [ 1650.801638] Faulting instruction address: 0xc000000000036154 [ 1650.801652] Oops: Kernel access of bad area, sig: 11 [#1] [ 1650.801660] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries [ 1650.801671] Modules linked in: dm_flakey dm_mod bonding tls rfkill sunrpc pseries_rng drm fuse drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp vmx_crypto [ 1650.801727] CPU: 0 PID: 382724 Comm: fsstress Kdump: loaded Not tainted 6.1.0-rc8+ #1 [ 1650.801739] Hardware name: IBM,8375-42A POWER9 (raw) 0x4e0202 0xf000005 of:IBM,FW940.02 (VL940_041) hv:phyp pSeries [ 1650.801743] Kernel attempted to read user page (108) - exploit attempt? (uid: 0) [ 1650.801748] NIP: c000000000036154 LR: c0000000006f67b4 CTR: c000000000036140 [ 1650.801755] BUG: Kernel NULL pointer dereference on read at 0x00000108 [ 1650.801759] REGS: c00000004eb7b480 TRAP: 0300 Not tainted (6.1.0-rc8+) [ 1650.801764] Faulting instruction address: 0xc000000000036154 [ 1650.801769] MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 88004400 XER: 00000000 [ 1650.801809] CFAR: c00000000000c9d4 DAR: 0000000000000108 DSISR: 40000000 IRQMASK: 0 [ 1650.801809] GPR00: c0000000006f67b4 c00000004eb7b720 c0000000016c0600 0000000000000000 [ 1650.801809] GPR04: c000000001690ef8 0000000000000000 0000000000000000 c00000004b72a900 [ 1650.801809] GPR08: c000000001506ee8 0000000000000000 0000000000000009 0000000000000000 [ 1650.801809] GPR12: c000000000036140 c0000000051e0000 0000000000000000 00007fff96f879b0 [ 1650.801809] GPR16: 00007fff970941d0 ffffffffffffffff 0000000000000005 c00000004484a400 [ 1650.801809] GPR20: c00000004484aeb8 0000000000040100 0000000000000001 c000000001489d58 [ 1650.801809] GPR24: 00000000ffffffff c00000004eb7b8b0 0000000000000004 c0000000011531e8 [ 1650.801809] GPR28: 0000000000000108 c00000004be38400 0000000000000004 c000000001690ef8 [ 1650.801927] NIP [c000000000036154] tm_cgpr_active+0x14/0x40 [ 1650.801939] LR [c0000000006f67b4] fill_thread_core_info+0x1d4/0x290 [ 1650.801951] Call Trace: [ 1650.801955] [c00000004eb7b720] [c0000000006f673c] fill_thread_core_info+0x15c/0x290 (unreliable) [ 1650.801971] [c00000004eb7b7a0] [c0000000006f6fd4] fill_note_info+0x1f4/0x390 [ 1650.801984] [c00000004eb7b810] [c0000000006f71fc] elf_core_dump+0x8c/0x580 [ 1650.801997] [c00000004eb7ba00] [c0000000006fcc10] do_coredump+0x330/0xca0 [ 1650.802012] [c00000004eb7bbd0] [c000000000174f94] get_signal+0x7f4/0x8f0 [ 1650.802024] [c00000004eb7bcb0] [c000000000020d2c] do_signal+0x7c/0x330 [ 1650.802036] [c00000004eb7bd50] [c000000000022010] do_notify_resume+0xb0/0x140 [ 1650.802049] [c00000004eb7bd80] [c000000000030550] interrupt_exit_user_prepare_main+0x1d0/0x290 [ 1650.802062] [c00000004eb7bde0] [c0000000000306f4] syscall_exit_prepare+0xe4/0x1f0 [ 1650.802074] [c00000004eb7be10] [c00000000000bffc] system_call_vectored_common+0xfc/0x280 [ 1650.802089] --- interrupt: 3000 at 0x7fff96de315c [ 1650.802099] NIP: 00007fff96de315c LR: 0000000000000000 CTR: 0000000000000000 [ 1650.802107] REGS: c00000004eb7be80 TRAP: 3000 Not tainted (6.1.0-rc8+) [ 1650.802115] MSR: 800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE> CR: 42004404 XER: 00000000 [ 1650.802141] IRQMASK: 0 [ 1650.802141] GPR00: 00000000000000fa 00007fffc54a96a0 00007fff96f87200 0000000000000000 [ 1650.802141] GPR04: 000000000005d704 0000000000000006 0000000000000000 0000000000000000 [ 1650.802141] GPR08: 00007fff96f81f68 0000000000000000 0000000000000000 0000000000000000 [ 1650.802141] GPR12: 0000000000000000 00007fff9709b1c0 0000000000000000 00007fff96f879b0 [ 1650.802141] GPR16: 00007fff970941d0 ffffffffffffffff 0000000010030bec 00000000100152e8 [ 1650.802141] GPR20: 0000000000000000 0000000000000000 00007fffc54bdfee 0000000000000001 [ 1650.802141] GPR24: 0000000010009800 00000000100131a8 8f5c28f5c28f5c29 028f5c28f5c28f5c [ 1650.802141] GPR28: 0000000000000006 ffffffffffffffff 00007fff97093980 000000000005d704 [ 1650.802249] NIP [00007fff96de315c] 0x7fff96de315c [ 1650.802258] LR [0000000000000000] 0x0 [ 1650.802266] --- interrupt: 3000 [ 1650.802272] Instruction dump: [ 1650.802279] 4bfe87d5 60000000 e8010040 38210030 ebe1fff8 7c0803a6 4e800020 7c0802a6 [ 1650.802305] 60000000 60000000 e9232aa0 38600000 <e9290108> 7929e844 79291f43 41820008 [ 1650.802330] ---[ end trace 0000000000000000 ]--- [ 1650.813469] [ 1650.813475] Oops: Kernel access of bad area, sig: 11 [#2] [ 1650.813480] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries [ 1650.813488] Modules linked in: dm_flakey dm_mod bonding tls rfkill sunrpc pseries_rng drm fuse drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp vmx_crypto [ 1650.813524] CPU: 4 PID: 382723 Comm: fsstress Kdump: loaded Tainted: G D 6.1.0-rc8+ #1 [ 1650.813532] Hardware name: IBM,8375-42A POWER9 (raw) 0x4e0202 0xf000005 of:IBM,FW940.02 (VL940_041) hv:phyp pSeries [ 1650.813537] NIP: c000000000036154 LR: c0000000006f67b4 CTR: c000000000036140 [ 1650.813541] REGS: c00000004eb4b480 TRAP: 0300 Tainted: G D (6.1.0-rc8+) [ 1650.813546] MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 88004400 XER: 20040000 [ 1650.813562] CFAR: c00000000000c9d4 DAR: 0000000000000108 DSISR: 40000000 IRQMASK: 0 [ 1650.813562] GPR00: c0000000006f67b4 c00000004eb4b720 c0000000016c0600 0000000000000000 [ 1650.813562] GPR04: c000000001690ef8 0000000000000000 0000000000000000 c0000000437e4800 [ 1650.813562] GPR08: c000000001506ee8 0000000000000000 0000000000000009 0000000000000000 [ 1650.813562] GPR12: c000000000036140 c00000000ffcc480 0000000000000000 00007fff96f879b0 [ 1650.813562] GPR16: 00007fff970941d0 ffffffffffffffff 0000000000000005 c000000044810e00 [ 1650.813562] GPR20: c0000000448118b8 0000000000040100 0000000000000001 c000000001489d58 [ 1650.813562] GPR24: 00000000ffffffff c00000004eb4b8b0 0000000000000004 c0000000011531e8 [ 1650.813562] GPR28: 0000000000000108 c00000003235f000 0000000000000004 c000000001690ef8 [ 1650.813619] NIP [c000000000036154] tm_cgpr_active+0x14/0x40 [ 1650.813625] LR [c0000000006f67b4] fill_thread_core_info+0x1d4/0x290 [ 1650.813632] Call Trace: [ 1650.813634] [c00000004eb4b720] [c0000000006f673c] fill_thread_core_info+0x15c/0x290 (unreliable) [ 1650.813643] [c00000004eb4b7a0] [c0000000006f6fd4] fill_note_info+0x1f4/0x390 [ 1650.813650] [c00000004eb4b810] [c0000000006f71fc] elf_core_dump+0x8c/0x580 [ 1650.813657] [c00000004eb4ba00] [c0000000006fcc10] do_coredump+0x330/0xca0 [ 1650.813662] [c00000004eb4bbd0] [c000000000174f94] get_signal+0x7f4/0x8f0 [ 1650.813668] [c00000004eb4bcb0] [c000000000020d2c] do_signal+0x7c/0x330 [ 1650.813674] [c00000004eb4bd50] [c000000000022010] do_notify_resume+0xb0/0x140 [ 1650.813681] [c00000004eb4bd80] [c000000000030550] interrupt_exit_user_prepare_main+0x1d0/0x290 [ 1650.813687] [c00000004eb4bde0] [c0000000000306f4] syscall_exit_prepare+0xe4/0x1f0 [ 1650.813693] [c00000004eb4be10] [c00000000000bffc] system_call_vectored_common+0xfc/0x280 [ 1650.813700] --- interrupt: 3000 at 0x7fff96de315c [ 1650.813705] NIP: 00007fff96de315c LR: 0000000000000000 CTR: 0000000000000000 [ 1650.813709] REGS: c00000004eb4be80 TRAP: 3000 Tainted: G D (6.1.0-rc8+) [ 1650.813713] MSR: 800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE> CR: 42004404 XER: 00000000 [ 1650.813725] IRQMASK: 0 [ 1650.813725] GPR00: 00000000000000fa 00007fffc54a9b90 00007fff96f87200 0000000000000000 [ 1650.813725] GPR04: 000000000005d703 0000000000000006 0000000000000000 0000000000000000 [ 1650.813725] GPR08: 00007fff96f81f68 0000000000000000 0000000000000000 0000000000000000 [ 1650.813725] GPR12: 0000000000000000 00007fff9709b1c0 0000000000000000 00007fff96f879b0 [ 1650.813725] GPR16: 00007fff970941d0 ffffffffffffffff 0000000010030bec 00000000100152e8 [ 1650.813725] GPR20: 0000000000000000 0000000000000000 00007fffc54bdfee 0000000000000001 [ 1650.813725] GPR24: 0000000010010460 00000000100131a8 8f5c28f5c28f5c29 028f5c28f5c28f5c [ 1650.813725] GPR28: 0000000000000006 0000000000000005 00007fff97093980 000000000005d703 [ 1650.813778] NIP [00007fff96de315c] 0x7fff96de315c [ 1650.813782] LR [0000000000000000] 0x0 [ 1650.813785] --- interrupt: 3000 [ 1650.813788] Instruction dump: [ 1650.813791] 4bfe87d5 60000000 e8010040 38210030 ebe1fff8 7c0803a6 4e800020 7c0802a6 [ 1650.813801] 60000000 60000000 e9232aa0 38600000 <e9290108> 7929e844 79291f43 41820008 [ 1650.813811] ---[ end trace 0000000000000000 ]--- (In reply to Michael Ellerman from comment #5) > Sorry I don't have any idea which commit could have fixed this. > > The process that crashed was "fsstress", do you know if it uses io_uring? Yes, fsstress has io_uring read/write operations. And from the kernel .config file(as attachment), the CONFIG_IO_URING=y On Sun Dec 11, 2022 at 11:19 PM AEST, wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=214913 > > --- Comment #7 from Zorro Lang (zlang@redhat.com) --- > (In reply to Michael Ellerman from comment #5) > > Sorry I don't have any idea which commit could have fixed this. > > > > The process that crashed was "fsstress", do you know if it uses io_uring? > > Yes, fsstress has io_uring read/write operations. And from the kernel .config > file(as attachment), the CONFIG_IO_URING=y The task being dumped seems like it's lost its task->thread.regs. The NULL pointer is here: int tm_cgpr_active(struct task_struct *target, const struct user_regset *regset) { if (!cpu_has_feature(CPU_FTR_TM)) return -ENODEV; if (!MSR_TM_ACTIVE(target->thread.regs->msr)) return 0; return regset->n; } On that regs->msr deref. r9 contains the regs pointer. The kernel attempt to read user page - exploit attempt? message is I think a red herring it's coming up because of the NULL deref I think (I thought we fixed that). Anyway I'm not sure how we could lose regs, all user threads should have them set to non-NULL. It doesn't look like we can collect threads for dumping before we have called copy_thread(), which is where they get thread.regs set. AFAIK it's not supposed to change after that. Would you be able to try this patch, hopefully it catches the problem thread on the exit side, and gives a clue why regs is NULL. Thanks, Nick --- diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 6a11025e5850..ece63b3d2304 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1898,9 +1898,21 @@ static int fill_note_info(struct elfhdr *elf, int phdrs, /* * Now fill in each thread's information. */ - for (t = info->thread; t != NULL; t = t->next) + for (t = info->thread; t != NULL; t = t->next) { + if (!t->task) { + WARN_ON(1); + printk("core info lost task\n"); + continue; + } + if (!t->task->thread.regs) { + WARN_ON(1); + printk("lost regs pid:%d (current->pid:%d)\n", t->task->pid, current->pid); + continue; + } + if (!fill_thread_core_info(t, view, cprm->siginfo->si_signo, info)) return 0; + } /* * Fill in the two process-wide notes. diff --git a/kernel/exit.c b/kernel/exit.c index 35e0a31a0315..6820fe333081 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -366,6 +366,8 @@ static void coredump_task_exit(struct task_struct *tsk) if (core_state) { struct core_thread self; + WARN_ON(!current->thread.regs); + self.task = current; if (self.task->flags & PF_SIGNALED) self.next = xchg(&core_state->dumper.next, &self); I assume it's an io_uring IO worker. They're created via create_io_worker() -> create_io_thread(). They pass a non-NULL `args->fn` to copy_process() -> copy_thread(), so we end up in the "kernel thread" branch of the if, which sets p->thread.regs = NULL. On Mon Dec 12, 2022 at 3:57 PM AEST, wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=214913 > > --- Comment #9 from Michael Ellerman (michael@ellerman.id.au) --- > I assume it's an io_uring IO worker. > > They're created via create_io_worker() -> create_io_thread(). > > They pass a non-NULL `args->fn` to copy_process() -> copy_thread(), so we end > up in the "kernel thread" branch of the if, which sets p->thread.regs = NULL. Hmm, you might be right. These things are created with the memory and thread / signal context shared with the userspace process. Still doesn't seem like they should be involved in core dumping though, pt_regs would have no meaning even if we did set something there. How best to catch these and filter them out of the core dump? Check for PF_IO_WORKER in the coredump gathering? Thanks, Nick Le 12/12/2022 à 04:52, Nicholas Piggin a écrit : > On Sun Dec 11, 2022 at 11:19 PM AEST, wrote: >> https://bugzilla.kernel.org/show_bug.cgi?id=214913 >> >> --- Comment #7 from Zorro Lang (zlang@redhat.com) --- >> (In reply to Michael Ellerman from comment #5) >>> Sorry I don't have any idea which commit could have fixed this. >>> >>> The process that crashed was "fsstress", do you know if it uses io_uring? >> >> Yes, fsstress has io_uring read/write operations. And from the kernel >> .config >> file(as attachment), the CONFIG_IO_URING=y > > The task being dumped seems like it's lost its task->thread.regs. The > NULL pointer is here: > > int tm_cgpr_active(struct task_struct *target, const struct user_regset > *regset) > { > if (!cpu_has_feature(CPU_FTR_TM)) > return -ENODEV; > > if (!MSR_TM_ACTIVE(target->thread.regs->msr)) > return 0; > > return regset->n; > } > > On that regs->msr deref. r9 contains the regs pointer. > > The kernel attempt to read user page - exploit attempt? message is > I think a red herring it's coming up because of the NULL deref I > think (I thought we fixed that). > No we didn't fix that, my patch was rejected see https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8b865b93d25c15c8e6d41e71c368bfc28da4489d.1606816701.git.christophe.leroy@csgroup.eu/ The reason for the rejection was: The first page can be mapped if mmap_min_addr is 0. Blocking all faults to the first page would potentially break any program that does that. Also if there is something mapped at 0 it's a good chance it is an exploit attempt :) Christophe I believe this was fixed by the series merged as: https://git.kernel.org/powerpc/c/89fb39134ae3b1e1f207af44a037721d92b32f70 Which was merged into v6.4. |