Passing a read-only incorrectly aligned address into getcpu() causes a kernel panic. I originally found this issue when testing stress-ng using stress-ng --sysbadaddr 1, I've managed to make a short reproducer that can panic the kernel on every invocation of the program. I can reproduce this on mainline kernels (in Debian), tested and reproduced on kernels 6.6.15, 6.9.7 and 6.10.6, so it's been around a while and it's still reproducible on recent kernels. This only occurs on PA-RISC (hppa) kernels and only tested in a QEMU VM since I don't have access to real H/W. cking@hppa:~$ cat crash.c #define _GNU_SOURCE #include <sched.h> #include <sys/mman.h> #include <sys/syscall.h> #include <unistd.h> void main(void) { char *addr; addr = mmap(NULL, 4096, PROT_READ, MAP_ANONYMOUS | MAP_SHARED, -1, 0); if (addr != MAP_FAILED) getcpu((int *)addr, (int *)(1 + addr)); } cking@hppa:~$ gcc crash.c -o crash cking@hppa:~$ ./crash [ 361.158650] Backtrace: [ 361.159621] [<10413c78>] handle_unaligned+0x590/0x710 [ 361.159621] [<10409354>] handle_interruption+0x1dc/0x7b8 [ 361.159621] [<104545d8>] sys_getcpu+0x30/0x74 [ 361.159621] [ 361.159621] [ 361.159621] Page fault: bad address: Code=26 (Data memory access rights trap) at addr f9000000 [ 361.159621] CPU: 2 PID: 749 Comm: crash Not tainted 6.6.15-parisc #1 Debian 6.6.15-2 [ 361.159621] Hardware name: 9000/778/B160L [ 361.159621] [ 361.159621] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI [ 361.159621] PSW: 00000000000001000000000000001111 Not tainted [ 361.159621] r00-03 0004000f 00000000 10413c78 142903c0 [ 361.159621] r04-07 14290080 12a08000 fc000000 f9000001 [ 361.159621] r08-11 00000000 0f3dd280 f9099c20 f9096e58 [ 361.159621] r12-15 00011008 0119c228 00000000 00000001 [ 361.159621] r16-19 14290080 00138428 011b4e00 ff000000 [ 361.159621] r20-23 00000000 00000000 00000000 00000011 [ 361.159621] r24-27 00000000 00000000 14290080 110dd848 [ 361.159621] r28-31 f9000000 00000000 14290400 000003c3 [ 361.159621] sr00-03 000003c3 000003c3 00000000 000003c3 [ 361.159621] sr04-07 00000000 00000000 00000000 00000000 [ 361.159621] [ 361.159621] IASQ: 00000000 00000000 IAOQ: 104135ac 104135b0 [ 361.170517] IIR: 0f945280 ISR: 000003c3 IOR: f9000000 [ 361.170517] CPU: 2 CR30: 12a08000 CR31: 00000000 [ 361.170517] ORIG_R28: 12a08000 [ 361.170517] IAOQ[0]: emulate_stw+0x5c/0x94 [ 361.170517] IAOQ[1]: emulate_stw+0x60/0x94 [ 361.170517] RP(r2): handle_unaligned+0x590/0x710 [ 361.170517] Backtrace: [ 361.170517] [<10413c78>] handle_unaligned+0x590/0x710 [ 361.170517] [<10409354>] handle_interruption+0x1dc/0x7b8 [ 361.170517] [<104545d8>] sys_getcpu+0x30/0x74 [ 361.170517] [ 361.170517] Kernel panic - not syncing: Page fault: bad address [ 361.170517] ---[ end Kernel panic - not syncing: Page fault: bad address ]---
uname -a ./Linux hppa 6.10.6-parisc #1 SMP Debian 6.10.6-1 (2024-08-19) parisc GNU/Linux cking@hppa:~$ ./crash [ 991.661268] handle_unaligned: 190 callbacks suppressed [ 991.661901] Kernel: unaligned access to 0xf8c00001 in sys_getcpu+0x30/0x6c (iir 0xf3cd280) [ 991.677270] Backtrace: [ 991.679232] [<10413c60>] handle_unaligned+0x598/0x758 [ 991.679232] [<10409854>] handle_interruption+0x1dc/0x7b8 [ 991.679232] [<10454dec>] sys_getcpu+0x30/0x6c [ 991.679232] [ 991.679232] [ 991.679232] Page fault: bad address: Code=26 (Data memory access rights trap) at addr f8c00000 [ 991.679232] CPU: 2 PID: 725 Comm: crash Not tainted 6.10.6-parisc #1 Debian 6.10.6-1 [ 991.679232] Hardware name: 9000/778/B160L [ 991.679232] [ 991.679232] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI [ 991.679232] PSW: 00000000000001101111111100001111 Not tainted [ 991.679232] r00-03 0006ff0f 00000000 10413c60 12de83c0 [ 991.679232] r04-07 12de8080 17220940 fc000000 f8c00001 [ 991.679232] r08-11 00000000 0f3cd280 f8fd5c20 f8fd2e58 [ 991.679232] r12-15 00011008 00bdd7d8 00000000 00000001 [ 991.679232] r16-19 12de8080 00138428 00bd5730 ff000000 [ 991.679232] r20-23 00000000 00000000 00000000 12de86d8 [ 991.679232] r24-27 00000000 00000000 12de8080 11148ae8 [ 991.679232] r28-31 00000000 000003c0 12de8400 f8c00000 [ 991.679232] sr00-03 00000000 000003c0 00000000 000003c0 [ 991.679232] sr04-07 00000000 00000000 00000000 00000000 [ 991.679232] [ 991.679232] IASQ: 00000000 00000000 IAOQ: 10413598 1041359c [ 991.679232] IIR: 0ff45280 ISR: 000003c0 IOR: f8c00000 [ 991.679232] CPU: 2 CR30: 17220940 CR31: 00000000 [ 991.679232] ORIG_R28: 00000000 [ 991.679232] IAOQ[0]: emulate_stw+0x5c/0x90 [ 991.679232] IAOQ[1]: emulate_stw+0x60/0x90 [ 991.679232] RP(r2): handle_unaligned+0x598/0x758 [ 991.679232] Backtrace: [ 991.679232] [<10413c60>] handle_unaligned+0x598/0x758 [ 991.679232] [<10409854>] handle_interruption+0x1dc/0x7b8 [ 991.679232] [<10454dec>] sys_getcpu+0x30/0x6c [ 991.679232] [ 991.679232] Kernel panic - not syncing: Page fault: bad address
This *is* a bug in qemu. When running on a physical box, strace shows that the kernel behaves correctly: mmap2(NULL, 4096, PROT_READ, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0xf9000000 getcpu(0xf9000000, 0xf9000001, NULL) = -1 EFAULT (Bad address) exit_group(-1) = ? +++ exited with 255 +++ On an AMD64 box I get a segfault (which seems strange?): mmap(NULL, 4096, PROT_READ, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0x7f2b62c73000 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7f2b62c73000} --- +++ killed by SIGSEGV +++ Segmentation fault qemu-user works OK. will try qemu-system soon.
arm64 and riscv returns EFAULT too, whereas x86 segfault with my tests: Linux debian-11-all-h3-cc-h5 6.10.6-arm64 #1 SMP Debian 6.10.6-1 (2024-08-19) aarch64 GNU/Linux: mmap(NULL, 4096, PROT_READ, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0xffffac174000 getcpu(0xffffac174000, 0xffffac174001, NULL) = -1 EFAULT (Bad address) Linux starfive 5.15.0-starfive #1 SMP Fri Nov 11 06:58:52 EST 2022 riscv64 GNU/Linux: mmap(NULL, 4096, PROT_READ, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0x3fb20df000 getcpu(0x3fb20df000, 0x3fb20df001, NULL) = -1 EFAULT (Bad address) Linux t480 6.1.0-25-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.106-3 (2024-08-26) x86_64 GNU/Linux: mmap(NULL, 4096, PROT_READ, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0x7ff780a24000 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7ff780a24000} --- +++ killed by SIGSEGV +++ so x86-64 does behave differently with the access.
I'd assume the segfault on x86 is because of the vDSO implementation of getcpu() on x86, which executes vdso_read_cpunode() from arch/x86/include/asm/segment.h: static inline void vdso_read_cpunode(unsigned *cpu, unsigned *node) { ... if (cpu) *cpu = (p & VDSO_CPUNODE_MASK); if (node) *node = (p >> VDSO_CPUNODE_BITS);
Initial patch to fix qemu emulation for parisc posted: https://lore.kernel.org/linux-parisc/Zvyx1kM4JljbzxQW@p100/T/#u
Richard Henderson posted another series of patches: https://lists.nongnu.org/archive/html/qemu-devel/2024-10/msg00919.html
This is now fixed in git head of qemu and can be closed: https://gitlab.com/qemu-project/qemu/-/commit/99746de61262fd5cf80eacfdb513e8d40e9107e8
Thank you for you work on this issue. Much appreciated.