Bug 215720 - brk() regression on AArch64 on static-pie binary -- issue with ASLR and a guard page?
Summary: brk() regression on AArch64 on static-pie binary -- issue with ASLR and a gua...
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
Depends on:
Reported: 2022-03-22 02:24 UTC by Victor Stinner
Modified: 2022-04-27 15:13 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.17.0
Regression: No
Bisected commit-id:

empty.c reproducer (86 bytes, text/plain)
2022-03-22 02:24 UTC, Victor Stinner

Description Victor Stinner 2022-03-22 02:24:57 UTC
Created attachment 300597 [details]
empty.c reproducer

I found a brk() syscall regression of Linux kernel 5.17 on AArch64.

A git bisect found the change "fs/binfmt_elf: use PT_LOAD p_align values for static PIE": commit 9630f0d60fec5fbcaa4435a66f75df1dc9704b66, changed related to the bz#215275.

Program to reproduce the bug, empty.c (attached to the issue):
_Thread_local int var1 = 0;
int main() {
    volatile int x = 1;
    var1 = x;
    return 0;

Build the program as a static PIE program:

    gcc -std=c11 -static-pie -g empty.c -o empty -O2

The program fails randomly, it takes 100 to 6000 runs to reproduce the crash.

Short shell loop to reproduce the crash:
$ i=0; while true; do ./empty; rc=$?; i=$(($i + 1)); echo "$i:
$(date): $rc"; if [ $rc -ne 0 ]; then break; fi; done
159: Tue Mar 22 01:54:22 CET 2022: 0
160: Tue Mar 22 01:54:22 CET 2022: 0
Segmentation fault (core dumped)
161: Tue Mar 22 01:54:22 CET 2022: 139

Disabling ASLR (write 0 to /proc/sys/kernel/randomize_va_space) works
around the bug.

Rather than using "empty.c" program, the "ldconfig -V > /dev/null" command can be used: standard static-pie program.

strace when the program works:
brk(NULL)                               = 0xaaaac3961000
brk(0xaaaac3961b78)                     = 0xaaaac3961b78

strace when the bug occurs:
brk(NULL)                               = 0xaaaabf3c3000
brk(0xaaaabf3c3b78)                     = 0xaaaabf3c3000

The following test of the brk() syscall fails when the bug occurs:
	/* Check against existing mmap mappings. */
	next = find_vma(mm, oldbrk);
	if (next && newbrk + PAGE_SIZE > vm_start_gap(next))
		goto out;

Note: When the bug occurs, the program crash with SIGSEGV: the glibc __libc_setup_tls() function calls sbrk(2936) to allocate TLS variables, but it doesn't handle the memory allocation failure.

Note: At the beginning, I discovered this kernel regression while checking for Python
buildbot failures on our Fedora Rawhide AArch64 machine.

* Fedora downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=2066147
* Python issue: https://bugs.python.org/issue47078
Comment 1 Victor Stinner 2022-03-22 02:41:00 UTC
See also the binutils issue: "p_align in ELF program headers should not exceed section alignment"

See also this old (kernel 4.18) fixed x86-64 kernel bug: "kernel: brk can grow the heap into the area reserved for the stack"
Comment 2 Florian Weimer 2022-04-27 15:13:54 UTC
Apparently the revert made it into v5.18-rc3:

commit 354e923df042a11d1ab8ca06b3ebfab3a018a4ec
Author: Andrew Morton <akpm@linux-foundation.org>
Date:   Thu Apr 14 19:13:55 2022 -0700

    revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders"
    Commit 925346c129da11 ("fs/binfmt_elf: fix PT_LOAD p_align values for
    loaders") was an attempt to fix regressions due to 9630f0d60fec5f
    ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE").

commit aeb7923733d100b86c6bc68e7ae32913b0cec9d8
Author: Andrew Morton <akpm@linux-foundation.org>
Date:   Thu Apr 14 19:13:58 2022 -0700

    revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE"

It was Cc:ed to <stable@vger.kernel.org>, so hopefully it will make it into a 5.17.z kernel, too.

Note You need to log in before you can comment on or make changes to this bug.