Bug 218258

Summary: SELinux mprotect EACCES/execheap for memory segment directly adjacent to heap
Product: Memory Management Reporter: Ilija Tovilo (ilija.tovilo)
Component: OtherAssignee: Andrew Morton (akpm)
Status: RESOLVED CODE_FIX    
Severity: high CC: agurenko, bagasdotme, bugs+kernel, cgzones, kparal, paul
Priority: P2    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Ilija Tovilo 2023-12-12 13:18:26 UTC
Hi! We're running into an issue with SELinux where mprotect will result in a EACCES due to the execheap policy since Kernel 6.6. This happens when the mmap-ed segment lies directly adjacent to the heap. I think this is caused by the following patch:

https://github.com/torvalds/linux/commit/68df1baf158fddc07b6f0333e4c81fe1ccecd6ff

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index d06e350fedee5f..ee8575540a8efc 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3762,13 +3762,10 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 	if (default_noexec &&
 	    (prot & PROT_EXEC) && !(vma->vm_flags & VM_EXEC)) {
 		int rc = 0;
-		if (vma->vm_start >= vma->vm_mm->start_brk &&
-		    vma->vm_end <= vma->vm_mm->brk) {
+		if (vma_is_initial_heap(vma)) {
 			rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECHEAP, NULL);
-		} else if (!vma->vm_file &&
-			   ((vma->vm_start <= vma->vm_mm->start_stack &&
-			     vma->vm_end >= vma->vm_mm->start_stack) ||
+		} else if (!vma->vm_file && (vma_is_initial_stack(vma) ||
 			    vma_is_stack_for_current(vma))) {
 			rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECSTACK, NULL);

Before this patch, selinux_file_mprotect would check whether the original start_brk and brk values lied within the vma segment. However, this does not match the vma_is_initial_heap implementation.

static inline bool vma_is_initial_heap(const struct vm_area_struct *vma)
{
       return vma->vm_start <= vma->vm_mm->brk &&
		vma->vm_end >= vma->vm_mm->start_brk;
}

This function checks whether vma overlaps with start_brk/brk. However, since the comparison includes equality this will also yield true if the segment lies directly before or after the heap. It's possible that equality is handling cases where start_brk == brk and start_brk == vm_start, but I'm not sure. In that case, the following patch might work.

static inline bool vma_is_initial_heap(const struct vm_area_struct *vma)
{
       return (vma->vm_start < vma->vm_mm->brk && vma->vm_end > vma->vm_mm->start_brk)
		|| vma->vm_start == vma->vm_mm->start_brk;
}

Thank you!
Comment 1 Ilija Tovilo 2023-12-12 14:10:00 UTC
Here's a small reproducer, which may be of help:

int main(void)
{
	const size_t size = 2 * 1024 * 1024;
	void *brk = sbrk(0);
	void *mem = mmap(brk, size, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS|MAP_FIXED, -1, 0);
	if (mem == MAP_FAILED) {
		fprintf(stderr, "Address %p wasn't free, try again.\n", brk);
		exit(1);
	}
	if (mprotect(mem, size, PROT_READ|PROT_EXEC) == -1) {
		fprintf(stderr, "mprotect() failed [%d] %s\n", errno, strerror(errno));
		exit(1);
	}
	munmap(mem, size);
}

Results in:

mprotect() failed [13] Permission denied

Adding some padding between the heap using `void *brk = sbrk(0) + size;` solves the problem.
Comment 2 Ilija Tovilo 2023-12-12 15:12:11 UTC
For completion, the inverse also fails, as expected.

#include <stdint.h>
#include <sys/mman.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>

static void *find_start_brk(void)
{
	uintptr_t start, end;
	FILE *f;
	char buffer[4096];
	f = fopen("/proc/self/maps", "r");
	while (fgets(buffer, 4096, f) && sscanf(buffer, "%lx-%lx", &start, &end) == 2) {
		if (strstr(buffer, "[heap]") != NULL) {
			return (void *)start;
		}
	}
	fclose(f);

	return 0;
}

int main(void)
{
	const size_t size = 2 * 1024 * 1024;
	void *start_brk = find_start_brk();
	void *mem = mmap(start_brk - size, size, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS|MAP_FIXED, -1, 0);
	if (mem == MAP_FAILED) {
		fprintf(stderr, "Address %p wasn't free, try again.\n", brk);
		exit(1);
	}
	if (mprotect(mem, size, PROT_READ|PROT_EXEC) == -1) {
		fprintf(stderr, "mprotect() failed [%d] %s\n", errno, strerror(errno));
		exit(1);
	}
	munmap(mem, size);
}
Comment 3 Bagas Sanjaya 2023-12-14 11:53:23 UTC
(In reply to Ilija Tovilo from comment #0)
> Hi! We're running into an issue with SELinux where mprotect will result in a
> EACCES due to the execheap policy since Kernel 6.6. This happens when the
> mmap-ed segment lies directly adjacent to the heap. I think this is caused
> by the following patch:
> 
> https://github.com/torvalds/linux/commit/
> 68df1baf158fddc07b6f0333e4c81fe1ccecd6ff
> 

Hi Ilija,

mm-stable tree should have the fix for this bug [1]. Please test.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-hotfixes-stable&id=d3bb89ea9c13e5a98d2b7a0ba8e50a77893132cb
Comment 4 Kamil Paral 2024-08-01 12:18:47 UTC
Hello Ilija and Bagas. The execheap issue with SELinux doesn't seem to be fully resolved, unfortunately. In Fedora 40, we see a lot of execheap SELinux denials in chromium/electron apps, happening randomly. Chromium and SELinux devs claim that everything seems to be OK on their end. This issue is not present in kernel 6.5, it only happens with kernel 6.6 and later (6.10 is the last one I tested, still affected). The patch from comment 3 is included in Fedora kernel, so either it doesn't fix the issue (fully), or there's still a similar problem somewhere else in the code.

Here's Fedora downstream bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2254434

Could you please look into it? Thanks a lot!
Comment 5 Artem S. Tashkinov 2024-08-03 16:26:12 UTC
Christian Göttsche,

Could you take a look please? Tens of thousands of people are affected.
Comment 6 Artem S. Tashkinov 2024-08-03 16:27:22 UTC
CC'ing Paul Moore for good measure.
Comment 7 Kamil Paral 2024-08-05 07:06:53 UTC
A minimal reproducer is now available here:
https://bugzilla.redhat.com/show_bug.cgi?id=2254434#c106
Comment 8 Artem S. Tashkinov 2024-08-05 12:24:48 UTC
reisner.marc wrote:

It looks like those getrandom(2) calls are done in order to randomize the address space, and the value gotten from getrandom(2) is masked and then passed to mmap.

https://github.com/v8/v8/blob/b85492a46dfbb53a31ae98a77819291e635e2203/src/base/platform/platform-posix.cc#L315

I was able to create a small reproducer C program:

#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <sys/mman.h>
#include <linux/prctl.h>
#include <sys/prctl.h>
#include <sys/random.h>

int main(void)
{
    uintptr_t raw_addr;
    getrandom(&raw_addr, sizeof(raw_addr), 0);
    raw_addr &= 0x3FFFFFFFF000;
    void* pointer = mmap((void *)raw_addr, 536870912, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, pointer, 536870912, "v8");
    if (mprotect(pointer, 536870912, PROT_READ | PROT_WRITE | PROT_EXEC) == -1)
    {
        printf("%s", strerror(errno));
        return 1;
    }
    return 0;
}


If I run it in a loop like this, it eventually exits with "Permission denied", and I get the avc.

$ while true; do ./a.out || break; done
Permission denied
Comment 9 Paul Moore 2024-08-05 23:27:16 UTC
A couple of quick comments:

* Please post this to the upstream SELinux mailing list on vger.kernel.org; the mailing list is where upstream development, debugging, discussion, etc. happens.  I apologize, but I don't currently have the time to track the kernel.org bugzilla.

* The reproducer is a good first step, but it would be nice if on failure it would dump the memory mappings (see /proc/self/maps for an example) and the address which caused the access denial.  We need to understand where the offending memory region is located in relation to the heap (this is still an execheap denial, yes?).

* Taking a quick look at the kernel code, nothing is jumping out at me in selinux_file_mprotect() or the vma_is_initial_heap() helper.  This will need more eyes, more investigation, or some combination of the two.
Comment 10 Kamil Paral 2024-08-08 10:27:37 UTC
Just to make this bugzilla up-to-date, an improved reproducer is now available at:
https://bugzilla.redhat.com/show_bug.cgi?id=2254434#c138

It is now believed to be a memory management bug in kernel.

The problem has now been reported on a kernel mailing list here:
https://lore.kernel.org/all/ZrPmoLKJEf1wiFmM@marcreisner.com/
Comment 11 Artem S. Tashkinov 2024-08-22 10:27:34 UTC
Fixed by https://lore.kernel.org/all/30fc5b38165e4eda57d640eca76b7df1@paul-moore.com/

The fix will be available in 6.11 and it's been backported to Fedora's 6.10.6.
Comment 12 Artem S. Tashkinov 2024-08-24 17:38:02 UTC
The fix will be included in 6.10.7 as well.