Bug 217624 - [REGRESSION] Memory corruption in multithreaded user space program while calling fork
Summary: [REGRESSION] Memory corruption in multithreaded user space program while call...
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: AMD Linux
: P3 high
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-07-02 11:35 UTC by Jacob Young
Modified: 2023-09-01 17:58 UTC (History)
8 users (show)

See Also:
Kernel Version: 6.4.0
Subsystem:
Regression: Yes
Bisected commit-id: c7f8f31c00d187a2c71a241c7f2bd6aa102a4e6f


Attachments

Description Jacob Young 2023-07-02 11:35:20 UTC
After upgrading to kernel version 6.4.0 from 6.3.9, I noticed frequent but random crashes in a user space program.  After a lot of reduction, I have come up with the following reproducer program:

$ uname -a
Linux jacob 6.4.1-gentoo #1 SMP PREEMPT_DYNAMIC Sat Jul  1 19:02:42 EDT 2023 x86_64 AMD Ryzen 9 7950X3D 16-Core Processor AuthenticAMD GNU/Linux
$ cat repro.c
#define _GNU_SOURCE
#include <sched.h>
#include <sys/wait.h>
#include <unistd.h>

void *threadSafeAlloc(size_t n) {
    static size_t end_index = 0;
    static char buffer[1 << 25];
    size_t start_index = __atomic_load_n(&end_index, __ATOMIC_SEQ_CST);
    while (1) {
        if (start_index + n > sizeof(buffer)) _exit(1);
        if (__atomic_compare_exchange_n(&end_index, &start_index, start_index + n, 1, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) return buffer + start_index;
    }
}

int thread(void *arg) {
    size_t i;
    size_t n = 1 << 7;
    char *items;
    (void)arg;
    while (1) {
        items = threadSafeAlloc(n);
        for (i = 0; i != n; i += 1) items[i] = '@';
        for (i = 0; i != n; i += 1) if (items[i] != '@') _exit(2);
    }
}

int main(void) {
    static size_t stacks[2][1 << 9];
    size_t i;
    for (i = 0; i != 2; i += 1) clone(&thread, &stacks[i] + 1, CLONE_THREAD | CLONE_VM | CLONE_SIGHAND, NULL);
    while (1) {
        if (fork() == 0) _exit(0);
        (void)wait(NULL);
    }
}
$ cc repro.c
$ ./a.out
$ echo $?
2

After tuning the various parameters for my computer, exit code 2, which indicates that memory corruption was detected, occurs approximately 99% of the time.  Exit code 1, which occurs approximately 1% of the time, means it ran out of statically-allocated memory before reproducing the issue, and increasing the memory usage any more only leads to diminishing returns.  There is also something like a 0.1% chance that it segfaults due to memory corruption elsewhere than in the statically-allocated buffer.

With this reproducer in hand, I was able to perform the following bisection:

git bisect start
# status: waiting for both good and bad commits
# bad: [6995e2de6891c724bfeb2db33d7b87775f913ad1] Linux 6.4
git bisect bad 6995e2de6891c724bfeb2db33d7b87775f913ad1
# status: waiting for good commit(s), bad commit known
# good: [457391b0380335d5e9a5babdec90ac53928b23b4] Linux 6.3
git bisect good 457391b0380335d5e9a5babdec90ac53928b23b4
# good: [d42b1c47570eb2ed818dc3fe94b2678124af109d] Merge tag 'devicetree-for-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
git bisect good d42b1c47570eb2ed818dc3fe94b2678124af109d
# bad: [58390c8ce1bddb6c623f62e7ed36383e7fa5c02f] Merge tag 'iommu-updates-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
git bisect bad 58390c8ce1bddb6c623f62e7ed36383e7fa5c02f
# good: [888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0] Merge tag 'sysctl-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
git bisect good 888d3c9f7f3ae44101a3fd76528d3dd6f96e9fd0
# bad: [86e98ed15b3e34460d1b3095bd119b6fac11841c] Merge tag 'cgroup-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
git bisect bad 86e98ed15b3e34460d1b3095bd119b6fac11841c
# bad: [7fa8a8ee9400fe8ec188426e40e481717bc5e924] Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect bad 7fa8a8ee9400fe8ec188426e40e481717bc5e924
# bad: [0120dd6e4e202e19a0e011e486fb2da40a5ea279] zram: make zram_bio_discard more self-contained
git bisect bad 0120dd6e4e202e19a0e011e486fb2da40a5ea279
# good: [fce0b4213edb960859dcc65ea414c8efb11948e1] mm/page_alloc: add helper for checking if check_pages_enabled
git bisect good fce0b4213edb960859dcc65ea414c8efb11948e1
# bad: [59f876fb9d68a4d8c20305d7a7a0daf4ee9478a8] mm: avoid passing 0 to __ffs()
git bisect bad 59f876fb9d68a4d8c20305d7a7a0daf4ee9478a8
# good: [0050d7f5ee532f92e8ab1efcec6547bfac527973] afs: split afs_pagecache_valid() out of afs_validate()
git bisect good 0050d7f5ee532f92e8ab1efcec6547bfac527973
# good: [2ac0af1b66e3b66307f53b1cc446514308ec466d] mm: fall back to mmap_lock if vma->anon_vma is not yet set
git bisect good 2ac0af1b66e3b66307f53b1cc446514308ec466d
# skip: [0d2ebf9c3f7822e7ba3e4792ea3b6b19aa2da34a] mm/mmap: free vm_area_struct without call_rcu in exit_mmap
git bisect skip 0d2ebf9c3f7822e7ba3e4792ea3b6b19aa2da34a
# skip: [70d4cbc80c88251de0a5b3e8df3275901f1fa99a] powerc/mm: try VMA lock-based page fault handling first
git bisect skip 70d4cbc80c88251de0a5b3e8df3275901f1fa99a
# good: [444eeb17437a0ef526c606e9141a415d3b7dfddd] mm: prevent userfaults to be handled under per-vma lock
git bisect good 444eeb17437a0ef526c606e9141a415d3b7dfddd
# bad: [e06f47a16573decc57498f2d02f9af3bb3e84cf2] s390/mm: try VMA lock-based page fault handling first
git bisect bad e06f47a16573decc57498f2d02f9af3bb3e84cf2
# skip: [0bff0aaea03e2a3ed6bfa302155cca8a432a1829] x86/mm: try VMA lock-based page fault handling first
git bisect skip 0bff0aaea03e2a3ed6bfa302155cca8a432a1829
# skip: [cd7f176aea5f5929a09a91c661a26912cc995d1b] arm64/mm: try VMA lock-based page fault handling first
git bisect skip cd7f176aea5f5929a09a91c661a26912cc995d1b
# good: [52f238653e452e0fda61e880f263a173d219acd1] mm: introduce per-VMA lock statistics
git bisect good 52f238653e452e0fda61e880f263a173d219acd1
# bad: [c7f8f31c00d187a2c71a241c7f2bd6aa102a4e6f] mm: separate vma->lock from vm_area_struct
git bisect bad c7f8f31c00d187a2c71a241c7f2bd6aa102a4e6f
# only skipped commits left to test
# possible first bad commit: [c7f8f31c00d187a2c71a241c7f2bd6aa102a4e6f] mm: separate vma->lock from vm_area_struct
# possible first bad commit: [0d2ebf9c3f7822e7ba3e4792ea3b6b19aa2da34a] mm/mmap: free vm_area_struct without call_rcu in exit_mmap
# possible first bad commit: [70d4cbc80c88251de0a5b3e8df3275901f1fa99a] powerc/mm: try VMA lock-based page fault handling first
# possible first bad commit: [cd7f176aea5f5929a09a91c661a26912cc995d1b] arm64/mm: try VMA lock-based page fault handling first
# possible first bad commit: [0bff0aaea03e2a3ed6bfa302155cca8a432a1829] x86/mm: try VMA lock-based page fault handling first

I do not usually see any kernel log output while running the program, just occasional logs about user space segfaults.
Comment 1 Sam James 2023-07-02 12:17:46 UTC
Could you report this on the mailing list and CC the commit author?
Comment 2 Sam James 2023-07-02 12:18:13 UTC
(In reply to Sam James from comment #1)
> Could you report this on the mailing list and CC the commit author?

(see https://www.kernel.org/doc/html/v6.4/admin-guide/reporting-issues.html)
Comment 4 Michal Suchánek 2023-07-03 07:53:38 UTC
Might be related to https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/ - might be worth trying to revert the bad commit identified there.
Comment 5 Jacob Young 2023-07-03 08:33:49 UTC
(In reply to Michal Suchánek from comment #4)
> Might be related to
> https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
> - might be worth trying to revert the bad commit identified there.

I can confirm that v6.4 with 0bff0aaea03e2a3ed6bfa302155cca8a432a1829 reverted no longer causes any memory corruption with either my reproducer or the original program.
Comment 6 Holger Hoffstätte 2023-07-03 19:00:16 UTC
Temporary workaround fix posted at:
https://lore.kernel.org/all/20230703182150.2193578-1-surenb@google.com/
Comment 7 Holger Hoffstätte 2023-07-04 14:28:47 UTC
(In reply to Holger Hoffstätte from comment #6)
> Temporary workaround fix posted at:
> https://lore.kernel.org/all/20230703182150.2193578-1-surenb@google.com/

<sigh> ..and of course it doesn't work as expected:
https://lore.kernel.org/all/c2cc745a-22f0-90df-59b0-2abd961cd829@redhat.com/
Comment 8 Sam James 2023-07-07 02:45:17 UTC
The discussion continues at https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/.
Comment 9 Sam James 2023-07-11 06:15:38 UTC
Everything I've seen suggests 6.4.3 should be fine.
Comment 10 Jiri Slaby 2023-07-11 06:21:48 UTC
(In reply to Michal Suchánek from comment #4)
> Might be related to
> https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/

(In reply to Sam James from comment #9)
> Everything I've seen suggests 6.4.3 should be fine.

Yes, go builds fine now too.
Comment 11 Jacob Young 2023-07-12 00:49:17 UTC
(In reply to Jiri Slaby from comment #10)
> (In reply to Michal Suchánek from comment #4)
> > Might be related to
> >
> https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
> 
> (In reply to Sam James from comment #9)
> > Everything I've seen suggests 6.4.3 should be fine.
> 
> Yes, go builds fine now too.

The build system I originally encountered this issue with also works again with CONFIG_PER_VMA_LOCK=y on 6.4.3.
Comment 12 Holger Hoffstätte 2023-09-01 17:58:04 UTC
I think this can be closed now.

Note You need to log in before you can comment on or make changes to this bug.