Bug 215389 - pagealloc: memory corruption with VMAP_STACK=y set and burdening the memory subsystem via "stress -c 2 --vm 2 --vm-bytes 896M"
Summary: pagealloc: memory corruption with VMAP_STACK=y set and burdening the memory s...
Status: ASSIGNED
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: PPC-32 (show other bugs)
Hardware: PPC-32 Linux
: P1 normal
Assignee: platform_ppc-32
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-12-22 17:52 UTC by Erhard F.
Modified: 2023-10-26 23:46 UTC (History)
3 users (show)

See Also:
Kernel Version: 5.15.10
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg (5.15.10, PowerMac G4 DP) (46.82 KB, text/plain)
2021-12-22 17:52 UTC, Erhard F.
Details
kernel .config (5.15.10, PowerMac G4 DP) (107.04 KB, text/plain)
2021-12-22 18:00 UTC, Erhard F.
Details
bisect.log (3.57 KB, text/plain)
2022-01-25 22:23 UTC, Erhard F.
Details
dmesg (5.10-rc2 with ADB_PMU disabled, PowerMac G4 DP) (5.23 KB, text/plain)
2022-01-30 20:12 UTC, Erhard F.
Details
dmesg (5.18-rc3, PowerMac G4 DP) (53.33 KB, text/plain)
2022-04-18 13:54 UTC, Erhard F.
Details
kernel .config (5.18-rc3, PowerMac G4 DP) (108.83 KB, text/plain)
2022-04-18 13:54 UTC, Erhard F.
Details
dmesg (5.18-rc6, CONFIG_LOWMEM_SIZE=0x28000000, PowerMac G4 DP) (39.61 KB, text/plain)
2022-05-11 07:43 UTC, Erhard F.
Details
kernel .config (5.18-rc6, CONFIG_LOWMEM_SIZE=0x28000000, PowerMac G4 DP) (108.91 KB, text/plain)
2022-05-11 07:44 UTC, Erhard F.
Details
dmesg (5.18-rc6, CONFIG_LOWMEM_SIZE=0x28000000, outline KASAN, PowerMac G4 DP) (52.13 KB, text/plain)
2022-05-16 18:51 UTC, Erhard F.
Details
kernel .config (5.18-rc6, CONFIG_LOWMEM_SIZE=0x28000000, outline KASAN, PowerMac G4 DP) (108.89 KB, text/plain)
2022-05-16 18:52 UTC, Erhard F.
Details
dmesg (5.19-rc4, PowerMac G4 DP) (38.55 KB, text/plain)
2022-06-28 23:01 UTC, Erhard F.
Details
kernel .config (5.19-rc4, PowerMac G4 DP) (109.57 KB, text/plain)
2022-06-28 23:02 UTC, Erhard F.
Details
dmesg (5.19-rc5, outline KASAN, PowerMac G4 DP) (36.45 KB, text/plain)
2022-07-05 16:02 UTC, Erhard F.
Details
dmesg (6.0-rc2, outline KASAN, PowerMac G4 DP) (85.43 KB, text/plain)
2022-08-23 21:45 UTC, Erhard F.
Details
kernel .config (6.0-rc2, PowerMac G4 DP) (110.15 KB, text/plain)
2022-08-23 21:56 UTC, Erhard F.
Details
dmesg (6.3.3, KCSAN, PowerMac G4 DP) (243.22 KB, text/plain)
2023-05-23 19:55 UTC, Erhard F.
Details
kernel .config (6.3.3, PowerMac G4 DP) (109.64 KB, text/plain)
2023-05-23 19:56 UTC, Erhard F.
Details
dmesg (5.5-rc5, PowerMac G4 DP) (38.85 KB, text/plain)
2023-10-26 23:40 UTC, Erhard F.
Details
attachment-30247-0.html (158 bytes, text/html)
2023-10-26 23:41 UTC, Christophe Leroy
Details
kernel .config (5.5-rc5, PowerMac G4 DP) (86.70 KB, text/plain)
2023-10-26 23:46 UTC, Erhard F.
Details

Description Erhard F. 2021-12-22 17:52:57 UTC
Created attachment 300113 [details]
dmesg (5.15.10, PowerMac G4 DP)

Happens at running the glibc-2.33 testsuite on my G4 DP.

[...]
[ 5503.973022] pagealloc: memory corruption
[ 5503.973226] fffdfff0: 00 00 00 00                                      ....
[ 5503.973469] CPU: 0 PID: 15826 Comm: ld.so.1 Tainted: G        W         5.15.10-gentoo-PowerMacG4 #3
[ 5503.973791] Call Trace:
[ 5503.973849] [f61edc20] [c03e8644] dump_stack_lvl+0x60/0x80 (unreliable)
[ 5503.974096] [f61edc40] [c016ece8] __kernel_unpoison_pages+0x13c/0x174
[ 5503.974320] [f61edc90] [c015aa64] post_alloc_hook+0x60/0xb4
[ 5503.974511] [f61edcb0] [c015aadc] prep_new_page+0x24/0x5c
[ 5503.974687] [f61edcd0] [c015be14] get_page_from_freelist+0x26c/0x548
[ 5503.974898] [f61edd50] [c015c5d8] __alloc_pages+0xc8/0x7a4
[ 5503.975080] [f61eddf0] [c0146470] alloc_zeroed_user_highpage_movable.constprop.0+0x18/0x48
[ 5503.975358] [f61ede10] [c01467a8] wp_page_copy+0x58/0x4a4
[ 5503.975534] [f61ede80] [c0149df4] handle_mm_fault+0x72c/0x864
[ 5503.975725] [f61edf00] [c001a9dc] do_page_fault+0x578/0x6c8
[ 5503.975919] [f61edf30] [c000424c] DataAccess_virt+0xd4/0xe4
[ 5503.976102] --- interrupt: 300 at 0x6ffc5eb0
[ 5503.976228] NIP:  6ffc5eb0 LR: 6ffc5e84 CTR: c0335cb0
[ 5503.976383] REGS: f61edf40 TRAP: 0300   Tainted: G        W          (5.15.10-gentoo-PowerMacG4)
[ 5503.976684] MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 840022c8  XER: 20000000
[ 5503.976929] DAR: a78032e4 DSISR: 0a000000 
               GPR00: 6ffc60bc af9a9650 a7a15550 0064c9ac 00896b60 00000009 bcecbe5c 001282d4 
               GPR08: 00899280 a78032e4 a7809068 f61edf30 240022c2 6ffece34 008a1a90 00000001 
               GPR16: 00000000 0064c9ac 0064c9e8 0064c980 008a1830 0064b8f4 0000000f 00000009 
               GPR24: 00896b60 bcecbe5c 000002c6 a7828774 a76db010 000083a7 6fff4cdc 0064c9ac 
[ 5504.008476] NIP [6ffc5eb0] 0x6ffc5eb0
[ 5504.018630] LR [6ffc5e84] 0x6ffc5e84
[ 5504.028738] --- interrupt: 300
[ 5504.038956] page:ef4c8e34 refcount:1 mapcount:0 mapping:00000000 index:0x1 pfn:0x31065
[ 5504.049340] flags: 0x80000000(zone=2)
[ 5504.059763] raw: 80000000 00000100 00000122 00000000 00000001 00000000 ffffffff 00000001
[ 5504.070297] raw: 00000000
[ 5504.080511] page dumped because: pagealloc: corrupted page details

The machine stays usable afterwards. Happened also a 2nd time after a reboot, again at building glibc-2.33 and running  testsuite:

[...]
[ 2946.948834] pagealloc: memory corruption
[ 2946.949078] fffcfff0: 00 00 00 00                                      ....
[ 2946.949419] CPU: 1 PID: 31318 Comm: ld.so.1 Tainted: G        W         5.15.10-gentoo-PowerMacG4 #3
[ 2946.949753] Call Trace:
[ 2946.949814] [f5c21b00] [c03e8644] dump_stack_lvl+0x60/0x80 (unreliable)
[ 2946.950054] [f5c21b20] [c016ece8] __kernel_unpoison_pages+0x13c/0x174
[ 2946.950281] [f5c21b70] [c015aa64] post_alloc_hook+0x60/0xb4
[ 2946.950476] [f5c21b90] [c015aadc] prep_new_page+0x24/0x5c
[ 2946.950651] [f5c21bb0] [c015be14] get_page_from_freelist+0x26c/0x548
[ 2946.950865] [f5c21c30] [c015c5d8] __alloc_pages+0xc8/0x7a4
[ 2946.951053] [f5c21cd0] [c011f6d4] pagecache_get_page+0x184/0x1fc
[ 2946.951259] [f5c21d30] [c029fd34] prepare_pages+0x80/0x14c
[ 2946.951442] [f5c21d80] [c02a28dc] btrfs_buffered_write+0x2b8/0x54c
[ 2946.951653] [f5c21e20] [c02a4700] btrfs_file_write_iter+0x340/0x368
[ 2946.951876] [f5c21e70] [c01892fc] vfs_write+0x18c/0x1dc
[ 2946.952057] [f5c21ef0] [c0189484] ksys_write+0x74/0xb8
[ 2946.952231] [f5c21f30] [c0015098] ret_from_syscall+0x0/0x28
[ 2946.952420] --- interrupt: c00 at 0x6fecc128
[ 2946.952547] NIP:  6fecc128 LR: 6fecc100 CTR: 00000001
[ 2946.952704] REGS: f5c21f40 TRAP: 0c00   Tainted: G        W          (5.15.10-gentoo-PowerMacG4)
[ 2946.953008] MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 24022448  XER: 00000000
[ 2946.953267] 
               GPR00: 00000004 afad5d90 a7b83550 00000009 afad5e9c 00002000 00000000 6fecbfe8 
               GPR08: 0000d032 402c551a 402c5409 f5c21f30 84022448 6ffeee28 007889b0 afad8070 
               GPR16: afad7fa0 afad8008 00000000 00000000 00008000 00000008 00976000 001c5bcc 
               GPR24: 00000000 afad5e9c 00002000 00000009 afad7e9c 00000000 6ffbaff4 afad5e9c 
[ 2946.975430] NIP [6fecc128] 0x6fecc128
[ 2946.985730] LR [6fecc100] 0x6fecc100
[ 2946.995992] --- interrupt: c00
[ 2947.006198] page:ef4c8e34 refcount:1 mapcount:0 mapping:00000000 index:0x1 pfn:0x31065
[ 2947.016579] flags: 0x80000000(zone=2)
[ 2947.026946] raw: 80000000 00000100 00000122 00000000 00000001 00000000 ffffffff 00000001
[ 2947.037712] raw: 00000000
[ 2947.048178] page dumped because: pagealloc: corrupted page details
Comment 1 Erhard F. 2021-12-22 18:00:19 UTC
Created attachment 300115 [details]
kernel .config (5.15.10, PowerMac G4 DP)
Comment 2 Christophe Leroy 2021-12-24 13:35:07 UTC
Probably hard to track.

Any chance to bisect the issue ?
Comment 3 Erhard F. 2021-12-26 08:38:33 UTC
Bisecting will take some time. I'll report back as soon as I have any findings.
Comment 4 Erhard F. 2022-01-08 01:07:12 UTC
I was able to easily reproduce this on 5.15.13, however not on 5.16-rc8.

But on 5.16-rc8 I got this the 3rd time I ran the glibc testsuite:

[...]
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [kworker/u4:7:32566]
Modules linked in: auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc ghash_generic gf128mul gcm ccm algif_aead des_generic libdes ctr cbc ecb algif_skcipher aes_generic libaes cmac sha512_generic sha1_generic sha1_powerpc md5 md5_ppc md4 b43legacy mac80211 libarc4 snd_aoa_codec_tas snd_aoa_fabric_layout snd_aoa cfg80211 rfkill evdev mac_hid therm_windtunnel firewire_ohci firewire_core crc_itu_t sr_mod cdrom snd_aoa_i2sbus snd_aoa_soundbus snd_pcm snd_timer snd ohci_pci soundcore radeon ohci_hcd ehci_pci ehci_hcd hwmon i2c_algo_bit drm_ttm_helper ttm ssb drm_kms_helper pcmcia pcmcia_core usbcore 8250_pci syscopyarea sysfillrect sysimgblt usb_common 8250 8250_base serial_mctrl_gpio fb_sys_fops pkcs8_key_parser fuse drm drm_panel_orientation_quirks configfs
CPU: 1 PID: 32566 Comm: kworker/u4:7 Not tainted 5.16.0-rc8-PowerMacG4 #1
Workqueue: zswap1 compact_page_work
NIP:  c0078730 LR: c0078724 CTR: 00000000
REGS: f698dd40 TRAP: 0900   Not tainted  (5.16.0-rc8-PowerMacG4)
MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 44008242  XER: 20000000

GPR00: c01856c8 f698de00 ca20b540 00000001 d4c73ffc 00000000 de0bd0bc aaaaaaaa 
GPR08: aaaaaaaa 00000000 ffffffff 00000004 84002242 00000000 c00553fc 00000001 
GPR16: 00000002 d4c73fc0 c0980000 002ec02c 00000040 d4c7300c d4c7302e c19c4bc0 
GPR24: c19c4bc0 c0185d74 ef0d0040 d4c73008 d4c74a4c 0000007f de0bd000 d4c74a54 
NIP [c0078730] arch_write_lock+0x28/0x3c
LR [c0078724] arch_write_lock+0x1c/0x3c
Call Trace:
[f698de00] [c0185d74] release_z3fold_page_locked+0x0/0x44 (unreliable)
[f698de20] [c01856c8] do_compact_page+0x334/0x508
[f698de80] [c004f354] process_one_work+0x1d4/0x288
[f698dec0] [c004f814] worker_thread+0x1b8/0x260
[f698df00] [c0055514] kthread+0x118/0x11c
[f698df30] [c0016268] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
39610020 4bfa7668 9421ffe0 7c0802a6 90010024 93e1001c 7c7f1b78 7fe3fb78 
4bffff0d 2c030000 41a20014 813f0000 <2c090000> 4182ffe8 4bfffff4 39610020 
Kernel panic - not syncing: softlockup: hung tasks
CPU: 1 PID: 32566 Comm: kworker/u4:7 Tainted: G             L    5.16.0-rc8-PowerMacG4 #1
Workqueue: zswap1 compact_page_work
Call Trace:
[f698dbb0] [c03e7f04] dump_stack_lvl+0x60/0x80 (unreliable)
[f698dbd0] [c0037734] panic+0x128/0x30c
[f698dc30] [c00c6334] watchdog_nmi_enable+0x0/0x10
[f698dc70] [c0097fc8] __hrtimer_run_queues+0xf0/0x154
[f698dcb0] [c0098b7c] hrtimer_interrupt+0xf8/0x25c
[f698dcf0] [c000d70c] timer_interrupt+0x20c/0x294
[f698dd30] [c0004a50] Decrementer_virt+0x100/0x104
--- interrupt: 900 at arch_write_lock+0x28/0x3c
NIP:  c0078730 LR: c0078724 CTR: 00000000
REGS: f698dd40 TRAP: 0900   Tainted: G             L     (5.16.0-rc8-PowerMacG4)
MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 44008242  XER: 20000000

GPR00: c01856c8 f698de00 ca20b540 00000001 d4c73ffc 00000000 de0bd0bc aaaaaaaa 
GPR08: aaaaaaaa 00000000 ffffffff 00000004 84002242 00000000 c00553fc 00000001 
GPR16: 00000002 d4c73fc0 c0980000 002ec02c 00000040 d4c7300c d4c7302e c19c4bc0 
GPR24: c19c4bc0 c0185d74 ef0d0040 d4c73008 d4c74a4c 0000007f de0bd000 d4c74a54 
NIP [c0078730] arch_write_lock+0x28/0x3c
LR [c0078724] arch_write_lock+0x1c/0x3c
--- interrupt: 900
[f698de00] [c0185d74] release_z3fold_page_locked+0x0/0x44 (unreliable)
[f698de20] [c01856c8] do_compact_page+0x334/0x508
[f698de80] [c004f354] process_one_work+0x1d4/0x288
[f698dec0] [c004f814] worker_thread+0x1b8/0x260
[f698df00] [c0055514] kthread+0x118/0x11c
[f698df30] [c0016268] ret_from_kernel_thread+0x5c/0x64
Rebooting in 40 seconds..


Which is interesting because on bug #213837 my not yet finished bisect is also giving hints z3fold may be the problem...

I'll check out next whether the issue is reproduceable on 5.15.x when I use zbud or zmalloc for zswap instead of z3fold.
Comment 5 Erhard F. 2022-01-08 18:14:29 UTC
Ok, with zswap lzo/zbud I also get this memory corruption on 5.15.13. So most probably it's not lzo/z3pool but something else. I'll start a bisect then...
Comment 6 Erhard F. 2022-01-25 22:23:59 UTC
Created attachment 300318 [details]
bisect.log

Ok, finally got it. Interesting find:

 # git bisect bad
db972a3787d12b1ce9ba7a31ec376d8a79e04c47 is the first bad commit
commit db972a3787d12b1ce9ba7a31ec376d8a79e04c47
Author: Christophe Leroy <christophe.leroy@csgroup.eu>
Date:   Tue Dec 8 05:24:19 2020 +0000

    powerpc/powermac: Fix low_sleep_handler with CONFIG_VMAP_STACK
    
    low_sleep_handler() can't restore the context from standard
    stack because the stack can hardly be accessed with MMU OFF.
    
    Store everything in a global storage area instead of storing
    a pointer to the stack in that global storage area.
    
    To avoid a complete churn of the function, still use r1 as
    the pointer to the storage area during restore.
    
    Fixes: cd08f109e262 ("powerpc/32s: Enable CONFIG_VMAP_STACK")
    Reported-by: Giuseppe Sacco <giuseppe@sguazz.it>
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Tested-by: Giuseppe Sacco <giuseppe@sguazz.it>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/e3e0d8042a3ba75cb4a9546c19c408b5b5b28994.1607404931.git.christophe.leroy@csgroup.eu

 arch/powerpc/platforms/Kconfig.cputype  |   2 +-
 arch/powerpc/platforms/powermac/sleep.S | 132 ++++++++++++++------------------
 2 files changed, 60 insertions(+), 74 deletions(-)
Comment 7 Christophe Leroy 2022-01-26 06:41:19 UTC
Interesting ... Though confusing.
Comment 8 Christophe Leroy 2022-01-26 07:55:55 UTC
Looking closer, in fact that might be a false positive.

The huge difference with that bad commit is that:
- Before the commit, the kernel is built _without_ CONFIG_VMAP_STACK
- After the commit, the kernel is built _with_ CONFIG_VMAP_STACK

Would you be able to perform following tests:
- Disable VMAP_STACK and see if the problem still occurs.
- Disable ADB_PMU and see it the problem still occurs.

With the version which preceeds the bad commit, can you disable ADB_PMU and enable VMAP_STACK and see what happens ?
Comment 9 Erhard F. 2022-01-30 20:12:17 UTC
Created attachment 300354 [details]
dmesg (5.10-rc2 with ADB_PMU disabled, PowerMac G4 DP)

Took a little time but I double checked the results (one time using distcc '-j8 -l2', one time native '-j3') to be sure:

ADB_PMU disabled, VMAP_STACK disabled  ...  "neverending build"
ADB_PMU enabled,  VMAP_STACK disabled  ...  works ok
ADB_PMU disabled, VMAP_STACK enabled   ...  "neverending build"
ADB_PMU enabled,  VMAP_STACK enabled   ...  memory corruption

Version used was git db972a3787d12b1ce9ba7a31ec376d8a79e04c47, which is the one before a last 'git bisect bad' ends the git bisect.

The "neverending builds" happen when I run this kernel with ADB_PMU disabled. The G4 runs for several hours building (?) without reaching the glibc test stage. With ADB_PMU enabled I get a pass or memory corruption much earlier.

Also without ADB_PMU I get a kernel panic when rebooting or shutting down the G4. Also the G4 does not reboot/poweroff in this case, I need to switch it off manually.
Comment 10 Christophe Leroy 2022-01-31 07:41:19 UTC
Thanks for the tests.

I'm not surprised that the system doesn't poweroff or reboot without ADB_PMU because the PMU manages power.

The "neverending build" is maybe because the PMU also manages RTC clock and without it you get inconsistent time ?

Anyway, it looks like there is indeed something linked to VMAP_STACK.

I'm wondering whether you could be running out of vmalloc space. I initially thought you were using KASAN, but it seems not according to your .config.

Could you try reducing CONFIG_LOWMEM_SIZE to 0x28000000 for instance and see if the memory corruption still happens ?

To do this you'll need CONFIG_ADVANCED_OPTIONS and CONFIG_LOWMEM_SIZE_BOOL.
Comment 11 Erhard F. 2022-01-31 23:33:13 UTC
(In reply to Christophe Leroy from comment #10)
> I'm wondering whether you could be running out of vmalloc space. I initially
> thought you were using KASAN, but it seems not according to your .config.
Correct, I was not using KASAN. I use it only for testing -rc kernels or when I am particularly wary. This memory corruption I noticed during regular usage. Seems running the kernel with slub_debug=FZP page_poison=1 is a good thing. ;)

> Could you try reducing CONFIG_LOWMEM_SIZE to 0x28000000 for instance and see
> if the memory corruption still happens ?
Thanks, that did the trick! With CONFIG_LOWMEM_SIZE=0x28000000 the memory corruption is gone on VMAP_STACK enabled kernels. Tested it additionally on current 5.16.4 where this works too.
Comment 12 Erhard F. 2022-04-18 13:54:10 UTC
Created attachment 300774 [details]
dmesg (5.18-rc3, PowerMac G4 DP)

Another try with running glibc-2.34 testsuite on kernel 5.18-rc3. Looks like it's still a problem.

[...]
pagealloc: memory corruption
fffdfff0: 00 00 00 00                                      ....
CPU: 0 PID: 21222 Comm: install Not tainted 5.18.0-rc3-PMacG4 #5
Call Trace:
[f8085a70] [c06e8820] dump_stack_lvl+0x80/0xc0 (unreliable)
[f8085a90] [c02c0b2c] __kernel_unpoison_pages+0x1c0/0x204
[f8085ae0] [c02a4cb0] get_page_from_freelist+0xcb4/0xeb0
[f8085ba0] [c02a5754] __alloc_pages+0x184/0x11b4
[f8085c70] [c0230d50] __filemap_get_folio+0x224/0x598
[f8085cf0] [c0240ebc] pagecache_get_page+0x20/0x88
[f8085d10] [c04e2600] prepare_pages+0xf8/0x358
[f8085d60] [c04e4e54] btrfs_buffered_write+0x334/0x850
[f8085e20] [c04ea598] btrfs_do_write_iter+0x3a8/0x768
[f8085e80] [c02ee25c] vfs_write+0x364/0x488
[f8085f00] [c02ee52c] ksys_write+0x78/0x128
[f8085f30] [c001e1a8] ret_from_syscall+0x0/0x2c
--- interrupt: c00 at 0x5c5d08
NIP:  005c5d08 LR: 005c5ce0 CTR: c0289c9c
REGS: f8085f40 TRAP: 0c00   Not tainted  (5.18.0-rc3-PMacG4)
MSR:  0000f932 <EE,PR,FP,ME,IR,DR,RI>  CR: 28022464  XER: 20000000

GPR00: 00000004 af820720 a7ced760 00000006 a77aa000 00020000 00000000 00000000 
GPR08: 00120000 a77a9000 00000008 403d77ca 403d7497 0077fff4 00000000 00020000 
GPR16: 00000000 af8208c8 00020000 00000000 00000000 af821d2b a77aa000 00020000 
GPR24: 00000000 00000006 7ff00000 00000006 a77aa000 00020000 006c7ff4 00020000 
NIP [005c5d08] 0x5c5d08
LR [005c5ce0] 0x5c5ce0
--- interrupt: c00
page:ef4c4ec4 refcount:1 mapcount:0 mapping:00000000 index:0x1 pfn:0x31069
flags: 0x80000000(zone=2)
raw: 80000000 00000100 00000122 00000000 00000001 00000000 ffffffff 00000001
raw: 00000000
page dumped because: pagealloc: corrupted page detail
Comment 13 Erhard F. 2022-04-18 13:54:51 UTC
Created attachment 300775 [details]
kernel .config (5.18-rc3, PowerMac G4 DP)
Comment 14 Christophe Leroy 2022-05-02 09:43:00 UTC
Do you mean it still happens with the default values, or it also happens with the reduced CONFIG_LOWMEM_SIZE ?
Comment 15 Erhard F. 2022-05-02 11:59:34 UTC
It definitively still happens with the default values. Can test with the reduced CONFIG_LOWMEM_SIZE next week and report back.
Comment 16 Erhard F. 2022-05-11 07:43:25 UTC
Created attachment 300929 [details]
dmesg (5.18-rc6, CONFIG_LOWMEM_SIZE=0x28000000, PowerMac G4 DP)

(In reply to Christophe Leroy from comment #14)
> Do you mean it still happens with the default values, or it also happens
> with the reduced CONFIG_LOWMEM_SIZE ?
Turns out the memory corruption also happens with the reduced CONFIG_LOWMEM_SIZE=0x28000000.

Tested again on v5.18-rc6, both with CONFIG_LOWMEM_SIZE=0x28000000 and without.
Comment 17 Erhard F. 2022-05-11 07:44:00 UTC
Created attachment 300930 [details]
kernel .config (5.18-rc6, CONFIG_LOWMEM_SIZE=0x28000000, PowerMac G4 DP)
Comment 18 Erhard F. 2022-05-11 14:08:06 UTC
Ok, and another problem during building via distcc on the G4, still LOWMEM_SIZE=0x28000000 (kernel v5.17.6).

[...]
Oops: Kernel stack overflow, sig: 11 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in: auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc ghash_generic gf128mul gcm ccm algif_aead des_generic libdes ctr cbc ecb algif_skcipher aes_generic libaes cmac sha512_generic sha1_generic sha1_powerpc md5 md5_ppc md4 hid_generic b43legacy usbhid mac80211 hid libarc4 cfg80211 snd_aoa_codec_tas rfkill snd_aoa_fabric_layout snd_aoa evdev mac_hid therm_windtunnel firewire_ohci firewire_core crc_itu_t sr_mod cdrom ohci_pci 8250_pci radeon snd_aoa_i2sbus ohci_hcd snd_aoa_soundbus ssb snd_pcm ehci_pci snd_timer pcmcia snd soundcore pcmcia_core hwmon 8250 ehci_hcd i2c_algo_bit 8250_base drm_ttm_helper serial_mctrl_gpio ttm drm_kms_helper usbcore syscopyarea sysfillrect sysimgblt usb_common fb_sys_fops pkcs8_key_parser fuse drm drm_panel_orientation_quirks configfs
CPU: 0 PID: 24122 Comm: sh Not tainted 5.17.6-gentoo-PMacG4 #1
NIP:  c0018614 LR: 00000000 CTR: c103cbe0
REGS: e7fe9f50 TRAP: 0000   Not tainted  (5.17.6-gentoo-PMacG4)
MSR:  00001030 <ME,IR,DR>  CR: 00000001  XER: c000e234

GPR00: a78bfe90 80002288 00000000 d6a5e1a0 e991de60 0068c6c4 a7a3ff98 c1099000 
GPR08: 00000000 e991dec0 d6a5e1a0 80002288 005900d0 0068fff4 00000000 00000007 
GPR16: 00000029 00000007 00bc44b0 a7ddafe8 a78bfe90 fffff000 00000000 00000000 
GPR24: 005900d0 0068c6c4 c0dcc7a0 c1402b48 caa899c0 c103cbe0 c4ce9400 c08fe234 
NIP [c0018614] interrupt_return+0x17c/0x190
LR [00000000] 0x0
Call Trace:
Instruction dump:
40860018 7ccff120 80c10028 80010010 80210014 4c000064 7ccff120 7d3043a6 
392100c0 80c10028 80010010 80210014 <91210000> 7d3042a6 4c000064 7c000828 
---[ end trace 0000000000000000 ]---


@Christophe: Would it be helpful for these issues to try a KASAN build?
Comment 19 Christophe Leroy 2022-05-12 14:44:15 UTC
Yes KASAN can bring some additional inputs.
Maybe start with CONFIG_KFENCE, it is lighter than KASAN.


For the above problem, maybe CONFIG_DEBUG_STACKOVERFLOW can help.
Comment 20 Erhard F. 2022-05-12 21:43:02 UTC
DEBUG_STACKOVERFLOW and KFENCE have been enabled already in the builds I did here (see kernel attached kernel .config here).

However if I enable (inline) KASAN the kernel won't boot at all. I get dropped out in OpenFirmware console with:

[...]
Finalizing device tree... using OF tree (promptr=ff847240)

Invalid memory access at %SRR0: 40000000 %SRR1: 00000000
Comment 21 Michael Ellerman 2022-05-13 02:31:34 UTC
Increasing the stack size (CONFIG_THREAD_SHIFT) might avoid the stack overflows and allow you to debug the original issue in isolation.
Comment 22 Erhard F. 2022-05-16 18:51:04 UTC
Created attachment 300977 [details]
dmesg (5.18-rc6, CONFIG_LOWMEM_SIZE=0x28000000, outline KASAN, PowerMac G4 DP)

I increased THREAD_SHIFT to 14 and used outline KASAN still with CONFIG_LOWMEM_SIZE=0x28000000. The memory corruption output looks slightly different (but not much):

[...]
pagealloc: memory corruption
f5fcfff0: 00 00 00 00                                      ....
CPU: 1 PID: 29742 Comm: ld.so.1 Not tainted 5.18.0-rc6-PMacG4 #7
Call Trace:
[eea3ba90] [c09890d4] dump_stack_lvl+0x80/0xc0 (unreliable)
[eea3bab0] [c03cce40] __kernel_unpoison_pages+0x208/0x250
[eea3bb00] [c03a2e48] post_alloc_hook+0x108/0x144
[eea3bb30] [c03a66e0] get_page_from_freelist+0x9d4/0x12dc
[eea3bc70] [c03a7ad0] __alloc_pages+0x23c/0x1570
[eea3bde0] [c0379c8c] handle_mm_fault+0x610/0x1240
[eea3bed0] [c002e2d4] ___do_page_fault+0x19c/0x850
[eea3bf10] [c002ebbc] do_page_fault+0x28/0x5c
[eea3bf30] [c000433c] DataAccess_virt+0x124/0x17c
--- interrupt: 300 at 0x6fe0338c
NIP:  6fe0338c LR: 6fe032c4 CTR: 6fe033e0
REGS: eea3bf40 TRAP: 0300   Not tainted  (5.18.0-rc6-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 48002262  XER: 20000000
DAR: 046a5000 DSISR: 42000000 
GPR00: 6ffbcb94 afc45940 a7c95560 046a4fe4 8a000000 000127e0 03e59a8b 00000003 
GPR08: 046a5004 046a5000 04621cfc 6fe03170 6fe032c4 6ffece34 00000000 6ffef34d 
GPR16: 02dea020 04416750 00000003 01f8cbec 02de9fa0 01f8c660 00000000 00000000 
GPR24: afc45aa0 6ffef37c afc45a18 04678c7c 0007630c 04678c7c 6ff76ff4 045f5990 
NIP [6fe0338c] 0x6fe0338c
LR [6fe032c4] 0x6fe032c4
--- interrupt: 300
page:e739d6ec refcount:1 mapcount:0 mapping:00000000 index:0x1 pfn:0x290a3
flags: 0x80000000(zone=2)
raw: 80000000 00000100 00000122 00000000 00000001 00000000 ffffffff 00000001
raw: 00000000
page dumped because: pagealloc: corrupted page details
[...]

With THREAD_SHIFT=14 the stack issue does not show up.

A kernel with inline KASAN and same setup otherwise won't boot showing me this at the OpenFirmware prompt:

[...]
Finalizing device tree... using OF tree (promptr=ff847240)

Invalid memory access at %SRR0: 40000000 %SRR1: 00000000
Comment 23 Erhard F. 2022-05-16 18:52:51 UTC
Created attachment 300978 [details]
kernel .config (5.18-rc6, CONFIG_LOWMEM_SIZE=0x28000000, outline KASAN, PowerMac G4 DP)
Comment 24 Christophe Leroy 2022-05-17 08:25:22 UTC
Seems like with Inline KASAN your kernel is far too big compared to what we support at the time being:

c2468000 T __end_rodata
c2800000 T __init_begin
c2800000 T _sinittext

c2801644 T prom_init

The init text section is behind the 32Mbytes boundary, it means that prom_init and other functions are not called anymore directly but via a trampoline.

c000000c <__start>:
c000000c:       2c 05 00 00     cmpwi   r5,0
c0000010:       41 82 00 1c     beq     c000002c <__start+0x20>
c0000014:       42 9f 00 05     bcl     20,4*cr7+so,c0000018 <__start+0xc>
c0000018:       7d 08 02 a6     mflr    r8
c000001c:       3d 08 00 00     addis   r8,r8,0
c0000020:       39 08 ff e8     addi    r8,r8,-24
c0000024:       48 00 38 e5     bl      c0003908 <setup_disp_bat+0x30>
...
c0003908:       3d 80 c2 80     lis     r12,-15744
c000390c:       39 8c 16 44     addi    r12,r12,5700
c0003910:       7d 89 03 a6     mtctr   r12
c0003914:       4e 80 04 20     bctr


And it cannot work because at that time the kernel is not yet relocated to its final location.

There was the same problem with PPC64 and it was fix by 24d33ac5b8ff ("powerpc/64s: Make prom_init require RELOCATABLE").

Don't know if a similar approach could work.
Comment 25 Christophe Leroy 2022-05-17 08:31:25 UTC
The Kernel stack overflow looks odd.

Value of R1 is wrong and LR is NULL. Don't know how we ended up here, but probably not by a real stack overflow.
Comment 26 Christophe Leroy 2022-05-17 08:35:50 UTC
Note that THREAD_SHIFT is set to 14 when using KASAN:

config THREAD_SHIFT
	int "Thread shift" if EXPERT
	range 13 15
	default "15" if PPC_256K_PAGES
	default "14" if PPC64
	default "14" if KASAN
	default "13"
	help
	  Used to define the stack size. The default is almost always what you
	  want. Only change this if you know what you are doing.
Comment 27 Erhard F. 2022-05-28 11:49:57 UTC
I opened a new bug for the stack issue which contains a bit more data. Hopefully the output is of some help (see bug #216041).
Comment 28 Erhard F. 2022-06-28 23:01:54 UTC
Created attachment 301302 [details]
dmesg (5.19-rc4, PowerMac G4 DP)

Re-tried on v5.19-rc4 (without fadditional patches) + KFENCE.

My findings so far:
1. Memory corruption still persists.
2. Even without KASAN I need THREAD_SHIFT=14 or else I get the stack overflow from bug #216041.
3. Memory corruption also happens with CONFIG_LOWMEM_SIZE=0x28000000.
4. But the "neverending build" commit mentioned in comment #9 is gone (be it with default .config or CONFIG_LOWMEM_SIZE=0x28000000).

[...]
pagealloc: memory corruption
fffdfff0: 00 00 00 00                                      ....
CPU: 0 PID: 29136 Comm: localedef Not tainted 5.19.0-rc4-PMacG4 #3
Call Trace:
[f39b3c20] [c05eb9c0] dump_stack_lvl+0x60/0x90 (unreliable)
[f39b3c40] [c0232fb0] __kernel_unpoison_pages+0x1a8/0x1ec
[f39b3c90] [c02170dc] get_page_from_freelist+0xc20/0xe70
[f39b3d50] [c0217bdc] __alloc_pages+0x18c/0xe80
[f39b3e10] [c01f46b4] wp_page_copy+0x214/0xa1c
[f39b3e80] [c01fa0b8] handle_mm_fault+0x720/0xd64
[f39b3f00] [c00215dc] do_page_fault+0x1d4/0x830
[f39b3f30] [c000433c] DataAccess_virt+0x124/0x17c
--- interrupt: 300 at 0x669410
NIP:  00669410 LR: 006693e4 CTR: 00000000
REGS: f39b3f40 TRAP: 0300   Not tainted  (5.19.0-rc4-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 84002462  XER: 20000000
DAR: a7a3cce8 DSISR: 0a000000 
GPR00: 0066961c afd34060 a7bd3000 01a069bc 01b76d60 00000009 a4e0c05a 0005ccd8 
GPR08: 01b76140 a7a3cce8 a7a43e44 400a713a 44002862 0068fe34 01b8d730 00000001 
GPR16: 00000000 01a069bc 01a069f8 01a06990 01b8d170 01a06894 0000000f 00000009 
GPR24: 01b76d60 a4e0c05a 0000018d a7ad9f00 a79e0010 000041cb 00697cdc 01a069bc 
NIP [00669410] 0x669410
LR [006693e4] 0x6693e4
--- interrupt: 300
page:ef4bd80c refcount:1 mapcount:0 mapping:00000000 index:0x1 pfn:0x310ab
flags: 0x80000000(zone=2)
raw: 80000000 00000100 00000122 00000000 00000001 00000000 ffffffff 00000001
raw: 00000000
page dumped because: pagealloc: corrupted page details
Comment 29 Erhard F. 2022-06-28 23:02:58 UTC
Created attachment 301303 [details]
kernel .config (5.19-rc4, PowerMac G4 DP)
Comment 30 Michael Ellerman 2022-06-29 05:13:09 UTC
It's a bit of a stab in the dark, but can you try turning preempt off?

ie. CONFIG_PREEMPT_NONE=y
Comment 31 Erhard F. 2022-06-29 10:25:52 UTC
(In reply to Michael Ellerman from comment #30)
> It's a bit of a stab in the dark, but can you try turning preempt off?
> 
> ie. CONFIG_PREEMPT_NONE=y
Just tested that. Backtrace looks a little different but not much.

[..]
pagealloc: memory corruption
fffdfff0: 00 00 00 00                                      ....
CPU: 0 PID: 29086 Comm: localedef Not tainted 5.19.0-rc4-PMacG4 #2
Call Trace:
[f397bc90] [c05eb280] dump_stack_lvl+0x60/0x90 (unreliable)
[f397bcb0] [c0233128] __kernel_unpoison_pages+0x1a8/0x1ec
[f397bd00] [c02172ec] get_page_from_freelist+0xc20/0xe70
[f397bdc0] [c0217de0] __alloc_pages+0x180/0xe98
[f397be80] [c01fa164] handle_mm_fault+0x450/0xd64
[f397bf00] [c00215d8] do_page_fault+0x1d0/0x82c
[f397bf30] [c000433c] DataAccess_virt+0x124/0x17c
--- interrupt: 300 at 0x83f1b8
NIP:  0083f1b8 LR: 0083e25c CTR: 00000000
REGS: f397bf40 TRAP: 0300   Not tainted  (5.19.0-rc4-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 88224462  XER: 00000000
DAR: 01232b3c DSISR: 42000000 
GPR00: 00840220 af9416c0 a7ca4000 01231b50 00000fe0 00000005 01232b38 00000000 
GPR08: 00000ff1 01231b48 0000f4c9 008422b0 01067408 00a2fe34 00000070 01231b50 
GPR16: 00000000 00000000 00000000 00000007 0000003f 009ba23c 01067010 009ba79c 
GPR24: 00000062 009bdac8 000000fe 009ba79c 00000fe0 009ba764 009b9ff4 00000ff0 
NIP [0083f1b8] 0x83f1b8
LR [0083e25c] 0x83e25c
--- interrupt: 300
page:ef4bd80c refcount:1 mapcount:0 mapping:00000000 index:0x1 pfn:0x310ab
flags: 0x80000000(zone=2)
raw: 80000000 00000100 00000122 00000000 00000001 00000000 ffffffff 00000001
raw: 00000000
page dumped because: pagealloc: corrupted page details


Interesting thing is the memory corruption always seems to happen in the last stage of installing, after building is done at copying over the binaries from build directory to target directory:

[...]
if test -r /var/tmp/portage/sys-libs/glibc-2.34-r13/image//usr/include/gnu/stubs-32.h && cmp -s /var/tmp/portage/sys-libs/glibc-2.34-r13/work/build-ppc-powerpc-unknown-linux-gnu-nptl/stubs.h /var/tmp/portage/sys-libs/glibc-2.34-r13/image//usr/include/gnu/stubs-32.h; \
then echo 'stubs.h unchanged'; \
else /usr/lib/portage/python3.10/ebuild-helpers/xattr/install -c -m 644 /var/tmp/portage/sys-libs/glibc-2.34-r13/work/build-ppc-powerpc-unknown-linux-gnu-nptl/stubs.h /var/tmp/portage/sys-libs/glibc-2.34-r13/image//usr/include/gnu/stubs-32.h; fi
rm -f /var/tmp/portage/sys-libs/glibc-2.34-r13/work/build-ppc-powerpc-unknown-linux-gnu-nptl/stubs.h
make[1]: Leaving directory '/var/tmp/portage/sys-libs/glibc-2.34-r13/work/glibc-2.34'
>>> Completed installing sys-libs/glibc-2.34-r13 into
>>> /var/tmp/portage/sys-libs/glibc-2.34-r13/image

 * Final size of build directory: 635640 KiB (620.7 MiB)
 * Final size of installed tree:  109892 KiB (107.3 MiB)

making executable: /usr/lib/libc.so
compressme           : 44.96%   (  3.80 KiB =>   1.71 KiB, compressme.zst)     
[...]
/var/tmp/portage/sys-libs/glibc-2.34-r13/image/usr/share/doc/glibc-2.34-r13/NEWS : 33.98%   (   315 KiB =>    107 KiB, /var/tmp/portage/sys-libs/glibc-2.34-r13/image/usr/share/doc/glibc-2.34-r13/NEWS.zst) 
strip: powerpc-unknown-linux-gnu-strip --strip-unneeded -N __gentoo_check_ldflags__ -R .comment -R .GCC.command.line -R .note.gnu.gold-version
   /usr/lib/crt1.o
   /usr/lib/Mcrt1.o
   /usr/lib/gcrt1.o
   /usr/lib/Scrt1.o
[...]
   /lib/ld.so.1
   /usr/lib/audit/sotruss-lib.so
   /usr/bin/pldd
installsources: rsyncing source files
rsync: [sender] link_stat "/var/tmp/portage/sys-libs/glibc-2.34-r13/work/glibc-2.34/iconv/charmap-kw.gperf" failed: No such file or directory (2)
rsync: [sender] link_stat "/var/tmp/portage/sys-libs/glibc-2.34-r13/work/glibc-2.34/locale/charmap-kw.gperf" failed: No such file or directory (2)
rsync: [sender] link_stat "/var/tmp/portage/sys-libs/glibc-2.34-r13/work/glibc-2.34/locale/locfile-kw.gperf" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1326) [sender=3.2.4]

>>> Installing (1 of 1) sys-libs/glibc-2.34-r13::gentoo
 * Defaulting /etc/host.conf:multi to on
 * Last-minute run tests with ./ld.so.1 in /lib ...
[...]
Comment 32 Erhard F. 2022-06-30 13:41:19 UTC
(In reply to Michael Ellerman from comment #30)
> It's a bit of a stab in the dark, but can you try turning preempt off?
> 
> ie. CONFIG_PREEMPT_NONE=y
Looks like your intuition was not bad at all. ;) CONFIG_PREEMPT_NONE=y had no effect but when I disable SMP at all '# CONFIG_SMP is not set' I get no memory corruption and also no stack overflow issues.

Also no special treatment with Advanced Options or setting THREAD_SHIFT manually was necessary. The G4 just does fine, albeit with 1 of it's 2 CPUs only with disabled SMP.

For testing I did 6 of this glibc testsuite builds in a row without getting issues. With SMP enabled I get memory corruption or stack overflow at the 1st build allmost all of the time.
Comment 33 Erhard F. 2022-07-05 16:02:25 UTC
Created attachment 301337 [details]
dmesg (5.19-rc5, outline KASAN, PowerMac G4 DP)

Re-tested on 5.19-rc5 + https://patchwork.ozlabs.org/project/linuxppc-dev/patch/2ee707512b8b212b079b877f4ceb525a1606a3fb.1656655567.git.christophe.leroy@csgroup.eu/

I can run the kernel with outline KASAN, default THREAD_SHIFT and without advanced options necessary. Also I don't get the stack issue (bug #216041) any longer.

However as long as CONFIG_SMP=y (CONFIG_NR_CPUS=2) is set I still get the memory corruption:

[...]
pagealloc: memory corruption
f5fcfff0: 00 00 00 00                                      ....
CPU: 1 PID: 27635 Comm: estrip Not tainted 5.19.0-rc5-PMacG4+ #1
Call Trace:
[f380b9b0] [c0829ebc] dump_stack_lvl+0x60/0x90 (unreliable)
[f380b9d0] [c0307528] __kernel_unpoison_pages+0x1d8/0x220
[f380ba20] [c02dd3bc] post_alloc_hook+0x108/0x144
[f380ba50] [c02e0a70] get_page_from_freelist+0x9e0/0x1278
[f380bb90] [c02e1e04] __alloc_pages+0x250/0x1078
[f380bcf0] [c02af098] wp_page_copy+0x128/0xdb8
[f380bde0] [c02b6fdc] handle_mm_fault+0x954/0x1138
[f380bed0] [c0029938] ___do_page_fault+0x250/0x84c
[f380bf10] [c002a168] do_page_fault+0x28/0x5c
[f380bf30] [c000433c] DataAccess_virt+0x124/0x17c
--- interrupt: 300 at 0x65b734
NIP:  0065b734 LR: 0065b708 CTR: 00354600
REGS: f380bf40 TRAP: 0300   Not tainted  (5.19.0-rc5-PMacG4+)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 82222420  XER: 00000000
DAR: 026fcea0 DSISR: 0a000000 
GPR00: 00000000 afbd5250 a7b0c560 026bb5f0 0269deac 026bb628 696e6f64 026fcea0 
GPR08: 00000000 00000000 00000000 00354600 42222420 0071fff4 026af620 0072243c 
GPR16: 00723b50 007223a4 026b1770 026ec8a0 007222e4 0269de70 02700920 00000001 
GPR24: 00721e9c 00721eb8 0072082c 00000000 afbd52ec 00000000 0072608c 00000000 
NIP [0065b734] 0x65b734
LR [0065b708] 0x65b708
--- interrupt: 300
page:ef4bd6ec refcount:1 mapcount:0 mapping:00000000 index:0x1 pfn:0x310a3
flags: 0x80000000(zone=2)
raw: 80000000 00000100 00000122 00000000 00000001 00000000 ffffffff 00000001
raw: 00000000
page dumped because: pagealloc: corrupted page details
Comment 34 Erhard F. 2022-08-23 21:45:13 UTC
Created attachment 301639 [details]
dmesg (6.0-rc2, outline KASAN, PowerMac G4 DP)

Getting a more interesting backtrace with v6.0.0-rc2 + outline KASAN:

[...]
BUG: KASAN: slab-out-of-bounds in handle_mm_fault+0x27c/0x10f4
Read of size 4 at addr c32edd48 by task cc1plus/1230

CPU: 1 PID: 1230 Comm: cc1plus Tainted: G                T  6.0.0-rc2-PMacG4 #5
Call Trace:
[f4d2bd40] [c0864cc4] dump_stack_lvl+0x60/0xa4 (unreliable)
[f4d2bd60] [c032b8d8] print_report+0x30c/0x688
[f4d2bdb0] [c032befc] kasan_report+0xe4/0x214
[f4d2be00] [c02ce4d8] handle_mm_fault+0x27c/0x10f4
[f4d2bed0] [c002cc98] ___do_page_fault+0x25c/0x8d0
[f4d2bf10] [c002d560] do_page_fault+0x28/0x6c
[f4d2bf30] [c000433c] DataAccess_virt+0x124/0x17c
--- interrupt: 300 at 0xfa9c0c0
NIP:  0fa9c0c0 LR: 1066b838 CTR: 0fa9bea4
REGS: f4d2bf40 TRAP: 0300   Tainted: G                T   (6.0.0-rc2-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 24022828  XER: 20000000
DAR: 9a352014 DSISR: 42000000 
GPR00: 1066b828 af869c10 a7dd1ba0 9a352000 00000000 00000018 9a352018 00000000 
GPR08: 11c30000 0fb89a88 099aec30 0fa9bea4 88022444 11c3d4e0 00000001 af869e78 
GPR16: 9afeeed0 9afedd60 10cef83c 115fdfd0 af869e80 11603030 9afeeed0 00000002 
GPR24: 9afeeed0 9b611f60 115fdfd0 a0c82c30 9afeeed0 00000005 0000006e 9a352000 
NIP [0fa9c0c0] 0xfa9c0c0
LR [1066b838] 0x1066b838
--- interrupt: 300

Allocated by task 1:
 __kasan_slab_alloc+0xd0/0x134
 kmem_cache_alloc+0x21c/0x66c
 __kernfs_new_node+0xe8/0x354
 kernfs_new_node+0x84/0xfc
 __kernfs_create_file+0x50/0x204
 sysfs_add_file_mode_ns+0xf4/0x1f0
 internal_create_group+0x1f0/0x620
 btrfs_init_sysfs+0x264/0x350
 init_btrfs_fs+0x24/0x280
 do_one_initcall+0xc0/0x34c
 kernel_init_freeable+0x2c0/0x400
 kernel_init+0x28/0x178
 ret_from_kernel_thread+0x5c/0x64

The buggy address belongs to the object at c32edd50
 which belongs to the cache kernfs_node_cache of size 88
The buggy address is located 8 bytes to the left of
 88-byte region [c32edd50, c32edda8)

The buggy address belongs to the physical page:
page:eee4a954 refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x32ed
flags: 0x200(slab|zone=0)
raw: 00000200 00000100 00000122 c1852520 00000000 001e003c ffffffff 00000001
raw: 00000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 c32edc00: 00 00 fc fc fc fc fc fc 00 00 00 00 00 00 00 00
 c32edc80: 00 00 00 fc fc fc fc fc fc 00 00 00 00 00 00 00
>c32edd00: 00 00 00 00 fc fc fc fc fc fc 00 00 00 00 00 00
                                      ^
 c32edd80: 00 00 00 00 00 fc fc fc fc fc fc 00 00 00 00 00
 c32ede00: 00 00 00 00 00 00 fc fc fc fc fc fc 00 00 00 00
==================================================================
Disabling lock debugging due to kernel taint
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
[...]
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
get_swap_device: Bad swap file entry 64cccccc
_swap_info_get: Bad swap file entry 64cccccc
BUG: Bad page map in process cc1plus  pte:cccccccc pmd:032ed000
addr:9a352000 vm_flags:00100073 anon_vma:c5933ee8 mapping:00000000 index:9a352
file:(null) fault:0x0 mmap:0x0 read_folio:0x0
CPU: 0 PID: 1230 Comm: cc1plus Tainted: G    B   W       T  6.0.0-rc2-PMacG4 #5
Call Trace:
[f4d2b9b0] [c0864cc4] dump_stack_lvl+0x60/0xa4 (unreliable)
[f4d2b9d0] [c02c5bc4] print_bad_pte+0x2e8/0x364
[f4d2ba60] [c02c9c3c] unmap_page_range+0x964/0xb78
[f4d2bb20] [c02ca590] unmap_vmas+0x168/0x2d4
[f4d2bbd0] [c02d8af0] exit_mmap+0x11c/0x2dc
[f4d2bca0] [c005e8f4] mmput+0xa0/0x254
[f4d2bcd0] [c006e1b4] do_exit+0x430/0xe08
[f4d2bd50] [c006ed88] do_group_exit+0x68/0x11c
[f4d2bd80] [c0086818] get_signal+0xbfc/0xc50
[f4d2be30] [c000edf8] do_notify_resume+0xf0/0x540
[f4d2bf10] [c0019cfc] interrupt_exit_user_prepare_main+0x7c/0xd0
[f4d2bf30] [c00234ac] interrupt_return+0x14/0x190
--- interrupt: 300 at 0xfa9c0c0
NIP:  0fa9c0c0 LR: 1066b838 CTR: 0fa9bea4
REGS: f4d2bf40 TRAP: 0300   Tainted: G    B   W       T   (6.0.0-rc2-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 24022828  XER: 20000000
DAR: 9a352014 DSISR: 42000000 
GPR00: 1066b828 af869c10 a7dd1ba0 9a352000 00000000 00000018 9a352018 00000000 
GPR08: 11c30000 0fb89a88 099aec30 0fa9bea4 88022444 11c3d4e0 00000001 af869e78 
GPR16: 9afeeed0 9afedd60 10cef83c 115fdfd0 af869e80 11603030 9afeeed0 00000002 
GPR24: 9afeeed0 9b611f60 115fdfd0 a0c82c30 9afeeed0 00000005 0000006e 9a352000 
NIP [0fa9c0c0] 0xfa9c0c0
LR [1066b838] 0x1066b838
--- interrupt: 300
_swap_info_get: Bad swap file entry 64cccccc
BUG: Bad page map in process cc1plus  pte:cccccccc pmd:032ed000
addr:9a353000 vm_flags:00100073 anon_vma:c5933ee8 mapping:00000000 index:9a353
file:(null) fault:0x0 mmap:0x0 read_folio:0x0
CPU: 0 PID: 1230 Comm: cc1plus Tainted: G    B   W       T  6.0.0-rc2-PMacG4 #5
Call Trace:
[f4d2b9b0] [c0864cc4] dump_stack_lvl+0x60/0xa4 (unreliable)
[f4d2b9d0] [c02c5bc4] print_bad_pte+0x2e8/0x364
[f4d2ba60] [c02c9c3c] unmap_page_range+0x964/0xb78
[f4d2bb20] [c02ca590] unmap_vmas+0x168/0x2d4
[f4d2bbd0] [c02d8af0] exit_mmap+0x11c/0x2dc
[f4d2bca0] [c005e8f4] mmput+0xa0/0x254
[f4d2bcd0] [c006e1b4] do_exit+0x430/0xe08
[f4d2bd50] [c006ed88] do_group_exit+0x68/0x11c
[f4d2bd80] [c0086818] get_signal+0xbfc/0xc50
[f4d2be30] [c000edf8] do_notify_resume+0xf0/0x540
[f4d2bf10] [c0019cfc] interrupt_exit_user_prepare_main+0x7c/0xd0
[f4d2bf30] [c00234ac] interrupt_return+0x14/0x190
--- interrupt: 300 at 0xfa9c0c0
NIP:  0fa9c0c0 LR: 1066b838 CTR: 0fa9bea4
REGS: f4d2bf40 TRAP: 0300   Tainted: G    B   W       T   (6.0.0-rc2-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 24022828  XER: 20000000
DAR: 9a352014 DSISR: 42000000 
GPR00: 1066b828 af869c10 a7dd1ba0 9a352000 00000000 00000018 9a352018 00000000 
GPR08: 11c30000 0fb89a88 099aec30 0fa9bea4 88022444 11c3d4e0 00000001 af869e78 
GPR16: 9afeeed0 9afedd60 10cef83c 115fdfd0 af869e80 11603030 9afeeed0 00000002 
GPR24: 9afeeed0 9b611f60 115fdfd0 a0c82c30 9afeeed0 00000005 0000006e 9a352000 
NIP [0fa9c0c0] 0xfa9c0c0
LR [1066b838] 0x1066b838
--- interrupt: 300
BUG: Bad page map in process cc1plus  pte:00000001 pmd:032ed000
page:eedd8000 refcount:1 mapcount:-1 mapping:00000000 index:0x0 pfn:0x0
flags: 0x1000(reserved|zone=0)
raw: 00001000 eedd8004 eedd8004 00000000 00000000 00000000 fffffffe 00000001
raw: 00000000
page dumped because: bad pte
addr:9a354000 vm_flags:00100073 anon_vma:c5933ee8 mapping:00000000 index:9a354
file:(null) fault:0x0 mmap:0x0 read_folio:0x0
CPU: 0 PID: 1230 Comm: cc1plus Tainted: G    B   W       T  6.0.0-rc2-PMacG4 #5
Call Trace:
[f4d2b9b0] [c0864cc4] dump_stack_lvl+0x60/0xa4 (unreliable)
[f4d2b9d0] [c02c5bc4] print_bad_pte+0x2e8/0x364
[f4d2ba60] [c02c9974] unmap_page_range+0x69c/0xb78
[f4d2bb20] [c02ca590] unmap_vmas+0x168/0x2d4
[f4d2bbd0] [c02d8af0] exit_mmap+0x11c/0x2dc
[f4d2bca0] [c005e8f4] mmput+0xa0/0x254
[f4d2bcd0] [c006e1b4] do_exit+0x430/0xe08
[f4d2bd50] [c006ed88] do_group_exit+0x68/0x11c
[f4d2bd80] [c0086818] get_signal+0xbfc/0xc50
[f4d2be30] [c000edf8] do_notify_resume+0xf0/0x540
[f4d2bf10] [c0019cfc] interrupt_exit_user_prepare_main+0x7c/0xd0
[f4d2bf30] [c00234ac] interrupt_return+0x14/0x190
--- interrupt: 300 at 0xfa9c0c0
NIP:  0fa9c0c0 LR: 1066b838 CTR: 0fa9bea4
REGS: f4d2bf40 TRAP: 0300   Tainted: G    B   W       T   (6.0.0-rc2-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 24022828  XER: 20000000
DAR: 9a352014 DSISR: 42000000 
GPR00: 1066b828 af869c10 a7dd1ba0 9a352000 00000000 00000018 9a352018 00000000 
GPR08: 11c30000 0fb89a88 099aec30 0fa9bea4 88022444 11c3d4e0 00000001 af869e78 
GPR16: 9afeeed0 9afedd60 10cef83c 115fdfd0 af869e80 11603030 9afeeed0 00000002 
GPR24: 9afeeed0 9b611f60 115fdfd0 a0c82c30 9afeeed0 00000005 0000006e 9a352000 
NIP [0fa9c0c0] 0xfa9c0c0
LR [1066b838] 0x1066b838
--- interrupt: 300
_swap_info_get: Bad swap file entry 14c32d92
BUG: Bad page map in process cc1plus  pte:c32d9228 pmd:032ed000
addr:9a356000 vm_flags:00100073 anon_vma:c5933ee8 mapping:00000000 index:9a356
file:(null) fault:0x0 mmap:0x0 read_folio:0x0
CPU: 0 PID: 1230 Comm: cc1plus Tainted: G    B   W       T  6.0.0-rc2-PMacG4 #5
Call Trace:
[f4d2b9b0] [c0864cc4] dump_stack_lvl+0x60/0xa4 (unreliable)
[f4d2b9d0] [c02c5bc4] print_bad_pte+0x2e8/0x364
[f4d2ba60] [c02c9c3c] unmap_page_range+0x964/0xb78
[f4d2bb20] [c02ca590] unmap_vmas+0x168/0x2d4
[f4d2bbd0] [c02d8af0] exit_mmap+0x11c/0x2dc
[f4d2bca0] [c005e8f4] mmput+0xa0/0x254
[f4d2bcd0] [c006e1b4] do_exit+0x430/0xe08
[f4d2bd50] [c006ed88] do_group_exit+0x68/0x11c
[f4d2bd80] [c0086818] get_signal+0xbfc/0xc50
[f4d2be30] [c000edf8] do_notify_resume+0xf0/0x540
[f4d2bf10] [c0019cfc] interrupt_exit_user_prepare_main+0x7c/0xd0
[f4d2bf30] [c00234ac] interrupt_return+0x14/0x190
--- interrupt: 300 at 0xfa9c0c0
NIP:  0fa9c0c0 LR: 1066b838 CTR: 0fa9bea4
REGS: f4d2bf40 TRAP: 0300   Tainted: G    B   W       T   (6.0.0-rc2-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 24022828  XER: 20000000
DAR: 9a352014 DSISR: 42000000 
GPR00: 1066b828 af869c10 a7dd1ba0 9a352000 00000000 00000018 9a352018 00000000 
GPR08: 11c30000 0fb89a88 099aec30 0fa9bea4 88022444 11c3d4e0 00000001 af869e78 
GPR16: 9afeeed0 9afedd60 10cef83c 115fdfd0 af869e80 11603030 9afeeed0 00000002 
GPR24: 9afeeed0 9b611f60 115fdfd0 a0c82c30 9afeeed0 00000005 0000006e 9a352000 
NIP [0fa9c0c0] 0xfa9c0c0
LR [1066b838] 0x1066b838
--- interrupt: 300
_swap_info_get: Bad swap file entry 60c0e0fa
BUG: Bad page map in process cc1plus  pte:c0e0fac0 pmd:032ed000
addr:9a357000 vm_flags:00100073 anon_vma:c5933ee8 mapping:00000000 index:9a357
file:(null) fault:0x0 mmap:0x0 read_folio:0x0
CPU: 0 PID: 1230 Comm: cc1plus Tainted: G    B   W       T  6.0.0-rc2-PMacG4 #5
Call Trace:
[f4d2b9b0] [c0864cc4] dump_stack_lvl+0x60/0xa4 (unreliable)
[f4d2b9d0] [c02c5bc4] print_bad_pte+0x2e8/0x364
[f4d2ba60] [c02c9c3c] unmap_page_range+0x964/0xb78
[f4d2bb20] [c02ca590] unmap_vmas+0x168/0x2d4
[f4d2bbd0] [c02d8af0] exit_mmap+0x11c/0x2dc
[f4d2bca0] [c005e8f4] mmput+0xa0/0x254
[f4d2bcd0] [c006e1b4] do_exit+0x430/0xe08
[f4d2bd50] [c006ed88] do_group_exit+0x68/0x11c
[f4d2bd80] [c0086818] get_signal+0xbfc/0xc50
[f4d2be30] [c000edf8] do_notify_resume+0xf0/0x540
[f4d2bf10] [c0019cfc] interrupt_exit_user_prepare_main+0x7c/0xd0
[f4d2bf30] [c00234ac] interrupt_return+0x14/0x190
--- interrupt: 300 at 0xfa9c0c0
NIP:  0fa9c0c0 LR: 1066b838 CTR: 0fa9bea4
REGS: f4d2bf40 TRAP: 0300   Tainted: G    B   W       T   (6.0.0-rc2-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 24022828  XER: 20000000
DAR: 9a352014 DSISR: 42000000 
GPR00: 1066b828 af869c10 a7dd1ba0 9a352000 00000000 00000018 9a352018 00000000 
GPR08: 11c30000 0fb89a88 099aec30 0fa9bea4 88022444 11c3d4e0 00000001 af869e78 
GPR16: 9afeeed0 9afedd60 10cef83c 115fdfd0 af869e80 11603030 9afeeed0 00000002 
GPR24: 9afeeed0 9b611f60 115fdfd0 a0c82c30 9afeeed0 00000005 0000006e 9a352000 
NIP [0fa9c0c0] 0xfa9c0c0
LR [1066b838] 0x1066b838
--- interrupt: 300
_swap_info_get: Bad swap file entry 50c32ed0
BUG: Bad page map in process cc1plus  pte:c32ed0a0 pmd:032ed000
addr:9a358000 vm_flags:00100073 anon_vma:c5933ee8 mapping:00000000 index:9a358
file:(null) fault:0x0 mmap:0x0 read_folio:0x0
CPU: 0 PID: 1230 Comm: cc1plus Tainted: G    B   W       T  6.0.0-rc2-PMacG4 #5
Call Trace:
[f4d2b9b0] [c0864cc4] dump_stack_lvl+0x60/0xa4 (unreliable)
[f4d2b9d0] [c02c5bc4] print_bad_pte+0x2e8/0x364
[f4d2ba60] [c02c9c3c] unmap_page_range+0x964/0xb78
[f4d2bb20] [c02ca590] unmap_vmas+0x168/0x2d4
[f4d2bbd0] [c02d8af0] exit_mmap+0x11c/0x2dc
[f4d2bca0] [c005e8f4] mmput+0xa0/0x254
[f4d2bcd0] [c006e1b4] do_exit+0x430/0xe08
[f4d2bd50] [c006ed88] do_group_exit+0x68/0x11c
[f4d2bd80] [c0086818] get_signal+0xbfc/0xc50
[f4d2be30] [c000edf8] do_notify_resume+0xf0/0x540
[f4d2bf10] [c0019cfc] interrupt_exit_user_prepare_main+0x7c/0xd0
[f4d2bf30] [c00234ac] interrupt_return+0x14/0x190
--- interrupt: 300 at 0xfa9c0c0
NIP:  0fa9c0c0 LR: 1066b838 CTR: 0fa9bea4
REGS: f4d2bf40 TRAP: 0300   Tainted: G    B   W       T   (6.0.0-rc2-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 24022828  XER: 20000000
DAR: 9a352014 DSISR: 42000000 
GPR00: 1066b828 af869c10 a7dd1ba0 9a352000 00000000 00000018 9a352018 00000000 
GPR08: 11c30000 0fb89a88 099aec30 0fa9bea4 88022444 11c3d4e0 00000001 af869e78 
GPR16: 9afeeed0 9afedd60 10cef83c 115fdfd0 af869e80 11603030 9afeeed0 00000002 
GPR24: 9afeeed0 9b611f60 115fdfd0 a0c82c30 9afeeed0 00000005 0000006e 9a352000 
NIP [0fa9c0c0] 0xfa9c0c0
LR [1066b838] 0x1066b838
--- interrupt: 300
_swap_info_get: Bad swap file entry 4c2a45bd
BUG: Bad page map in process cc1plus  pte:2a45bd98 pmd:032ed000
addr:9a35c000 vm_flags:00100073 anon_vma:c5933ee8 mapping:00000000 index:9a35c
file:(null) fault:0x0 mmap:0x0 read_folio:0x0
CPU: 0 PID: 1230 Comm: cc1plus Tainted: G    B   W       T  6.0.0-rc2-PMacG4 #5
Call Trace:
[f4d2b9b0] [c0864cc4] dump_stack_lvl+0x60/0xa4 (unreliable)
[f4d2b9d0] [c02c5bc4] print_bad_pte+0x2e8/0x364
[f4d2ba60] [c02c9c3c] unmap_page_range+0x964/0xb78
[f4d2bb20] [c02ca590] unmap_vmas+0x168/0x2d4
[f4d2bbd0] [c02d8af0] exit_mmap+0x11c/0x2dc
[f4d2bca0] [c005e8f4] mmput+0xa0/0x254
[f4d2bcd0] [c006e1b4] do_exit+0x430/0xe08
[f4d2bd50] [c006ed88] do_group_exit+0x68/0x11c
[f4d2bd80] [c0086818] get_signal+0xbfc/0xc50
[f4d2be30] [c000edf8] do_notify_resume+0xf0/0x540
[f4d2bf10] [c0019cfc] interrupt_exit_user_prepare_main+0x7c/0xd0
[f4d2bf30] [c00234ac] interrupt_return+0x14/0x190
--- interrupt: 300 at 0xfa9c0c0
NIP:  0fa9c0c0 LR: 1066b838 CTR: 0fa9bea4
REGS: f4d2bf40 TRAP: 0300   Tainted: G    B   W       T   (6.0.0-rc2-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 24022828  XER: 20000000
DAR: 9a352014 DSISR: 42000000 
GPR00: 1066b828 af869c10 a7dd1ba0 9a352000 00000000 00000018 9a352018 00000000 
GPR08: 11c30000 0fb89a88 099aec30 0fa9bea4 88022444 11c3d4e0 00000001 af869e78 
GPR16: 9afeeed0 9afedd60 10cef83c 115fdfd0 af869e80 11603030 9afeeed0 00000002 
GPR24: 9afeeed0 9b611f60 115fdfd0 a0c82c30 9afeeed0 00000005 0000006e 9a352000 
NIP [0fa9c0c0] 0xfa9c0c0
LR [1066b838] 0x1066b838
--- interrupt: 300
BUG: Unable to handle kernel data access on read at 0x09fcbaf8
Faulting instruction address: 0xc02c99a8
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in: auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc hid_generic usbhid hid b43legacy mac80211 snd_aoa_codec_tas libarc4 snd_aoa_fabric_layout snd_aoa cfg80211 rfkill evdev mac_hid firewire_ohci therm_windtunnel firewire_core sr_mod cdrom crc_itu_t snd_aoa_i2sbus snd_aoa_soundbus snd_pcm snd_timer snd 8250_pci soundcore ssb pcmcia pcmcia_core 8250 8250_base serial_mctrl_gpio ohci_pci radeon hwmon ohci_hcd ehci_pci i2c_algo_bit drm_ttm_helper ttm drm_display_helper ehci_hcd drm_kms_helper syscopyarea sysfillrect usbcore sysimgblt fb_sys_fops usb_common fuse drm drm_panel_orientation_quirks configfs
CPU: 0 PID: 1230 Comm: cc1plus Tainted: G    B   W       T  6.0.0-rc2-PMacG4 #5
NIP:  c02c99a8 LR: c02c99a8 CTR: 00000000
REGS: f4d2b9a0 TRAP: 0300   Tainted: G    B   W       T   (6.0.0-rc2-PMacG4)
MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 24d88838  XER: 20000000
DAR: 09fcbaf8 DSISR: 40000000 
GPR00: 00000000 f4d2ba60 c3ff0020 00000000 00000000 00000000 00000000 00000000 
GPR08: 00000000 00000000 00000000 00000000 00000000 11c3d4e0 f4d2bad0 00000000 
GPR16: c7d11008 fe9a5752 c0de15e0 f4d2bb50 f4d2bab0 c3626ac8 00000000 c16ed525 
GPR24: 9a400000 fffffffd 00000000 f4d2bc10 09fcbaf4 09fcbaf4 9a35f000 c32edd78 
NIP [c02c99a8] unmap_page_range+0x6d0/0xb78
LR [c02c99a8] unmap_page_range+0x6d0/0xb78
Call Trace:
[f4d2ba60] [c02c99a8] unmap_page_range+0x6d0/0xb78 (unreliable)
[f4d2bb20] [c02ca590] unmap_vmas+0x168/0x2d4
[f4d2bbd0] [c02d8af0] exit_mmap+0x11c/0x2dc
[f4d2bca0] [c005e8f4] mmput+0xa0/0x254
[f4d2bcd0] [c006e1b4] do_exit+0x430/0xe08
[f4d2bd50] [c006ed88] do_group_exit+0x68/0x11c
[f4d2bd80] [c0086818] get_signal+0xbfc/0xc50
[f4d2be30] [c000edf8] do_notify_resume+0xf0/0x540
[f4d2bf10] [c0019cfc] interrupt_exit_user_prepare_main+0x7c/0xd0
[f4d2bf30] [c00234ac] interrupt_return+0x14/0x190
--- interrupt: 300 at 0xfa9c0c0
NIP:  0fa9c0c0 LR: 1066b838 CTR: 0fa9bea4
REGS: f4d2bf40 TRAP: 0300   Tainted: G    B   W       T   (6.0.0-rc2-PMacG4)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 24022828  XER: 20000000
DAR: 9a352014 DSISR: 42000000 
GPR00: 1066b828 af869c10 a7dd1ba0 9a352000 00000000 00000018 9a352018 00000000 
GPR08: 11c30000 0fb89a88 099aec30 0fa9bea4 88022444 11c3d4e0 00000001 af869e78 
GPR16: 9afeeed0 9afedd60 10cef83c 115fdfd0 af869e80 11603030 9afeeed0 00000002 
GPR24: 9afeeed0 9b611f60 115fdfd0 a0c82c30 9afeeed0 00000005 0000006e 9a352000 
NIP [0fa9c0c0] 0xfa9c0c0
LR [1066b838] 0x1066b838
--- interrupt: 300
Instruction dump:
7ecfb378 82410014 82c10018 4bffff04 3d40c170 578901be 83aa5580 1d290024 
7fbd4a14 387d0004 7fbceb78 48063245 <813d0004> 712a0001 40820304 7f83e378 
---[ end trace 0000000000000000 ]---

Fixing recursive fault but reboot is needed!


I deleted about 120.000 lines of "get_swap_device: Bad swap file entry 64cccccc" in the kernel dmesg to make it more compact. swap partition is 8192 MiB large at /dev/sdb6.
Comment 35 Erhard F. 2022-08-23 21:56:28 UTC
Created attachment 301640 [details]
kernel .config (6.0-rc2, PowerMac G4 DP)
Comment 36 Christophe Leroy 2023-05-19 18:48:13 UTC
Would be nice to give it a new try with KCSAN enabled.

To get KCSAN on powerpc/32, apply following series: https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=354731
Comment 37 Erhard F. 2023-05-23 19:55:46 UTC
Created attachment 304308 [details]
dmesg (6.3.3, KCSAN, PowerMac G4 DP)

Thanks for taking another look into this Christophe!

Applied the patches on top of 6.3.3 and these are my findings so far:

1. KCSAN works fine on my G4 and passes self tests.
2. It does not generate any additional output when I hit the "pagealloc: memory corruption".
3. When setting CONFIG_KCSAN_WEAK_MEMORY=y my G4 won't finish booting. Early boot works, the screen shows some dmesg but booting gets stuck there never reaching console. I also don't get any netconsole output with CONFIG_KCSAN_WEAK_MEMORY=y.
4. As soon as I set CONFIG_KCSAN_EARLY_ENABLE=y dmesg shows plenty of data races!

netconsole output and kernel .config attached.

To provoke the memory corruption 'stress' is a good tool. stress -m2 --vm-bytes 915M provokes the corruption easily and --vm-bytes 915M is small enough to not provoke the OOM killer on my G4 DP with its' 2 CPUs and 2 GiB RAM.
Comment 38 Erhard F. 2023-05-23 19:56:37 UTC
Created attachment 304309 [details]
kernel .config (6.3.3, PowerMac G4 DP)
Comment 39 Erhard F. 2023-05-23 21:17:56 UTC
No change with 6.4-rc4, only additional data "page_type: 0xffffffff()" is shown:

[...]
pagealloc: memory corruption
06fe3258: 00 00 00 00                                      ....
CPU: 1 PID: 397 Comm: stress Tainted: G        W       T  6.4.0-rc3-PMacG4-dirty #1
Hardware name: PowerMac3,6 7455 0x80010303 PowerMac
Call Trace:
[f2a35c70] [c0eea17c] dump_stack_lvl+0x60/0xa4 (unreliable)
[f2a35c90] [c0eea1d8] dump_stack+0x18/0x30
[f2a35ca0] [c0360f90] __kernel_unpoison_pages+0x234/0x288
[f2a35ce0] [c033fdf4] get_page_from_freelist+0xd90/0x10d8
[f2a35d90] [c0340978] __alloc_pages+0x138/0xdd8
[f2a35e40] [c0315b80] handle_mm_fault+0xab8/0x15e0
[f2a35ed0] [c003a3d4] ___do_page_fault+0x320/0x8c4
[f2a35f10] [c003abe0] do_page_fault+0x28/0x80
[f2a35f30] [c000433c] DataAccess_virt+0x124/0x17c
--- interrupt: 300 at 0xaf30d8
NIP:  00af30d8 LR: 00af30b4 CTR: 00000000
REGS: f2a35f40 TRAP: 0300   Tainted: G        W       T   (6.4.0-rc3-PMacG4-dirty)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 20882464  XER: 00000000
DAR: 8f7a3010 DSISR: 42000000 
GPR00: 00af30b4 af9c9cb0 a7cd2740 6e97d010 39300000 20224462 00000000 00a10264 
GPR08: 20e27000 20e26000 00000000 4062ceda 20882462 00b0fff4 00000000 00000000 
GPR16: 00000000 00000002 00000000 0000005a 40802462 80002462 40002462 00b100a4 
GPR24: ffffffff ffffffff 39300000 00000000 00000000 6e97d010 00b17d64 00001000 
NIP [00af30d8] 0xaf30d8
LR [00af30b4] 0xaf30b4
--- interrupt: 300
page:e314e657 refcount:1 mapcount:0 mapping:00000000 index:0x1 pfn:0x31065
flags: 0x80000000(zone=2)
page_type: 0xffffffff()
raw: 80000000 00000100 00000122 00000000 00000001 00000000 ffffffff 00000001
raw: 00000000
page dumped because: pagealloc: corrupted page details
Comment 40 Erhard F. 2023-10-26 23:40:59 UTC
Created attachment 305297 [details]
dmesg (5.5-rc5, PowerMac G4 DP)

Re-visiting this bug as it's reproducible on v6.6-rc7.

This time I tried the other way round. CONFIG_VMAP_STACK was added for ppc with commit cd08f109e26231b279bcc0388428afcac6408ec6 (at about kernel v5.5-rc5 time). So I did a git checkout cd08f109e26231b279bcc0388428afcac6408ec6 and started from there with a further reduced kernel .config.

I added two additional patches to get the G4 to boot with VMAP_STACK enabled: 4119622 "powerpc/32s: Fix kasan_early_hash_table() for CONFIG_VMAP_STACK" and 232ca1e "powerpc/32s: Fix DSI and ISI exceptions for CONFIG_VMAP_STACK".

Then I burdened the memory subsystem with "stress -c 2 --vm 2 --vm-bytes 896M" as before and hit the issue in less than 20 sec. Not hitting the issue means my G4 runs "stress -c 2 --vm 2 --vm-bytes 896M" for about half an hour without side effects.

So it looks like the issue was here from the start when CONFIG_VMAP_STACK was added for ppc. (see dmesg)

I don't hit the issue when:
   1. nr_cpus=1 is set + VMAP_STACK enabled
   2. VMAP_STACK disabled

Setting LOWMEM_SIZE to 0x28000000 does not seem to have an effect on it.


This bug really plays hard to get... T'll do further KCSAN checks in recent kernels and open separate issues if KCSAN digs up something useful.
Comment 41 Christophe Leroy 2023-10-26 23:41:11 UTC
Created attachment 305298 [details]
attachment-30247-0.html

I'm out of office until 06 Nov.
Comment 42 Erhard F. 2023-10-26 23:46:07 UTC
Created attachment 305299 [details]
kernel .config (5.5-rc5, PowerMac G4 DP)

Note You need to log in before you can comment on or make changes to this bug.