Bug 209181 - kernel BUG at arch/powerpc/mm/pgtable.c:304!
Summary: kernel BUG at arch/powerpc/mm/pgtable.c:304!
Status: RESOLVED DUPLICATE of bug 209029
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-07 05:44 UTC by Zorro Lang
Modified: 2020-09-07 06:18 UTC (History)
2 users (show)

See Also:
Kernel Version: Linux 5.9-rc4
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Zorro Lang 2020-09-07 05:44:15 UTC
Description of problem:
The latest upstream mainline kernel always panic on ppc64le machine (P9) as below:

[    1.406462] Loading compiled-in X.509 certificates 
[    1.436966] Loaded X.509 cert 'Build time autogenerated kernel key: 834a47793f474746e698c2f3a32aa53ffded35db' 
[    1.437154] zswap: loaded using pool lzo/zbud 
[    1.437509] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers 
[    1.437571] ------------[ cut here ]------------ 
[    1.437584] WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/pgtable.c:185 set_pte_at+0xd8/0x1c0 
[    1.437589] Modules linked in: 
[    1.437596] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4 #1 
[    1.437602] NIP:  c00000000009bb28 LR: c00000000152e1c0 CTR: 0000000000000000 
[    1.437608] REGS: c0000001fb6eb7b0 TRAP: 0700   Not tainted  (5.9.0-rc4) 
[    1.437613] MSR:  8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 24002824  XER: 0000000a 
[    1.437624] CFAR: c00000000009ba74 IRQMASK: 0  
[    1.437624] GPR00: c00000000152e1c0 c0000001fb6eba40 c000000002138900 c0000000120f4100  
[    1.437624] GPR04: 000b701718150000 c0000000122400a8 05014e0100000080 0000000000000000  
[    1.437624] GPR08: 0000000000000080 07000000000000c0 05000000000000c0 0000000000000001  
[    1.437624] GPR12: 0000000000002000 c000000004050000 0000000000000000 c000000001569d38  
[    1.437624] GPR16: c000000012210000 f0ffffffffffffff c0000001fb52f8a0 c000000002231cb8  
[    1.437624] GPR20: c0000000010d8de0 c000000001000000 c00000001220b2e8 c0000000122f8000  
[    1.437624] GPR24: 0000000000000100 000000000000014e c0000000122f8028 8000000000000105  
[    1.437624] GPR28: 000b701718150000 c0000000122118c0 c0000000120f4100 c0000000122400a8  
[    1.437668] NIP [c00000000009bb28] set_pte_at+0xd8/0x1c0 
[    1.437674] LR [c00000000152e1c0] debug_vm_pgtable+0x8f4/0x1e14 
[    1.437679] Call Trace: 
[    1.437685] [c0000001fb6eba40] [c000000001082f48] _raw_spin_lock+0x88/0x100 (unreliable) 
[    1.437693] [c0000001fb6eba80] [c00000000152dfd4] debug_vm_pgtable+0x708/0x1e14 
[    1.437700] [c0000001fb6ebb90] [c00000000001208c] do_one_initcall+0xbc/0x5f0 
[    1.437707] [c0000001fb6ebc80] [c0000000014e4d04] kernel_init_freeable+0x4bc/0x58c 
[    1.437714] [c0000001fb6ebdb0] [c000000000012de8] kernel_init+0x2c/0x164 
[    1.437721] [c0000001fb6ebe20] [c00000000000d5d0] ret_from_kernel_thread+0x5c/0x6c 
[    1.437726] Instruction dump: 
[    1.437731] 41820068 e8010050 ebc10030 7c0803a6 4bffff8c 4bffff88 3d200700 792907c6  
[    1.437741] 612900c0 7d4a4838 2faa00c0 419eff54 <0fe00000> 4bffff4c 3fe0bfef 63ffffff  
[    1.437751] irq event stamp: 275292 
[    1.437757] hardirqs last  enabled at (275291): [<c0000000004ef4d0>] inc_zone_page_state+0xa0/0xd0 
[    1.437764] hardirqs last disabled at (275292): [<c0000000000096fc>] program_check_common_virt+0x2bc/0x310 
[    1.437771] softirqs last  enabled at (273036): [<c000000000f97044>] inet6_register_protosw+0x154/0x2a0 
[    1.437778] softirqs last disabled at (273034): [<c000000000f96f34>] inet6_register_protosw+0x44/0x2a0 
[    1.437784] ---[ end trace 39aeb34808a575d2 ]--- 
[    1.437790] ------------[ cut here ]------------ 
[    1.437795] kernel BUG at arch/powerpc/mm/pgtable.c:304! 
[    1.437801] Oops: Exception in kernel mode, sig: 5 [#1] 
[    1.437805] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries 
[    1.437807] Modules linked in: 
[    1.437811] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.9.0-rc4 #1 
[    1.437815] NIP:  c00000000009c1a8 LR: c0000000005f9de0 CTR: 0000000000000000 
[    1.437819] REGS: c0000001fb6eb720 TRAP: 0700   Tainted: G        W          (5.9.0-rc4) 
[    1.437822] MSR:  8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 24002828  XER: 0000000a 
[    1.437829] CFAR: c00000000009c148 IRQMASK: 0  
[    1.437829] GPR00: c0000000005f9de0 c0000001fb6eb9b0 c000000002138900 c0000000120f4100  
[    1.437829] GPR04: 000b701718150000 c0000000122400a8 00000000122f8000 0000000000802f12  
[    1.437829] GPR08: 0000000000000000 0000000000000001 0000000000000028 0000000000000001  
[    1.437829] GPR12: 0000000000002000 c000000004050000 0000000000000000 c000000001569d38  
[    1.437829] GPR16: c000000012210000 f0ffffffffffffff c0000001fb52f8a0 c000000002231cb8  
[    1.437829] GPR20: c0000000010d8de0 c000000001000000 c00000001220b2e8 c0000000122f8000  
[    1.437829] GPR24: 0000000000000100 0000000000000008 c000000002231ca8 000000000000000a  
[    1.437829] GPR28: c000000002231cb8 c000000002231cb0 000b701718150000 000000000002dc05  
[    1.437860] NIP [c00000000009c1a8] assert_pte_locked+0x218/0x360 
[    1.437864] LR [c0000000005f9de0] pte_update+0xc0/0x180 
[    1.437867] Call Trace: 
[    1.437870] [c0000001fb6eb9b0] [0000000000000100] 0x100 (unreliable) 
[    1.437875] [c0000001fb6eba20] [c0000000005f9de0] pte_update+0xc0/0x180 
[    1.437879] [c0000001fb6eba80] [c00000000152e1e0] debug_vm_pgtable+0x914/0x1e14 
[    1.437884] [c0000001fb6ebb90] [c00000000001208c] do_one_initcall+0xbc/0x5f0 
[    1.437888] [c0000001fb6ebc80] [c0000000014e4d04] kernel_init_freeable+0x4bc/0x58c 
[    1.437893] [c0000001fb6ebdb0] [c000000000012de8] kernel_init+0x2c/0x164 
[    1.437897] [c0000001fb6ebe20] [c00000000000d5d0] ret_from_kernel_thread+0x5c/0x6c 
[    1.437900] Instruction dump: 
[    1.437903] 7c0803a6 60000000 39400001 7fdffc36 7d4ad830 394affff 7d4a07b4 7d4af838  
[    1.437909] 794a1f24 7d09502a 7d090074 7929d182 <0b090000> 79090022 550ac03e ebfc0000  
[    1.437916] ---[ end trace 39aeb34808a575d3 ]--

How reproducible:
100% on our ppc64le machines

Steps to Reproduce:
1. git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
2. build and intall the kernel (I'll update the .config file later)
3. boot the kernel
<panic at here>

Additional info:
The HEAD of my test kernel is:
commit f4d51dffc6c01a9e94650d95ce0104964f8ae822
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sun Sep 6 17:11:40 2020 -0700

    Linux 5.9-rc4
Comment 1 Christophe Leroy 2020-09-07 05:52:04 UTC
See https://bugzilla.kernel.org/show_bug.cgi?id=209029

Patch at https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200902040122.136414-1-aneesh.kumar@linux.ibm.com/ to deactivate CONFIG_DEBUG_VM_PGTABLE on powerpc until the issue is fixes.
Comment 2 Zorro Lang 2020-09-07 06:18:02 UTC
(In reply to Christophe Leroy from comment #1)
> See https://bugzilla.kernel.org/show_bug.cgi?id=209029
> 
> Patch at
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200902040122.
> 136414-1-aneesh.kumar@linux.ibm.com/ to deactivate CONFIG_DEBUG_VM_PGTABLE
> on powerpc until the issue is fixes.

Thanks for this info
Comment 3 Zorro Lang 2020-09-07 06:18:54 UTC

*** This bug has been marked as a duplicate of bug 209029 ***

Note You need to log in before you can comment on or make changes to this bug.