Bug 215381

Summary: BUG: Unable to handle kernel data access on read at 0x6600cc00000004
Product: Platform Specific/Hardware Reporter: Erhard F. (erhard_f)
Component: PPC-64Assignee: platform_ppc-64
Status: RESOLVED OBSOLETE    
Severity: normal    
Priority: P1    
Hardware: PPC-64   
OS: Linux   
Kernel Version: 5.15.10 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg (kernel 5.15.10, Talos II)
kernel .config (kernel 5.15.10, Talos II)

Description Erhard F. 2021-12-21 18:50:57 UTC
Created attachment 300105 [details]
dmesg (kernel 5.15.10, Talos II)

Happened not during boot but shortly afterwards compiling some stuff via ssh.

[...]
BUG: Unable to handle kernel data access on read at 0x6600cc00000004
Faulting instruction address: 0xc0000000001398c4
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=192 NUMA PowerNV
Modules linked in: auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc rfkill evdev ecb xts snd_hda_codec_hdmi radeon xhci_pci snd_hda_intel ctr snd_intel_dspcfg xhci_hcd snd_hda_codec snd_hwdep ofpart cbc snd_hda_core drm_ttm_helper ttm snd_pcm powernv_flash i2c_algo_bit aes_generic ibmpowernv libaes usbcore drm_kms_helper mtd at24 vmx_crypto snd_timer gf128mul opal_prd hwmon regmap_i2c snd sysimgblt syscopyarea sysfillrect fb_sys_fops usb_common soundcore lz4 lz4_compress lz4_decompress zram zsmalloc powernv_cpufreq drm fuse drm_panel_orientation_quirks backlight configfs
CPU: 22 PID: 55708 Comm: clang Not tainted 5.15.10-gentoo-TalosII #1
NIP:  c0000000001398c4 LR: c000000000a9b6c0 CTR: 0000000000000000
REGS: c000200009dcef10 TRAP: 0380   Not tainted  (5.15.10-gentoo-TalosII)
MSR:  9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 28228244  XER: 000000ae
CFAR: c000000000a9b6bc IRQMASK: 0 
GPR00: c000000000315d88 c000200009dcf1b0 c000000001256d00 006600cc00000000 
GPR04: c00000000d966370 00003fff8dc00000 0000000000000000 0000000000000000 
GPR08: 0000000000000009 ffffffffdead4ead c00c000000000000 0000000000000000 
GPR12: 0000000088228244 c0002007ff7f6500 c000200009dcf710 c00020000595e660 
GPR16: c0000000012a5bc0 00003fff8dc3e000 c000200005436080 00003fff8dc5e000 
GPR20: 0000400000000000 0000000000000000 c0002000097de4a0 006600cc00000000 
GPR24: c000000001260985 0000000000000000 c0000000012a5b28 0000000000000000 
GPR28: c000200009dcf710 c0000000012a5bc0 c00000000d966370 006600cc00000000 
NIP [c0000000001398c4] .do_raw_spin_lock+0x14/0x1d0
LR [c000000000a9b6c0] ._raw_spin_lock+0x10/0x30
Call Trace:
[c000200009dcf1b0] [00000000000000c8] 0xc8 (unreliable)
[c000200009dcf230] [c00000000029d9d4] .finish_fault+0x3e4/0x4f0
[c000200009dcf2a0] [c000000000315d88] .__split_huge_pmd+0xe8/0x1190
[c000200009dcf430] [c00000000029a5bc] .unmap_page_range+0x43c/0xfe0
[c000200009dcf5c0] [c00000000029b618] .unmap_vmas+0xd8/0x200
[c000200009dcf6a0] [c0000000002a8324] .unmap_region+0xc4/0x160
[c000200009dcf7c0] [c0000000002ab5fc] .__do_munmap+0x1fc/0x5f0
[c000200009dcf880] [c0000000002aba70] .__vm_munmap+0x80/0x110
[c000200009dcf940] [c0000000003ef160] .elf_map+0xa0/0x120
[c000200009dcf9d0] [c0000000003f1168] .load_elf_binary+0xbf8/0x1fa0
[c000200009dcfb40] [c00000000034ecc8] .bprm_execve+0x2a8/0x700
[c000200009dcfc10] [c00000000034fcc8] .do_execveat_common.isra.0+0x188/0x230
[c000200009dcfcd0] [c000000000350dfc] .__se_sys_execve+0x3c/0x50
[c000200009dcfd40] [c00000000002de48] .system_call_exception+0x1c8/0x530
[c000200009dcfe10] [c00000000000c068] system_call_vectored_common+0xe8/0x278
--- interrupt: 3000 at 0x3fffbd817c0c
NIP:  00003fffbd817c0c LR: 0000000000000000 CTR: 0000000000000000
REGS: c000200009dcfe80 TRAP: 3000   Not tainted  (5.15.10-gentoo-TalosII)
MSR:  900000000000f032 <SF,HV,EE,PR,FP,ME,IR,DR,RI>  CR: 42220442  XER: 00000000
IRQMASK: 0 
GPR00: 000000000000000b 00003fffbd12dd20 00003fffbd935300 000000002cbe3700 
GPR04: 000000002cc377e0 000000002cbe2cc0 0000000000000008 0000000000000001 
GPR08: 0000000000000001 0000000000000000 0000000000000000 0000000000000000 
GPR12: 0000000000000000 00003fffbda03810 0000000000000000 0000000000000020 
GPR16: 00000000100514a0 000000000000005c 0000000000000000 00000000100348d4 
GPR20: 0000000000000000 00003fffbd125000 00003fffea78ba68 0000000000000000 
GPR24: 00003fffbd12de10 0000000000000000 00003fffea78b638 00003fffea78ba18 
GPR28: 000000002cc182c0 00003fffea78ba68 0000000000000001 0000000000100000 
NIP [00003fffbd817c0c] 0x3fffbd817c0c
LR [0000000000000000] 0x0
--- interrupt: 3000
Instruction dump:
f9430010 792907c6 6529ffff 6129ffff f9230004 4e800020 60000000 fbe1fff8 
f821ff81 3d20dead 7c7f1b78 61294ead <81430004> 7c0a4800 408200d4 e95f0010 
---[ end trace 063d70c8fce39c11 ]---

# inxi -bZ
System:    Host: T1000 Kernel: 5.15.10-TalosII ppc64 bits: 64 Console: tty 2 Distro: Gentoo Base System release 2.7 
Machine:   Type: PowerPC Device System: T2P9D01 REV 1.01 details: PowerNV T2P9D01 REV 1.01 rev: 2.2 (pvr 004e 1202) 
CPU:       Info: 32-Core POWER9 altivec supported [MCP] speed: 2154 MHz min/max: 2154/3800 MHz 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Turks XT [Radeon HD 6670/7670] driver: radeon v: kernel 
           Device-2: ASPEED Graphics Family driver: N/A 
           Display: server: X.org 1.20.14 driver: radeon tty: 211x57 
           Message: Advanced graphics data unavailable in console for root. 
Network:   Device-1: Broadcom and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe driver: tg3 
           Device-2: Broadcom and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe driver: tg3 
Drives:    Local Storage: total: 447.13 GiB used: 18.79 GiB (4.2%) 
Info:      Processes: 370 Uptime: 1h 10m Memory: 62.75 GiB used: 1.74 GiB (2.8%) Init: systemd Shell: Bash inxi: 3.1.06 

# lspci 
0000:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0000:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Turks XT [Radeon HD 6670/7670]
0000:01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Turks HDMI Audio [Radeon HD 6500/6600 / 6700M Series]
0001:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0001:01:00.0 Non-Volatile memory controller: Phison Electronics Corporation Device 5008 (rev 01)
0002:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:01:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02)
0004:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0004:01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0004:01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0005:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0005:01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 04)
0005:02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41)
0030:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0031:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0032:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0033:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
Comment 1 Erhard F. 2021-12-21 18:52:02 UTC
Created attachment 300107 [details]
kernel .config (kernel 5.15.10, Talos II)
Comment 2 Erhard F. 2022-08-19 10:19:03 UTC
Have not seen this in quite some stable kernel releases...

Closing here. Will re-open in case I hit it again.