Bug 204789
Summary: | Boot failure with more than 256G of memory on Power9 with 4K pages & Hash MMU | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Cameron Berkenpas (cam) |
Component: | PPC-64 | Assignee: | platform_ppc-64 |
Status: | RESOLVED CODE_FIX | ||
Severity: | high | CC: | michael, samuel |
Priority: | P1 | ||
Hardware: | PPC-64 | ||
OS: | Linux | ||
Kernel Version: | 5.2.x | Subsystem: | |
Regression: | Yes | Bisected commit-id: |
Description
Cameron Berkenpas
2019-09-08 00:04:26 UTC
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=204789 > > Bug ID: 204789 > Summary: Boot failure with more than 256G of memory > Product: Memory Management > Version: 2.5 > Kernel Version: 5.2.x > Hardware: PPC-64 > OS: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: Other > Assignee: akpm@linux-foundation.org > Reporter: cam@neo-zeon.de > Regression: No "Yes" :) > Kernel series 5.2.x will not boot on my Talos II workstation with dual POWER9 > 18 core processors and 512G of physical memory with disable_radix=yes and 4k > pages. > > 5.3-rc6 did not work either. > > 5.1 and earlier boot fine. Thanks. It's probably best to report this on the powerpc list, cc'ed here. > I can get the system to boot IF I leave the Radix MMU enabled or if I boot a > kernel with 64k pages. I haven't yet tested enabling the Radix MMU with 64k > pages at the same time, but I suspect this would work. This is a system I > cannot take down TOO frequently. > > The system will also boot with the Radix MMU disabled and 4k pages with 256G > or > less memory. Setting mem on the kernel CLI to 256G or less results in a > successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and the > kernel will not boot. > > Petitboot comes up, but the system fails VERY early in boot in the serial > console with: > SIGTERM received, booting... > [ 23.838858] kexec_core: Starting new kernel > > Early printk is enabled, and it never progresses any further. > > 5.1 boots just fine with the Radix MMU disabled and 4k pages. > > Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU > disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with > 5.1.21 for now. > > I have been unable to reproduce this issue in KVM. > > Here are my PCIe peripherals: > 1. Microsemi/Adaptec HBA 1100-4i SAS controller > 2. Megaraid 9316-16i SAS RAID controller. > > I've only tried little endian as this is a little endian install. > > -- > You are receiving this mail because: > You are the assignee for the bug. Hello, Regression set to "yes". Not sure how I missed that. :) Will report future PPC issues to that I come across to this list as well. Thanks! -Cameron On 9/11/19 7:31 AM, Andrew Morton wrote: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon@bugzilla.kernel.org wrote: > >> https://bugzilla.kernel.org/show_bug.cgi?id=204789 >> >> Bug ID: 204789 >> Summary: Boot failure with more than 256G of memory >> Product: Memory Management >> Version: 2.5 >> Kernel Version: 5.2.x >> Hardware: PPC-64 >> OS: Linux >> Tree: Mainline >> Status: NEW >> Severity: high >> Priority: P1 >> Component: Other >> Assignee: akpm@linux-foundation.org >> Reporter: cam@neo-zeon.de >> Regression: No > "Yes" :) > >> Kernel series 5.2.x will not boot on my Talos II workstation with dual >> POWER9 >> 18 core processors and 512G of physical memory with disable_radix=yes and 4k >> pages. >> >> 5.3-rc6 did not work either. >> >> 5.1 and earlier boot fine. > Thanks. It's probably best to report this on the powerpc list, cc'ed here. > >> I can get the system to boot IF I leave the Radix MMU enabled or if I boot a >> kernel with 64k pages. I haven't yet tested enabling the Radix MMU with 64k >> pages at the same time, but I suspect this would work. This is a system I >> cannot take down TOO frequently. >> >> The system will also boot with the Radix MMU disabled and 4k pages with 256G >> or >> less memory. Setting mem on the kernel CLI to 256G or less results in a >> successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and >> the >> kernel will not boot. >> >> Petitboot comes up, but the system fails VERY early in boot in the serial >> console with: >> SIGTERM received, booting... >> [ 23.838858] kexec_core: Starting new kernel >> >> Early printk is enabled, and it never progresses any further. >> >> 5.1 boots just fine with the Radix MMU disabled and 4k pages. >> >> Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU >> disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with >> 5.1.21 for now. >> >> I have been unable to reproduce this issue in KVM. >> >> Here are my PCIe peripherals: >> 1. Microsemi/Adaptec HBA 1100-4i SAS controller >> 2. Megaraid 9316-16i SAS RAID controller. >> >> I've only tried little endian as this is a little endian install. >> >> -- >> You are receiving this mail because: >> You are the assignee for the bug. Andrew Morton <akpm@linux-foundation.org> writes: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon@bugzilla.kernel.org wrote: > >> https://bugzilla.kernel.org/show_bug.cgi?id=204789 >> >> Bug ID: 204789 >> Summary: Boot failure with more than 256G of memory >> Product: Memory Management >> Version: 2.5 >> Kernel Version: 5.2.x >> Hardware: PPC-64 >> OS: Linux >> Tree: Mainline >> Status: NEW >> Severity: high >> Priority: P1 >> Component: Other >> Assignee: akpm@linux-foundation.org >> Reporter: cam@neo-zeon.de >> Regression: No > > "Yes" :) > >> Kernel series 5.2.x will not boot on my Talos II workstation with dual >> POWER9 >> 18 core processors and 512G of physical memory with disable_radix=yes and 4k >> pages. >> >> 5.3-rc6 did not work either. >> >> 5.1 and earlier boot fine. > > Thanks. It's probably best to report this on the powerpc list, cc'ed here. > >> I can get the system to boot IF I leave the Radix MMU enabled or if I boot a >> kernel with 64k pages. I haven't yet tested enabling the Radix MMU with 64k >> pages at the same time, but I suspect this would work. This is a system I >> cannot take down TOO frequently. >> >> The system will also boot with the Radix MMU disabled and 4k pages with 256G >> or >> less memory. Setting mem on the kernel CLI to 256G or less results in a >> successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and >> the >> kernel will not boot. >> >> Petitboot comes up, but the system fails VERY early in boot in the serial >> console with: >> SIGTERM received, booting... >> [ 23.838858] kexec_core: Starting new kernel >> >> Early printk is enabled, and it never progresses any further. >> >> 5.1 boots just fine with the Radix MMU disabled and 4k pages. >> >> Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU >> disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with >> 5.1.21 for now. >> >> I have been unable to reproduce this issue in KVM. >> >> Here are my PCIe peripherals: >> 1. Microsemi/Adaptec HBA 1100-4i SAS controller >> 2. Megaraid 9316-16i SAS RAID controller. >> >> I've only tried little endian as this is a little endian install. Will you be able to bisect this? I tried 4K PAGESIZE on P8 with upstream kernel and I can't recreate the issuue. [root@ltc ~]# free -g total used free shared buff/cache available Mem: 495 0 494 0 0 493 Swap: 0 0 0 [root@ltc ~]# getconf PAGESIZE 4096 [root@ltc ~]# grep Hash /proc/cpuinfo MMU : Hash I will see if I can get a P9 system with largemem -aneesh Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> writes: > Andrew Morton <akpm@linux-foundation.org> writes: > >> (switched to email. Please respond via emailed reply-to-all, not via the >> bugzilla web interface). >> >> On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon@bugzilla.kernel.org >> wrote: >> >>> https://bugzilla.kernel.org/show_bug.cgi?id=204789 >>> >>> Bug ID: 204789 >>> Summary: Boot failure with more than 256G of memory >>> Product: Memory Management >>> Version: 2.5 >>> Kernel Version: 5.2.x >>> Hardware: PPC-64 >>> OS: Linux >>> Tree: Mainline >>> Status: NEW >>> Severity: high >>> Priority: P1 >>> Component: Other >>> Assignee: akpm@linux-foundation.org >>> Reporter: cam@neo-zeon.de >>> Regression: No >> >> "Yes" :) >> >>> Kernel series 5.2.x will not boot on my Talos II workstation with dual >>> POWER9 >>> 18 core processors and 512G of physical memory with disable_radix=yes and >>> 4k >>> pages. >>> >>> 5.3-rc6 did not work either. >>> >>> 5.1 and earlier boot fine. >> >> Thanks. It's probably best to report this on the powerpc list, cc'ed here. >> >>> I can get the system to boot IF I leave the Radix MMU enabled or if I boot >>> a >>> kernel with 64k pages. I haven't yet tested enabling the Radix MMU with 64k >>> pages at the same time, but I suspect this would work. This is a system I >>> cannot take down TOO frequently. >>> >>> The system will also boot with the Radix MMU disabled and 4k pages with >>> 256G or >>> less memory. Setting mem on the kernel CLI to 256G or less results in a >>> successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and >>> the >>> kernel will not boot. >>> >>> Petitboot comes up, but the system fails VERY early in boot in the serial >>> console with: >>> SIGTERM received, booting... >>> [ 23.838858] kexec_core: Starting new kernel >>> >>> Early printk is enabled, and it never progresses any further. >>> >>> 5.1 boots just fine with the Radix MMU disabled and 4k pages. >>> >>> Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU >>> disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with >>> 5.1.21 for now. >>> >>> I have been unable to reproduce this issue in KVM. >>> >>> Here are my PCIe peripherals: >>> 1. Microsemi/Adaptec HBA 1100-4i SAS controller >>> 2. Megaraid 9316-16i SAS RAID controller. >>> >>> I've only tried little endian as this is a little endian install. > > Will you be able to bisect this? I tried 4K PAGESIZE on P8 with upstream > kernel and I can't recreate the issuue. > > [root@ltc ~]# free -g > total used free shared buff/cache > available > Mem: 495 0 494 0 0 > 493 > Swap: 0 0 0 > [root@ltc ~]# getconf PAGESIZE > 4096 > [root@ltc ~]# grep Hash /proc/cpuinfo > MMU : Hash > > I will see if I can get a P9 system with largemem > I was able to recreate this on a system that got memory above 16TB address. I guess your P9 system memory layout is also like that. Can you try this patch? It doesn't really fix the isssue, as in map the full 512GB of memory. But it do prevent the kernel crash. commit ebd05100344765fc3c030f0c257c2f9236fcd1ec Author: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Date: Fri Sep 13 19:26:25 2019 +0530 powerpc/book3s64/hash/4k: 4k supports only 16TB linear mapping With commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range"), we now split the 64TB address range into 4 contexts each of 16TB. That implies we can do only 16TB linear mapping. Make sure we don't add physical memory above 16TB if that is present in the system. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index bb3deb76c951..86cce8189240 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -35,12 +35,16 @@ extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; * memory requirements with large number of sections. * 51 bits is the max physical real address on POWER9 */ -#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME) && \ - defined(CONFIG_PPC_64K_PAGES) + +#if defined(CONFIG_PPC_64K_PAGES) +#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME) #define MAX_PHYSMEM_BITS 51 #else #define MAX_PHYSMEM_BITS 46 #endif +#else /* CONFIG_PPC_64K_PAGES */ +#define MAX_PHYSMEM_BITS 44 +#endif /* 64-bit classic hash table MMU */ #include <asm/book3s/64/mmu-hash.h> Yep, the box comes up now, but with 256G memory as expected. I'll get back to you on when I'll be able to bisect. Thanks! On 9/13/19 7:21 AM, Aneesh Kumar K.V wrote: > Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> writes: > >> Andrew Morton <akpm@linux-foundation.org> writes: >> >>> (switched to email. Please respond via emailed reply-to-all, not via the >>> bugzilla web interface). >>> >>> On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon@bugzilla.kernel.org >>> wrote: >>> >>>> https://bugzilla.kernel.org/show_bug.cgi?id=204789 >>>> >>>> Bug ID: 204789 >>>> Summary: Boot failure with more than 256G of memory >>>> Product: Memory Management >>>> Version: 2.5 >>>> Kernel Version: 5.2.x >>>> Hardware: PPC-64 >>>> OS: Linux >>>> Tree: Mainline >>>> Status: NEW >>>> Severity: high >>>> Priority: P1 >>>> Component: Other >>>> Assignee: akpm@linux-foundation.org >>>> Reporter: cam@neo-zeon.de >>>> Regression: No >>> "Yes" :) >>> >>>> Kernel series 5.2.x will not boot on my Talos II workstation with dual >>>> POWER9 >>>> 18 core processors and 512G of physical memory with disable_radix=yes and >>>> 4k >>>> pages. >>>> >>>> 5.3-rc6 did not work either. >>>> >>>> 5.1 and earlier boot fine. >>> Thanks. It's probably best to report this on the powerpc list, cc'ed here. >>> >>>> I can get the system to boot IF I leave the Radix MMU enabled or if I boot >>>> a >>>> kernel with 64k pages. I haven't yet tested enabling the Radix MMU with >>>> 64k >>>> pages at the same time, but I suspect this would work. This is a system I >>>> cannot take down TOO frequently. >>>> >>>> The system will also boot with the Radix MMU disabled and 4k pages with >>>> 256G or >>>> less memory. Setting mem on the kernel CLI to 256G or less results in a >>>> successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and >>>> the >>>> kernel will not boot. >>>> >>>> Petitboot comes up, but the system fails VERY early in boot in the serial >>>> console with: >>>> SIGTERM received, booting... >>>> [ 23.838858] kexec_core: Starting new kernel >>>> >>>> Early printk is enabled, and it never progresses any further. >>>> >>>> 5.1 boots just fine with the Radix MMU disabled and 4k pages. >>>> >>>> Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU >>>> disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with >>>> 5.1.21 for now. >>>> >>>> I have been unable to reproduce this issue in KVM. >>>> >>>> Here are my PCIe peripherals: >>>> 1. Microsemi/Adaptec HBA 1100-4i SAS controller >>>> 2. Megaraid 9316-16i SAS RAID controller. >>>> >>>> I've only tried little endian as this is a little endian install. >> Will you be able to bisect this? I tried 4K PAGESIZE on P8 with upstream >> kernel and I can't recreate the issuue. >> >> [root@ltc ~]# free -g >> total used free shared buff/cache >> available >> Mem: 495 0 494 0 0 >> 493 >> Swap: 0 0 0 >> [root@ltc ~]# getconf PAGESIZE >> 4096 >> [root@ltc ~]# grep Hash /proc/cpuinfo >> MMU : Hash >> >> I will see if I can get a P9 system with largemem >> > I was able to recreate this on a system that got memory above 16TB > address. I guess your P9 system memory layout is also like that. > > Can you try this patch? It doesn't really fix the isssue, as in map the > full 512GB of memory. But it do prevent the kernel crash. > > commit ebd05100344765fc3c030f0c257c2f9236fcd1ec > Author: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> > Date: Fri Sep 13 19:26:25 2019 +0530 > > powerpc/book3s64/hash/4k: 4k supports only 16TB linear mapping > > With commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel > regions in the > same 0xc range"), we now split the 64TB address range into 4 contexts > each of > 16TB. That implies we can do only 16TB linear mapping. Make sure we > don't > add physical memory above 16TB if that is present in the system. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> > > diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h > b/arch/powerpc/include/asm/book3s/64/mmu.h > index bb3deb76c951..86cce8189240 100644 > --- a/arch/powerpc/include/asm/book3s/64/mmu.h > +++ b/arch/powerpc/include/asm/book3s/64/mmu.h > @@ -35,12 +35,16 @@ extern struct mmu_psize_def > mmu_psize_defs[MMU_PAGE_COUNT]; > * memory requirements with large number of sections. > * 51 bits is the max physical real address on POWER9 > */ > -#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME) > && \ > - defined(CONFIG_PPC_64K_PAGES) > + > +#if defined(CONFIG_PPC_64K_PAGES) > +#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME) > #define MAX_PHYSMEM_BITS 51 > #else > #define MAX_PHYSMEM_BITS 46 > #endif > +#else /* CONFIG_PPC_64K_PAGES */ > +#define MAX_PHYSMEM_BITS 44 > +#endif > > /* 64-bit classic hash table MMU */ > #include <asm/book3s/64/mmu-hash.h> > On 9/13/19 8:35 PM, Cameron Berkenpas wrote:
> Yep, the box comes up now, but with 256G memory as expected.
>
> I'll get back to you on when I'll be able to bisect.
>
> Thanks!
I am sure this is due to
commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in
the same 0xc range"),
We reduced the linear map range for 4K page size to 16TB there.
-aneesh
Running against the kernel I built against 0034d395f89d and the problem is still there. However, running against the kernel I built against the previous commit, a35a3c6f6065, and the system boots. This being due to 0034d395f89d confirmed. Thanks! On 9/13/19 9:13 AM, Aneesh Kumar K.V wrote: > On 9/13/19 8:35 PM, Cameron Berkenpas wrote: >> Yep, the box comes up now, but with 256G memory as expected. >> >> I'll get back to you on when I'll be able to bisect. >> >> Thanks! > > I am sure this is due to > > commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions > in the same 0xc range"), > > We reduced the linear map range for 4K page size to 16TB there. > > > -aneesh On 9/13/19 10:58 PM, Cameron Berkenpas wrote: > Running against the kernel I built against 0034d395f89d and the problem > is still there. > > However, running against the kernel I built against the previous commit, > a35a3c6f6065, and the system boots. > > This being due to 0034d395f89d confirmed. https://lore.kernel.org/linuxppc-dev/20190917145702.9214-1-aneesh.kumar@linux.ibm.com This series should help you. -aneesh Hello, Unfortunately, this patch set has made things quite a bit worse for me. Appending mem=256G doesn't fix it either. in all cases, the system at least gets past early boot and then I will probably get a panic and eventual reboot, or occasionally it just locks up entirely. Here's my very first attempt at booting the kernel where I didn't even get a panic: https://pastebin.com/a3TVZcVB Here's another attempt where I get a panic: https://pastebin.com/QsJjyC2v Finally here's an attempt with mem=256G: https://pastebin.com/swgLYie9 I don't know that these results are substantially different from each other, but perhaps there's something helpful. Sometimes (but not in any of the above), the host gets to the point that systemd starts up, but ultimately it seems I got the same stacktrace. At one point, I ended up with a CPU guarded out, but it was simple to recover. -Cameron On 9/17/19 8:15 PM, Aneesh Kumar K.V wrote: > On 9/13/19 10:58 PM, Cameron Berkenpas wrote: >> Running against the kernel I built against 0034d395f89d and the >> problem is still there. >> >> However, running against the kernel I built against the previous >> commit, a35a3c6f6065, and the system boots. >> >> This being due to 0034d395f89d confirmed. > > > > https://lore.kernel.org/linuxppc-dev/20190917145702.9214-1-aneesh.kumar@linux.ibm.com > > > This series should help you. > > -aneesh > Can you boot a good kernel and do: $ sudo grep RAM /proc/iomem And paste the output. Just to confirm what your memory layout is. What arrangement of DIMMs do you have? It's possible you could work around the bug by changing that, depending on how many DIMMs and slots you have. grep RAM /proc/iomem 00000000-3fffffffff : System RAM The system has 16 dimm slots, all are populated. Unfortunately, I will not have physical to access to the box in the foreseeable future. Aneesh appears to be correct in that this issue started with 0034d395f89d. On 10/1/19 3:04 AM, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=204789 > > --- Comment #10 from Michael Ellerman (michael@ellerman.id.au) --- > Can you boot a good kernel and do: > > $ sudo grep RAM /proc/iomem > > And paste the output. Just to confirm what your memory layout is. > > What arrangement of DIMMs do you have? It's possible you could work around > the > bug by changing that, depending on how many DIMMs and slots you have. > I am also experiencing this issue on a Talos II, however with much less RAM. Right now I have 16 GB attached to each CPU: # grep RAM /proc/iomem 00000000-3ffffffff : System RAM 200000000000-2003ffffffff : System RAM Without the patchset linked above, I also have a failure to boot with 5.2 and later kernels. (/proc/cmdline) console=hvc0 disable_radix ignore_loglevel no_console_suspend With the first patch from the patchset linked above, the RAM attached to the second node is ignored, as expected, but the system boots and otherwise runs fine. With the full patchset linked above, I get panics on boot, as mentioned in comment 9: [ 5.286513] Oops: Machine check, sig: 7 [#1] [ 5.286536] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=256 NUMA PowerNV [ 5.286545] Modules linked in: soundcore [ 5.286554] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G M 5.3.4-00012-g8fc24abb8c31 #1 [ 5.286569] NIP: 0000000000000000 LR: 7265677368657265 CTR: 0000000000000000 [ 5.286590] REGS: c0000003ffb66fb0 TRAP: c00000000120dd00 Tainted: G M (5.3.4-00012-g8fc24abb8c31) [ 5.286602] MSR: 0000000000000000 <> CR: c000000000036f04 XER: 00000000 [ 5.286611] CFAR: 0000000000000000 IRQMASK: c0000003ffb67370 [ 5.286611] GPR00: 0000000000000000 c0003d0000083d28 ffffffffffffffff 0000000006000000 [ 5.286611] GPR04: 0500000002010101 00c75e1bc4c00a58 ffffffffffffffff c000000000036530 [ 5.286611] GPR08: c0003d0000083d28 c0000003ffb67510 0000000000000000 c0000003ffb670e0 [ 5.286611] GPR12: c0000003ffb67040 8804422200000000 c0000000000804ec c00000000120dd00 [ 5.286611] GPR16: 0000000000000000 c0000003ffb673fc c0000003ffb67070 0000000000000000 [ 5.286611] GPR20: c0000000000367f4 c00000000120dd00 0000000000000000 c0000003ffb670e0 [ 5.286611] GPR24: c0000003ffb67370 0000000000000000 c000000000008380 0000000000000000 [ 5.286611] GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 5.286777] NIP [0000000000000000] 0x0 [ 5.286792] LR [7265677368657265] 0x7265677368657265 [ 5.286809] Call Trace: [ 5.286823] Instruction dump: [ 5.286838] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 5.286858] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 60000000 60000000 60000000 60000000 [ 5.286889] ---[ end trace 60912b64b73c973e ]--- [ 5.819189] [ 5.819203] Oops: Machine check, sig: 7 [#2] [ 5.819205] Disabling lock debugging due to kernel taint [ 5.819223] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=256 NUMA PowerNV [ 5.819233] Modules linked in: snd_hda_intel(+) snd_hda_codec snd_hwdep snd_hda_core snd_pcm tg3(+) snd_timer snd libphy ttm soundcore [ 5.819264] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G M D 5.3.4-00012-g8fc24abb8c31 #1 [ 5.819286] NIP: 0000000000000000 LR: 7265677368657265 CTR: 0000000000000000 [ 5.819315] REGS: c0000003ffb7efb0 TRAP: c00000000120dd00 Tainted: G M D (5.3.4-00012-g8fc24abb8c31) [ 5.819328] MSR: 0000000000000000 <> CR: c000000000036f04 XER: 00000000 [ 5.819347] CFAR: 0000000000000000 IRQMASK: c0000003ffb7f370 [ 5.819347] GPR00: 0000000000000000 c0003d0000063d28 ffffffffffffffff 0000000006000000 [ 5.819347] GPR04: 0500000002010101 000ed5d8325a5873 ffffffffffffffff c000000000036530 [ 5.819347] GPR08: c0003d0000063d28 c0000003ffb7f510 0000000000000000 c0000003ffb7f0e0 [ 5.819347] GPR12: c0000003ffb7f040 8804424200000000 c0000000000804ec c00000000120dd00 [ 5.819347] GPR16: 0000000000000000 c0000003ffb7f3fc c0000003ffb7f070 0000000000000000 [ 5.819347] GPR20: c0000000000367f4 c00000000120dd00 0000000000000000 c0000003ffb7f0e0 [ 5.819347] GPR24: c0000003ffb7f370 0000000000000000 c000000000008380 0000000000000000 [ 5.819347] GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 5.819537] NIP [0000000000000000] 0x0 [ 5.819554] LR [7265677368657265] 0x7265677368657265 [ 5.819562] Call Trace: [ 5.819567] Instruction dump: [ 5.819573] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 5.819603] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 60000000 60000000 60000000 60000000 [ 5.819648] ---[ end trace 60912b64b73c973f ]--- [ 6.311806] [ 6.311820] Oops: Machine check, sig: 7 [#3] [ 6.311829] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=256 NUMA PowerNV [ 6.311839] Modules linked in: snd_hda_intel(+) snd_hda_codec snd_hwdep snd_hda_core snd_pcm tg3(+) snd_timer snd libphy ttm soundcore [ 6.311869] CPU: 0 PID: 734 Comm: udevd Tainted: G M D 5.3.4-00012-g8fc24abb8c31 #1 [ 6.311882] NIP: 0000000000000000 LR: 7265677368657265 CTR: 0000000000000000 [ 6.311903] REGS: c0000003ffbc6fb0 TRAP: c00000000120dd00 Tainted: G M D (5.3.4-00012-g8fc24abb8c31) [ 6.311917] MSR: 0000000000000000 <> CR: c000000000036f04 XER: 00000000 [ 6.311937] CFAR: 0000000000000000 IRQMASK: c0000003ffbc7370 [ 6.311937] GPR00: 0000000000000000 c0003d0000003d28 ffffffffffffffff 0000000006000000 [ 6.311937] GPR04: 0500000002010101 00b492f4c8c2175c ffffffffffffffff c000000000036530 [ 6.311937] GPR08: c0003d0000003d28 c0000003ffbc7510 0000000000000000 c0000003ffbc70e0 [ 6.311937] GPR12: c0000003ffbc7040 8024428200000000 c0000000000804ec c00000000120dd00 [ 6.311937] GPR16: 0000000000000000 c0000003ffbc73fc c0000003ffbc7070 0000000000000000 [ 6.311937] GPR20: c0000000000367f4 c00000000120dd00 0000000000000000 c0000003ffbc70e0 [ 6.311937] GPR24: c0000003ffbc7370 0000000000000000 c000000000008380 0000000000000000 [ 6.311937] GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 6.312109] NIP [0000000000000000] 0x0 [ 6.312126] LR [7265677368657265] 0x7265677368657265 [ 6.312143] Call Trace: [ 6.312148] Instruction dump: [ 6.312155] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 6.312190] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 60000000 60000000 60000000 60000000 [ 6.312226] ---[ end trace 60912b64b73c9740 ]--- [ 6.819242] Kernel panic - not syncing: Fatal I have easy physical access to this machine, so I'd be able to try out patches if needed. This was resolved some time back by Aneesh and the patches made into mainline a long time ago. Marking resolved. The fix is: 7746406baa3b ("powerpc/book3s64/hash/4k: Support large linear mapping range with 4K") |