Bug 213073

Summary: kernel panic early in boot on xeon x5690 dual core
Product: Platform Specific/Hardware Reporter: robert.shteynfeld
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: RESOLVED CODE_FIX    
Severity: high CC: alfalco, bp, dedekind1, mike.rapoport, rppt
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.11 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg after booting 5.10.21 kernel
dmesg booting 5.10.21 with mminit_loglevel=4 ignore_loglevel
reoder-memmap-init.patch
reoder-memmap-init-v2.patch
debug-memmap-init.patch
building kernel from stable tree
dmesg after booting 5.10.21 kernel with extra debugging
reoder-memmap-init-v3.patch
5.12.7 boot with CONFIG_DEBUG_VM=n build
reorder-memmap-init-v4.patch
5.12.7 dmesg after patch 4
bdx-CoD-on-v5.13-reverted-dmesg.txt
bdx-CoD-off-v5.13-reverted-dmesg.txt
bdx-CoD-off-v5.13-vanilla-dmesg.txt
config-5.13.0.txt

Description robert.shteynfeld 2021-05-15 02:20:18 UTC
"Attempted to kill the idle task".  Started with kernel 5.11.x.  5.10.x boots ok.  Another report of redhat.bugzilla same dual-core x5690 cpu as mine.

Video of kernel panic https://drive.google.com/file/d/1pfDAnu3DVG-V-HWVGDpWiNONZO7gZjGP/view?usp=sharing.  

Log from another report on bugzilla.redhat https://bugzilla.redhat.com/show_bug.cgi?id=1945809:

=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2021.04.01 19:29:36 =~=~=~=~=~=~=~=~=~=~=~=
Probing EDD (edd=off to disable)... ok
[    0.000000] Linux version 5.11.11-300.fc34.x86_64 (mockbuild@bkernel01.iad2.fedoraproject.org) (gcc (GCC) 11.0.1 20210324 (Red Hat 11.0.1-0), GNU ld version 2.35.1-41.fc34) #1 SMP Tue Mar 30 16:37:11 UTC 2021
[    0.000000] Command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.11.11-300.fc34.x86_64 root=/dev/mapper/fedora_33-root ro rd.lvm.lv=fedora_33/root rd.shell rd.debug rd.udev.debug console=ttyS0,115200n8 earlyprintk=serial,ttyS0,115200n8
[    0.000000] x86/fpu: x87 FPU will use FXSAVE
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009bfff] usable
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000dbdf9bff] usable
[    0.000000] BIOS-e820: [mem 0x00000000dbdf9c00-0x00000000dbe4bbff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000dbe4bc00-0x00000000dbe4dbff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000dbe4dc00-0x00000000dbffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fcffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fe000000-0x00000000fed003ff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000feefffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ffb00000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000001223ffffff] usable
[    0.000000] printk: bootconsole [earlyser0] enabled
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.5 present.
[    0.000000] DMI: Dell Inc. Precision WorkStation T5500  /0CRH6C, BIOS A18 10/15/2018
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 3458.005 MHz processor
[    0.001676] last_pfn = 0x1224000 max_arch_pfn = 0x400000000
[    0.008608] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT  
Memory KASLR using RDTSC...
[    0.018567] last_pfn = 0xdbdf9 max_arch_pfn = 0x400000000
[    0.043235] Using GB pages for direct mapping
[    0.047749] RAMDISK: [mem 0x343be000-0x361d6fff]
[    0.052186] ACPI: Early table checksum verification disabled
[    0.057807] ACPI: RSDP 0x00000000000FEC30 000024 (v02 DELL  )
[    0.063516] ACPI: XSDT 0x00000000000FCB10 00007C (v01 DELL   B10K     00000015 ASL  00000061)
[    0.072000] ACPI: FACP 0x00000000000FCC08 0000F4 (v03 DELL   B10K     00000015 ASL  00000061)
[    0.080481] ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 128/64 (20201113/tbfadt-564)
[    0.090238] ACPI: DSDT 0x00000000FFEA1DE4 0055B9 (v01 DELL   dt_ex    00001000 INTL 20050624)
[    0.098658] ACPI: FACS 0x00000000DBDF9C00 000040
[    0.103239] ACPI: FACS 0x00000000DBDF9C00 000040
[    0.107920] ACPI: SSDT 0x00000000FFEA74AE 00009C (v01 DELL   st_ex    00001000 INTL 20050624)
[    0.116308] ACPI: APIC 0x00000000000FCCFC 00026A (v01 DELL   B10K     00000015 ASL  00000061)
[    0.124789] ACPI: BOOT 0x00000000000FCF66 000028 (v01 DELL   B10K     00000015 ASL  00000061)
[    0.133271] ACPI: ASF! 0x00000000000FCF8E 000096 (v32 DELL   B10K     00000015 ASL  00000061)
[    0.141752] ACPI: MCFG 0x00000000000FD024 00003C (v01 DELL   B10K     00000015 ASL  00000061)
[    0.150233] ACPI: HPET 0x00000000000FD060 000038 (v01 DELL   B10K     00000015 ASL  00000061)
[    0.158714] ACPI: TCPA 0x00000000000FD2BC 000032 (v01 DELL   B10K     00000015 ASL  00000061)
[    0.167195] ACPI: DMAR 0x00000000000FD2EE 000100 (v01 DELL   B10K     00000015 ASL  00000061)
[    0.175677] ACPI: [    0.551749] Zone ranges:
[    0.554110]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.560253]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.566398]   Normal   [mem 0x0000000100000000-0x0000001223ffffff]
[    0.572543]   Device   empty
[    0.575399] Movable zone start for each node
[    0.579641] Early memory node ranges
[    0.583188]   node   1: [mem 0x0000000000001000-0x000000000009bfff]
[    0.589418]   node   1: [mem 0x0000000000100000-0x00000000dbdf8fff]
[    0.595650]   node   1: [mem 0x0000000100000000-0x0000000c23ffffff]
[    0.601888]   node   0: [mem 0x0000000c24000000-0x0000001223ffffff]
[    0.608117] Initmem setup node 0 [mem 0x0000000c24000000-0x0000001223ffffff]
[    0.886736]   Normal zone: 12615680 pages in unavailable ranges
[    0.892473] Initmem setup node 1 [mem 0x0000000000001000-0x0000000c23ffffff]
[    0.899997]   DMA zone: 28772 pages in unavailable ranges
[    0.916560]   DMA32 zone: 16903 pages in unavailable ranges
[    1.071634]   Normal zone: 16384 pages in unavailable ranges
[    1.077341] ACPI: PM-Timer IO Port: 0x808
[    1.081177] ACPI: LAPIC_NMI (acpi_id[0xff] high level lint[0x1])
[    1.087145] IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23
[    1.093979] IOAPIC[1]: apic_id 9, version 32, address 0xfec80000, GSI 24-47
[    1.100900] IOAPIC[2]: apic_id 10, version 32, address 0xfec88000, GSI 48-71
[    1.107907] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    1.114224] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    1.120804] Using ACPI (MADT) for SMP configuration information
[    1.126686] ACPI: HPET id: 0x8086a301 base: 0xfed00000
[    1.131798] smpboot: Allowing 64 CPUs, 40 hotplug CPUs
[    1.136913] PM: hibernation: Registered nosave memory: [mem 0x00000000-0x00000fff]
[    1.144427] PM: hibernation: Registered nosave memory: [mem 0x0009c000-0x000effff]
[    1.151957] PM: hibernation: Registered nosave memory: [mem 0x000f0000-0x000fffff]
[    1.159486] PM: hibernation: Registered nosave memory: [mem 0xdbdf9000-0xdbdf9fff]
[    1.167015] PM: hibernation: Registered nosave memory: [mem 0xdbdfa000-0xdbe4afff]
[    1.174543] PM: hibernation: Registered nosave memory: [mem 0xdbe4b000-0xdbe4bfff]
[    1.182073] PM: hibernation: Registered nosave memory: [mem 0xdbe4c000-0xdbe4cfff]
[    1.189602] PM: hibernation: Registered nosave memory: [mem 0xdbe4d000-0xdbe4dfff]
[    1.197132] PM: hibernation: Registered nosave memory: [mem 0xdbe4e000-0xdbffffff]
[    1.204662] PM: hibernation: Registered nosave memory: [mem 0xdc000000-0xf7ffffff]
[    1.212191] PM: hibernation: Registered nosave memory: [mem 0xf8000000-0xfcffffff]
[    1.219720] PM: hibernation: Registered nosave memory: [mem 0xfd000000-0xfdffffff]
[    1.227249] PM: hibernation: Registered nosave memory: [mem 0xfe000000-0xfecfffff]
[    1.234778] PM: hibernation: Registered nosave memory: [mem 0xfed00000-0xfedfffff]
[    1.242308] PM: hibernation: Registered nosave memory: [mem 0xfee00000-0xfeefffff]
[    1.249837] PM: hibernation: Registered nosave memory: [mem 0xfef00000-0xffafffff]
[    1.257366] PM: hibernation: Registered nosave memory: [mem 0xffb00000-0xffffffff]
[    1.264897] [mem 0xdc000000-0xf7ffffff] available for PCI devices
[    1.270955] Booting paravirtualized kernel on bare hardware
[    1.276495] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[    1.291454] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:64 nr_cpu_ids:64 nr_node_ids:2
[    1.302715] percpu: Embedded 54 pages/cpu s184320 r8192 d28672 u262144
[    1.309145] Built 2 zonelists, mobility grouping on.  Total pages: 18578823
[    1.315979] Policy zone: Normal
[    1.319095] Kernel command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.11.11-300.fc34.x86_64 root=/dev/mapper/fedora_33-root ro rd.lvm.lv=fedora_33/root rd.shell rd.debug rd.udev.debug console=ttyS0,115200n8 earlyprintk=serial,ttyS0,115200n8
[    1.340186] printk: log_buf_len individual max cpu contribution: 4096 bytes
[    1.346961] printk: log_buf_len total cpu_extra contributions: 258048 bytes
[    1.353885] printk: log_buf_len min size: 262144 bytes
[    1.359413] printk: log_buf_len: 524288 bytes
[    1.363582] printk: early log buf free: 251352(95%)
[    1.369018] mem auto-init: stack:off, heap alloc:off, heap free:off
[    1.577283] page 0xc24000 outside node 1 zone Normal [ 0x100000 - 0xc24000 ]
[    1.584143] page:(____ptrval____) refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xc24000
[    1.593490] flags: 0x57ffffc0000000()
[    1.597126] raw: 0057ffffc0000000 fffff8f3f0900008 fffff8f3f0900008 0000000000000000
[    1.604827] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[    1.612528] page dumped because: VM_BUG_ON_PAGE(bad_range(zone, page))
[    1.619031] ------------[ cut here ]------------
[    1.623606] kernel BUG at mm/page_alloc.c:1018!
[    1.628112] invalid opcode: 0000 [#1] SMP PTI
[    1.632434] CPU: 0 PID: 0 Comm: swapper Not tainted 5.11.11-300.fc34.x86_64 #1
[    1.639617] Hardware name: Dell Inc. Precision WorkStation T5500  /0CRH6C, BIOS A18 10/15/2018
[    1.648186] RIP: 0010:__free_one_page+0x4f6/0x5c0
[    1.652859] Code: 0a 00 00 00 4c 89 ef 4c 89 5c 24 08 e8 93 69 fc ff 4c 8b 5c 24 08 e9 b6 fb ff ff 48 c7 c6 98 52 3c 90 4c 89 f7 e8 aa 76 fd ff <0f> 0b 66 66 66 66 90 e9 6d ff ff ff 83 fd 09 0f 8e 7b fd ff ff e8
[    1.671552] RSP: 0000:ffffffff90a03da0 EFLAGS: 00010082
[    1.676744] RAX: 0000000000000000 RBX: 0000000000000009 RCX: ffff9da08b5fffa8
[    1.683841] RDX: c0000000ffffbfff RSI: 0000000000000000 RDI: 0000000000000046
[    1.690937] RBP: 000000000000000a R08: 0000000000000000 R09: ffffffff90a03aa0
[    1.698034] R10: ffffffff90a03a98 R11: ffff9da0a3f49328 R12: 0000000000c24000
[    1.705130] R13: ffff9d9aa3fd5d00 R14: fffff8f3f0900000 R15: 00000000000003ff
[    1.712228] FS:  0000000000000000(0000) GS:ffff9da08b600000(0000) knlGS:0000000000000000
[    1.720277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.725988] CR2: ffff9d9704a01000 CR3: 0000000883a10001 CR4: 00000000000206b0
[    1.733084] Call Trace:
[    1.735511]  free_one_page+0x5f/0xe0
[    1.739056]  __free_pages_ok+0x1b4/0x540
[    1.742950]  memblock_free_all+0x11c/0x175
[    1.747019]  mem_init+0x19/0x1e4
[    1.750220]  start_kernel+0x564/0x832
[    1.753856]  ? load_ucode_intel_bsp+0x11/0x2d
[    1.758183]  secondary_startup_64_no_verify+0xc2/0xcb
[    1.763202] Modules linked in:
[    1.766250] random: get_random_bytes called from oops_exit+0x35/0x60 with crng_init=0
[    1.766257] ---[ end trace 6bbd9b534dbde42d ]---
[    1.778606] RIP: 0010:__free_one_page+0x4f6/0x5c0
[    1.783280] Code: 0a 00 00 00 4c 89 ef 4c 89 5c 24 08 e8 93 69 fc ff 4c 8b 5c 24 08 e9 b6 fb ff ff 48 c7 c6 98 52 3c 90 4c 89 f7 e8 aa 76 fd ff <0f> 0b 66 66 66 66 90 e9 6d ff ff ff 83 fd 09 0f 8e 7b fd ff ff e8
[    1.801973] RSP: 0000:ffffffff90a03da0 EFLAGS: 00010082
[    1.807166] RAX: 0000000000000000 RBX: 0000000000000009 RCX: ffff9da08b5fffa8
[    1.814262] RDX: c0000000ffffbfff RSI: 0000000000000000 RDI: 0000000000000046
[    1.821359] RBP: 000000000000000a R08: 0000000000000000 R09: ffffffff90a03aa0
[    1.828455] R10: ffffffff90a03a98 R11: ffff9da0a3f49328 R12: 0000000000c24000
[    1.835552] R13: ffff9d9aa3fd5d00 R14: fffff8f3f0900000 R15: 00000000000003ff
[    1.842648] FS:  0000000000000000(0000) GS:ffff9da08b600000(0000) knlGS:0000000000000000
[    1.850697] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.856408] CR2: ffff9d9704a01000 CR3: 0000000883a10001 CR4: 00000000000206b0
[    1.863507] Kernel panic - not syncing: Attempted to kill the idle task!
[    1.870202] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
Comment 1 Borislav Petkov 2021-05-15 07:46:50 UTC
Can you trigger with 5.12 too?
Comment 2 robert.shteynfeld 2021-05-15 13:04:30 UTC
Created attachment 296765 [details]
attachment-9158-0.html

I have not tried myself, but the other user with the same exact problem on
same exact hardware did try and it also hangs on boot (from
https://bugzilla.redhat.com/show_bug.cgi?id=1945809):

"so i update to to the rawhide kernel

kernel-5.12.0-0.rc5.180.fc35.x86_64.rpm
and the error still exists, I haven't hooked up the laptop to the
serial to log the output but it still hangs and kernel panics."


On Sat, May 15, 2021 at 3:46 AM <bugzilla-daemon@bugzilla.kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=213073
>
> Borislav Petkov (bp@alien8.de) changed:
>
>            What    |Removed                     |Added
>
> ----------------------------------------------------------------------------
>                  CC|                            |bp@alien8.de
>
> --- Comment #1 from Borislav Petkov (bp@alien8.de) ---
> Can you trigger with 5.12 too?
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You reported the bug.
Comment 3 Borislav Petkov 2021-05-15 13:30:58 UTC
Mike, can you have a look pls?

This looks memblock-something related.

Thx.
Comment 4 robert.shteynfeld 2021-05-15 13:33:09 UTC
Created attachment 296767 [details]
dmesg after booting 5.10.21 kernel
Comment 5 Mike Rapoport 2021-05-15 14:48:05 UTC
Can you add "mminit_loglevel=4 ignore_loglevel" to the kernel command line and post the log please?
Comment 6 robert.shteynfeld 2021-05-15 15:24:03 UTC
I don't have a way to get the console output into a file, but here's the video in case that helps:

https://drive.google.com/file/d/1pxIa8e9lP49_ayrPPkqb1EtPVp_U2LnD/view?usp=sharing
Comment 7 robert.shteynfeld 2021-05-15 17:18:24 UTC
I've had a early kernel boot problem a few years back that was fixed with a patch and it was due to an unusual memory ranges for the dual x5690.  Here's the e-mail chunk from previous problem:


Michal Hocko <mhocko@kernel.org>
Fri, Jan 25, 2019, 11:39 AM
to Linus, Mikhail, Linux, Gerald, Mikhail, Dave, Alexander, Andrew, Pavel, Steven, Daniel, Bob, me

On Fri 25-01-19 11:16:30, robert shteynfeld wrote:
> Attached is the dmesg from patched kernel.

Your Node1 physical memory range precedes Node0 which is quite unusual
but it shouldn't be a huge problem on its own. But memory ranges are
not aligned to the memory section

[    0.286954] Early memory node ranges
[    0.286955]   node   1: [mem 0x0000000000001000-0x0000000000090fff]
[    0.286955]   node   1: [mem 0x0000000000100000-0x00000000dbdf8fff]
[    0.286956]   node   1: [mem 0x0000000100000000-0x0000001423ffffff]
[    0.286956]   node   0: [mem 0x0000001424000000-0x0000002023ffffff]

As you can see the last pfn for the node1 is inside the section and
Node0 starts right after. This is quite unusual as well. If for no other
reasons then the memmap of those struct pages will be remote for one or
the other. Actually I am not even sure we can handle that properly
because we do expect 1:1 mapping between sections and nodes.

Now it also makes some sense why 2830bf6f05fb ("mm, memory_hotplug:
initialize struct pages for the full memory section") made any
difference. We simply write over a potentially initialized struct page
and blow up on that. I strongly suspect that the commit just uncovered
a pre-existing problem. Let me think what we can do about that.
Comment 8 robert.shteynfeld 2021-05-15 19:11:41 UTC
As per redhat thread, disabling NUMA avoids the problem.
Comment 9 Mike Rapoport 2021-05-15 19:24:18 UTC
Can you run 5.10.y with "mminit_loglevel=4 ignore_loglevel" please?
Comment 10 robert.shteynfeld 2021-05-15 20:04:43 UTC
Created attachment 296789 [details]
dmesg booting 5.10.21 with mminit_loglevel=4 ignore_loglevel

Booted 5.10.21 with mminit_loglevel=4 ignore_loglevel as requested.
Comment 11 Mike Rapoport 2021-05-17 15:28:50 UTC
Created attachment 296813 [details]
reoder-memmap-init.patch

I believe that regression is caused by the refactoring of the memory map for holes in the memory [commit 0740a50b9baa ("mm/page_alloc.c: refactor initialization of .
struct page for holes in memory layout")].

It presumed that the nodes are sorted in ascending order by their start_pfn and this assumption does not hold on your system.

The attached patch enforces that order before starting memory map initialization and I think it should fix your issue.
Comment 12 robert.shteynfeld 2021-05-17 17:54:44 UTC
Thanks for the quick patch!
Comment 13 Mike Rapoport 2021-05-21 13:37:18 UTC
Created attachment 296913 [details]
reoder-memmap-init-v2.patch

Can you please test this version of the patch as well?
It is more efficient than the previous one, especially for large systems
Comment 14 robert.shteynfeld 2021-05-22 01:18:36 UTC
Sorry, I never got to test the first patch.  I tried the second one but it doesn't seem to work.  I'll try the first on Mon.
Comment 15 robert.shteynfeld 2021-05-24 16:52:10 UTC
Tested the first patch, but still getting kernel panic.  One thing to note is that it's building 5.12.5 kernel when I build branch origin/f33 of the fedora kernel.  Here's the video in case it helps:  
https://drive.google.com/file/d/1sKwKeKSktIXXbGNzFXCG-7ttXRUhE9Al/view?usp=sharing
Comment 16 Mike Rapoport 2021-05-27 18:08:03 UTC
Created attachment 297009 [details]
debug-memmap-init.patch

This patch adds some debugging prints to v5.10.21.
Can you apply this to v5.10.21 and post the boot log?
Comment 17 robert.shteynfeld 2021-05-28 03:02:51 UTC
Created attachment 297013 [details]
attachment-19828-0.html

I tried checking out the 5.10.21 commit and running "fedpkg local" to
build, but it refuses to do that with a detached HEAD git status.  So not
exactly sure how I can build an old release.  Will try to figure it out.

On Thu, May 27, 2021 at 2:08 PM <bugzilla-daemon@bugzilla.kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=213073
>
> --- Comment #16 from Mike Rapoport (mike.rapoport@gmail.com) ---
> Created attachment 297009 [details]
>   --> https://bugzilla.kernel.org/attachment.cgi?id=297009&action=edit
> debug-memmap-init.patch
>
> <https://bugzilla.kernel.org/attachment.cgi?id=297009&action=editdebug-memmap-init.patch>
>
> This patch adds some debugging prints to v5.10.21.
> Can you apply this to v5.10.21 and post the boot log?
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You reported the bug.
Comment 18 Mike Rapoport 2021-05-28 11:05:10 UTC
Created attachment 297019 [details]
building kernel from stable tree

I think that for this test you may build a kernel without proper Fedora packaging.
Something like the attached script should work.
Comment 19 robert.shteynfeld 2021-05-28 23:26:33 UTC
I built the kernel rpm, but it's 1G and my /boot is a 500MB partition.  Is there a reason why the rpm is so large?
Comment 20 Mike Rapoport 2021-05-30 09:57:29 UTC
Hmm, probably it's because of the debug info. I think if you disable it in "Kernel Hacking" -> "Compile-time checks and compiler options" -> "Compile the kernel with debug info" in kernel configuration (e.g. make menuconfig) the size of the rpm will be much smaller.
Comment 21 robert.shteynfeld 2021-05-30 15:41:45 UTC
Created attachment 297047 [details]
dmesg after booting 5.10.21 kernel with extra debugging
Comment 22 Mike Rapoport 2021-05-30 18:13:01 UTC
Created attachment 297057 [details]
reoder-memmap-init-v3.patch

The prints actually seem to be quite ok.
I've updated the second patch to include the same prints, and I've added ~1sec delay inside the initialization of the memory map, so that a video could be more helpful.

I think that it's also possible to disable CONFIG_DEBUG_VM and CONFIG_DEBUG_VM_PGFLAGS in "Kernel hacking" -> "Memory Debugging" to have the system boot to get the logs.
Comment 23 robert.shteynfeld 2021-06-04 01:33:01 UTC
Created attachment 297143 [details]
5.12.7 boot with CONFIG_DEBUG_VM=n build

Was not able to apply the latest patch to any of the versions I've tried, but was able to build with CONFIG_DEBUG_VM off and save the dmesg output from 5.12.7 (does not panic as you suggested).
Comment 24 Mike Rapoport 2021-06-04 15:01:41 UTC
Created attachment 297155 [details]
reorder-memmap-init-v4.patch

Hmm, something went wrong with reorder-memmap-init-v3.patch, I diffed the wrong versions.

Can you please try v4? It should apply on v5.12.7 and it should boot with CONFIG_DEBUG_VM=n.
Even if it doesn't it has delays so that video will be helpful.
Comment 25 Andre Lucas Falco 2021-06-16 20:03:07 UTC
Hi, I have the same behavior reported here, on a Dell Precision T7500 with two Xeon X5675 if it has NUMA enabled in the BIOS.

How could I help testing the v4 patch?
Comment 26 Mike Rapoport 2021-06-16 20:45:58 UTC
The patch applies to v5.12.7, if you could apply it, build the kernel and send the dmesg output that would be great.
Comment 27 robert.shteynfeld 2021-06-17 01:52:52 UTC
Created attachment 297419 [details]
5.12.7 dmesg after patch 4
Comment 28 Mike Rapoport 2021-06-20 10:05:42 UTC
The prints seem to be correct and the system booted Ok, didn't it?
Was CONFIG_DEBUG_VM enabled or disabled?
Comment 29 robert.shteynfeld 2021-06-20 17:19:45 UTC
I thought that CONFIG_DEBUG_VM=y was set from previous build, but it looks like it wasn't set (I'm looking at the .config file?). The system booted ok.
Comment 30 robert.shteynfeld 2021-06-20 17:26:10 UTC
From .config

# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VM_PGTABLE is not set
Comment 31 robert.shteynfeld 2021-06-20 17:26:22 UTC
From .config

# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VM_PGTABLE is not set
Comment 32 Mike Rapoport 2021-06-21 06:33:57 UTC
Can you please verify that with CONFIG_DEBUG_VM=y, CONFIG_DEBUG_VM_PGTABLE=y and CONFIG_PAGE_POISONING=y the system boots fine?
Comment 33 robert.shteynfeld 2021-06-23 02:15:51 UTC
Yes it boots.
Comment 34 Artem Bityutskiy 2021-07-12 12:05:29 UTC
Mike Rappaport referred me to this ticket. Here is what I got.

1. System: Dell R630, 2-Socket Broadwell Xeon
2. What happens: kernel v5.11 boots, kernel v5.12 and v5.13 does not boot.
3. Bisecting result: reverting this commit fixes the problem:

0740a50b9baa (refs/bisect/bad) mm/page_alloc.c: refactor initialization of struct page for holes in memory layout

4. I do not see anything on serial, even with early console. Tried both earlyprintk and earlycon.

I used PXE boot for this system with pxelinux booloader. I did not try off-disk boot.

5. The system has the Cluster On Die (CoD) feature enabled. When I disable the CoD feature in BIOS meny, the system boots just fine.

In other words, to make the system bood, I need to do one of:
a. Revert the patch. Then both CoD on and off boot.
b. Keep the patch, but disable CoD.

6. Attached dmesg for the following 3 configurations.

a. bdx-CoD-on-v5.13-reverted-dmesg.txt  - CoD enabled, kernel is v5.13 + 0740a50b9baa reverted. It reverts cleanly.
b. bdx-CoD-off-v5.13-reverted-dmesg.txt - CoD disabled, kernel is v5.13 + 0740a50b9baa reverted.
c. bdx-CoD-off-v5.13-vanilla-dmesg.txt - CoD disabled, kernel is vanilla v5.13

7. Attached to kernel config file: config-5.13.0.txt
Comment 35 Artem Bityutskiy 2021-07-12 12:08:26 UTC
Created attachment 297797 [details]
bdx-CoD-on-v5.13-reverted-dmesg.txt
Comment 36 Artem Bityutskiy 2021-07-12 12:09:47 UTC
Created attachment 297799 [details]
bdx-CoD-off-v5.13-reverted-dmesg.txt
Comment 37 Artem Bityutskiy 2021-07-12 12:10:13 UTC
Created attachment 297801 [details]
bdx-CoD-off-v5.13-vanilla-dmesg.txt
Comment 38 Artem Bityutskiy 2021-07-12 12:10:44 UTC
Created attachment 297803 [details]
config-5.13.0.txt
Comment 39 Mike Rapoport 2021-07-12 12:39:09 UTC
The issue that was originally reported in this bug is fixed by commit 
122e093c1734 ("mm/page_alloc: fix memory map initialization for descending nodes") that went into v5.14-rc1.

Can you please check what happens in your case with 5.14-rc1?
Comment 40 Artem Bityutskiy 2021-07-12 12:56:37 UTC
Patch 122e093c1734 fixes the issue for me.

How tested:
$ git reset --hard v5.13
$ git cherry-pick 122e093c1734

Build, it boots fine. Without 122e093c1734 it does not boot. Thanks!
Comment 41 Artem Bityutskiy 2021-07-12 12:57:48 UTC
Consider closing the ticket as it is fixed by commit 122e093c1734. The commit has 'Cc: stable@kernel.org', so it eventually should propagate to older stable trees.
Comment 42 Borislav Petkov 2022-01-06 23:39:33 UTC
Ok, closing.