Bug 219910

Summary: Dell Precision T5600 (dual-socket Xeon E5-2680) fails to boot since kernel 6.6.0 (regression from 6.5.9)
Product: Platform Specific/Hardware Reporter: Actionless (actionless.loveless)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: NEW ---    
Severity: blocking CC: actionless.loveless, ardb
Priority: P3    
Hardware: Intel   
OS: Linux   
See Also: https://bugzilla.kernel.org/show_bug.cgi?id=218173
Kernel Version: 6.6.0 Subsystem:
Regression: Yes Bisected commit-id: a1b87d54f4e45ff5e0d081fb1d9db3bf1a8fb39a

Description Actionless 2025-03-22 17:53:49 UTC
Dell Precision T5600 (dual-socket Xeon E5-2680) fails to boot starting with kernel 6.6.0 and all versions thereafter (including latest 6.13.7). The system shuts down shortly after GRUB with no visible output on screen, even with earlyprintk and debug flags.

This behavior is **not resolved by the x2apic patch in bug #218173**, and is confirmed to be a **separate regression** introduced in the 6.6.0 kernel merge.

---

βœ… **Known good:**
- 6.5.9 (mainline and zen)
- 6.6.72-lts

❌ **Known bad:**
- 6.6.0
- 6.6.1
- 6.7.0
- 6.12.19-lts
- 6.13.7

All were tested with minimal and stripped kernel configs:
- `CONFIG_X86_X2APIC=n`
- `CONFIG_NUMA=n`
- `CONFIG_ACPI_HOTPLUG_CPU=n`
- `CONFIG_EFI_RUNTIME_MAP=n`
- etc.

Also tried several additional bootloader options:

`nokaslr intel_iommu=off noapic nolapic acpi=strict
processor.max_cstate=1 idle=halt earlyprintk=vga debug ignore_loglevel loglevel=8`

---

πŸ–₯ **System:**
- Dell Precision T5600
- 2Γ— Intel Xeon E5-2680 (Sandy Bridge-EP)
- BIOS A18
- GeForce GTX 1080

---

πŸ§ͺ **Symptoms:**
- Black screen and system powers off silently after ~1 minute post-GRUB
- No logs, no framebuffer, earlyprintk doesn't show anything
- Issue reproduced with `linux`, `linux-lts`, and `linux-zen` kernels
- Config from working 6.6.72 LTS kernel reused on 6.13.7 β€” still fails

---

🧠 **Extra Notes:**
I’ve previously encountered a compatibility issue with this motherboard misreporting its EFI version (seen in Tesla/NVIDIA kernel bug contexts). It’s possible this could be related to the EFI memory map or early boot services being mishandled by the kernel.

Thanks in advance!
Comment 1 Actionless 2025-03-22 18:33:17 UTC
i'll also start the bisecting process, but because build takes 30-40 minutes on that machine, i could build&test only 1-2 versions per evening - so i will upload results several days/weeks later which exact commit it would stuck on
Comment 2 Actionless 2025-03-23 12:41:45 UTC
I minimized the config and used ccache to speed things up.

🧠 Bisection Complete β€” Root Cause Identified

I’ve bisected the issue to the exact first bad commit:

commit a1b87d54f4e45ff5e0d081fb1d9db3bf1a8fb39a
Author: Ard Biesheuvel
Title: x86/efistub: Avoid legacy decompressor when doing EFI boot
πŸ”₯ This commit causes a silent failure during EFI boot (black screen β†’ power-off within 1 minute) on:



🧬 BIOS version: A19 (not A18 as I previously thought β€” confirmed)

EFI on this machine appears to falsely advertise capabilities that are required for the new decompressor path used after this commit.


πŸ’‘ Conclusion:
Legacy decompressor path was required on this platform, and skipping it unconditionally leads to total boot failure.

Let me know if any logs or debug builds would help further β€” happy to test and assist.