Bug 219910 - Dell Precision T5600 (dual-socket Xeon E5-2680) fails to boot since kernel 6.6.0 (regression from 6.5.9)
Summary: Dell Precision T5600 (dual-socket Xeon E5-2680) fails to boot since kernel 6....
Status: NEW
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: Intel Linux
: P3 blocking
Assignee: platform_x86_64@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-03-22 17:53 UTC by Actionless
Modified: 2025-03-23 12:44 UTC (History)
2 users (show)

See Also:
Kernel Version: 6.6.0
Subsystem:
Regression: Yes
Bisected commit-id: a1b87d54f4e45ff5e0d081fb1d9db3bf1a8fb39a


Attachments

Description Actionless 2025-03-22 17:53:49 UTC
Dell Precision T5600 (dual-socket Xeon E5-2680) fails to boot starting with kernel 6.6.0 and all versions thereafter (including latest 6.13.7). The system shuts down shortly after GRUB with no visible output on screen, even with earlyprintk and debug flags.

This behavior is **not resolved by the x2apic patch in bug #218173**, and is confirmed to be a **separate regression** introduced in the 6.6.0 kernel merge.

---

✅ **Known good:**
- 6.5.9 (mainline and zen)
- 6.6.72-lts

❌ **Known bad:**
- 6.6.0
- 6.6.1
- 6.7.0
- 6.12.19-lts
- 6.13.7

All were tested with minimal and stripped kernel configs:
- `CONFIG_X86_X2APIC=n`
- `CONFIG_NUMA=n`
- `CONFIG_ACPI_HOTPLUG_CPU=n`
- `CONFIG_EFI_RUNTIME_MAP=n`
- etc.

Also tried several additional bootloader options:

`nokaslr intel_iommu=off noapic nolapic acpi=strict
processor.max_cstate=1 idle=halt earlyprintk=vga debug ignore_loglevel loglevel=8`

---

🖥 **System:**
- Dell Precision T5600
- 2× Intel Xeon E5-2680 (Sandy Bridge-EP)
- BIOS A18
- GeForce GTX 1080

---

🧪 **Symptoms:**
- Black screen and system powers off silently after ~1 minute post-GRUB
- No logs, no framebuffer, earlyprintk doesn't show anything
- Issue reproduced with `linux`, `linux-lts`, and `linux-zen` kernels
- Config from working 6.6.72 LTS kernel reused on 6.13.7 — still fails

---

🧠 **Extra Notes:**
I’ve previously encountered a compatibility issue with this motherboard misreporting its EFI version (seen in Tesla/NVIDIA kernel bug contexts). It’s possible this could be related to the EFI memory map or early boot services being mishandled by the kernel.

Thanks in advance!
Comment 1 Actionless 2025-03-22 18:33:17 UTC
i'll also start the bisecting process, but because build takes 30-40 minutes on that machine, i could build&test only 1-2 versions per evening - so i will upload results several days/weeks later which exact commit it would stuck on
Comment 2 Actionless 2025-03-23 12:41:45 UTC
I minimized the config and used ccache to speed things up.

🧠 Bisection Complete — Root Cause Identified

I’ve bisected the issue to the exact first bad commit:

commit a1b87d54f4e45ff5e0d081fb1d9db3bf1a8fb39a
Author: Ard Biesheuvel
Title: x86/efistub: Avoid legacy decompressor when doing EFI boot
🔥 This commit causes a silent failure during EFI boot (black screen → power-off within 1 minute) on:



🧬 BIOS version: A19 (not A18 as I previously thought — confirmed)

EFI on this machine appears to falsely advertise capabilities that are required for the new decompressor path used after this commit.


💡 Conclusion:
Legacy decompressor path was required on this platform, and skipping it unconditionally leads to total boot failure.

Let me know if any logs or debug builds would help further — happy to test and assist.

Note You need to log in before you can comment on or make changes to this bug.