Bug 217637
Summary: | unable to boot when monitor is attached | ||
---|---|---|---|
Product: | Linux | Reporter: | primalmotion (primalmotion) |
Component: | Kernel | Assignee: | Virtual assignee for kernel bugs (linux-kernel) |
Status: | NEW --- | ||
Severity: | normal | CC: | bagasdotme, jonathon.hall, pmenzel+bugzilla.kernel.org |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: | |
Attachments: |
photo of the trace
another picture of the issue dmesg |
Created attachment 304555 [details]
another picture of the issue
(In reply to primalmotion from comment #0) > Created attachment 304554 [details] > photo of the trace > > In the latest 6.3 and 6.4, it is impossible for me to boot my laptop if my > DELL U2720Q monitor is plugged in (USB-C). I have to unplug it, then boot. > As soon as the first second of boot went through, I can plug in my monitor > and there is no issue afterward. There is no issue waking up after suspend. > Only when it boots. > > See the attached pictures of the trace. The trace itself seems random (at > least to me :)). I tried several things, like removing any attached USB > devices from the monitor built-in USB-hub, but that does not change > anything. (there is a keyboard and trackpad attached). Do you have this issue on v6.1? Can you attach dmesg instead? No the issue does not happen with 6.1. I'm not sure how to get the dmesg since it panics immediately after loading the kernel (In reply to primalmotion from comment #3) > No the issue does not happen with 6.1. I'm not sure how to get the dmesg > since it panics immediately after loading the kernel Then can you please bisect between v6.1 and v6.3? Can you also attach lspci and lsusb? > Then can you please bisect between v6.1 and v6.3? The bisect operation is gonna take a long time, I'm not sure when I'll have the time to do so. I'll keep you posted > Can you also attach lspci and lsusb? lspci: 00:00.0 Host bridge: Intel Corporation Device 9b51 00:02.0 VGA compatible controller: Intel Corporation Comet Lake UHD Graphics (rev 04) 00:04.0 Signal processing controller: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem 00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model 00:12.0 Signal processing controller: Intel Corporation Comet Lake Thermal Subsytem 00:14.0 USB controller: Intel Corporation Comet Lake PCH-LP USB 3.1 xHCI Host Controller 00:14.2 RAM memory: Intel Corporation Comet Lake PCH-LP Shared SRAM 00:15.0 Serial bus controller: Intel Corporation Serial IO I2C Host Controller 00:1c.0 PCI bridge: Intel Corporation Device 02be (rev f0) 00:1c.7 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #8 (rev f0) 00:1d.0 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #13 (rev f0) 00:1f.0 ISA bridge: Intel Corporation Comet Lake PCH-LP LPC Premium Controller/eSPI Controller 00:1f.3 Audio device: Intel Corporation Comet Lake PCH-LP cAVS 00:1f.4 SMBus: Intel Corporation Comet Lake PCH-LP SMBus Host Controller 00:1f.5 Serial bus controller: Intel Corporation Comet Lake SPI (flash) Controller 01:00.0 Network controller: Qualcomm Atheros AR9462 Wireless Network Adapter (rev 01) 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15) 03:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO lsusb: Bus 002 Device 002: ID 05e3:0749 Genesys Logic, Inc. SD Card Reader and Writer Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 081: ID 0451:82ff Texas Instruments, Inc. Bus 001 Device 079: ID 445a:1424 DZTECH DZ65RGBV3 Bus 001 Device 083: ID 0763:410e M-Audio AIR 192 14 Bus 001 Device 082: ID 05e3:0608 Genesys Logic, Inc. Hub Bus 001 Device 080: ID 05ac:0265 Apple, Inc. Magic Trackpad 2 Bus 001 Device 078: ID 05e3:0608 Genesys Logic, Inc. Hub Bus 001 Device 077: ID 0451:8442 Texas Instruments, Inc. Bus 001 Device 071: ID 04ca:300d Lite-On Technology Corp. Atheros AR3012 Bluetooth Bus 001 Device 031: ID 20a0:42b2 Clay Logic Nitrokey 3 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub You seem to use Arch Linux, which might provide a way to easily test mainline releases and the release candidates, so you can save building the Linux kernel. Sorry it took so long, but I've finally started working on the bisect. I will try to find the first packaged version in Arch that breaks, then bisect from there. I'll report back when I'm done So... I went back as far as 6.0.10 and the panic is happening on every single version I tested now... I'm a bit at loss.. Thank you for running the tests. I lost track, can you please list the working and non-working Linux kernels? Lastly, please attach the output of `dmesg` of a successful boot, where you plug in the monitor later. I'm sorry if I was unclear, what I meant is I could not find any working kernel version. I tried most of the versions back to 6.0.10. They all crash the same now. Something else must have changed that triggers this crash in the kernel but what? I tried to downgrade linux-firmware to a version from April (I'm sure everything was working fine back then) but that did not help. Created attachment 304933 [details]
dmesg
This is the dmesg of a successful boot
After talking to pureboot's maintainer, it seems this may be the culprit. They had several reports from other users stating the inability to boot any device when they attached to a 4K display. The maintainer tried to plug their laptop to a 4K TV and encoutered the crash. I have downgraded pureboot and I will try again tonight when I have access to my offending monitor, then report back Ok so downgrading to the previous version of PureBoot fixes the issue with latest kernel version on arch (6.4.11). I'm not sure if this kernel panic is valid, even if caused by pureboot, so I'll let you decide if you want to close the issue or not. If you need more information from me, feel free to ask. Thank you! I’d say, if (system) firmware incorrectly configures the hardware all bets are off. It’d be nice, if the PureBoot folks could chime in and share their analysis. If they won’t, I’d close this issue for now. Hi all, I'm the PureBoot developer at Purism. This looks like a bug in PureBoot/Heads that originated from https://github.com/osresearch/heads/pull/1378. I've validated a fix in PureBoot, it'll go out in PureBoot 28 (this week if there are no troubles in testing), and I'll PR to upstream after that. I think this can be closed here, any change would only be to defend against buggy firmware. It appears that with a 4K display, the framebuffer memory isn't being properly indicated as reserved one way or another. When booting with a 4K display garbage briefly appears, then is overwritten by the framebuffer console, I believe this indicates Linux is allocating memory within the framebuffer for general use, then the framebuffer console overwrites and corrupts it. memtest86+ also shows this. The test patterns show up on the framebuffer when it reaches the right place in memory, the framebuffer display overwrites part of it, and memtest86+ correctly reports this as a failure. I'm not sure why this only occurs with 4K (memtest86+ passes with 1080p), but I don't plan on doing a deeper dive to find out since we are moving to a different graphics initialization method (https://github.com/osresearch/heads/pull/1403, same is being done for Librems in PB 28), and I've now validated this method with 4K framebuffer. Thank you all for testing and investigating this, and let me know if there is any other info I can provide. |
Created attachment 304554 [details] photo of the trace In the latest 6.3 and 6.4, it is impossible for me to boot my laptop if my DELL U2720Q monitor is plugged in (USB-C). I have to unplug it, then boot. As soon as the first second of boot went through, I can plug in my monitor and there is no issue afterward. There is no issue waking up after suspend. Only when it boots. See the attached pictures of the trace. The trace itself seems random (at least to me :)). I tried several things, like removing any attached USB devices from the monitor built-in USB-hub, but that does not change anything. (there is a keyboard and trackpad attached).