Bug 80321 - Lenovo W530: Freeze on boot with CSM enabled
Summary: Lenovo W530: Freeze on boot with CSM enabled
Status: RESOLVED INVALID
Alias: None
Product: EFI
Classification: Unclassified
Component: Boot (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: EFI Virtual User
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-16 00:21 UTC by Gigadoc2
Modified: 2014-07-26 13:12 UTC (History)
3 users (show)

See Also:
Kernel Version: >3.13
Subsystem:
Regression: No
Bisected commit-id:


Attachments
lspci without CSM (26.72 KB, text/plain)
2014-07-16 00:21 UTC, Gigadoc2
Details
dmidecode without CSM (17.42 KB, text/plain)
2014-07-16 00:22 UTC, Gigadoc2
Details
dmidecode with CSM enabled + noapic boot option (17.42 KB, text/plain)
2014-07-16 18:34 UTC, Gigadoc2
Details
Udevd debug output after system freeze (427.46 KB, image/jpeg)
2014-07-16 21:01 UTC, Gigadoc2
Details
another udevd debug output after freeze (485.34 KB, image/jpeg)
2014-07-16 21:02 UTC, Gigadoc2
Details

Description Gigadoc2 2014-07-16 00:21:32 UTC
Created attachment 143171 [details]
lspci without CSM

Booting a Linux distribution on the Thinkpad W530 via UEFI - Compatibility Support Module (CSM) turned off - works fine, but booting UEFI with CSM on, or booting in BIOS mode, which requires the CSM to be on, freezes the Device.

Guessing from the systemd messages up to this point, freezing occurs right after udev is started.

I have tried different kernels with Arch Linux, and a Fedora 20 live image.

Unfortunately, the freeze always happens before logs are written to disk.

Steps to reproduce:
1. Enable CSM on a Thinkpad W530
2. Boot anything with a kernel newer than 3.13
3. Start udev hardware detection

You can boot the system with noacpi or noapic as kernel options, though this will heavily limit the functionality of the system.
Comment 1 Gigadoc2 2014-07-16 00:22:21 UTC
Created attachment 143181 [details]
dmidecode without CSM
Comment 2 Gigadoc2 2014-07-16 18:34:49 UTC
Created attachment 143251 [details]
dmidecode with CSM enabled + noapic boot option
Comment 3 Gigadoc2 2014-07-16 18:37:47 UTC
Comment on attachment 143251 [details]
dmidecode with CSM enabled + noapic boot option

i just noticed, the output of dmidecode does not change when enabling the CSM
Comment 4 Gigadoc2 2014-07-16 21:01:41 UTC
Created attachment 143281 [details]
Udevd debug output after system freeze

After further testing, i found out a two other things:
- nolapic gets the system to boot as well
- The problem only happens when Optimus is set to "Discrete Graphics", e.g. you set the nvidia card as the primary graphics card. However, it doesn't matter whether you have the proprietary nvidia driver installed or not.

Also, i got the kernel version wrong, it happens with 3.10 onwards; and i managed to get at least the debug output of udev, at the time the system freezes.
Comment 5 Gigadoc2 2014-07-16 21:02:30 UTC
Created attachment 143291 [details]
another udevd debug output after freeze
Comment 6 Matt Fleming 2014-07-21 14:59:55 UTC
Is there a reason you're not happy to just continue booting the machine with UEFI and CSM disabled?

The CSMs can be pretty buggy and I am certainly not surprised that all 3 modes (BIOS, CSM on, CSM off) fail to work the same.
Comment 7 Gigadoc2 2014-07-21 15:25:52 UTC
I can keep this setup, but others may not, if they need BIOS functionality for dualbooting with other operating systems.

Also, I used to boot BIOS around kernel version 3.8, so this looks like a kernel regression. I'll do some more testing and will be reporing back with the first kernel version failing to boot.
Comment 8 Matt Fleming 2014-07-21 15:27:53 UTC
Thanks, if you can narrow it down to a single commit that introduced this regression that would be immensely helpful.
Comment 9 Gigadoc2 2014-07-24 02:16:01 UTC
Embarrassingly, while bisecting, I noticed that kernels pre 3.8 also ocasionally fail to boot - just less often than newer versions. I went back to v3.2.0, and even then the error occured after a few cold boots, and i know that later kernel versions did work in the past.

Seems like this is not a kernel regression after all, but depends on something different - I might just have broken my hardware/firmware without noticing.

I will still try to reproduce my "working" system, but until I managed to do so and got more precise data about what's causing this, I won't open another bug.

Sorry for the confusion.
Comment 10 sven 2014-07-26 13:12:39 UTC
I also have similar problems on a T420, the kernel is started via EFISTUB.

As soon as I set the display option to discrete, it fails to boot most of the time. Either it hangs while “setting up console”, has problems accessing the hard drive or stops with a systemd rfkill error. So I would assume, that it may be some kind of ACPI problem. I had it booted to X though a couple of times, with the Nvidia blob and nouveau.

When I set the display option to Optimus, the system boots fine all of the time. But I only can use the Intel GPU. Using the Nvidia one via PRIME crashes the Intel driver. See also "https://bugs.freedesktop.org/show_bug.cgi?id=80769". Kernel also shows, independently of using PRIME, the following errors

* ACPI Warning: SystemIO range 0x0000000000000428-0x000000000000042f conflicts with OpRegion 0x0000000000000400-0x000000000000047f (\_SB_.PCI0.LPC_.PMIO) (20140214/utaddress-258)

* ACPI Warning: SystemIO range 0x0000000000000540-0x000000000000054f conflicts with OpRegion 0x0000000000000500-0x000000000000057f (\_SB_.PCI0.LPC_.LPIO) (20140214/utaddress-258)

* ACPI Warning: SystemIO range 0x0000000000000530-0x000000000000053f conflicts with OpRegion 0x0000000000000500-0x000000000000057f (\_SB_.PCI0.LPC_.LPIO) (20140214/utaddress-258)

* ACPI Warning: SystemIO range 0x0000000000000500-0x000000000000052f conflicts with OpRegion 0x0000000000000500-0x000000000000057f (\_SB_.PCI0.LPC_.LPIO) (20140214/utaddress-258)

* ACPI Warning: \_SB_.PCI0.PEG_.VID_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140214/nsarguments-95)

* ACPI Warning: \_SB_.PCI0.PEG_.VID_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140214/nsarguments-95).

Note You need to log in before you can comment on or make changes to this bug.