Bug 216109

Summary: Steam Deck fails to boot when E820 entries clipped out of _CRS
Product: Drivers Reporter: Bjorn Helgaas (bjorn)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED CODE_FIX    
Severity: normal CC: gpiccoli, jongman.heo, jwrdegoede
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: v5.19 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: v5.18 dmesg (working)
v5.19-rc1 dmesg with 4c5e242d3e93 reverted (working)
v5.19-rc1 iomem with 4c5e242d3e93 reverted
v5.19-rc1 lspci with 4c5e242d3e93 reverted
Quirk
[PATCH] x86/PCI: Revert: "Clip only host bridge windows for E820 regions"
dmesg (5.19-rc1) showing the failure
v5.19-rc2 dmesg from Intel MID platform with 4c5e242d3e93 reverted (working)
v5.19-rc2 dmesg from VMWare Linux guest with fix (working)
v5.19-rc1 dmesg failure in VMWare Fusion VM
v5.19-rc1 dmesg VMWare Fusion VM workaround with pci=no_820

Description Bjorn Helgaas 2022-06-09 22:27:16 UTC
Guilherme G. Piccoli reported that v5.18 boots fine on Steam Deck, but v5.19-rc1 does not.  He bisected it to 4c5e242d3e93 ("x86/PCI: Clip only host bridge windows for E820 regions") [1].

A quirk similar to [2] that disables E820 clipping makes v5.19-rc1 work again.

The reason why v5.18 (which always does E820 clipping by default) works, while v5.19-rc1 (which also does E820 clipping on this platform) does not has not been explained yet.

[1] https://git.kernel.org/linus/4c5e242d3e93
[2] https://git.kernel.org/linus/d341838d776a
Comment 1 Bjorn Helgaas 2022-06-09 22:27:46 UTC
Created attachment 301140 [details]
v5.18 dmesg (working)
Comment 2 Bjorn Helgaas 2022-06-09 22:29:24 UTC
Created attachment 301141 [details]
v5.19-rc1 dmesg with 4c5e242d3e93 reverted (working)

Unable to collect dmesg with vanilla v5.19-rc1 because rootfs, wifi, mmc slot do not work.
Comment 3 Bjorn Helgaas 2022-06-09 22:30:45 UTC
Created attachment 301142 [details]
v5.19-rc1 iomem with 4c5e242d3e93 reverted
Comment 4 Bjorn Helgaas 2022-06-09 22:32:44 UTC
Created attachment 301143 [details]
v5.19-rc1 lspci with 4c5e242d3e93 reverted
Comment 5 Guilherme G. Piccoli 2022-06-09 22:40:02 UTC
Thanks a lot for the report Bjorn! I'll attach here the quirk that makes the device work, but I'll defer posting this in the MLs for now, until we fully understand the issue - we still have some time before v5.19 is released. Cheers!
Comment 6 Guilherme G. Piccoli 2022-06-09 22:42:15 UTC
Created attachment 301144 [details]
Quirk
Comment 7 Hans de Goede 2022-06-10 10:39:04 UTC
Looking at the logs I think that the issue likely is that there is an e820 reservation in the middle of the main 32bit mem window for the root-bridge:

[    0.000000] BIOS-e820: [mem 0x00000000a0000000-0x00000000a00fffff] reserved

is in the middle (rather then along the edges) of:

[    0.249449] pci_bus 0000:00: root bus resource [mem 0x80000000-0xf7ffffff window]

So after clipping I expect this window to now be:

[    0.249449] pci_bus 0000:00: root bus resource [mem 0xa0100000-0xf7ffffff window]

Which is likely causing issues for and/or moving around the window for:

[    0.266583] pci 0000:00:08.1: PCI bridge to [bus 04]
[    0.266588] pci 0000:00:08.1:   bridge window [io  0x1000-0x1fff]
[    0.266591] pci 0000:00:08.1:   bridge window [mem 0x80000000-0x803fffff]
[    0.266596] pci 0000:00:08.1:   bridge window [mem 0xf8e0000000-0xf8f01fffff 64bit pref]

Which now falls outside the main window. In the past when clipping was only done from arch_remove_reservations() this would only impact new memory window allocations and only if they did not fit in the 0x80000000-0x9ffffff] hole before the "BIOS-e820: [mem 0x00000000a0000000-0x00000000a00fffff]" reservation.

Where as now I believe we will clip of everything before 0xa0100000, making the above mem window for the 08.1 bridge no longer valid.

So, asssuming that the steamdeck is unique in having this reservation in the middle of the main 32 bit mem window, I guess that adding the quirk, which will remove the clipping is probably the best fix.

But if we find more models with this we may need to revert: https://git.kernel.org/linus/4c5e242d3e93 moving the new pci_use_e820 check to arch_remove_reservations()
(so not a straight forward revert).

There are several gotcha's related to introducing build-time errors wrt certain Kconfig options with doing this move. I've these fixed in an older version of my patches which did have the pci_use_e820 check in arch_remove_reservations().

So let me know if we want to go with the revert, then I can prepare a patch partially based on my old patches, which should avoid the build issues.
Comment 8 Hans de Goede 2022-06-10 10:53:47 UTC
Note other e820 / pci root bridge overlaps on the steamdeck are:

[    0.249454] pci_bus 0000:00: root bus resource [mem 0xfed45000-0xfed814ff window]

vs

[    0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed81fff] reserved

Where the end of the window goes into the reserved area.

###

And also:

[    0.249457] pci_bus 0000:00: root bus resource [mem 0xfed81900-0xfed81fff window]

vs

[    0.249454] pci_bus 0000:00: root bus resource [mem 0xfed45000-0xfed814ff window]

where the window is fully covered by the reservation. AFAICT with the earlier clipping done in 4c5e242d3e93 this means that this window will be entirely dropped now.

###

And also:

[    0.249460] pci_bus 0000:00: root bus resource [mem 0xfedc0000-0xfedc0fff window]
and:
[    0.249462] pci_bus 0000:00: root bus resource [mem 0xfedc6000-0xfedc6fff window]
[

vs

[    0.000000] BIOS-e820: [mem 0x00000000fedc0000-0x00000000feddffff] reserved

So these 2 windows will also be entirely dropped from the root-bridge resource list now.

Bjorn, does this mean that the PCI code will now also reconfigure the bridge to no longer forward these? That may be another cause of the not booting issue.
Comment 9 Hans de Goede 2022-06-10 11:02:19 UTC
Also taken the other complete overlaps into account and wondering about the consequences of completely dropping these from the bridge's resource-list.

I'm leaning towards just reverting 4c5e242d3e93 and moving the pci_use_e820 check to arch_remove_reservations() as part of this non straight-forward revert.

WRT to the build issues, this would require the arch/x86/include/asm/pci_x86.h + arch/x86/kernel/resource.c bits from:

https://lore.kernel.org/linux-pci/20211014183943.27717-2-hdegoede@redhat.com/

As well as the follow up build-fixes from:

https://lore.kernel.org/linux-pci/20211020102102.86577-1-hdegoede@redhat.com/

(no need to re-invent that wheel again)
Comment 10 Hans de Goede 2022-06-10 17:42:44 UTC
Created attachment 301150 [details]
[PATCH] x86/PCI: Revert: "Clip only host bridge windows for E820 regions"

Ok, here is what a revert of 4c5e242d3e93 would look like.

Guilherme G. Piccoli, if you can give this a test that would be great.
Comment 11 Guilherme G. Piccoli 2022-06-10 21:22:28 UTC
Created attachment 301154 [details]
dmesg (5.19-rc1) showing the failure

I was able to collect a dmesg for kernel 5.19-rc1 showing the issue in the Steam Deck - used pstore/ramoops for that, inducing an artificial panic.

Notice the file might not open in all editor, I could open normally in vim.
Cheers!
Comment 12 Guilherme G. Piccoli 2022-06-10 21:43:01 UTC
(In reply to Hans de Goede from comment #10)
> Created attachment 301150 [details]
> [PATCH] x86/PCI: Revert: "Clip only host bridge windows for E820 regions"
> 
> Ok, here is what a revert of 4c5e242d3e93 would look like.
> 
> Guilherme G. Piccoli, if you can give this a test that would be great.


Hans, thanks for the details in the issue and for the revert patch.
I've tested, and it works fine - device booted normally with your patch on top of 5.19-rc1.
When submitting, feel free to add:

Reported-and-Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>


Also, if possible loop me / CC in the mailing-lists, I'm pretty interested in this issue.
Cheers!
Comment 13 Hans de Goede 2022-06-12 14:24:51 UTC
Thank you for catching the log of the no booting kernel, that was very useful. Things are indeed as I expected. from the non-boot log:

acpi PNP0A08:00: clipped [mem 0x000a0000-0x000bffff window] to [mem 0x000c0000-0x000bffff window] for e820 entry [mem 0x0009f000-0x000bffff]
acpi PNP0A08:00: clipped [mem 0x80000000-0xf7ffffff window] to [mem 0xa0100000-0xf7ffffff window] for e820 entry [mem 0xa0000000-0xa00fffff]
acpi PNP0A08:00: clipped [mem 0xfed45000-0xfed814ff window] to [mem 0xfed45000-0xfed7ffff window] for e820 entry [mem 0xfed80000-0xfed81fff]
acpi PNP0A08:00: clipped [mem 0xfed81900-0xfed81fff window] to [mem 0xfed82000-0xfed81fff window] for e820 entry [mem 0xfed80000-0xfed81fff]
acpi PNP0A08:00: clipped [mem 0xfedc0000-0xfedc0fff window] to [mem 0xfede0000-0xfedc0fff window] for e820 entry [mem 0xfedc0000-0xfeddffff]
acpi PNP0A08:00: clipped [mem 0xfedc6000-0xfedc6fff window] to [mem 0xfede0000-0xfedc6fff window] for e820 entry [mem 0xfedc0000-0xfeddffff]
acpi PNP0A08:00: ignoring host bridge window [mem 0x000c0000-0x000bffff window] (conflicts with PCI mem [mem 0x00000000-0xfffffffffff])
acpi PNP0A08:00: ignoring host bridge window [mem 0xfed82000-0xfed81fff window] (conflicts with PCI mem [mem 0x00000000-0xfffffffffff])
acpi PNP0A08:00: ignoring host bridge window [mem 0xfede0000-0xfedc0fff window] (conflicts with PCI mem [mem 0x00000000-0xfffffffffff])
acpi PNP0A08:00: ignoring host bridge window [mem 0xfede0000-0xfedc6fff window] (conflicts with PCI mem [mem 0x00000000-0xfffffffffff])

Which includes the predicted issues of the main window getting much smaller as well as some other windows getting completely removed.

Notice esp. the main window getting much smaller log line:

acpi PNP0A08:00: clipped [mem 0x80000000-0xf7ffffff window] to [mem 0xa0100000-0xf7ffffff window]

Which seems to cause issues later on when claiming resources for PCI devices already setup by the BIOS:

pci 0000:00:01.2: can't claim BAR 14 [mem 0x80600000-0x806fffff]: no compatible bridge window
pci 0000:00:01.3: can't claim BAR 14 [mem 0x80500000-0x805fffff]: no compatible bridge window
pci 0000:00:01.4: can't claim BAR 14 [mem 0x80400000-0x804fffff]: no compatible bridge window
pci 0000:00:08.1: can't claim BAR 14 [mem 0x80000000-0x803fffff]: no compatible bridge window
pci 0000:01:00.0: can't claim BAR 0 [mem 0x80600000-0x80603fff 64bit]: no compatible bridge window
pci 0000:02:00.0: can't claim BAR 0 [mem 0x80501000-0x80501fff]: no compatible bridge window
pci 0000:02:00.0: can't claim BAR 1 [mem 0x80500000-0x805007ff]: no compatible bridge window
pci 0000:04:00.0: can't claim BAR 5 [mem 0x80300000-0x8037ffff]: no compatible bridge window
pci 0000:04:00.3: can't claim BAR 0 [mem 0x80000000-0x800fffff 64bit]: no compatible bridge window
pci 0000:04:00.4: can't claim BAR 0 [mem 0x80100000-0x801fffff 64bit]: no compatible bridge window
pci 0000:03:00.0: can't claim BAR 2 [mem 0x80400000-0x8040ffff 64bit]: no compatible bridge window
pci 0000:04:00.1: can't claim BAR 0 [mem 0x803c0000-0x803c3fff]: no compatible bridge window
pci 0000:04:00.2: can't claim BAR 2 [mem 0x80200000-0x802fffff]: no compatible bridge window
pci 0000:04:00.2: can't claim BAR 5 [mem 0x803c4000-0x803c5fff]: no compatible bridge window
pci 0000:04:00.5: can't claim BAR 0 [mem 0x80380000-0x803bffff]: no compatible bridge window

Later on Linux does try to assign new resources to these, e.g. :

pci 0000:00:01.2: BAR 14: assigned [mem 0xa0100000-0xa01fffff]
pci 0000:00:01.3: BAR 14: assigned [mem 0xa0200000-0xa02fffff]
pci 0000:00:01.4: BAR 14: assigned [mem 0xa0300000-0xa03fffff]
pci 0000:00:08.1: BAR 14: assigned [mem 0xa0400000-0xa07fffff]

But I guess that the changing of these BARs for an already online PCI device might be confusing some things. Or this could break ACPI methods which do direct IO to these devices assuming the BARs are unchanged from how they were setup by the BIOS.

It is still unclear to me what exactly here is causing the boot to not complete, but it is clear that commit 4c5e242d3e93 has a bunch of undesirable side-effects.

So I'm going to go ahead and submit my revert of it to Bjorn and then we will see from there.

> When submitting, feel free to add:
>
> Reported-and-Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>

Ack, will do.

Also Bjorn, note this has already been successfully build-tested by the lkp infra as part of my pdx86/review-hans branch:

"""
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git review-hans
branch HEAD: 8c99ffaa7c4b0b81e7bf18933d530dddc9ff0f85  x86/PCI: Revert: "Clip only host bridge windows for E820 regions"

elapsed time: 898m

configs tested: 91
configs skipped: 3

The following configs have been built successfully.
"""

So I believe this won't introduce any build issues, even with configs with CONFIG_PCI unset.
Comment 14 Bjorn Helgaas 2022-06-14 17:29:31 UTC
Created attachment 301176 [details]
v5.19-rc2 dmesg from Intel MID platform with 4c5e242d3e93 reverted (working)

Andy Shevchenko <andy.shevchenko@gmail.com> reported a similar failure on Intel MID platforms.  This is a dmesg log with 4c5e242d3e93 reverted for further debugging.

Andy's report: https://lore.kernel.org/r/20220613201641.67640-1-andriy.shevchenko@linux.intel.com
Comment 15 Jongman Heo 2022-06-15 00:27:33 UTC
Created attachment 301178 [details]
v5.19-rc2 dmesg from VMWare Linux guest with fix (working)


I have same issue. It happens with my VMWare Linux guest.

Bisected to same commit, 4c5e242d3e93.


VMWare Workstation 15 Pro (15.5.7)

Host : Windows 10 
Guest : Fedora 36 + vanilla kernel

Attached dmesg log is obtained with,
  commit 018ab4fa ("netfs: fix up netfs_inode_init() docbook comment" which is latest linus git) + patch (from Hans)

I couldn't get log when boot fails (just tried to remove 'rhgb quite' cmdline).
Comment 16 Bjorn Helgaas 2022-06-17 17:35:59 UTC
Created attachment 301194 [details]
v5.19-rc1 dmesg failure in VMWare Fusion VM

Reported by Benjamin Coddington:
https://lore.kernel.org/r/1B1241E1-6C7E-42CF-9690-1F47E9F3A6B2@redhat.com
Comment 17 Bjorn Helgaas 2022-06-17 17:37:20 UTC
Created attachment 301195 [details]
v5.19-rc1 dmesg VMWare Fusion VM workaround with pci=no_820

Again from Benjamin Coddington
Comment 18 Bjorn Helgaas 2022-06-17 21:14:46 UTC
Hans' revert is upstream: https://git.kernel.org/linus/a2b36ffbf5b6
and should appear in v5.19-rc3.