Bug 216217 - PCI e820 got screwed up, some systems completely fail to boot.
Summary: PCI e820 got screwed up, some systems completely fail to boot.
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: AMD Linux
: P1 high
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-07-07 21:10 UTC by System Error
Modified: 2022-07-21 01:34 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.19-rc4
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description System Error 2022-07-07 21:10:55 UTC
In last few recent mainline kernels something got completely screwed up in regard of e820 vs PCI on one of computer systems.

Kernels up to 5.16 booting just fine on mentioned system. However, around approx kernel 5.17 something changed in regard how PCI handles e820. Unfortunately it completely broken mentioned system to extent it FAILS TO BOOT with recent kernels.

It looks like if PCI woefully fails resources reservations or so, this in turn results in failing to initialize numerous devices, ranging from GPU to networking, so overall system gets stuck without graphics (!!!) and generally stuck in the middle of boot.

Details:
Investigation shown commit around a2b36ffbf5b6ec301e61249c8b09e610bc80772f hints there were some changes on how PCI vs e820 handled and some bool boot parameter has been added to control PCI e820 behavior. Unfortunately with it default state kernels BACKFIRE on mentioned system and COMPLETELY FAIL TO BOOT. Which isn't anyhow acceptable state of things.

I had to use "pci=no_e820" kernel command line as hinted, this restores sane PCI behavior so system could boot properly.

Boot log upon use of this override suggests I should report this condition so future kernels suxx a bit less.

Just in case... 
[    0.000000] DMI: Gigabyte Technology Co., Ltd. GA-990FXA-UD5/GA-990FXA-UD5, BIOS F5 08/29/2011

...

[    2.534896] ACPI: Interpreter enabled
[    2.534919] ACPI: PM: (supports S0 S3 S4 S5)
[    2.534921] ACPI: Using IOAPIC for interrupt routing
[    2.534960] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    2.534961] PCI: Ignoring E820 reservations for host bridge windows
[    2.534962] PCI: Please notify linux-pci@vger.kernel.org so future kernels can this automatically
[    2.535123] ACPI: Enabled 9 GPEs in block 00 to 1F
[    2.539860] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[    2.539867] acpi PNP0A03:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
[    2.540155] PCI host bridge to bus 0000:00

I could collect other details if necessary.
Comment 1 Hans de Goede 2022-07-08 08:53:52 UTC
This log line:

"PCI: Ignoring E820 reservations for host bridge windows"

Suggests that you are using a Fedora kernel with had a backported patch modifying PCI bridge windows vs E820 reservations behavior.

This patch was reverted upstream and has also been dropped from recent Fedora kernels.

Can you please give the latest 5.18 Fedora kernels a try? These are already in the stable updates repo for Fedora. So on F35/F36 just a "dnf update 'kernel*'" should suffice to get a 5.18 kernel.

If you are still on Fedora 34, you can grab a 5.18 kernel here:
https://koji.fedoraproject.org/koji/buildinfo?buildID=1997527

Here are some quick install instructions for installing a kernel directly from koji (Fedora's buildsystem) :
https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt

If this does NOT help please attach full dmesg output, both from a boot of the latest 5.16 kernel where you did not need to pass pci=no_e820 as well as from the latest problematic kernel with pci=no_e820 on the kernel commandline.

###

I am a bit surprised you need to use pci=no_e820 though. pci=use_e820 has been the default behavior of Linux for 10 years now and the 5.18 Fedora kernels will also behave as if pci=use_e820 is passed, except that the whole patch adding the boolean to enable/disable this has been dropped...

Anyways lets see what the 5.18 kernels do and then see from there.

###

Note the 5.19 kernel will once again change the behavior to the behavior which is causing trouble for you, but only on systems with a BIOS year >= 2023 and I see your BIOS is from 2011 so 5.19 should work fine on your system.

If you want to give 5.19 a try already you can grab it here:
https://koji.fedoraproject.org/koji/buildinfo?buildID=1996456
Comment 2 Hans de Goede 2022-07-08 08:56:18 UTC
> Note the 5.19 kernel will once again change the behavior to the behavior
> which is causing trouble for you, but only on systems with a BIOS year >=
> 2023 and I see your BIOS is from 2011 so 5.19 should work fine on your
> system.
>
> If you want to give 5.19 a try already you can grab it here:
> https://koji.fedoraproject.org/koji/buildinfo?buildID=1996456

I just realized this is not accurate, 5.19 will default to pci=no_e820 for systems with a BIOS year >= 2023, which is actually the behavior which seems to help you (which as said is weird).
Comment 3 System Error 2022-07-09 14:57:43 UTC
1) I do NOT use Fedora kernels. Actually, I don't use Fedora at all. So thanks for tips but it doesn't helps me much, unfortunately.

2) Goal of this bug report is to ensure MAINLINE KERNEL behaves properly on my system and systems similar to this ON ALL DISTROS picking up these mainline kernels.

3) Mentioned line only appears if I manually force "pci=no_e820" option via boot loader (I use GRUB). I have to do it manually because if I don't, default kernel behavior in last few mainline kernel versions isn't good enough - system completely fails to boot, failing to initialize numerous PCI devices. Most notably lack of GPU and NET means it isn't very useful. Kernels up to 5.16 were booting just fine so breaking change got introduced past this point.

4) This is some mainline kernel PCI regression or so: mainline kernels 5.16 and earlier were just fine.

5) I need pci=no_e820 on that system. Trying to use pci=use_e820 leads to unbootable system just like if I've not specified it at all.

6) This bug reported against Mainline at about 5.19-rc4. Exact git commit at time of build has been a175eca0f3d747599f1fdfac04cc9195b71ec996. Somewhere about 5.17 or .18 something in regard of PCI vs e820 got apparently changed and broken things to extent recent mainline kernels fail to boot mentioned system. Whatever you said about reverting only relevant if it happened past a175eca0f3d747599f1fdfac04cc9195b71ec996 commit (its about 5.19-rc4, which is fairly close to "current" mainline state of 5.19-to-be).

Theoretically, I can even try to bisect mailine down to certain commit that makes it a problem for me but it going to consume sizable time and problem mostly pinned down already: "some change past 5.16 mainline regarding e820 vs PCI". So if it saves me full fledged bisect session I'd be very grateful.

6) Any fedora specifics dont apply to my head. I'm not capable of applying internal Fedora things either, and operate on mainline kernels "as is". Ofc I can apply custom patch or so on mainline as it here and now, but it shouldn't assume I have access to Fedora infrastructure/their tools/etc. Sorry, but I like other distro(s) - but here I'm all about some mainline kernel SNAFU. Technically I build my own kernels and been doing it for hell a long time without major issues.

7) I think my BIOS date is more or less correct, using actual current time +/- UTC vs TZ-based storage. So it should boast 2022 and whatever time is it now.
Comment 4 Hans de Goede 2022-07-09 15:13:42 UTC
> 1) I do NOT use Fedora kernels. Actually, I don't use Fedora at all. So
> thanks for tips but it doesn't helps me much, unfortunately.

If you are not using Fedora build kernels, where do the kernel which you are using come from then? Ah I see below that you are building your own kernels, correct ?

As you say 5.16 works and the relevant code for 5.16 indeed has none of the recent PCI-e820 related changes:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/arch/x86/kernel/resource.c?h=v5.16


5.17 did have some changes, but they were reverted? Were your perhaps seeing issues with 5.17-rc# builds?  :
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/arch/x86/kernel/resource.c?h=v5.17

Also note that the patch merged for 5.17:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7f7b4236f2040d19df1ddaf30047128b41e78de7

Does not have the "PCI: Ignoring E820 reservations for host bridge windows" message nor this is repond to/use pci=use_e820 / pci=no_e820 on the kernel commandline. Those were part of patches posted on the list. But not of anything merged before 5.19. So it looks like your builds are picking up some patches from somewhere which were never part of the mainline kernel?


5.18 did not have any changes related to this:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/arch/x86/kernel/resource.c?h=v5.18


And the current master does have some changes but those should be a no-op:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/arch/x86/kernel/resource.c

Note there was one troublesome change in there, but that has been reverted in 5.19-rc3 and you say you are still seeing this with 5.19-rc4 ...


7) I think my BIOS date is more or less correct, using actual current time +/- UTC vs TZ-based storage. So it should boast 2022 and whatever time is it now.

Sorry, I was unclear I mean the date your BIOS was released, as given by "cat /sys/class/dmi/id/bios_date"
Comment 5 Hans de Goede 2022-07-09 15:14:56 UTC
Ugh, typos:

> Also note that the patch merged for 5.17:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7f7b4236f2040d19df1ddaf30047128b41e78de7

> Does not have the "PCI: Ignoring E820 reservations for host bridge windows"
> message nor this is repond to/use pci=use_e820 / pci=no_e820 on the kernel
> commandline. Those were part of patches posted on the list. But not of
> anything merged before 5.19. So it looks like your builds are picking up some
> patches from somewhere which were never part of the mainline kernel?

Should read:

Also note that the patch merged for 5.17:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7f7b4236f2040d19df1ddaf30047128b41e78de7

Does not have the "PCI: Ignoring E820 reservations for host bridge windows" message nor does it repond to/use pci=use_e820 / pci=no_e820 on the kernel commandline. Those were part of patches posted on the list. But not of anything merged before 5.19. So it looks like your builds are picking up some patches from somewhere which were never part of the mainline kernel?
Comment 6 Hans de Goede 2022-07-09 15:19:28 UTC
Hmm, are you perhaps trying to say that you are seeing issues since 5.17 and that starting with 5.19-rc# you can now workaround those by adding "pci=no_e820" to the kernel commandline ?

What happens if you boot a vanilla mainline 5.17 with "pci=no_e820" added to the kernel commandline?

Also please provide logs as mentioned in comment 1:

"If this does NOT help please attach full dmesg output, both from a boot of the latest 5.16 kernel where you did not need to pass pci=no_e820 as well as from the latest problematic kernel with pci=no_e820 on the kernel commandline."
Comment 7 Hans de Goede 2022-07-09 15:29:12 UTC
So reading your comments again in your initial comment 0 you said:

"around approx kernel 5.17 something changed in regard how PCI handles e820"

And then in your latest comment 3 you said:

"This bug reported against Mainline at about 5.19-rc4"

"Somewhere about 5.17 or .18 something in regard of PCI vs e820 got apparently changed"

So upon reading this again I think that what you are trying to say is:

1. 5.16 works
2. You upgraded to 5.19-rc4 and then things stopped working; and passing pci=no_e820 fixes things ?

And you never actually tried 5.17 / 5.18, but detected from 5.19-rc4 being broken that things have broken somewhere in the 5.17 - 5.19-rc4 timeframe ?

###

As you can see your "around approx kernel 5.17" language is really confusing (for me at least). Can you please provide a full list of exactly which kernel versions you have tried and for each version:

1. If it works normally
2. If it works with "pci=no_e820" on the kernel commandline
3. Full dmesg output directly after boot (if possible to gather this)
Comment 8 System Error 2022-07-09 22:52:32 UTC
Wow. Now it gets even more interesting. After building couple of older kernels (5.17/5.18 release), booting here, booting there, trying to reproduce, etc ... I got idea I can't reproduce all this crazy crap at all! Known-problematic scenarios suddenly started working. Which is bug reporter's worst nightmare ofc.

Actually, now all kernels boot just fine, even one I've used to file bug report, wow! Actually, even both pci=use_e820 and pci=no_e820 now suddenly BOTH working. Which wasn't a case before, use_e820 has been failing and no_e820 has been booting OK. Now, somehow, BOTH options are OKAY, making issue non-reproducible.

My wild guess what possibly took place to make all these confusing things possible:
1) Kernel apparently defaults to use_e820 on this system.
2) I guess at some point system firmware gone nuts and started giving out some really insane crap as PCI resorces (erased flash area maybe or something?). This coincided with mentioned changes (whoa!!!). This could explain at least why pci=no_e820 worked my issue around, getting rid of apparent garbage values - so PCI worked just fine.
3) I guess at some point of trying to take on problem I did something that provoked firmware into something like HW rescan and so on - and eventually fixed BAR reservations, etc, etc to point use_e820 started working just fine.

So in the end my wild guess it's been system firmware bug/glitch and I've bothered really wrong heads. What a shame.

When I've had this problem... I had to use serial wire to even get boot log at all since due to PCI issues storages weren't in shape to write persistent logs, together with failing GPU init it ensured I can't even have log in friendly manner, so I had to resort to using serial wire to connect "early boot" logs. That's where I got idea PCI fails to take off.

At this point e820-based PCI resource allocation reported numerous errors/problems like that:
[    3.791783][    T1] pci 0000:00:12.0: can't claim BAR 0 [mem 0xfdffe000-0xfdffefff]: no compatible bridge window
...
[    3.805843][    T1] pci_bus 0000:00: Some PCI device resources are unassigned, try booting with pci=realloc

Overall numerous PCI resource reservation failures caused net/gpu/storage to fail to initialise properly. Forcing pci=no_e820 got rid of this problem, BUT, BUT, BUT now even pci=use_e820 works just fine without spitting out mentioned errors anymore. My best guess firmware re-allocated resources, rebuilt tables and problem just gone alltogether.

So long story short, best explanation I can think of is that I faced firmware bug or system-level upset where some table in flash got ruined but now apparently self-recovered, making whole thing non-reproducible as even pci=use_e820 doesn't causes these errors anymore.

So, sorry for taking your time, I'll mark it as "RESOLVED - UNREPRODUCIBLE" assuming I've been goofied by system firmware bug/glitch, not a real kernel bug. It's just unlucky - and very confusing - coincidence.
Comment 9 Hans de Goede 2022-07-10 13:19:20 UTC
> 1) Kernel apparently defaults to use_e820 on this system.

Yes that has been the default behavior for approx. 10 years or so now.

> 2) I guess at some point system firmware gone nuts and started giving out
> some really insane crap as PCI resorces (erased flash area maybe or
> something?). This coincided with mentioned changes (whoa!!!). This could
> explain at least why pci=no_e820 worked my issue around, getting rid of
> apparent garbage values - so PCI worked just fine.

My guess is that you were seeing these issue with 5.19-rc1 or 5.19-rc2 which did have a change in PCI behavior which is known to cause issues like these:

[    3.791783][    T1] pci 0000:00:12.0: can't claim BAR 0 [mem 0xfdffe000-0xfdffefff]: no compatible bridge window
...
[    3.805843][    T1] pci_bus 0000:00: Some PCI device resources are unassigned, try booting with pci=realloc

This change has been reverted starting with 5.19-rc3 so I think the most likely cause is you were seeing these issues with 5.19-rc1 or -rc2. If you really want to know you could try building those and see if it reproduces there.

Anyways it is good to hear that everything is working for you as it should now.
Comment 10 Bjorn Helgaas 2022-07-11 21:42:12 UTC
Thanks for all your investigation.  If you could attach a complete dmesg log, we should be able to work out what happened and make sure everything is working as it should.
Comment 11 System Error 2022-07-21 01:31:43 UTC
Yes, I'm curious. So I've built 5.19-rc1 and I can confirm it DOES causes PCI fallout. 

But, still, I'm puzzled why I'm here at all since I have at least some idea of reporting bugs to kernel so I tested on more or less current mainline (about 5.19-rc4 at the time of writing bug) and it also failed to init PCI by default. But something apparently changed and now it boots just fine.

I suspect I caused PCI resources reallocation shuffle as to even get idea why it doesn't starts up with GPU out and storages failing I had to connect serial wire to get any kernel progress at all, and ofc I dont have RS232 readily hooked to PC's back these days. So I had to access MB's RS232 pin header directly, well, I know what "serial wire" is. To make it easier to connect pin header at the bottom of MB I had to unplug some card - and eventually booted without it. Upon getting idea what could it be at all I put card back into slot, etc. Guess at this point firmware could've rebuilt PCI resources allocations, so some kernels actually started to behave, making it non reproducible. But -rc1 still backfires.

So I got impression it's long standing bug nobody fixes, at which point I had little option but to try to investigate myself, even if it takes assembly of custom serial wire. But it seems whole story proven to be waaaay more complicated than that, involving certain reverts, and even something beyond I think been firmware glitch, since now -rc4 and beyond boots just fine by defailt.

To at least slightly make it up for wasting your time I've also re-tested this bug on 5.19-rc7 on mentioned system and pleased to confirm upcoming 5.19 is going to be good here just as well. Either way thanks everyone involved - at this point all systems operational and I'll be able to enjoy new kernel.
Comment 12 System Error 2022-07-21 01:34:58 UTC
I'll close this bug, and tentatively set "PATCH_ALREADY_AVAILABLE" due to -rc1 vs -rc4 snafu on my side. I'm not really sure its technically correct reaason but its most appropriate thing I can imagine to the date.

Note You need to log in before you can comment on or make changes to this bug.