Created attachment 307307 [details] command dmesg output kernel message output includes below error messages when assign ROM BAR for some device on Intel Birch Stream platform. -->> [ 15.058116] pci 0000:3b:00.0: ROM [mem size 0x00100000 pref]: can't assign; no space [ 15.058121] pci 0000:3b:00.0: ROM [mem size 0x00100000 pref]: failed to assign <<--
Created attachment 307308 [details] lspci command output
I checked the host bridge of device 3b:00.0, the host bridge shows it has memory space as "pci_bus 0000:3a: root bus resource [mem 0xa2800000-0xaabbffff window]", obviously, there is enough space for the device behind it. The device topology listed as below. --> [0000:3a]-00.0 Intel Corporation Device 09a2 +-00.1 Intel Corporation Device 09a4 +-00.2 Intel Corporation Device 09a3 +-00.4 Intel Corporation Device 0b23 -02.0-[3b]----00.0 Broadcom / LSI MegaRAID 12GSAS/PCIe Secure SAS39xx [0000:14]-00.0 Intel Corporation Device 09a2 +-00.1 Intel Corporation Device 09a4 +-00.2 Intel Corporation Device 09a3 +-00.4 Intel Corporation Device 0b23 +02.0[15]-- +04.0[16]-- -06.0-[17]--+-00.0 Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller -00.1 Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller <-- And with deeper digging on it, I compared the differences of enumerating kernel scanning bridge 14:06.0 and 3a:02.0, I found that the reason of bridge 14:06.0 could offer enough memory space for their devices(17:00.0 and 17:00.1) is OS assign it, and why OS assign enough memory space for bridge 14:06.0 is because the bridge default hasn't a value in its Memory Base Register/Memory Limit Register(Offset 20h/22h), but bridge 3a:02.0 has default value on it, its registers listed as below, the bridge was only programed one 1M space in it, and this 1M just for normal bar not for ROM bar. --> [ 14.638597] pci 0000:3a:02.0: PCI bridge to [bus 3b] [ 14.638601] pci 0000:3a:02.0: bridge window [io 0x6000-0x6fff] [ 14.638605] pci 0000:3a:02.0: bridge window [mem 0xa2800000-0xa28fffff] [ 14.638613] pci 0000:3a:02.0: bridge window [mem 0x24ffffd00000-0x24ffffefffff 64bit pref] <-- bridge 14:06.0 could assign or extend their bridge window, the message listed as below. --> [ 15.057942] pci 0000:14:06.0: bridge window [mem 0x93000000-0x930fffff]: assigned <-- I've tried to asking UEFI guys program bridge 3a:02.0 with a 2M memory space or just like bridge 14:06.0, don't appoint them, just let OS assign them. But after UEFI guys evaluated, they thought OS should enlarge 32bit bridge memory space for 3a:02.0. The details listed as below. --> Expansion rom bar could be from 2K to 16M. So, UEFI reserving extra 1M doesn't cover all sizes. 17:00.0/17:00.1 only has 256K oprom bars, while 1M is granularity of p2p bridge mmio windows. So, they are able to fit within the minimum mmio reserved for the largest oprom bar below root. 14:06.0 didn't have any 32-bit window opened by UEFI, as 17:00.0/17:00.1 only had 64-bit bars. So, OS programmed 32-bit window to map the oprom bars, even though they are not enabled. 3b:00.0 has 1M oprom bar and OS is free to increase the 3a:02.0's window by 1M if necessary. As 3b:00.0 also had 32-bit bar, 3a:02.0 already had 32-bit window open, but OS can enlarge it. It's parent root bridge (host bridge) does have 1M reserved for the largest oprom bar below it. Root cause is that OS is trying to allocate resources to oprom bars, which are not even enabled. As 32-bit mmio space below 4GB is very limited, UEFI can't waste it on allocating all oprom bars, when they can't be used at the same time as the regular bars. From PCIe spec, When Expansion ROM Enable is Set, the decoder is used for accesses to the Expansion ROM and device independent software must not access the Function through any other Base Address Registers or entries in the Enhanced Allocation Capability. <--
I tried to hardcode Linux kernel to make bridge 3a:02.0 limit register to 0xa290 from original 0xa280, it means to enlarge the bridge mmio window to 2M, its default value is 1M. After setting those, the ROM BAR can get it wanted 1M memory space, no assign failed message anymore! Therefore, I am thinking would it be possible for OS auto enlarge bridge 3a:02.0 to a larger mmio size window? After all, the host bridge 0000:3a has reserved much more than 2M,and a lot of memory space actually was not used under the host bridge. Below is the snippet of my own debug message output, just FYI. -->> [ 8.235462] pci 0000:3a:02.0: [8086:0db0] type 01 class 0x060400 PCIe Root Port [ 8.235478] pci 0000:3a:02.0: BAR 0 [mem 0x24fffff00000-0x24fffff0ffff 64bit] [ 8.235486] pci 0000:3a:02.0: PCI bridge to [bus 3b] [ 8.235490] pci 0000:3a:02.0: bridge window [io 0x6000-0x6fff] [ 8.235494] pci 0000:3a:02.0: 1-------pciconfig base:a2800000, limit:a2800000 [ 8.235497] pci 0000:3a:02.0: -bridge window [mem 0xa2800000-0xa29fffff], region.start:a2800000, end:a29fffff, reg>flags:200 [ 8.235501] pci 0000:3a:02.0: bridge window [mem 0xa2800000-0xa29fffff] [ 8.235509] pci 0000:3a:02.0: bridge window [mem 0x24ffffd00000-0x24ffffefffff 64bit pref] ... [ 8.648632] pci 0000:3b:00.0: ROM [mem 0xa2900000-0xa29fffff pref]: assigned [ 8.648635] pci 0000:3a:02.0: PCI bridge to [bus 3b] [ 8.648638] pci 0000:3a:02.0: bridge window [io 0x6000-0x6fff] [ 8.648642] pci 0000:3a:02.0: start------pci_setup_bridge_mmio, reg: [mem 0xa2800000-0xa29fffff], res>flags:200 [ 8.648645] pci 0000:3a:02.0: read-------pci_setup_bridge_mmio, region.start:a2800000, end:a29fffff, composed-l:a290a280 [ 8.648648] pci 0000:3a:02.0: bridge window [mem 0xa2800000-0xa29fffff] [ 8.648652] pci 0000:3a:02.0: bridge window [mem 0x24ffffd00000-0x24ffffefffff 64bit pref] [ 8.648658] pci_bus 0000:3a: resource 4 [io 0x6000-0x7fff window] [ 8.648660] pci_bus 0000:3a: resource 5 [mem 0xa2800000-0xaabbffff window] [ 8.648662] pci_bus 0000:3a: resource 6 [mem 0x240000000000-0x24ffffffffff window] [ 8.648664] pci_bus 0000:3b: resource 0 [io 0x6000-0x6fff] [ 8.648666] pci_bus 0000:3b: resource 1 [mem 0xa2800000-0xa29fffff] [ 8.648668] pci_bus 0000:3b: resource 2 [mem 0x24ffffd00000-0x24ffffefffff 64bit pref] <<--
Furthermore, the kernel message also has other related ROM error as below. Does kernel should enhance them while device was set an invalid 32 bit bar base? --> [ 15.057766] pci 0000:17:00.0: ROM [mem 0xfffc0000-0xffffffff pref]: can't claim; no compatible bridge window [ 15.057771] pci 0000:17:00.1: ROM [mem 0xfffc0000-0xffffffff pref]: can't claim; no compatible bridge window [ 15.057776] pci 0000:3b:00.0: ROM [mem 0xfff00000-0xffffffff pref]: can't claim; no compatible bridge window <--
Hello, Is there anyone who can response on this issue or I am in the wrong place?
Can you try with pci=realloc on the kernel command line? It will instruct kernel to redo the bridge window sizing automatically (for all bridges). If that doesn't help, I've done some work on this area which improves Expansion ROM handling wrt. bridge window sizing (again, try with pci=realloc as otherwise it likely won't do much): https://lore.kernel.org/linux-pci/20241216175632.4175-1-ilpo.jarvinen@linux.intel.com/T/#t The general problem with defaulting automatically to a greedy reservation strategy, however, is that it's in fact possible to run out of resources at a later stage and rewinding all the preceeding greedy decision at that point is not easy/practical.
OK, thanks you reply, I am going to try your kernel and your advise to add kernel parameter pci=realloc
Created attachment 307613 [details] dmesg of appending "pci=realloc" to kernel command line Test it with RHEL10 kernel version Linux 6.12.0-47.el10.x86_64 and appended the "pci=realloc" to kernel command line, the error still there. -->> [ 2.372740] pci 0000:3b:00.0: ROM [mem size 0x00100000 pref]: can't assign; no space [ 2.372743] pci 0000:3b:00.0: ROM [mem size 0x00100000 pref]: failed to assign <<--
If I apply your improvement patches, which one patch I should to apply, is it patch "[PATCH 25/25] PCI: Rework optional resource handling" ?
And when will the patches are submitted to upstream Linux kernel? Currently, I don't see those patches in V6.14-rc2
Hi, Unfortunately, you cannot apply just one of the patches. That series has lots of internal dependencies so you should try with the entire series. But yes, 25/25 is the main change I expect to help your case when pci=realloc is used. To get to a point where the problem could even be fixed, I had to do lots of supporting work first and those efforts were further hampered by low-quality code from old which I also ended up cleaning up a lot, thus there are so many patches in total. A few of the patches might not strictly be necessary but shouldn't be harmful either (most of them actually shouldn't do any functional changes at all). Those patches have been submitted already (since December 2024), but it usually takes lots of time to get non-trivial PCI subsystem changes accepted. Perhaps if you can test the patches and confirm they help also in your case, it might help in getting them applied faster :-).
Created attachment 307620 [details] dmesg that apply your patchset. After applying the patch set, bridge can enlarge their 32-bit mmio range and offered enough space for the devices under the bridge. However, the complain messages "can't assign; no space, failed to assign" still output that might confuse users. -->> [ 2.332258] pci_bus 0000:3a: max bus depth: 1 pci_try_num: 2 [ 2.332262] pci 0000:3b:00.0: ROM [mem size 0x00100000 pref]: can't assign; no space [ 2.332264] pci 0000:3b:00.0: ROM [mem size 0x00100000 pref]: failed to assign [ 2.332266] pci 0000:3a:02.0: PCI bridge to [bus 3b] [ 2.332267] pci 0000:3a:02.0: bridge window [io 0x6000-0x6fff] [ 2.332271] pci 0000:3a:02.0: bridge window [mem 0xa2800000-0xa28fffff] [ 2.332273] pci 0000:3a:02.0: bridge window [mem 0x24ffffd00000-0x24ffffefffff 64bit pref] [ 2.332277] PCI: No. 2 try to assign unassigned res [ 2.332278] release child resource [mem 0xa2800000-0xa28fffff pref] [ 2.332279] pci 0000:3a:02.0: resource 14 [mem 0xa2800000-0xa28fffff] released [ 2.332280] pci 0000:3a:02.0: PCI bridge to [bus 3b] [ 2.332283] pci 0000:3a:02.0: bridge window [mem 0x00100000-0x001fffff] to [bus 3b] add_size 100000 add_align 100000 [ 2.332285] pci 0000:3a:02.0: bridge window [mem 0xa2800000-0xa29fffff]: assigned [ 2.332287] pci 0000:3b:00.0: BAR 4 [mem 0xa2800000-0xa28fffff pref]: assigned [ 2.332291] pci 0000:3b:00.0: ROM [mem 0xa2900000-0xa29fffff pref]: assigned [ 2.332292] pci 0000:3a:02.0: PCI bridge to [bus 3b] [ 2.332293] pci 0000:3a:02.0: bridge window [io 0x6000-0x6fff] [ 2.332296] pci 0000:3a:02.0: bridge window [mem 0xa2800000-0xa29fffff] [ 2.332298] pci 0000:3a:02.0: bridge window [mem 0x24ffffd00000-0x24ffffefffff 64bit pref] [ 2.332302] pci_bus 0000:3a: resource 4 [io 0x6000-0x7fff window] [ 2.332303] pci_bus 0000:3a: resource 5 [mem 0xa2800000-0xaabbffff window] [ 2.332304] pci_bus 0000:3a: resource 6 [mem 0x240000000000-0x24ffffffffff window] [ 2.332305] pci_bus 0000:3b: resource 0 [io 0x6000-0x6fff] [ 2.332306] pci_bus 0000:3b: resource 1 [mem 0xa2800000-0xa29fffff] [ 2.332307] pci_bus 0000:3b: resource 2 [mem 0x24ffffd00000-0x24ffffefffff 64bit pref] <<-- In addition, the following messages persist. Should we simply disregard them, or do they present an opportunity for improvement? -->> [ 2.332007] pci 0000:17:00.0: ROM [mem 0xfff80000-0xffffffff pref]: can't claim; no compatible bridge window [ 2.332009] pci 0000:17:00.1: ROM [mem 0xfff80000-0xffffffff pref]: can't claim; no compatible bridge window [ 2.332011] pci 0000:17:00.2: ROM [mem 0xfff80000-0xffffffff pref]: can't claim; no compatible bridge window [ 2.332014] pci 0000:17:00.3: ROM [mem 0xfff80000-0xffffffff pref]: can't claim; no compatible bridge window [ 2.332017] pci 0000:3b:00.0: ROM [mem 0xfff00000-0xffffffff pref]: can't claim; no compatible bridge window [ 2.332020] pci 0000:5f:00.0: ROM [mem 0xffff0000-0xffffffff pref]: can't claim; no compatible bridge window [ 2.332022] pci 0000:70:00.0: ROM [mem 0xffff0000-0xffffffff pref]: can't claim; no compatible bridge window <<--
Great, thanks a lot for testing. If you feel confident enough, could you give your Tested-by tag for the series? PCI resource allocation is done in steps and those errors do originate from the time when the bridge window configuration has not yet been adapted to fit all the downstream resources. The failed resource assignments are retried by the later steps. If you do: grep ROM dmesg_apply_patchset.log | sort -k 4,4 -k2,2 ...you find out all those got later assigned successfully. I'm considering adding trace events to resource fitting and assignment so it could be better tracked with the normal tracing tools and possibly print less by default. So I agree there's some room for improvement there and hopefully we'll eventually get there.
OK,I tried to Add "Tested-by" to all those 24 patches, and updated cover-letter for adding a note regarding my test environments and results. I'll sent to your email separately first, Would you please help to review, because I don't know if I added contents are correct or not, If all are ready, I'll sent to upstream later. Many thanks!
Hi, Ah, my apologies for not explaining the process more. You wouldn't have needed to try to add them manually (and send the patches back to me). It's enough to reply to the cover letter of my series on the linux-pci mailing list and include your Tested-by on a separate line on the email if you want to give your tag to all of the patches (and mention it covers patches 1-25). You can put the description to the reply as well. Then either me or the maintainer will pick the tag from there, most likely the former when I need to post v2. If you're not subscribed to the linux-pci mailing list, you can get the mailbox file of the coverletter from the lore link I gave above so you can directly reply to the email that appeared on the linux-pci list (click the "raw" link). For some reason, you refer to 24 patches even if there are 25 in the series but I suppose that was just a typo?
Hi, Ilpo Thanks for your guidance, I've sent the email out, and give my test envrironments and test results. Unfortunately, I forget to attach this bug link there.
Created attachment 307649 [details] The dmesg of latest kernel 6.14.0-rc2 I'm attaching the dmesg log of the latest kernel version 6.14.0-rc2 for those who are interested.
And when I finished sending the reply via my Outlook, I just has a test reply your cover letter with the "git send-email" command, I only intended to sent my other email address, but obviously, git send-email sent to all those user in that email thread. that's my fault, sorry about that!
Thanks a lot again for testing this! It seems Bjorn (the PCI subsystem maintainer) applied this series now so unless the patches cause some unforeseen breakage that cannot be easily resolved, these will likely get into 6.15 (and its release candidates once that times comes). I've sent Bjorn a note so he can add the Closes tag pointing to this bugzilla entry and your tested by tag to the commits.
OK, got it! Appreciate it!