Bug 216428 - Thunderbolt 4 PCI Bridge Fails to Receive Proper PCI Resources
Summary: Thunderbolt 4 PCI Bridge Fails to Receive Proper PCI Resources
Status: NEEDINFO
Alias: None
Product: ACPI
Classification: Unclassified
Component: BIOS (show other bugs)
Hardware: AMD Linux
: P1 normal
Assignee: acpi_bios
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-08-31 01:08 UTC by Babak Rezai
Modified: 2024-08-20 20:43 UTC (History)
5 users (show)

See Also:
Kernel Version: 5.19.5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg output without TH4 card (107.91 KB, text/plain)
2022-08-31 01:08 UTC, Babak Rezai
Details
dmesg with TB4 card inserted, no kernel cmdline (119.41 KB, text/plain)
2022-08-31 01:09 UTC, Babak Rezai
Details
dmesg output with TB4, and cmdline arguments (131.06 KB, text/plain)
2022-08-31 01:12 UTC, Babak Rezai
Details
Increased hpmmio sizes, pci=hpmmiosize=4MB,hpmmioprefsize=64GB (219.04 KB, text/plain)
2022-09-01 02:14 UTC, Babak Rezai
Details

Description Babak Rezai 2022-08-31 01:08:45 UTC
Created attachment 301701 [details]
dmesg output without TH4 card

Device Info:
Motherboard : MSI MEG-X570 Godlike (MS-7C34)
Processor : Ryzen 9 5950x
Graphics: Radeon VII ( Vega 20 )
Thunderbolt : Asus ThunderboltEx 4 (Maple Ridge 4C 2020)

kernel cmdline :
pcie_ports=native pci=assign-busses,hpbussize=0x33,realloc,hpmmiosize=128M,hpmmioprefsize=16G

Problem Description:

I am attempting to enable the TB4 card on my system and have discovered some issues that prevent me from initialize the card and plug in a eGPU. However, I am unable to set kernel parameters that result in a successful boot. 

When I have no kernel command-line parameters, I have a successful boot but no BAR address space to support ePGU plugin.
With command-line parameters set, I have boot failure and a initramfs ( rootfs ) prompt, there seems to be misconfiguration of PCIe, neither the SATA or NVMe controller show up in dmesg.
Comment 1 Babak Rezai 2022-08-31 01:09:45 UTC
Created attachment 301702 [details]
dmesg with TB4 card inserted, no kernel cmdline
Comment 2 Babak Rezai 2022-08-31 01:12:09 UTC
Created attachment 301703 [details]
dmesg output with TB4, and cmdline arguments
Comment 3 Babak Rezai 2022-08-31 01:20:55 UTC
If it helps any, I have my PCIe sub-settings configured in bios as:
Re-size BAR support - Enabled
Above 4G memory/Crypto Currency mining - Enabled
PCI_E1 Gen switch - Auto
PCI_E2 Gen switch - Auto
PCI_E3 Gen switch - Auto
Chipset Gen switch - Auto
PCI_E1 Lanes Config - Auto
SR-IOV Support - Enabled
Comment 4 Mario Limonciello (AMD) 2022-08-31 19:49:24 UTC
Any chance you can contrast this to Windows on the same system?

> When I have no kernel command-line parameters, I have a successful boot but
> no BAR address space to support ePGU plugin.

I think that's the case with memory allocation today in Linux.  If you plug in your eGPU to the system before you boot up and don't set up any kernel options, does the BIOS do a better job at the memory allocation than Linux?
Comment 5 Alex Deucher 2022-08-31 20:18:00 UTC
I suspect there is not enough space reserved on the TB thunderbolt PCI bridge window for hotplug devices.  I believe Linux only reserves something 256K for hotplug.  Does adding `pci=hpmmiosize=4MB,hpmmioprefsize=64GB` to the kernel command line help?
Comment 6 Babak Rezai 2022-09-01 02:11:52 UTC
That got it to boot at least. The eGPU is another story however, appears that something is hung up and timing out

[ 1260.072776] pci 0000:0b:00.0: Adding to iommu group 46
[ 1260.210153] AMD-Vi: Completion-Wait loop timed out
[ 1260.338703] AMD-Vi: Completion-Wait loop timed out
[ 1260.462120] AMD-Vi: Completion-Wait loop timed out
[ 1260.599125] AMD-Vi: Completion-Wait loop timed out
[ 1260.722275] AMD-Vi: Completion-Wait loop timed out
[ 1260.859516] AMD-Vi: Completion-Wait loop timed out
Comment 7 Babak Rezai 2022-09-01 02:14:24 UTC
Created attachment 301707 [details]
Increased hpmmio sizes, pci=hpmmiosize=4MB,hpmmioprefsize=64GB
Comment 8 Babak Rezai 2022-09-01 02:15:26 UTC
(In reply to Babak Rezai from comment #7)
> Created attachment 301707 [details]
> Increased hpmmio sizes, pci=hpmmiosize=4MB,hpmmioprefsize=64GB

I have attached the dmesg output containing the snippet in comment 6.
Comment 10 Mario Limonciello (AMD) 2024-08-20 20:43:47 UTC
In kernel 6.9 we changed the policy for the USB4 CM so that it will reset the router at bootup.  This typically helps this type of problem.  Can you still reproduce this on the latest 6.10.y?

Note You need to log in before you can comment on or make changes to this bug.