Bug 198171

Summary: [AMD][X399] Inconsistent PCIe lane linking count
Product: Drivers Reporter: Barry G (barry)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: NEW ---    
Severity: normal    
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 4.15-rc3 Subsystem:
Regression: No Bisected commit-id:
Attachments: Output of lspci on a boot where the device links at x1
Output of lspci on a boot where the device links at x4
Dmesg output from a x4 link boot
Dmesg output from a x1 link boot

Description Barry G 2017-12-15 22:24:36 UTC
I have an AMD Threadripper system with an MSI X399 gaming carbon pro motherboard and a 1900X CPU.  When it boots, sometimes one of my cards (Intel X550 NIC) initializes X1 link trained and sometimes it link trains at X4.  I have tried this card in various other (Intel based) systems and not experienced this issue.

I am uncertain if this is a Bios issue, PCIe driver issue, or something else.  I am running the latest MB bios revision (V16 as of this writing).

In general, it seems like cold boots come up with a X1 width for the LnkSta and warm reboots come up with X4 width for the LnkSta.  This is not absolute though, as I have observed both inversions.

I will attach complete outputs but here are the highlights:
$ diff x550.{good,bad}
31c31
<               LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
---
>               LnkSta: Speed 8GT/s, Width x1, TrErr- Train- SlotClk+ DLActive-
>               BWMgmt- ABWMgmt-

Note that LnkCap always reports x4.

$ diff lspci.all.{good,bad}                                                                                                                                                                                                                                
[00:03.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1453 (prog-if 00 [Normal decode])]
295c295
<               LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
---
>               LnkSta: Speed 8GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+
>               BWMgmt+ ABWMgmt-

[0b:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)]
1414c1414
<               HeaderLog: 04000001 0000200f 0b070000 b4456d62
---
>               HeaderLog: 04000001 0000210f 0b070000 119631a9

[0c:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10G X550T (rev 01)]
1454c1454
<               LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
---
>               LnkSta: Speed 8GT/s, Width x1, TrErr- Train- SlotClk+ DLActive-
>               BWMgmt- ABWMgmt-

[0c:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10G X550T (rev 01)]
1529c1529
<               LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
---
>               LnkSta: Speed 8GT/s, Width x1, TrErr- Train- SlotClk+ DLActive-
>               BWMgmt- ABWMgmt-


This system is new and the video card that is currently in it requires the AMD DC patch set that was accepted in the 4.15-rc1 cycle.  As such, I have no prior data for this configuration.  I am open to installing another video card and trying older kernel versions if it would help.
Comment 1 Barry G 2017-12-15 22:25:57 UTC
Created attachment 261193 [details]
Output of lspci on a boot where the device links at x1
Comment 2 Barry G 2017-12-15 22:26:58 UTC
Created attachment 261195 [details]
Output of lspci on a boot where the device links at x4
Comment 3 Barry G 2017-12-15 22:28:37 UTC
Created attachment 261197 [details]
Dmesg output from a x4 link boot
Comment 4 Barry G 2017-12-15 22:29:53 UTC
Created attachment 261199 [details]
Dmesg output from a x1 link boot