Bug 200527
Summary: | Configurations with mismatched PCIe MPS settings | ||
---|---|---|---|
Product: | Drivers | Reporter: | Myron Stowe (myron.stowe) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | okaya |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.18 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | Proposed patch for issue (will post to linux-pci mail list) |
Description
Myron Stowe
2018-07-17 18:48:11 UTC
Created attachment 277403 [details]
Proposed patch for issue (will post to linux-pci mail list)
Seems like we can extend on the concept that commit 27d868b5e6c implemented in these situations, limiting to specific sub-topologies - root ports with downstream endpoint devices. In such scenarios, we can obtain the largest MPSS value the endpoint supports and set both its root port and itself to that setting. I've attached a patch that does such and with it applied to a 4.18.0-rc5 kernel, the afore mentioned scenario ends up with both the root port and endpoint devices getting set to a matching MPS value of 256 - pciehp 0000:00:01.2:pcie004: Slot(162): Card present pciehp 0000:00:01.2:pcie004: Slot(162): Link Up pci 0000:02:00.0: [144d:a822] type 00 class 0x010802 pci 0000:02:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit] pci 0000:02:00.0: Upstream bridge's Max Payload Size set to 256 (was 512, max 512) pci 0000:02:00.0: Max Payload Size set to 256 (was 128, max 256) iommu: Adding device 0000:02:00.0 to group 0 pci 0000:02:00.0: BAR 0: assigned [mem 0xf7400000-0xf7403fff 64bit] pcieport 0000:00:01.2: PCI bridge to [bus 02] I contemplated adding more error checking to the patch - checking the return value from the "pcie_set_mps(bridge, mpss)" call and possibly restoring the bridge's initial MPS setting if "rc = pcie_set_mps(dev, p_mps);" fails - but I could not imagine any situation where such would occur and decided against such. If anyone can point out possible cases where errors could occur, or in topologies where things may fail with such an approach, please let me know. Issue is there could be active traffic flowing from the other endpoints behind a switch while you are changing the root port's MPS. That's considered dangerous. If you are planning to do something like this, you need to somehow stop all traffic. (In reply to Sinan Kaya from comment #3) > Issue is there could be active traffic flowing from the other endpoints > behind a switch while you are changing the root port's MPS. That's > considered dangerous. > > If you are planning to do something like this, you need to somehow stop all > traffic. Hay Sinan, I'm intending to limit the case to boot and hot-plug of an endpoint device where the topology is limited to just a root port and the endpoint device. As such, there should be no other traffic flowing through the root port in question. I understand, and agree with, your concern but I don't intend for that type of topology to come into play with what I'm proposing - what am I missing? Oh. I didn't realize you were looking for a reduced topology. I have no objections for such a use case, then. Sinan, No problem at all, appreciate your "heads up" warning. I'll post to the list and if you see anything suspicious then please reply there. From testing by Dongdong Liu - I found a bug after applied the patch. The topology is as below. The 82599 netcard with two functions connect to RP. +-[0000:80]-+-00.0-[81]--+-00.0 Device 8086:10fb | | \-00.1 Device 8086:10fb 1. lspci -s BDF -vvv to get the value of device's MPSS , MPS and MRRS. RP (80:00.0): MPSS=512 MPS=512 MRRS=512 EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512 PF1(81:00.1): MPSS=512 MPS=512 MRRS=512 2. Enable SRIOV. echo 1 > /sys/devices/pci0000\:80/0000\:80\:00.0/0000\:81\:00.0/sriov_numvfs RP(80:00.0): MPSS=512 MPS=128 MRRS=512 ^^^ EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512 ^^^ PF1(81:00.1): MPSS=512 MPS=512 MRRS=512 ^^^ VF0(81:10.0): MPSS=128 MPS=128 MRRS=128 ^^^ The 82599 netcard PF (MPSS 512) and VF's MPSS (MPSS 128) are different. Then RP (MPS 128) will report Malformed TLP when PF0/PF1 has memory write operation with MPS 512. The 82599 netcard could work ok without the patch. The values of MPSS, MPS, MRRS are as below without the patch. RP(80:00.0): MPSS=512 MPS=512 MRRS=512 ^^^ EP PF0(81:00.0): MPSS=512 MPS=512 MRRS=512 ^^^ PF1(81:00.1): MPSS=512 MPS=512 MRRS=512 ^^^ VF0(81:10.0): MPSS=128 MPS=128 MRRS=128 ^^^ commit 9f0e89359775ee21fe1ea732e34edb52aef5addf Date: Mon Aug 13 12:19:46 2018 -0600 PCI: Match Root Port's MPS to endpoint's MPSS as necessary commit 3dbe97efe8bf450b183d6dee2305cbc032e6b8a4 Date: Mon Aug 13 12:19:39 2018 -0600 PCI: Skip MPS logic for Virtual Functions (VFs) |