Bug 196979 - IDT switch causes an ACS source valdation violation
Summary: IDT switch causes an ACS source valdation violation
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-18 15:37 UTC by James Puthukattukaran
Modified: 2017-09-18 15:46 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.12
Subsystem:
Regression: No
Bisected commit-id:


Attachments
v7 of the patch. (6.26 KB, patch)
2017-09-18 15:43 UTC, James Puthukattukaran
Details | Diff

Description James Puthukattukaran 2017-09-18 15:37:23 UTC
The IDT switch incorrectly flags an ACS source violation on a read config
request to an end point device on the completion (IDT 89H32H8G3-YC,
errata #36) even though the PCI Express spec states that completions are
never affected by ACS source violation (PCI Spec 3.1, Section 6.12.1.1). Here's
the specific copy of the errata text

"Item #36 - Downstream port applies ACS Source Validation to Completions
Section 6.12.1.1 of the PCI Express Base Specification 3.1 states
that completions are never affected
by ACS Source Validation. However, completions received by a
downstream port of the PCIe switch from a device that has not yet
captured a PCIe bus number are incorrectly dropped by ACS source
validation by the switch downstream port.

Workaround: Issue a CfgWr1 to the downstream device before issuing
the first CfgRd1 to the device.
This allows the downstream device to capture its bus number; ACS
source validation no longer stops
completions from being forwarded by the downstream port. It has been
observed that Microsoft Windows implements this workaround already;
however, some versions of Linux and other operating systems may not. "

The suggested workaround by IDT is to issue a configuration write to the
downstream device before issuing the first config read. This allows the
downstream device to capture its bus number, thus avoiding the ACS
violation on the completion. In order to make sure that the device is ready
for config accesses, we do what is currently done in making config reads
till it succeeds and then do the config write as specified by the errata.
However, to avoid hitting the errata issue when doing config reads, we
disable ACS SV around this process.
Comment 1 James Puthukattukaran 2017-09-18 15:43:49 UTC
Created attachment 258461 [details]
v7 of the patch.
Comment 2 James Puthukattukaran 2017-09-18 15:46:54 UTC
Bjorn had asked for an investigation on how Windows implements this workaround. 
https://lkml.org/lkml/2017/7/13/810


"I suspect instead that Windows does something slightly different in
enumeration that happens to avoid the problem.  Maybe it always does a
config write before the first config read.  Maybe there's something
else, like always leaving ACS SV off while enumerating.  Can you trace
a Windows boot in a VM that contains an IDT switch and figure out what
they're doing?

This just doesn't feel right.  Presumably IDT tested the workaround,
and if the workaround required ACS twiddling, they would have
mentioned that in the errata."

-------------------


I finally found a Windows expert in house to carry out the experiment for me to empirically deduce what it means that the ACS workaround as specified in the IDT errata was "implemented" in Windows.

It turns out that on Windows does not enable ACS source validation on switches in a bare metal environment. It only enables ACS source validation in a virtualized environment (Hyper-V). Furthermore, Hyper-V expects all devices to be present at boot and not physically hot plugged. Then, using powreshell or Hyper-V manager, you can assign a device to a guest (dynamic device assignment in Windows lingo). It is when this iccurs that the ACS feature is enabled on the switch. However, note that since the device has already been present and "configured" at boot, the source id has been latched and consequently, this avoids exposing the actual IDT issue.

I found additional information here on Hyper-V and direct device assignment here -
https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/plan/plan-for-deploying-devices-using-discrete-device-assignment

Note You need to log in before you can comment on or make changes to this bug.