Bug 201069 - Devices behind Intel PCI bridge unusable after S3 resume (prefetch issues)
Summary: Devices behind Intel PCI bridge unusable after S3 resume (prefetch issues)
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-10 03:41 UTC by Daniel Drake
Modified: 2019-06-04 11:40 UTC (History)
4 users (show)

See Also:
Kernel Version: 4.18
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Asus products with nvidia GPU affected by this issue (12.03 KB, text/plain)
2018-09-10 03:41 UTC, Daniel Drake
Details
Win10 qemu-traced PCI bridge config space accesses (40.39 KB, text/plain)
2018-09-10 04:14 UTC, Daniel Drake
Details
Peter Wu's Windows qemu trace (14.59 KB, text/plain)
2018-09-10 04:20 UTC, Daniel Drake
Details
case 2; egpu working after resume (used it before suspend) (85.49 KB, text/plain)
2018-10-01 14:32 UTC, Thomas Martitz
Details
case 1; egpu NOT working after resume (NOT using it before suspend) (104.85 KB, text/plain)
2018-10-01 14:35 UTC, Thomas Martitz
Details

Description Daniel Drake 2018-09-10 03:41:36 UTC
Created attachment 278393 [details]
Asus products with nvidia GPU affected by this issue

On 38+ Intel-based Asus products, the nvidia GPU becomes unusable after S3 suspend/resume. The affected products include multiple generations of nvidia GPUs and Intel SoCs. After resume, nouveau logs many errors such as:

    fifo: fault 00 [READ] at 0000005555555000 engine 00 [GR] client 04 [HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
    DRM: failed to idle channel 0 [DRM]

Similarly, the nvidia proprietary driver also fails after resume (black screen, 100% CPU usage in Xorg process). We shipped a sample to Nvidia for diagnosis, and their response indicated that it's a problem with the parent PCI bridge (on the Intel SoC), not the GPU.

Runtime suspend/resume works fine, only S3 suspend is affected. It's not a regression (we don't know of any working kernel version).

We found a workaround: on resume, rewrite the Intel PCI bridge 'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32). In the cases that I checked, this register has value 0 and we just have to rewrite that value.

It's very strange that rewriting the exact same register value makes a difference, but it definitely makes the issue go away. It's not just acting as some kind of memory barrier, because rewriting other bridge registers does not work around the issue. There's something magic in this particular register. We have confirmed this on all the affected models we have in-hands (X542UQ, UX533FD, X530UN, V272UN).

I'm attaching the details of the 38 Asus devices that we have worked on that we believe are affected.
Comment 1 Daniel Drake 2018-09-10 03:44:50 UTC
Testing one of the proposed patches that implements this workaround, Thomas Martitz confirmed that it fixes an equivalent problem where the AMD Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unusable after S3 resume.

https://bugs.freedesktop.org/show_bug.cgi?id=105760#c65
Comment 2 Daniel Drake 2018-09-10 03:53:49 UTC
Separately, the Asus X441UAR laptop had an issue where the r8169 ethernet became broken after suspend/resume, a recent kernel regression. This was worked around by turning off MSI-X in commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e").

However, reverting that commit, it can be seen that the PCI_PREF_BASE_UPPER32 workaround also solves this issue.

Furthermore we have also recently seen the equivalent r8169 S3 resume issue on an Aimfor-tech laptop with RTL6186evl/8111evl. We had not got around to investigating/quirking it yet, but testing now, that issue is also solved by applying the PCI_PREF_BASE_UPPER32 workaround.

Based on that I would guess that this issue also solves the equivalent issue that was reported at https://bugzilla.redhat.com/show_bug.cgi?id=1580079 and lead to commit 7c53a722459c ("r8169: don't use MSI-X on RTL8168g").
Comment 3 Daniel Drake 2018-09-10 04:14:34 UTC
Created attachment 278395 [details]
Win10 qemu-traced PCI bridge config space accesses

To check Windows 10 behaviour:

Create events.txt with contents:
pci_cfg_read
pci_cfg_write

Then run qemu with a ioh3420 PCI-to-PCI bridge added and tracing enabled:

$ qemu-system-x86_64 -enable-kvm -M q35,accel=kvm -m 2G -vga qxl -cpu host -hda testimg.img -device ioh3420,id=rp1,bus=pcie.0,addr=1c.0,port=1 -trace events=events.txt

Filtered the output to show only the bridge accesses, and separated boot, suspend and resume - results are attached.

Notably during resume, the prefetch-related registers get rewritten:
  pci_cfg_write ioh3420 28:0 @0x24 <- 0xfeb0fea0
  pci_cfg_write ioh3420 28:0 @0x28 <- 0x0
  pci_cfg_write ioh3420 28:0 @0x2c <- 0x0

This happened even though there was nothing behind the bridge. Windows failed to resume in this test (black screen) but the traced register writes seem indicative enough.
Comment 4 Daniel Drake 2018-09-10 04:20:32 UTC
Created attachment 278397 [details]
Peter Wu's Windows qemu trace

Peter Wu also traced Win10 PCI bridge behaviour using his setup at https://github.com/Lekensteyn/acpi-stuff/tree/master/d3test

I'm archiving his results here, which also show the 'Prefetchable Base Upper 32 Bits' register being rewritten on resume.

https://marc.info/?l=linux-pci&m=153616336225386&w=2
Comment 5 Daniel Drake 2018-09-10 04:23:10 UTC
Examining the 38 Asus/nvidia systems that we have worked on, plus the 2 with we have on-hand with r8169 resume issues, the affected PCI bridges are:

Intel Skylake PCIe Controller (x16) [8086:1901] (rev 05)
Intel Skylake PCIe Controller (x16) [8086:1901] (rev 07)
Intel Device [8086:31d8] (rev f3)
Intel Celeron N3350/Pentium N4200/Atom E3900 Series PCI Express Port B #1 [8086:5ad6] (rev fb)
Intel Celeron N3350/Pentium N4200/Atom E3900 Series PCI Express Port A #1 [8086:5ad8] (rev fb)
Intel Sunrise Point-LP PCI Express Root Port [8086:9d10] (rev f1)
Intel Sunrise Point-LP PCI Express Root Port #5 [8086:9d14] (rev f1)
Intel Device [8086:9dbc] (rev f0)
Comment 6 Alex Deucher 2018-09-10 14:50:47 UTC
FWIW, this seems to affect some systems with AMD dGPUs as well:
https://bugs.freedesktop.org/show_bug.cgi?id=105760

This may also be the root cause behind the runtime resume failures for which we already added quirks:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c#n573
On some platforms you can still use the old AMD specific ATPX ACPI methods rather than the newer standard _PR3.  When that is possible, the quirk enables them instead.  Perhaps the ATPX methods handle this properly while _PR3 assumes the OS will?
Comment 7 Peter Wu 2018-09-10 22:00:24 UTC
For future reference I am including some related discussions:

Rewriting Intel PCI bridge prefetch base address bits solves nvidia graphics issues
https://lists.freedesktop.org/archives/nouveau/2018-August/030954.html

[PATCH] PCI: add prefetch quirk to work around Asus/Nvidia suspend issues
https://www.spinics.net/lists/linux-pci/msg75686.html
(an Asus-specific quirk to read+write the prefetch upper 32-bit addr register)

[PATCH] PCI: Reprogram bridge prefetch registers on resume
https://www.spinics.net/lists/linux-pci/msg75974.html
(a generic PCI patch that unconditionally reprograms mmio addresses for bridges on system resume)


Based on the Windows traces, I suggested an alternative approach: change the PCI config space restore logic to unconditionally write values in a certain order (as opposed to the current read + write if changed logic):
https://lkml.kernel.org/r/20180908121422.GA10676@al

In the preceding reply, Bjorn Helgaas provides some historical context for the current restore logic and acknowledged that unconditionally writing all registers  (in reverse order as Linux currently does) might be ok.
Comment 8 Alex Deucher 2018-09-10 22:13:27 UTC
(In reply to Peter Wu from comment #7)
> 
> [PATCH] PCI: Reprogram bridge prefetch registers on resume
> https://www.spinics.net/lists/linux-pci/msg75974.html
> (a generic PCI patch that unconditionally reprograms mmio addresses for
> bridges on system resume)
> 
> 
> Based on the Windows traces, I suggested an alternative approach: change the
> PCI config space restore logic to unconditionally write values in a certain
> order (as opposed to the current read + write if changed logic):
> https://lkml.kernel.org/r/20180908121422.GA10676@al
> 
> In the preceding reply, Bjorn Helgaas provides some historical context for
> the current restore logic and acknowledged that unconditionally writing all
> registers  (in reverse order as Linux currently does) might be ok.

I think one of these approaches makes the most sense.  IIRC, we've seen the issue on non-Intel systems too, so it probably affects many (all?) PCI bridges.
Comment 9 Daniel Drake 2018-09-11 03:07:17 UTC
(In reply to Alex Deucher from comment #6)
> This may also be the root cause behind the runtime resume failures for which
> we already added quirks:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/
> drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c#n573

It's worth a try, although on the Asus/nvidia setups that I have tested, runtime suspend/resume is already working fine, this issue was only biting the S3 resume case.
Comment 10 Daniel Drake 2018-09-11 03:09:37 UTC
Another interesting historical reference: this very issue was discussed two years ago in bug #116851, where Marvell wifi cards were unusable after S3 resume.
Comment 11 Thomas Martitz 2018-10-01 14:30:30 UTC
I checked the latest iteration of the patch (v3) and I get mixed results.

1) If I don't use my eGPU before suspend it fails to resume after suspend
2) If I do use my eGPU before suspend it sucessfully resumes as well

Using means running glxgears with DRI_PRIME. In case 2, a glxgears after resuming from suspend works fine. In case 1 I get a blank window (and error messages in dmesg).

As requested on the mailing list, I'll attach dmesg logs with debug in pci_restore_config_dword() enabled, plus the output of lspci -vt

lspci -vt:
-[0000:00]-+-00.0  Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
           +-02.0  Intel Corporation UHD Graphics 620
           +-04.0  Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem
           +-14.0  Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
           +-14.2  Intel Corporation Sunrise Point-LP Thermal subsystem
           +-15.0  Intel Corporation Sunrise Point-LP Serial IO I2C Controller #0
           +-15.1  Intel Corporation Sunrise Point-LP Serial IO I2C Controller #1
           +-16.0  Intel Corporation Sunrise Point-LP CSME HECI #1
           +-1c.0-[01]----00.0  Advanced Micro Devices, Inc. [AMD/ATI] Lexa XT [Radeon PRO WX 3100]
           +-1c.3-[02]----00.0  Intel Corporation Wireless 8265 / 8275
           +-1c.4-[03-3b]--
           +-1d.0-[3c]----00.0  Toshiba America Info Systems Device 0116
           +-1f.0  Intel Corporation Intel(R) 100 Series Chipset Family LPC Controller/eSPI Controller - 9D4E
           +-1f.2  Intel Corporation Sunrise Point-LP PMC
           +-1f.3  Intel Corporation Sunrise Point-LP HD Audio
           +-1f.4  Intel Corporation Sunrise Point-LP SMBus
           \-1f.6  Intel Corporation Ethernet Connection (4) I219-V
Comment 12 Thomas Martitz 2018-10-01 14:32:54 UTC
Created attachment 278867 [details]
case 2; egpu working after resume (used it before suspend)
Comment 13 Thomas Martitz 2018-10-01 14:35:58 UTC
Created attachment 278869 [details]
case 1; egpu NOT working after resume (NOT using it before suspend)
Comment 14 Daniel Drake 2018-10-02 08:12:46 UTC
The related logs are identical same in both cases, it is restoring all the regs for the bridge:

[   44.734676] pcieport 0000:00:1c.0: restoring config space at offset 0x3c (was 0x100, writing 0x1ff)
[   44.734683] pcieport 0000:00:1c.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
[   44.734686] pcieport 0000:00:1c.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
[   44.734694] pcieport 0000:00:1c.0: restoring config space at offset 0x24 (was 0x10001, writing 0xa0119001)
[   44.734698] pcieport 0000:00:1c.0: restoring config space at offset 0x20 (was 0x0, writing 0xba30ba30)
[   44.734702] pcieport 0000:00:1c.0: restoring config space at offset 0x1c (was 0x20000000, writing 0x20003030)
[   44.734710] pcieport 0000:00:1c.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)

I suspect this is not a new issue with the latest patch, just a case that you did not detect when testing the earlier patches.

It's probably worth opening another bug report for this, hopefully AMD developers could help investigate further.

Note You need to log in before you can comment on or make changes to this bug.