Bug 208725 - IOMMU timeouts on AMD Radeon Pro W5700
Summary: IOMMU timeouts on AMD Radeon Pro W5700
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-28 10:35 UTC by Kai-Heng Feng
Modified: 2020-12-08 01:19 UTC (History)
3 users (show)

See Also:
Kernel Version: mainline
Tree: Mainline
Regression: No


Attachments
dmesg (100.71 KB, text/plain)
2020-07-28 10:37 UTC, Kai-Heng Feng
Details
lspci -vvnn (161.17 KB, text/plain)
2020-07-28 10:37 UTC, Kai-Heng Feng
Details

Description Kai-Heng Feng 2020-07-28 10:35:07 UTC
ATS quirk is required otherwise the device doesn't work properly:
[    3.375841] iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=63:00.0 address=0x42b5b01a0]
[    3.375845] iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=63:00.0 address=0x42b5b01c0]

It's the same bug as [1] on a different model.

[1] https://gitlab.freedesktop.org/drm/amd/-/issues/1015
Comment 1 Kai-Heng Feng 2020-07-28 10:37:26 UTC
Created attachment 290645 [details]
dmesg
Comment 2 Kai-Heng Feng 2020-07-28 10:37:48 UTC
Created attachment 290647 [details]
lspci -vvnn
Comment 3 nutodafozo 2020-08-26 18:44:54 UTC
I see the same on my RX 550 [1002:67ff], pci=noats also helps.
Could you also send another patch for rx550?
Comment 4 Kai-Heng Feng 2020-08-28 05:12:47 UTC
(In reply to nutodafozo from comment #3)
> I see the same on my RX 550 [1002:67ff], pci=noats also helps.
> Could you also send another patch for rx550?

Yea, please file a new bug and attach dmesg and `sudo lspci -vvnn`.
Comment 5 Joris L. 2020-10-22 12:26:36 UTC
also on AMD RADEON VII (VEGA20)

 iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=30:00.0 address=0x7fb59fb70]

X370 chipset with recent bios including AGESA 1.0.0.6
Comment 6 Bjorn Helgaas 2020-12-07 20:56:46 UTC
Kai-Heng, when you mark things resolved/fixed/etc, can you please include some reference to the change that fixed it?  E.g., attach the patch here, include a URL to a git commit, etc?  Otherwise it's really hard to figure out exactly what the fix is and which kernels have it.
Comment 7 Kai-Heng Feng 2020-12-08 01:19:01 UTC
Sorry. It was fixed by commit 45beb31d3afb ("PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken")

Note You need to log in before you can comment on or make changes to this bug.