Bug 82761

Summary: DMAR:[fault reason 06] PTE Read access is not set
Product: Drivers Reporter: Ansa89 (ansalonistefano)
Component: NetworkAssignee: drivers_network (drivers_network)
Status: REOPENED ---    
Severity: normal CC: alan, alex.williamson, szg00000, v
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.16.2 Tree: Mainline
Regression: No

Description Ansa89 2014-08-19 12:18:21 UTC
When I boot with "intel_iommu=on" parameter, I get these errors repeated over and over again in dmesg:
---
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
---


lspci -vvs 05:00.0
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8169 PCI Gigabit Ethernet Controller (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RTL8169/8110 Family PCI Gigabit Ethernet NIC
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 64 (8000ns min, 16000ns max), Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 19
        Region 0: I/O ports at c200 [size=256]
        Region 1: Memory at f7862000 (32-bit, non-prefetchable) [size=256]
        Expansion ROM at f7840000 [disabled] [size=128K]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Kernel driver in use: r8169


lspci -vt
-[0000:00]-+-00.0  Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller
           +-01.0-[01]--+-00.0  NVIDIA Corporation Device 0fc6
           |            \-00.1  NVIDIA Corporation Device 0e1b
           +-02.0  Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller
           +-14.0  Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller
           +-16.0  Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1
           +-1a.0  Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2
           +-1b.0  Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller
           +-1c.0-[02]--
           +-1c.1-[03]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller
           +-1c.3-[04-05]----00.0-[05]--+-00.0  Realtek Semiconductor Co., Ltd. RTL8169 PCI Gigabit Ethernet Controller
           |                            +-01.0  Realtek Semiconductor Co., Ltd. RTL8169 PCI Gigabit Ethernet Controller
           |                            \-02.0  Realtek Semiconductor Co., Ltd. RTL8169 PCI Gigabit Ethernet Controller
           +-1d.0  Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1
           +-1f.0  Intel Corporation H77 Express Chipset LPC Controller
           +-1f.2  Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode]
           \-1f.3  Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller


dmesg | grep IOMMU
Intel-IOMMU: enabled
dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap c0000020e60262 ecap f0101a
dmar: IOMMU 1: reg_base_addr fed91000 ver 1:0 cap c9008020660262 ecap f0105a
IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
IOMMU 0 0xfed90000: using Queued invalidation
IOMMU 1 0xfed91000: using Queued invalidation
IOMMU: software identity mapping for device 0000:00:00.0
IOMMU: software identity mapping for device 0000:00:01.0
IOMMU: software identity mapping for device 0000:00:14.0
IOMMU: software identity mapping for device 0000:00:16.0
IOMMU: software identity mapping for device 0000:00:1a.0
IOMMU: software identity mapping for device 0000:00:1b.0
IOMMU: software identity mapping for device 0000:00:1c.0
IOMMU: software identity mapping for device 0000:00:1c.1
IOMMU: software identity mapping for device 0000:00:1c.3
IOMMU: software identity mapping for device 0000:00:1d.0
IOMMU: software identity mapping for device 0000:00:1f.0
IOMMU: software identity mapping for device 0000:00:1f.2
IOMMU: software identity mapping for device 0000:00:1f.3
IOMMU: software identity mapping for device 0000:01:00.0
IOMMU: software identity mapping for device 0000:01:00.1
IOMMU: software identity mapping for device 0000:03:00.0
IOMMU: Setting RMRR:
IOMMU: Setting identity map for device 0000:00:02.0 [0xcb800000 - 0xcf9fffff]
IOMMU: Setting identity map for device 0000:00:14.0 [0xc8d17000 - 0xc8d24fff]
IOMMU: Setting identity map for device 0000:00:1a.0 [0xc8d17000 - 0xc8d24fff]
IOMMU: Setting identity map for device 0000:00:1d.0 [0xc8d17000 - 0xc8d24fff]
IOMMU: Prepare 0-16MiB unity mapping for LPC
IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]


OS: Debian Wheezy
Kernel: 3.16.1 (compiled by hand)

If you need more information, just ask.
Comment 1 Alex Williamson 2014-08-19 12:58:24 UTC
Does it work on 3.17-rc1?  Are all of the 8169 NICs on bus 05 up and running?  Please provide lspci -vv info for 04:00.0.
Comment 2 Ansa89 2014-08-19 16:37:17 UTC
1) I would prefer stay on stable kernel if it's possible (which commits of 3.17-rc1 would be relevant for this bug?).

2) Yes, all of the 8169 NICs are up and running.

3) lspci -vvs 04:00.0
04:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 03) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Bus: primary=04, secondary=05, subordinate=05, sec-latency=32
        I/O behind bridge: 0000c000-0000cfff
        Memory behind bridge: f7800000-f78fffff
        Secondary status: 66MHz+ FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [78] Power Management version 3
                Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [80] Express (v1) PCI/PCI-X Bridge, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ BrConfRtry-
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransPend-
                LnkCap: Port #1, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <2us, L1 <2us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; Disabled- Retrain- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
        Capabilities: [c0] Subsystem: Micro-Star International Co., Ltd. Device 7758
        Capabilities: [100 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed- WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
                        Status: NegoPending- InProgress-
Comment 3 Alex Williamson 2014-08-19 16:51:22 UTC
(In reply to Ansa89 from comment #2)
> 1) I would prefer stay on stable kernel if it's possible (which commits of
> 3.17-rc1 would be relevant for this bug?).

579305f iommu/vt-d: Update to use PCI DMA aliases
e17f9ff iommu/vt-d: Use iommu_group_get_for_dev()
104a1c1 iommu/core: Create central IOMMU group lookup/creation interface
Comment 4 Ansa89 2014-08-19 17:28:43 UTC
I will try 3.17-rc1 (hoping it's enough stable for home-server).
Comment 5 Ansa89 2014-08-19 21:24:43 UTC
Tested with 3.17-rc1: the errors still there, but the spam rate seems lower than 3.16.1 (with 3.16.1 I get the errors repeated a lot of times and the count grows fast; with 3.17-rc1 I get the same errors repeated less times and the count seems to grow slower).

After ~10 minutes:
dmesg | grep -i dmar
ACPI: DMAR 0x00000000C8EA83F0 0000B8 (v01 INTEL  SNB      00000001 INTL 00000001)
dmar: Host address width 36
dmar: DRHD base: 0x000000fed90000 flags: 0x0
dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap c0000020e60262 ecap f0101a
dmar: DRHD base: 0x000000fed91000 flags: 0x1
dmar: IOMMU 1: reg_base_addr fed91000 ver 1:0 cap c9008020660262 ecap f0105a
dmar: RMRR base: 0x000000c8d17000 end: 0x000000c8d24fff
dmar: RMRR base: 0x000000cb800000 end: 0x000000cf9fffff
DMAR: No ATSR found
[drm] DMAR active, disabling use of stolen memory
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [05:00.0] fault addr ff3f4000 
DMAR:[fault reason 06] PTE Read access is not set


In the end the bug seems not fixed in 3.17-rc1.
Comment 6 Alex Williamson 2014-08-19 21:55:11 UTC
Ok, then it's probably not a result of the PCIe-to-PCI bridge since 05:00.0 is the correct requester ID for all the devices behind the bridge.  Unfortunately that means that the problem may not be fixable.  We're only seeing reads to a single address, which may mean the NIC is using that read to synchronize transaction ordering, ex. using a DMA read to flush a DMA write from the device.  If the NIC driver has visibility of this address, then it could attempt to do a coherent mapping for the device(s) to avoid the fault.  If it doesn't, then these NICs may simply be incompatible with the IOMMU.

Are these 3 separate NICs plugged into PCI slots on the motherboard or is this a single triple-port card with embedded PCIe-to-PCI bridge?

You might be able to run the IOMMU in passthrough mode with iommu=pt r8169.use_dac=1, but note the warning in modinfo "use_dac:Enable PCI DAC. Unsafe on 32 bit PCI slot."  Unfortunately if you don't enable use_dac, then intel_iommu will ignore the passthrough option for these devices.

Also note that this problem has nothing to do with Virtualization/KVM.  Drivers/Network or perhaps Drivers/PCI would be a more appropriate classification.
Comment 7 Alex Williamson 2014-08-19 22:18:54 UTC
I'm guessing this might be the motherboard here: MSI ZH77A-G43

Since you're apparently trying to use VT-d on this system for KVM and therefore presumably device assignment, I'll note that you will never be able to successfully assign the conventional PCI devices separately between guests or between host and guests.  The IOMMU does not have the granularity to create separate IOMMU domains per PCI slot in this topology.  Also, some (all?) Realtek NICs have some strange backdoors to PCI configuration space that make them poor targets for PCI device assignment:

http://git.qemu.org/?p=qemu.git;a=commit;h=4cb47d281a995cb49e4652cb26bafb3ab2d9bd28
Comment 8 Ansa89 2014-08-20 07:50:01 UTC
(In reply to Alex Williamson from comment #6)
> Are these 3 separate NICs plugged into PCI slots on the motherboard or is
> this a single triple-port card with embedded PCIe-to-PCI bridge?

They are 3 separate NICs plugged into 3 separate PCI slots.


> You might be able to run the IOMMU in passthrough mode with iommu=pt
> r8169.use_dac=1, but note the warning in modinfo "use_dac:Enable PCI DAC.
> Unsafe on 32 bit PCI slot."  Unfortunately if you don't enable use_dac, then
> intel_iommu will ignore the passthrough option for these devices.

I tried using "intel_iommu=pt", but it didn't work (resulted in vt-d disabled).
However with "intel_iommu=on iommu=pt" the errors remain (probably because I didn't add "r8169.use_dac=1").
I'm on a 64 bit system, but I think it has nothing to with "32 bit PCI slot".


> Also note that this problem has nothing to do with Virtualization/KVM. 
> Drivers/Network or perhaps Drivers/PCI would be a more appropriate
> classification.

I searched for "IOMMU" section but it doesn't exist.
I will probably change classification to "Drivers/PCI".



(In reply to Alex Williamson from comment #7)
> I'm guessing this might be the motherboard here: MSI ZH77A-G43

Yes, that is my motherboard.


> Since you're apparently trying to use VT-d on this system for KVM and
> therefore presumably device assignment, I'll note that you will never be
> able to successfully assign the conventional PCI devices separately between
> guests or between host and guests.  The IOMMU does not have the granularity
> to create separate IOMMU domains per PCI slot in this topology.  Also, some
> (all?) Realtek NICs have some strange backdoors to PCI configuration space
> that make them poor targets for PCI device assignment:

Yes, I'm trying to do device assignment, but not with those NICs: I want to pass only the nVidia PCIe VGA card to guest; while all NICs (and the integrated VGA card) will remain available to host.
It would be nice if there would be a way to prevent IOMMU on these NICs (or something like that).

SIDE NOTE: in the qemu commit they talk about RTL8168, but I have real RTL8169 devices (the only RTL8168 device is the integrated NIC and for that device I'm using r8168 driver from realtek compiled by hand).
Comment 9 Alan 2014-08-21 15:29:13 UTC
If you are using an out of tree driver, then please take the bug up with the supplier. If you can duplicate it with the in-tree driver then please re-open the bug
Comment 10 Ansa89 2014-08-22 10:02:50 UTC
The problem is related to 00:05.0 device (Realtek Semiconductor Co., Ltd. RTL8169 PCI Gigabit Ethernet Controller) which actually use the in-tree r8169 driver.

The out of tree r8168 driver is used by 00:03.0 device (Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller) which has nothing to do with the issue.

For testing purpose I also tried using only the in-tree r8169 driver for all devices, but the problem persists.
Comment 11 Ansa89 2014-09-11 16:50:01 UTC
Problem persists also with linux 3.16.2.