Starting with kernel v4.19-rc1 NFS is unusable on some systems with hardware making use of the r8169 driver. For example, while there is some additional latency in automounting an NFS share, simple operations such as an 'ls' on a kernel source directory can take upwards of 8 minutes to complete. This takes a fraction of a second on a working kernel in my environment. Building from an NFS share is out of the question. The issue has been bisected to the following commit: commit a99790bf5c7f3d68d8b01e015d3212a98ee7bd57 (HEAD, refs/bisect/bad) Author: Kai-Heng Feng <kai.heng.feng@canonical.com> Date: Thu Jun 21 16:30:39 2018 +0800 r8169: Reinstate ASPM Support On Intel platforms (Skylake and newer), ASPM support in r8169 is the last missing puzzle to let CPU's Package C-State reaches PC8. Without ASPM support, the CPU cannot reach beyond PC3. PC8 can save additional ~3W in comparison with PC3 on a Coffee Lake platform, Dell G3 3779. This is based on the work from Chunhao Lin <hau@realtek.com>. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> The system in use is a Haswell based Asus B85M-G with NIC: dmesg: r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded r8169 0000:04:00.0 eth0: RTL8168g/8111g, 74:d0:2b:90:84:2b, XID 4c000800, IRQ 33 r8169 0000:04:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko] lspci: 04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c) Subsystem: ASUSTeK Computer Inc. Device 8554 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 18 Region 0: I/O ports at d000 [size=256] Region 2: Memory at f7c00000 (64-bit, non-prefetchable) [size=4K] Region 4: Memory at f0000000 (64-bit, prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 01 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 4096 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s (ok), Width x1 (ok) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via message/WAKE# AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled AtomicOpsCtl: ReqEn- LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [b0] MSI-X: Enable+ Count=4 Masked- Vector table: BAR=4 offset=00000000 PBA: BAR=4 offset=00000800 Capabilities: [d0] Vital Product Data pcilib: sysfs_read_vpd: read failed: Input/output error Not readable Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP+ BadDLLP+ Rollover- Timeout+ AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [140 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01 Status: NegoPending- InProgress- Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00 Capabilities: [170 v1] Latency Tolerance Reporting Max snoop latency: 71680ns Max no snoop latency: 71680ns Kernel driver in use: r8169 Kernel modules: r8169 Connected to root port: 00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #3 (rev d4) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin C routed to IRQ 29 Bus: primary=00, secondary=04, subordinate=04, sec-latency=0 I/O behind bridge: 0000d000-0000dfff [size=4K] Memory behind bridge: f7c00000-f7cfffff [size=1M] Prefetchable memory behind bridge: 00000000f0000000-00000000f00fffff [size=1M] Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR- BridgeCtl: Parity- SERR- NoISA- VGA- VGA16+ MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0 ExtTag- RBE+ DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend- LnkCap: Port #3, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <16us ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp- LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s (downgraded), Width x1 (ok) TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt- SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise- Slot #2, PowerLimit 10.000W; Interlock- NoCompl+ SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg- Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock- Changed: MRL- PresDet- LinkState- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- DevCap2: Completion Timeout: Range ABC, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd- AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd- AtomicOpsCtl: ReqEn- EgressBlck- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee002b8 Data: 0000 Capabilities: [90] Subsystem: ASUSTeK Computer Inc. Device 8534 Capabilities: [a0] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: pcieport Based on the behavioral regression, this hardware combination is not compatible with the ASPM features enabled in the above commit. A workaround is to boot with pcie_aspm=off or specify an ASPM policy such as powersafe or performance, but users should not need to do this simply to have a functional commodity device. It seems that laptop power savings are being prioritized over general functionality. Please resolve or revert. Thanks
Has any progress been made on this issue? I, for one, will not progress past the 4.18 kernel/driver due to the severity of the impact on the 8168 driver.
The issue should be gone since 5.3.
I have a similar network regression with r8169 driver on a 5.3 and also on a 5.8 kernel. Because in my case i totally loose the network connection and the workaround with disabling ASPM doesn't help, i have created a new bugreport https://bugzilla.kernel.org/show_bug.cgi?id=209081 .