Bug 217814
Summary: | r8169 transmit queue 0 timed out (after upgrade from 5.x to 6.x) | ||
---|---|---|---|
Product: | Networking | Reporter: | Chris Heath (chris) |
Component: | Other | Assignee: | Stephen Hemminger (stephen) |
Status: | NEW --- | ||
Severity: | high | CC: | bagasdotme, chris, hkallweit1, pablo.catalina, seanphaugh, vincas |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: |
Description
Chris Heath
2023-08-23 00:50:16 UTC
i believe i was on kernel 5.15 before the upgrade and before that on 5.4 (been running mint on this system for a number of years now and upgrading it via the mint upgrader when prompted) also submitted to ubuntu launchpad: https://bugs.launchpad.net/ubuntu/+bug/2032706 See comment on the bugzilla start page: Downstream kernels aren't supported here. Please test with a (best self-compiled) mainline kernel. Your lspci output misses relevant information, please use -vv as root. [ 14.157624] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control rx/tx [ 14.157677] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready [ 14.162293] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control off [ 14.162428] r8169 0000:02:00.0 enp2s0: Link is Down [ 17.671377] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full - flow control off This looks strange and may indicate a physical issue. Please check RJ45 ports on both sides and cabling. Best also test with another link partner. You can try the following to rule out ASPM-related issues: Disable l1_1_aspm and if that doesn't help l1_aspm under /sys/class/net/enp2s0/device/link. Best of course would be if you bisect the issue between last known good kernel and first troublesome. I've already tried different cables, and bypassed my unmanaged switch using the cable that feeds it, so I'm pretty sure it's not in my infra (although the port itself could be going bad i guess?) Here's the lspci -vv: 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15) Subsystem: Hewlett-Packard Company RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: I/O ports at 3000 [size=256] Region 2: Memory at cc904000 (64-bit, non-prefetchable) [size=4K] Region 4: Memory at cc900000 (64-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 01 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 4096 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s (ok), Width x1 (ok) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+ 10BitTagComp- 10BitTagReq- OBFF Via message/WAKE#, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled, AtomicOpsCtl: ReqEn- LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS- LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [b0] MSI-X: Enable+ Count=4 Masked- Vector table: BAR=4 offset=00000000 PBA: BAR=4 offset=00000800 Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [140 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00 Capabilities: [170 v1] Latency Tolerance Reporting Max snoop latency: 3145728ns Max no snoop latency: 3145728ns Capabilities: [178 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=150us PortTPowerOnTime=150us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=0ns L1SubCtl2: T_PwrOn=150us Kernel driver in use: r8169 Kernel modules: r8169 OK, so ASPM L1 sub-states are disabled. Then: - test with a mainline kernel - disable ASPM L1 via sysfs - bisect The following from your log may also play a role: [ 2809.974520] perf: interrupt took too long (2529 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 [ 6003.062611] perf: interrupt took too long (3175 > 3161), lowering kernel.perf_event_max_sample_rate to 63000 [ 6781.127013] perf: interrupt took too long (3972 > 3968), lowering kernel.perf_event_max_sample_rate to 50250 (In reply to Heiner Kallweit from comment #5) > OK, so ASPM L1 sub-states are disabled. Then: > - test with a mainline kernel To compile your own kernel (which is a requirement), see Documentation/admin-guide/quickly-build-trimmed-linux.rst in kernel sources (6.4 and later). > - bisect > See Documentation/admin-guide/bug-bisect.rst. ok so to disable l1_aspm? sudo echo 0 > /sys/class/net/enp2s0/device/link/l1_aspm also, a question re: building kernel and secure boot: linux mint supports secure boot and it is currently enabled... can I leave it enabled then when building my own kernel or should I disable it? (i'm thinking leave it enabled and disable in BIOS if I can't boot) btw, thanks everyone for helping i ended up adding pcie_aspm=off to my grub kernel parameters looks promising so far... will have to wait a while to see if the nic crashes though (In reply to Chris Heath from comment #7) > ok so to disable l1_aspm? > sudo echo 0 > /sys/class/net/enp2s0/device/link/l1_aspm > exactly I'm still having this issue on Debian Sid with 6.6.13 kernel. I believe last working was 6.1.27-1, I still have this kernel on hold. Tried r8169 and r8168-dkms with same result. pcie_aspm=off does not help, nor any other workaround tried, like disabling offloading and limiting to 100Mbps. In my case it's Asus N551JM laptop with this ethernet card: ``` 05:00.1 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 12) Subsystem: ASUSTeK Computer Inc. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 19 IOMMU group: 15 Region 0: I/O ports at d000 [size=256] Region 2: Memory at f7814000 (64-bit, non-prefetchable) [size=4K] Region 4: Memory at f7810000 (64-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 01 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10W DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 4096 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1 TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+ 10BitTagComp- 10BitTagReq- OBFF Via message/WAKE#, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [b0] MSI-X: Enable+ Count=4 Masked- Vector table: BAR=4 offset=00000000 PBA: BAR=4 offset=00000800 Capabilities: [d0] Vital Product Data Not readable Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00 Capabilities: [170 v1] Latency Tolerance Reporting Max snoop latency: 71680ns Max no snoop latency: 71680ns Capabilities: [178 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=150us PortTPowerOnTime=150us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=0ns L1SubCtl2: T_PwrOn=10us Kernel driver in use: r8169 Kernel modules: r8169 ``` (In reply to Vincas Dargis from comment #10) > I'm still having this issue on Debian Sid with 6.6.13 kernel. > > I believe last working was 6.1.27-1, I still have this kernel on hold. > > Tried r8169 and r8168-dkms with same result. > > pcie_aspm=off does not help, nor any other workaround tried, like disabling > offloading and limiting to 100Mbps. > You can try disabling ASPM in BIOS or disabling ASPM L1 via sysfs. And if it worked with an earlier kernel, please bisect. However, if the vendor driver also fails, it may indicate a hw issue. Hello, I'm also having this issue on Gentoo with 6.6.21: Linux akita 6.6.21-gentoo-dist #1 SMP PREEMPT_DYNAMIC Tue Apr 2 22:51:19 CDT 2024 x86_64 AMD Ryzen 9 5900X 12-Core Processor AuthenticAMD GNU/Linux --- 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05) Subsystem: Gigabyte Technology Co., Ltd RTL8125 2.5GbE Controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 24 IOMMU group: 23 Region 0: I/O ports at f000 [size=256] Region 2: Memory at fc600000 (64-bit, non-prefetchable) [size=64K] Region 4: Memory at fc610000 (64-bit, non-prefetchable) [size=16K] Capabilities: <access denied> Kernel driver in use: r8169 Kernel modules: r8169, r8125 --- [ 34.898835] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-500:00: Downshift occurred from negotiated speed 1Gbps to actual speed 100Mbps, check cabling! [ 34.898849] r8169 0000:05:00.0 enp5s0: Link is Up - 100Mbps/Full (downshifted) - flow control rx/tx [ 45.767203] ------------[ cut here ]------------ [ 45.767215] NETDEV WATCHDOG: enp5s0 (r8169): transmit queue 0 timed out 10057 ms [ 45.767228] WARNING: CPU: 6 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x232/0x240 [ 45.767237] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq nvidia_uvm(PO) nvidia_drm(PO) nvidia_modeset(PO) intel_rapl_msr intel_rapl_common 8021q garp mrp stp llc edac_mce_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi mt7921e snd_usb_audio mt7921_common nvidia(PO) snd_hda_intel snd_usbmidi_lib mt792x_lib snd_intel_dspcfg snd_ump mt76_connac_lib snd_intel_sdw_acpi snd_rawmidi mt76 snd_hda_codec snd_seq_device btusb nls_iso8859_1 snd_hda_core mc btrtl mac80211 snd_hwdep kvm btintel snd_pcm vfat fat irqbypass btbcm libarc4 snd_timer r8125(O) btmtk razermouse(O) snd video gigabyte_wmi rapl wmi_bmof soundcore k10temp cfg80211 bluetooth pcspkr acpi_cpufreq i2c_piix4 rfkill r8169 joydev loop fuse nfnetlink crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 sp5100_tco ccp nvme nvme_core nvme_common wmi [ 45.767411] CPU: 6 PID: 0 Comm: swapper/6 Tainted: P O 6.6.21-gentoo-dist #1 [ 45.767416] Hardware name: Gigabyte Technology Co., Ltd. X570S AORUS ELITE AX/X570S AORUS ELITE AX, BIOS F8 03/22/2024 [ 45.767419] RIP: 0010:dev_watchdog+0x232/0x240 [ 45.767424] Code: ff ff ff 48 89 df c6 05 9e a0 23 01 01 e8 c6 0b fa ff 45 89 f8 44 89 f1 48 89 de 48 89 c2 48 c7 c7 60 9f be 9c e8 de 35 24 ff <0f> 0b e9 2d ff ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 [ 45.767429] RSP: 0018:ffffc900003c0e78 EFLAGS: 00010286 [ 45.767434] RAX: 0000000000000000 RBX: ffff8881060d4000 RCX: 0000000000000000 [ 45.767438] RDX: ffff88881ebae340 RSI: ffff88881eba1580 RDI: 00000000000400f6 [ 45.767441] RBP: ffff8881060d44c8 R08: 0000000000000000 R09: ffffc900003c0d00 [ 45.767444] R10: 0000000000000003 R11: ffffffff9cf46328 R12: ffff88810e284a00 [ 45.767447] R13: ffff8881060d441c R14: 0000000000000000 R15: 0000000000002749 [ 45.767451] FS: 0000000000000000(0000) GS:ffff88881eb80000(0000) knlGS:0000000000000000 [ 45.767455] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.767458] CR2: 0000559bb230db78 CR3: 000000010616c000 CR4: 0000000000f50ee0 [ 45.767462] PKRU: 55555554 [ 45.767465] Call Trace: [ 45.767469] <IRQ> [ 45.767472] ? dev_watchdog+0x232/0x240 [ 45.767477] ? __warn+0x81/0x130 [ 45.767485] ? dev_watchdog+0x232/0x240 [ 45.767490] ? report_bug+0x171/0x1a0 [ 45.767496] ? apic_mem_wait_icr_idle+0x14/0x20 [ 45.767503] ? handle_bug+0x3c/0x80 [ 45.767508] ? exc_invalid_op+0x17/0x70 [ 45.767512] ? asm_exc_invalid_op+0x1a/0x20 [ 45.767522] ? dev_watchdog+0x232/0x240 [ 45.767527] ? dev_watchdog+0x232/0x240 [ 45.767531] ? __pfx_dev_watchdog+0x10/0x10 [ 45.767536] call_timer_fn+0x27/0x130 [ 45.767543] ? __pfx_dev_watchdog+0x10/0x10 [ 45.767547] __run_timers+0x222/0x2c0 [ 45.767555] run_timer_softirq+0x1d/0x40 [ 45.767560] __do_softirq+0xd4/0x2c8 [ 45.767568] __irq_exit_rcu+0xa6/0xc0 [ 45.767574] sysvec_apic_timer_interrupt+0x72/0x90 [ 45.767579] </IRQ> [ 45.767582] <TASK> [ 45.767586] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 45.767590] RIP: 0010:cpuidle_enter_state+0xcc/0x440 [ 45.767595] Code: 4a 40 0b ff e8 c5 f3 ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 83 47 0a ff 45 84 ff 0f 85 56 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d [ 45.767599] RSP: 0018:ffffc900001b7e90 EFLAGS: 00000246 [ 45.767604] RAX: ffff88881ebb3e40 RBX: ffff888100ddc400 RCX: 000000000000001f [ 45.767607] RDX: 0000000000000006 RSI: 00000000229837f7 RDI: 0000000000000000 [ 45.767610] RBP: 0000000000000002 R08: 0000000000000002 R09: 0000000000000401 [ 45.767613] R10: 0000000000000008 R11: ffff88881ebb27a4 R12: ffffffff9d077d40 [ 45.767616] R13: 0000000aa7eff1b9 R14: 0000000000000002 R15: 0000000000000000 [ 45.767625] cpuidle_enter+0x2d/0x40 [ 45.767631] do_idle+0x1d8/0x230 [ 45.767638] cpu_startup_entry+0x2a/0x30 [ 45.767642] start_secondary+0x11e/0x140 [ 45.767647] secondary_startup_64_no_verify+0x18f/0x19b [ 45.767656] </TASK> [ 45.767659] ---[ end trace 0000000000000000 ]--- [ 4316.702225] r8169 0000:05:00.0: invalid VPD tag 0x00 (size 0) at offset 0; assume missing optional EEPROM I've been losing my Internet connection every time my computer reboots. If it helps, I noticed that I can reliably get an Internet connection if I turn off my computer, unplug (power off) my unmanaged switch, then turn my computer on, and then plug the switch back in. I also noticed this happening in the dhcpcd logs: Apr 02 19:15:13 akita systemd[1]: Started Lightweight DHCP client daemon. Apr 02 19:15:14 akita dhcpcd[3282]: enp5s0: failed to request the lease Apr 02 19:15:14 akita dhcpcd[3282]: enp5s0: soliciting a DHCP lease Apr 02 19:15:18 akita dhcpcd[3282]: enp5s0: offered 192.168.1.243 from 192.168.1.254 Apr 02 19:15:23 akita dhcpcd[3282]: enp5s0: failed to request the lease Apr 02 19:15:23 akita dhcpcd[3282]: enp5s0: soliciting a DHCP lease Apr 02 19:15:23 akita dhcpcd[3282]: enp5s0: offered 192.168.1.243 from 192.168.1.254 Apr 02 19:15:28 akita dhcpcd[3282]: enp5s0: failed to request the lease I've tried the vendor driver (net-misc/r8125) and the latest available kernel (6.8.2 in testing, although I just noticed 6.8.3 is available) and neither seemed to help. Also tried setting pcie_aspm=off in my kernel options and that didn't seem to do anything. I can try bisecting some time next week if it helps, although I'm not sure where to start since this driver has never worked for me. Prior to this I was using Debian 12 and having the same issue there. > although I'm not sure where to start since this driver has never worked for
> me.
Please try between 6.1 and 6.2.
This Ethernet card worked for me up until 6.2.
My laptop is almost 10 years old, used it with Ubuntu, Debian stable and now Sid.
Only in Sid with ~ >=6.2 I started to get issues.
Sorry, but I haven't found energy yet to start bisecting myself.
(In reply to Sean Haugh from comment #12) > Hello, I'm also having this issue on Gentoo with 6.6.21: > [...] > [ 34.898835] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-500:00: Downshift > occurred from negotiated speed 1Gbps to actual speed 100Mbps, check cabling! > [ 34.898849] r8169 0000:05:00.0 enp5s0: Link is Up - 100Mbps/Full > (downshifted) - flow control rx/tx > [...] > I've been losing my Internet connection every time my computer reboots. If > it helps, I noticed that I can reliably get an Internet connection if I turn > off my computer, unplug (power off) my unmanaged switch, then turn my > computer on, and then plug the switch back in. I also noticed this happening > in the dhcpcd logs: > [...] > I've tried the vendor driver (net-misc/r8125) and the latest available > kernel (6.8.2 in testing, although I just noticed 6.8.3 is available) and > neither seemed to help. Also tried setting pcie_aspm=off in my kernel > options and that didn't seem to do anything. > A downshift typically indicates a physical issue. Best re-test with different cable and different switch, or at least different switch port. And ensure that firmware for the NIC is loaded. ethtool -i <if> will tell you. I tried 6.8.3, 6.1.81, and 5.15.151 without any luck. The cable is unfortunately inside the wall, but I did try a different switch port; no luck there either. I can try moving my computer downstairs and connecting it directly to the ISP router with a better cable, but that'll have to wait for another day. -- FYR: driver: r8169 version: 6.6.21-gentoo-dist firmware-version: rtl8125b-2_0.0.2 07/13/20 expansion-rom-version: bus-info: 0000:05:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no I'm using the upstream 6.8 kernel now (from git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git) and although I'm still getting downshifting, I'm not seeing the "transmit queue 0 timed out" message anymore. I have a similar problem, detected since kernel 6.8, now using 6.9.1: [ 1.342197] r8169 0000:02:00.0 eth0: RTL8168h/8111h, 84:47:09:0e:ba:2c, XID 541, IRQ 133 [ 1.342777] r8169 0000:02:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko] [ 1.363036] r8169 0000:03:00.0 eth1: RTL8168h/8111h, 84:47:09:0e:ba:2d, XID 541, IRQ 134 [ 1.363574] r8169 0000:03:00.0 eth1: jumbo features [frames: 9194 bytes, tx checksumming: ko] [ 1.833163] r8169 0000:02:00.0 enp2s0: renamed from eth0 [ 1.834953] r8169 0000:03:00.0 enp3s0: renamed from eth1 [ 10.932526] Generic FE-GE Realtek PHY r8169-0-200:00: attached PHY driver (mii_bus:phy_addr=r8169-0-200:00, irq=MAC) [ 11.138959] r8169 0000:02:00.0 enp2s0: Link is Down [ 11.218894] r8169 0000:02:00.0 enp2s0: entered allmulticast mode [ 11.219028] r8169 0000:02:00.0 enp2s0: entered promiscuous mode [ 27.226344] Generic FE-GE Realtek PHY r8169-0-200:00: Downshift occurred from negotiated speed 1Gbps to actual speed 100Mbps, check cabling! [ 27.226478] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full (downshifted) - flow control rx/tx [108133.125832] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 8587 ms [108133.134207] r8169 0000:02:00.0 enp2s0: ASPM disabled on Tx timeout [108133.158610] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100). [108133.176558] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). [108143.152459] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 3: transmit queue 0 timed out 8987 ms [108143.175917] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100). [108143.197345] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). [108153.178824] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 9974 ms [108163.205511] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 8234 ms [108163.224605] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100). [108163.238986] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). [108173.018734] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 8720 ms [108173.041848] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100). [108173.060345] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). # ethtool -i enp2s0 driver: r8169 version: 6.9.1-i7 firmware-version: rtl8168h-2_0.0.2 02/26/15 expansion-rom-version: bus-info: 0000:02:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no I tried with different network wires and same problem. It starts dumping the errors after some hours connected to the network. But not sure why, sometimes can be few days or only some minutes I cannot disable aspm: # find /sys/class/net/enp2s0/device/ | grep aspm || echo no aspm no aspm The network interface is dual, although I'm only using one of them. A downshift always indicates a physical issue. It doesn't have to be the cable, the trouble could also be caused by e.g. the RJ45 port on either side. After Debian Sid upgraded to 6.9.7-1, I no longer see this timeout (was still issue in 6.8.x)! I have left computer running over night (with torrents and bitcoind and whatever running), and I see no any timeouts or crashes in journalctl -k, network still works fine! It seems I do still see this in logs: ``` Jul 07 17:18:09 kernel: r8169 0000:05:00.1 enp5s0f1: NETDEV WATCHDOG: CPU: 6: transmit queue 0 timed out 10484 ms Jul 07 17:18:09 kernel: r8169 0000:05:00.1: can't disable ASPM; OS doesn't have ASPM control Jul 07 17:18:09 kernel: r8169 0000:05:00.1 enp5s0f1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). ``` But network is still usable. Still no "[ cut here ]" crashes printed. |