Bug 217814

Summary: r8169 transmit queue 0 timed out (after upgrade from 5.x to 6.x)
Product: Networking Reporter: Chris Heath (chris)
Component: OtherAssignee: Stephen Hemminger (stephen)
Status: NEW ---    
Severity: high CC: bagasdotme, chris, hkallweit1, pablo.catalina, seanphaugh, vincas
Priority: P3    
Hardware: Intel   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Chris Heath 2023-08-23 00:50:16 UTC
After upgrading my system to 6.2 recently I've been getting seemingly random network card crashes.

>lsb-release:
DISTRIB_ID=LinuxMint
DISTRIB_RELEASE=21.2
DISTRIB_CODENAME=victoria
DISTRIB_DESCRIPTION="Linux Mint 21.2 Victoria"

>uname -a:
Linux i5-ProDesk 6.2.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 16:27:29 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

>lspci:
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
	Subsystem: Hewlett-Packard Company RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: I/O ports at 3000 [size=256]
	Region 2: Memory at cc904000 (64-bit, non-prefetchable) [size=4K]
	Region 4: Memory at cc900000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: r8169
	Kernel modules: r8169

>dmesg:
[    9.210269] r8169 0000:02:00.0 eth0: RTL8168h/8111h, 40:b0:34:fb:e8:5a, XID 541, IRQ 133
[    9.210273] r8169 0000:02:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko]
[    9.210507] usb 2-2: New USB device found, idVendor=1c04, idProduct=0013, bcdDevice=61.08
[    9.210511] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    9.210512] usb 2-2: Product: TR-004
[    9.210513] usb 2-2: Manufacturer: QNAP Systems, Inc.
[    9.210514] usb 2-2: SerialNumber: 51323043423037383430
[    9.242547] r8169 0000:02:00.0 enp2s0: renamed from eth0
[    9.292656] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])
[    9.295636] usb-storage 2-2:1.0: USB Mass Storage device detected
[    9.295795] scsi host6: usb-storage 2-2:1.0
[    9.295851] usbcore: registered new interface driver usb-storage
[    9.305456] usbcore: registered new interface driver uas
[    9.379685] intel_tcc_cooling: Programmable TCC Offset detected
[    9.406737] intel_rapl_common: Found RAPL domain package
[    9.406740] intel_rapl_common: Found RAPL domain core
[    9.406742] intel_rapl_common: Found RAPL domain uncore
[    9.406744] intel_rapl_common: Found RAPL domain dram
[    9.426136] snd_hda_intel 0000:00:1f.3: enabling device (0100 -> 0102)
[    9.426440] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    9.513624] snd_hda_codec_conexant hdaudioC0D0: CX20632: BIOS auto-probing.
[    9.514513] snd_hda_codec_conexant hdaudioC0D0: autoconfig for CX20632: line_outs=1 (0x1c/0x0/0x0/0x0/0x0) type:line
[    9.514530] snd_hda_codec_conexant hdaudioC0D0:    speaker_outs=1 (0x1f/0x0/0x0/0x0/0x0)
[    9.514537] snd_hda_codec_conexant hdaudioC0D0:    hp_outs=1 (0x19/0x0/0x0/0x0/0x0)
[    9.514543] snd_hda_codec_conexant hdaudioC0D0:    mono: mono_out=0x0
[    9.514546] snd_hda_codec_conexant hdaudioC0D0:    inputs:
[    9.514550] snd_hda_codec_conexant hdaudioC0D0:      Mic=0x1a
[    9.514554] snd_hda_codec_conexant hdaudioC0D0:      Line=0x1d
[    9.571974] input: HDA Intel PCH Mic as /devices/pci0000:00/0000:00:1f.3/sound/card0/input23
[    9.572082] input: HDA Intel PCH Line as /devices/pci0000:00/0000:00:1f.3/sound/card0/input24
[    9.572161] input: HDA Intel PCH Line Out as /devices/pci0000:00/0000:00:1f.3/sound/card0/input25
[    9.572244] input: HDA Intel PCH Front Headphone as /devices/pci0000:00/0000:00:1f.3/sound/card0/input26
[    9.572346] input: HDA Intel PCH HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input27
[    9.572433] input: HDA Intel PCH HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input28
[    9.572552] input: HDA Intel PCH HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input29
[   10.128945] EXT4-fs (nvme0n1p3): mounted filesystem fec84ea4-37d8-4a23-aa47-6026b141f1e0 with ordered data mode. Quota mode: none.
[   10.167438] audit: type=1400 audit(1692740819.357:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=490 comm="apparmor_parser"
[   10.167443] audit: type=1400 audit(1692740819.357:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=490 comm="apparmor_parser"
[   10.167646] audit: type=1400 audit(1692740819.357:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lsb_release" pid=489 comm="apparmor_parser"
[   10.171430] audit: type=1400 audit(1692740819.361:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=492 comm="apparmor_parser"
[   10.171434] audit: type=1400 audit(1692740819.361:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=492 comm="apparmor_parser"
[   10.171436] audit: type=1400 audit(1692740819.361:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=492 comm="apparmor_parser"
[   10.172672] audit: type=1400 audit(1692740819.361:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/lightdm/lightdm-guest-session" pid=488 comm="apparmor_parser"
[   10.172675] audit: type=1400 audit(1692740819.361:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/lightdm/lightdm-guest-session//chromium" pid=488 comm="apparmor_parser"
[   10.173611] audit: type=1400 audit(1692740819.361:12): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/redshift" pid=493 comm="apparmor_parser"
[   10.175074] audit: type=1400 audit(1692740819.365:13): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=491 comm="apparmor_parser"
[   10.218202] RPC: Registered named UNIX socket transport module.
[   10.218204] RPC: Registered udp transport module.
[   10.218205] RPC: Registered tcp transport module.
[   10.218206] RPC: Registered tcp NFSv4.1 backchannel transport module.
[   10.322223] scsi 6:0:0:0: Direct-Access     QNAP     TR-004 DISK00    6108 PQ: 0 ANSI: 6
[   10.322510] sd 6:0:0:0: Attached scsi generic sg1 type 0
[   10.322737] sd 6:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16).
[   10.322864] sd 6:0:0:0: [sda] 46883930112 512-byte logical blocks: (24.0 TB/21.8 TiB)
[   10.322870] sd 6:0:0:0: [sda] 4096-byte physical blocks
[   10.323181] sd 6:0:0:0: [sda] Write Protect is off
[   10.323187] sd 6:0:0:0: [sda] Mode Sense: 47 00 00 08
[   10.323500] sd 6:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   10.366735] sd 6:0:0:0: [sda] Attached SCSI disk
[   10.570267] fbcon: Taking over console
[   10.596271] Console: switching to colour frame buffer device 170x48
[   10.706401] Generic FE-GE Realtek PHY r8169-0-200:00: attached PHY driver (mii_bus:phy_addr=r8169-0-200:00, irq=MAC)
[   10.780320] Lockdown: Xorg: raw io port access is restricted; see man kernel_lockdown.7
[   10.901951] r8169 0000:02:00.0 enp2s0: Link is Down
[   11.040232] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[   13.279884] Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7
[   13.291180] Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7
[   14.157624] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control rx/tx
[   14.157677] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready
[   14.162293] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control off
[   14.162428] r8169 0000:02:00.0 enp2s0: Link is Down
[   15.530004] logitech-hidpp-device 0003:046D:406E.0005: HID++ 4.5 device connected.
[   17.671377] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full - flow control off
[   18.257246] logitech-hidpp-device 0003:046D:406D.0004: HID++ 4.5 device connected.
[   22.827616] EXT4-fs (sda): mounted filesystem 62140d4d-9618-435f-8414-18384be15421 with ordered data mode. Quota mode: none.
[   67.980872] show_signal_msg: 18 callbacks suppressed
[   67.980874] GpuWatchdog[2955]: segfault at 0 ip 00007fd0eaf929a6 sp 00007fd0dfdfd370 error 6 in libcef.so[7fd0e6aef000+7770000] likely on CPU 1 (core 1, socket 0)
[   67.980884] Code: 89 de e8 0d ef 6e ff 80 7d cf 00 79 09 48 8b 7d b8 e8 4e 66 2c 03 41 8b 84 24 e0 00 00 00 89 45 b8 48 8d 7d b8 e8 5a d3 b5 fb <c7> 04 25 00 00 00 00 37 13 00 00 48 83 c4 38 5b 41 5c 41 5d 41 5e
[ 2809.974520] perf: interrupt took too long (2529 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
[ 6003.062611] perf: interrupt took too long (3175 > 3161), lowering kernel.perf_event_max_sample_rate to 63000
[ 6781.127013] perf: interrupt took too long (3972 > 3968), lowering kernel.perf_event_max_sample_rate to 50250
[ 8278.008471] ------------[ cut here ]------------
[ 8278.008479] NETDEV WATCHDOG: enp2s0 (r8169): transmit queue 0 timed out
[ 8278.008510] WARNING: CPU: 1 PID: 3485 at net/sched/sch_generic.c:525 dev_watchdog+0x21f/0x230
[ 8278.008527] Modules linked in: xt_multiport xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc sunrpc binfmt_misc snd_soc_avs snd_hda_codec_hdmi snd_soc_hda_codec snd_hda_ext_core snd_hda_codec_conexant snd_soc_core snd_hda_codec_generic ledtrig_audio snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg intel_rapl_msr snd_intel_sdw_acpi intel_rapl_common snd_hda_codec intel_tcc_cooling snd_hda_core x86_pkg_temp_thermal snd_hwdep intel_powerclamp snd_pcm coretemp snd_seq_midi kvm_intel snd_seq_midi_event uas usb_storage snd_rawmidi mei_pxp mei_hdcp kvm snd_seq irqbypass snd_seq_device joydev hp_wmi rapl snd_timer r8169 nls_iso8859_1 input_leds intel_cstate sparse_keymap mei_me snd realtek serio_raw platform_profile ee1004 mei soundcore wmi_bmof mac_hid acpi_pad sch_fq_codel msr parport_pc ppdev lp pstore_blk ramoops parport pstore_zone
[ 8278.008682]  reed_solomon efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c dm_mirror dm_region_hash dm_log hid_logitech_hidpp i915 hid_logitech_dj drm_buddy i2c_algo_bit ttm drm_display_helper crct10dif_pclmul cec crc32_pclmul rc_core polyval_clmulni hid_generic polyval_generic drm_kms_helper ghash_clmulni_intel usbhid syscopyarea sha512_ssse3 hid sysfillrect aesni_intel nvme sysimgblt crypto_simd ahci i2c_i801 xhci_pci nvme_core drm cryptd psmouse xhci_pci_renesas i2c_smbus libahci nvme_common video wmi
[ 8278.008808] CPU: 1 PID: 3485 Comm: bitburner Not tainted 6.2.0-26-generic #26~22.04.1-Ubuntu
[ 8278.008816] Hardware name: HP HP ProDesk 400 G4 SFF /82A2, BIOS P08 Ver. 02.46 03/28/2023
[ 8278.008819] RIP: 0010:dev_watchdog+0x21f/0x230
[ 8278.008830] Code: 00 e9 31 ff ff ff 4c 89 e7 c6 05 f5 a9 78 01 01 e8 c6 ff f7 ff 44 89 f1 4c 89 e6 48 c7 c7 08 30 24 93 48 89 c2 e8 31 0b 2c ff <0f> 0b e9 22 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
[ 8278.008836] RSP: 0000:ffffa30488b17db0 EFLAGS: 00010246
[ 8278.008842] RAX: 0000000000000000 RBX: ffff924a5f31c4c8 RCX: 0000000000000000
[ 8278.008847] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 8278.008851] RBP: ffffa30488b17dd8 R08: 0000000000000000 R09: 0000000000000000
[ 8278.008855] R10: 0000000000000000 R11: 0000000000000000 R12: ffff924a5f31c000
[ 8278.008858] R13: ffff924a5f31c41c R14: 0000000000000000 R15: 0000000000000000
[ 8278.008862] FS:  00007fa39d427280(0000) GS:ffff924d62680000(0000) knlGS:0000000000000000
[ 8278.008868] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8278.008872] CR2: 000016980078c380 CR3: 0000000216c98002 CR4: 00000000003706e0
[ 8278.008877] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8278.008880] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8278.008884] Call Trace:
[ 8278.008888]  <TASK>
[ 8278.008895]  ? __pfx_dev_watchdog+0x10/0x10
[ 8278.008905]  call_timer_fn+0x29/0x160
[ 8278.008914]  ? __pfx_dev_watchdog+0x10/0x10
[ 8278.008921]  __run_timers.part.0+0x1fb/0x2b0
[ 8278.008929]  ? ktime_get+0x43/0xc0
[ 8278.008934]  ? __pfx_tick_sched_timer+0x10/0x10
[ 8278.008945]  ? lapic_next_deadline+0x2c/0x50
[ 8278.008951]  ? clockevents_program_event+0xb2/0x140
[ 8278.008959]  run_timer_softirq+0x2a/0x60
[ 8278.008966]  __do_softirq+0xda/0x330
[ 8278.008972]  ? hrtimer_interrupt+0x12b/0x250
[ 8278.008982]  __irq_exit_rcu+0xa2/0xd0
[ 8278.008988]  irq_exit_rcu+0xe/0x20
[ 8278.008994]  sysvec_apic_timer_interrupt+0x43/0xb0
[ 8278.009001]  asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ 8278.009008] RIP: 0033:0x55be202d9e83
[ 8278.009014] Code: 00 00 00 0c 00 00 00 dd 00 00 00 ff ff ff ff 00 00 00 00 ff ff ff ff 0c 00 00 00 0c 00 00 00 0c 00 00 00 00 00 00 00 8b 59 d0 <49> 03 de f6 43 1b 01 74 05 e9 2f 30 bb 1f 55 48 89 e5 56 57 50 48
[ 8278.009019] RSP: 002b:00007ffd8e384078 EFLAGS: 00000206
[ 8278.009024] RAX: 0000000000000001 RBX: 0000000004c3e909 RCX: 000055be202d9e80
[ 8278.009028] RDX: 00003636000023e1 RSI: 00003636006a2ca1 RDI: 000036360069c629
[ 8278.009032] RBP: 00007ffd8e3840d0 R08: 000036360069c629 R09: 00003636001c4841
[ 8278.009035] R10: 00001698003f3209 R11: 0000000000000011 R12: 000036360000400d
[ 8278.009039] R13: 0000169800550000 R14: 0000363600000000 R15: 0000000000000006
[ 8278.009047]  </TASK>
[ 8278.009050] ---[ end trace 0000000000000000 ]---
[ 8278.038321] r8169 0000:02:00.0 enp2s0: rtl_chipcmd_cond == 1 (loop: 100, delay: 100).
[ 8278.039685] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8278.041049] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8278.042416] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8278.043805] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8278.045244] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8278.046613] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8278.070550] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 8278.093905] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 8278.117905] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 8388.884453] net_ratelimit: 9 callbacks suppressed
[ 8388.884460] r8169 0000:02:00.0 enp2s0: rtl_chipcmd_cond == 1 (loop: 100, delay: 100).
[ 8388.886144] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8388.887504] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8388.888934] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8388.890386] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8388.891839] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8388.893406] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
[ 8388.920620] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 8388.949110] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 8388.973078] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
Comment 1 Chris Heath 2023-08-23 00:54:02 UTC
i believe i was on kernel 5.15 before the upgrade and before that on 5.4 
(been running mint on this system for a number of years now and upgrading it via the mint upgrader when prompted)
Comment 2 Chris Heath 2023-08-23 01:04:32 UTC
also submitted to ubuntu launchpad: https://bugs.launchpad.net/ubuntu/+bug/2032706
Comment 3 Heiner Kallweit 2023-08-23 05:25:29 UTC
See comment on the bugzilla start page: Downstream kernels aren't supported here. Please test with a (best self-compiled) mainline kernel.

Your lspci output misses relevant information, please use -vv as root.

[   14.157624] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control rx/tx
[   14.157677] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready
[   14.162293] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control off
[   14.162428] r8169 0000:02:00.0 enp2s0: Link is Down
[   17.671377] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full - flow control off

This looks strange and may indicate a physical issue. Please check RJ45 ports on both sides and cabling. Best also test with another link partner.

You can try the following to rule out ASPM-related issues:
Disable l1_1_aspm and if that doesn't help l1_aspm under /sys/class/net/enp2s0/device/link.

Best of course would be if you bisect the issue between last known good kernel and first troublesome.
Comment 4 Chris Heath 2023-08-23 22:31:38 UTC
I've already tried different cables, and bypassed my unmanaged switch using the cable that feeds it, so I'm pretty sure it's not in my infra (although the port itself could be going bad i guess?)

Here's the lspci -vv:
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
	Subsystem: Hewlett-Packard Company RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: I/O ports at 3000 [size=256]
	Region 2: Memory at cc904000 (64-bit, non-prefetchable) [size=4K]
	Region 4: Memory at cc900000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [70] Express (v2) Endpoint, MSI 01
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 4096 bytes
		DevSta:	CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s (ok), Width x1 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp- 10BitTagReq- OBFF Via message/WAKE#, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS- TPHComp- ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled,
			 AtomicOpsCtl: ReqEn-
		LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
			 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
		Vector table: BAR=4 offset=00000000
		PBA: BAR=4 offset=00000800
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [140 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
	Capabilities: [170 v1] Latency Tolerance Reporting
		Max snoop latency: 3145728ns
		Max no snoop latency: 3145728ns
	Capabilities: [178 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=150us PortTPowerOnTime=150us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=0ns
		L1SubCtl2: T_PwrOn=150us
	Kernel driver in use: r8169
	Kernel modules: r8169
Comment 5 Heiner Kallweit 2023-08-24 05:29:40 UTC
OK, so ASPM L1 sub-states are disabled. Then:
- test with a mainline kernel
- disable ASPM L1 via sysfs
- bisect

The following from your log may also play a role:

[ 2809.974520] perf: interrupt took too long (2529 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
[ 6003.062611] perf: interrupt took too long (3175 > 3161), lowering kernel.perf_event_max_sample_rate to 63000
[ 6781.127013] perf: interrupt took too long (3972 > 3968), lowering kernel.perf_event_max_sample_rate to 50250
Comment 6 Bagas Sanjaya 2023-08-24 12:38:09 UTC
(In reply to Heiner Kallweit from comment #5)
> OK, so ASPM L1 sub-states are disabled. Then:
> - test with a mainline kernel
To compile your own kernel (which is a requirement), see
Documentation/admin-guide/quickly-build-trimmed-linux.rst in kernel sources
(6.4 and later).

> - bisect
> 

See Documentation/admin-guide/bug-bisect.rst.
Comment 7 Chris Heath 2023-08-24 23:58:23 UTC
ok so to disable l1_aspm? 
sudo echo 0 >  /sys/class/net/enp2s0/device/link/l1_aspm 

also, a question re: building kernel and secure boot:
linux mint supports secure boot and it is currently enabled...
can I leave it enabled then when building my own kernel or should I disable it?
(i'm thinking leave it enabled and disable in BIOS if I can't boot)

btw, thanks everyone for helping
Comment 8 Chris Heath 2023-08-25 00:24:14 UTC
i ended up adding pcie_aspm=off to my grub kernel parameters 
looks promising so far... will have to wait a while to see if the nic crashes though
Comment 9 Heiner Kallweit 2023-08-25 09:12:26 UTC
(In reply to Chris Heath from comment #7)
> ok so to disable l1_aspm? 
> sudo echo 0 >  /sys/class/net/enp2s0/device/link/l1_aspm 
> 
exactly
Comment 10 Vincas Dargis 2024-02-15 21:37:08 UTC
I'm still having this issue on Debian Sid with 6.6.13 kernel.

I believe last working was 6.1.27-1, I still have this kernel on hold.

Tried r8169 and r8168-dkms with same result.

pcie_aspm=off does not help, nor any other workaround tried, like disabling offloading and limiting to 100Mbps.

In my case it's Asus N551JM laptop with this ethernet card:

```
05:00.1 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 12)
	Subsystem: ASUSTeK Computer Inc. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 19
	IOMMU group: 15
	Region 0: I/O ports at d000 [size=256]
	Region 2: Memory at f7814000 (64-bit, non-prefetchable) [size=4K]
	Region 4: Memory at f7810000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [70] Express (v2) Endpoint, MSI 01
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10W
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 4096 bytes
		DevSta:	CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp- 10BitTagReq- OBFF Via message/WAKE#, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS- TPHComp- ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,
			 AtomicOpsCtl: ReqEn-
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
			 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
		Vector table: BAR=4 offset=00000000
		PBA: BAR=4 offset=00000800
	Capabilities: [d0] Vital Product Data
		Not readable
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
	Capabilities: [170 v1] Latency Tolerance Reporting
		Max snoop latency: 71680ns
		Max no snoop latency: 71680ns
	Capabilities: [178 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=150us PortTPowerOnTime=150us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=0ns
		L1SubCtl2: T_PwrOn=10us
	Kernel driver in use: r8169
	Kernel modules: r8169
```
Comment 11 Heiner Kallweit 2024-02-15 21:56:10 UTC
(In reply to Vincas Dargis from comment #10)
> I'm still having this issue on Debian Sid with 6.6.13 kernel.
> 
> I believe last working was 6.1.27-1, I still have this kernel on hold.
> 
> Tried r8169 and r8168-dkms with same result.
> 
> pcie_aspm=off does not help, nor any other workaround tried, like disabling
> offloading and limiting to 100Mbps.
> 
You can try disabling ASPM in BIOS or disabling ASPM L1 via sysfs.
And if it worked with an earlier kernel, please bisect.
However, if the vendor driver also fails, it may indicate a hw issue.
Comment 12 Sean Haugh 2024-04-03 19:10:59 UTC
Hello, I'm also having this issue on Gentoo with 6.6.21:

Linux akita 6.6.21-gentoo-dist #1 SMP PREEMPT_DYNAMIC Tue Apr  2 22:51:19 CDT 2024 x86_64 AMD Ryzen 9 5900X 12-Core Processor AuthenticAMD GNU/Linux

---

05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
	Subsystem: Gigabyte Technology Co., Ltd RTL8125 2.5GbE Controller
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 24
	IOMMU group: 23
	Region 0: I/O ports at f000 [size=256]
	Region 2: Memory at fc600000 (64-bit, non-prefetchable) [size=64K]
	Region 4: Memory at fc610000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: r8169
	Kernel modules: r8169, r8125

---

[   34.898835] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-500:00: Downshift occurred from negotiated speed 1Gbps to actual speed 100Mbps, check cabling!
[   34.898849] r8169 0000:05:00.0 enp5s0: Link is Up - 100Mbps/Full (downshifted) - flow control rx/tx
[   45.767203] ------------[ cut here ]------------
[   45.767215] NETDEV WATCHDOG: enp5s0 (r8169): transmit queue 0 timed out 10057 ms
[   45.767228] WARNING: CPU: 6 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x232/0x240
[   45.767237] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq nvidia_uvm(PO) nvidia_drm(PO) nvidia_modeset(PO) intel_rapl_msr intel_rapl_common 8021q garp mrp stp llc edac_mce_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi mt7921e snd_usb_audio mt7921_common nvidia(PO) snd_hda_intel snd_usbmidi_lib mt792x_lib snd_intel_dspcfg snd_ump mt76_connac_lib snd_intel_sdw_acpi snd_rawmidi mt76 snd_hda_codec snd_seq_device btusb nls_iso8859_1 snd_hda_core mc btrtl mac80211 snd_hwdep kvm btintel snd_pcm vfat fat irqbypass btbcm libarc4 snd_timer r8125(O) btmtk razermouse(O) snd video gigabyte_wmi rapl wmi_bmof soundcore k10temp cfg80211 bluetooth pcspkr acpi_cpufreq i2c_piix4 rfkill r8169 joydev loop fuse nfnetlink crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 sp5100_tco ccp nvme nvme_core nvme_common wmi
[   45.767411] CPU: 6 PID: 0 Comm: swapper/6 Tainted: P           O       6.6.21-gentoo-dist #1
[   45.767416] Hardware name: Gigabyte Technology Co., Ltd. X570S AORUS ELITE AX/X570S AORUS ELITE AX, BIOS F8 03/22/2024
[   45.767419] RIP: 0010:dev_watchdog+0x232/0x240
[   45.767424] Code: ff ff ff 48 89 df c6 05 9e a0 23 01 01 e8 c6 0b fa ff 45 89 f8 44 89 f1 48 89 de 48 89 c2 48 c7 c7 60 9f be 9c e8 de 35 24 ff <0f> 0b e9 2d ff ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90
[   45.767429] RSP: 0018:ffffc900003c0e78 EFLAGS: 00010286
[   45.767434] RAX: 0000000000000000 RBX: ffff8881060d4000 RCX: 0000000000000000
[   45.767438] RDX: ffff88881ebae340 RSI: ffff88881eba1580 RDI: 00000000000400f6
[   45.767441] RBP: ffff8881060d44c8 R08: 0000000000000000 R09: ffffc900003c0d00
[   45.767444] R10: 0000000000000003 R11: ffffffff9cf46328 R12: ffff88810e284a00
[   45.767447] R13: ffff8881060d441c R14: 0000000000000000 R15: 0000000000002749
[   45.767451] FS:  0000000000000000(0000) GS:ffff88881eb80000(0000) knlGS:0000000000000000
[   45.767455] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   45.767458] CR2: 0000559bb230db78 CR3: 000000010616c000 CR4: 0000000000f50ee0
[   45.767462] PKRU: 55555554
[   45.767465] Call Trace:
[   45.767469]  <IRQ>
[   45.767472]  ? dev_watchdog+0x232/0x240
[   45.767477]  ? __warn+0x81/0x130
[   45.767485]  ? dev_watchdog+0x232/0x240
[   45.767490]  ? report_bug+0x171/0x1a0
[   45.767496]  ? apic_mem_wait_icr_idle+0x14/0x20
[   45.767503]  ? handle_bug+0x3c/0x80
[   45.767508]  ? exc_invalid_op+0x17/0x70
[   45.767512]  ? asm_exc_invalid_op+0x1a/0x20
[   45.767522]  ? dev_watchdog+0x232/0x240
[   45.767527]  ? dev_watchdog+0x232/0x240
[   45.767531]  ? __pfx_dev_watchdog+0x10/0x10
[   45.767536]  call_timer_fn+0x27/0x130
[   45.767543]  ? __pfx_dev_watchdog+0x10/0x10
[   45.767547]  __run_timers+0x222/0x2c0
[   45.767555]  run_timer_softirq+0x1d/0x40
[   45.767560]  __do_softirq+0xd4/0x2c8
[   45.767568]  __irq_exit_rcu+0xa6/0xc0
[   45.767574]  sysvec_apic_timer_interrupt+0x72/0x90
[   45.767579]  </IRQ>
[   45.767582]  <TASK>
[   45.767586]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[   45.767590] RIP: 0010:cpuidle_enter_state+0xcc/0x440
[   45.767595] Code: 4a 40 0b ff e8 c5 f3 ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 83 47 0a ff 45 84 ff 0f 85 56 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
[   45.767599] RSP: 0018:ffffc900001b7e90 EFLAGS: 00000246
[   45.767604] RAX: ffff88881ebb3e40 RBX: ffff888100ddc400 RCX: 000000000000001f
[   45.767607] RDX: 0000000000000006 RSI: 00000000229837f7 RDI: 0000000000000000
[   45.767610] RBP: 0000000000000002 R08: 0000000000000002 R09: 0000000000000401
[   45.767613] R10: 0000000000000008 R11: ffff88881ebb27a4 R12: ffffffff9d077d40
[   45.767616] R13: 0000000aa7eff1b9 R14: 0000000000000002 R15: 0000000000000000
[   45.767625]  cpuidle_enter+0x2d/0x40
[   45.767631]  do_idle+0x1d8/0x230
[   45.767638]  cpu_startup_entry+0x2a/0x30
[   45.767642]  start_secondary+0x11e/0x140
[   45.767647]  secondary_startup_64_no_verify+0x18f/0x19b
[   45.767656]  </TASK>
[   45.767659] ---[ end trace 0000000000000000 ]---
[ 4316.702225] r8169 0000:05:00.0: invalid VPD tag 0x00 (size 0) at offset 0; assume missing optional EEPROM

I've been losing my Internet connection every time my computer reboots. If it helps, I noticed that I can reliably get an Internet connection if I turn off my computer, unplug (power off) my unmanaged switch, then turn my computer on, and then plug the switch back in. I also noticed this happening in the dhcpcd logs:

Apr 02 19:15:13 akita systemd[1]: Started Lightweight DHCP client daemon.
Apr 02 19:15:14 akita dhcpcd[3282]: enp5s0: failed to request the lease
Apr 02 19:15:14 akita dhcpcd[3282]: enp5s0: soliciting a DHCP lease
Apr 02 19:15:18 akita dhcpcd[3282]: enp5s0: offered 192.168.1.243 from 192.168.1.254
Apr 02 19:15:23 akita dhcpcd[3282]: enp5s0: failed to request the lease
Apr 02 19:15:23 akita dhcpcd[3282]: enp5s0: soliciting a DHCP lease
Apr 02 19:15:23 akita dhcpcd[3282]: enp5s0: offered 192.168.1.243 from 192.168.1.254
Apr 02 19:15:28 akita dhcpcd[3282]: enp5s0: failed to request the lease

I've tried the vendor driver (net-misc/r8125) and the latest available kernel (6.8.2 in testing, although I just noticed 6.8.3 is available) and neither seemed to help. Also tried setting pcie_aspm=off in my kernel options and that didn't seem to do anything.

I can try bisecting some time next week if it helps, although I'm not sure where to start since this driver has never worked for me. Prior to this I was using Debian 12 and having the same issue there.
Comment 13 Vincas Dargis 2024-04-03 19:15:48 UTC
> although I'm not sure where to start since this driver has never worked for
> me.

Please try between 6.1 and 6.2.

This Ethernet card worked for me up until 6.2.

My laptop is almost 10 years old, used it with Ubuntu, Debian stable and now Sid.

Only in Sid with ~ >=6.2 I started to get issues.

Sorry, but I haven't found energy yet to start bisecting myself.
Comment 14 Heiner Kallweit 2024-04-03 19:32:24 UTC
(In reply to Sean Haugh from comment #12)
> Hello, I'm also having this issue on Gentoo with 6.6.21:
> [...]
> [   34.898835] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-500:00: Downshift
> occurred from negotiated speed 1Gbps to actual speed 100Mbps, check cabling!
> [   34.898849] r8169 0000:05:00.0 enp5s0: Link is Up - 100Mbps/Full
> (downshifted) - flow control rx/tx
> [...]
> I've been losing my Internet connection every time my computer reboots. If
> it helps, I noticed that I can reliably get an Internet connection if I turn
> off my computer, unplug (power off) my unmanaged switch, then turn my
> computer on, and then plug the switch back in. I also noticed this happening
> in the dhcpcd logs:
> [...]
> I've tried the vendor driver (net-misc/r8125) and the latest available
> kernel (6.8.2 in testing, although I just noticed 6.8.3 is available) and
> neither seemed to help. Also tried setting pcie_aspm=off in my kernel
> options and that didn't seem to do anything.
> 
A downshift typically indicates a physical issue. Best re-test with different cable and different switch, or at least different switch port.
Comment 15 Heiner Kallweit 2024-04-03 19:34:32 UTC
And ensure that firmware for the NIC is loaded.
ethtool -i <if> will tell you.
Comment 16 Sean Haugh 2024-04-03 21:15:32 UTC
I tried 6.8.3, 6.1.81, and 5.15.151 without any luck. The cable is unfortunately inside the wall, but I did try a different switch port; no luck there either.

I can try moving my computer downstairs and connecting it directly to the ISP router with a better cable, but that'll have to wait for another day.

--

FYR:

driver: r8169
version: 6.6.21-gentoo-dist
firmware-version: rtl8125b-2_0.0.2 07/13/20
expansion-rom-version: 
bus-info: 0000:05:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no
Comment 17 Sean Haugh 2024-04-09 22:32:12 UTC
I'm using the upstream 6.8 kernel now (from git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git) and although I'm still getting downshifting, I'm not seeing the "transmit queue 0 timed out" message anymore.
Comment 18 Pablo Catalina 2024-05-31 07:22:52 UTC
I have a similar problem, detected since kernel 6.8, now using 6.9.1:
[    1.342197] r8169 0000:02:00.0 eth0: RTL8168h/8111h, 84:47:09:0e:ba:2c, XID 541, IRQ 133
[    1.342777] r8169 0000:02:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko]
[    1.363036] r8169 0000:03:00.0 eth1: RTL8168h/8111h, 84:47:09:0e:ba:2d, XID 541, IRQ 134
[    1.363574] r8169 0000:03:00.0 eth1: jumbo features [frames: 9194 bytes, tx checksumming: ko]
[    1.833163] r8169 0000:02:00.0 enp2s0: renamed from eth0
[    1.834953] r8169 0000:03:00.0 enp3s0: renamed from eth1
[   10.932526] Generic FE-GE Realtek PHY r8169-0-200:00: attached PHY driver (mii_bus:phy_addr=r8169-0-200:00, irq=MAC)
[   11.138959] r8169 0000:02:00.0 enp2s0: Link is Down
[   11.218894] r8169 0000:02:00.0 enp2s0: entered allmulticast mode
[   11.219028] r8169 0000:02:00.0 enp2s0: entered promiscuous mode
[   27.226344] Generic FE-GE Realtek PHY r8169-0-200:00: Downshift occurred from negotiated speed 1Gbps to actual speed 100Mbps, check cabling!
[   27.226478] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full (downshifted) - flow control rx/tx

[108133.125832] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 8587 ms
[108133.134207] r8169 0000:02:00.0 enp2s0: ASPM disabled on Tx timeout
[108133.158610] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100).
[108133.176558] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[108143.152459] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 3: transmit queue 0 timed out 8987 ms
[108143.175917] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100).
[108143.197345] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[108153.178824] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 9974 ms
[108163.205511] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 8234 ms
[108163.224605] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100).
[108163.238986] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[108173.018734] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 8720 ms
[108173.041848] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100).
[108173.060345] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).



# ethtool -i enp2s0 
driver: r8169
version: 6.9.1-i7
firmware-version: rtl8168h-2_0.0.2 02/26/15
expansion-rom-version: 
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no


I tried with different network wires and same problem. It starts dumping the errors after some hours connected to the network. But not sure why, sometimes can be few days or only some minutes

I cannot disable aspm:
# find /sys/class/net/enp2s0/device/ | grep aspm || echo no aspm
no aspm

The network interface is dual, although I'm only using one of them.
Comment 19 Heiner Kallweit 2024-05-31 08:51:28 UTC
A downshift always indicates a physical issue. It doesn't have to be the cable, the trouble could also be caused by e.g. the RJ45 port on either side.
Comment 20 Vincas Dargis 2024-07-04 17:22:10 UTC
After Debian Sid upgraded to 6.9.7-1, I no longer see this timeout (was still issue in 6.8.x)!

I have left computer running over night (with torrents and bitcoind and whatever running), and I see no any timeouts or crashes in journalctl -k, network still works fine!
Comment 21 Vincas Dargis 2024-07-07 14:22:22 UTC
It seems I do still see this in logs:

```
Jul 07 17:18:09 kernel: r8169 0000:05:00.1 enp5s0f1: NETDEV WATCHDOG: CPU: 6: transmit queue 0 timed out 10484 ms
Jul 07 17:18:09 kernel: r8169 0000:05:00.1: can't disable ASPM; OS doesn't have ASPM control
Jul 07 17:18:09 kernel: r8169 0000:05:00.1 enp5s0f1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
```

But network is still usable. Still no "[ cut here ]" crashes printed.