After upgrading my system to 6.2 recently I've been getting seemingly random network card crashes. >lsb-release: DISTRIB_ID=LinuxMint DISTRIB_RELEASE=21.2 DISTRIB_CODENAME=victoria DISTRIB_DESCRIPTION="Linux Mint 21.2 Victoria" >uname -a: Linux i5-ProDesk 6.2.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 16:27:29 UTC 2 x86_64 x86_64 x86_64 GNU/Linux >lspci: 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15) Subsystem: Hewlett-Packard Company RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: I/O ports at 3000 [size=256] Region 2: Memory at cc904000 (64-bit, non-prefetchable) [size=4K] Region 4: Memory at cc900000 (64-bit, non-prefetchable) [size=16K] Capabilities: <access denied> Kernel driver in use: r8169 Kernel modules: r8169 >dmesg: [ 9.210269] r8169 0000:02:00.0 eth0: RTL8168h/8111h, 40:b0:34:fb:e8:5a, XID 541, IRQ 133 [ 9.210273] r8169 0000:02:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko] [ 9.210507] usb 2-2: New USB device found, idVendor=1c04, idProduct=0013, bcdDevice=61.08 [ 9.210511] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 9.210512] usb 2-2: Product: TR-004 [ 9.210513] usb 2-2: Manufacturer: QNAP Systems, Inc. [ 9.210514] usb 2-2: SerialNumber: 51323043423037383430 [ 9.242547] r8169 0000:02:00.0 enp2s0: renamed from eth0 [ 9.292656] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915]) [ 9.295636] usb-storage 2-2:1.0: USB Mass Storage device detected [ 9.295795] scsi host6: usb-storage 2-2:1.0 [ 9.295851] usbcore: registered new interface driver usb-storage [ 9.305456] usbcore: registered new interface driver uas [ 9.379685] intel_tcc_cooling: Programmable TCC Offset detected [ 9.406737] intel_rapl_common: Found RAPL domain package [ 9.406740] intel_rapl_common: Found RAPL domain core [ 9.406742] intel_rapl_common: Found RAPL domain uncore [ 9.406744] intel_rapl_common: Found RAPL domain dram [ 9.426136] snd_hda_intel 0000:00:1f.3: enabling device (0100 -> 0102) [ 9.426440] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915]) [ 9.513624] snd_hda_codec_conexant hdaudioC0D0: CX20632: BIOS auto-probing. [ 9.514513] snd_hda_codec_conexant hdaudioC0D0: autoconfig for CX20632: line_outs=1 (0x1c/0x0/0x0/0x0/0x0) type:line [ 9.514530] snd_hda_codec_conexant hdaudioC0D0: speaker_outs=1 (0x1f/0x0/0x0/0x0/0x0) [ 9.514537] snd_hda_codec_conexant hdaudioC0D0: hp_outs=1 (0x19/0x0/0x0/0x0/0x0) [ 9.514543] snd_hda_codec_conexant hdaudioC0D0: mono: mono_out=0x0 [ 9.514546] snd_hda_codec_conexant hdaudioC0D0: inputs: [ 9.514550] snd_hda_codec_conexant hdaudioC0D0: Mic=0x1a [ 9.514554] snd_hda_codec_conexant hdaudioC0D0: Line=0x1d [ 9.571974] input: HDA Intel PCH Mic as /devices/pci0000:00/0000:00:1f.3/sound/card0/input23 [ 9.572082] input: HDA Intel PCH Line as /devices/pci0000:00/0000:00:1f.3/sound/card0/input24 [ 9.572161] input: HDA Intel PCH Line Out as /devices/pci0000:00/0000:00:1f.3/sound/card0/input25 [ 9.572244] input: HDA Intel PCH Front Headphone as /devices/pci0000:00/0000:00:1f.3/sound/card0/input26 [ 9.572346] input: HDA Intel PCH HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input27 [ 9.572433] input: HDA Intel PCH HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input28 [ 9.572552] input: HDA Intel PCH HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input29 [ 10.128945] EXT4-fs (nvme0n1p3): mounted filesystem fec84ea4-37d8-4a23-aa47-6026b141f1e0 with ordered data mode. Quota mode: none. [ 10.167438] audit: type=1400 audit(1692740819.357:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=490 comm="apparmor_parser" [ 10.167443] audit: type=1400 audit(1692740819.357:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=490 comm="apparmor_parser" [ 10.167646] audit: type=1400 audit(1692740819.357:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lsb_release" pid=489 comm="apparmor_parser" [ 10.171430] audit: type=1400 audit(1692740819.361:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=492 comm="apparmor_parser" [ 10.171434] audit: type=1400 audit(1692740819.361:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=492 comm="apparmor_parser" [ 10.171436] audit: type=1400 audit(1692740819.361:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=492 comm="apparmor_parser" [ 10.172672] audit: type=1400 audit(1692740819.361:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/lightdm/lightdm-guest-session" pid=488 comm="apparmor_parser" [ 10.172675] audit: type=1400 audit(1692740819.361:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/lightdm/lightdm-guest-session//chromium" pid=488 comm="apparmor_parser" [ 10.173611] audit: type=1400 audit(1692740819.361:12): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/redshift" pid=493 comm="apparmor_parser" [ 10.175074] audit: type=1400 audit(1692740819.365:13): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=491 comm="apparmor_parser" [ 10.218202] RPC: Registered named UNIX socket transport module. [ 10.218204] RPC: Registered udp transport module. [ 10.218205] RPC: Registered tcp transport module. [ 10.218206] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 10.322223] scsi 6:0:0:0: Direct-Access QNAP TR-004 DISK00 6108 PQ: 0 ANSI: 6 [ 10.322510] sd 6:0:0:0: Attached scsi generic sg1 type 0 [ 10.322737] sd 6:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). [ 10.322864] sd 6:0:0:0: [sda] 46883930112 512-byte logical blocks: (24.0 TB/21.8 TiB) [ 10.322870] sd 6:0:0:0: [sda] 4096-byte physical blocks [ 10.323181] sd 6:0:0:0: [sda] Write Protect is off [ 10.323187] sd 6:0:0:0: [sda] Mode Sense: 47 00 00 08 [ 10.323500] sd 6:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 10.366735] sd 6:0:0:0: [sda] Attached SCSI disk [ 10.570267] fbcon: Taking over console [ 10.596271] Console: switching to colour frame buffer device 170x48 [ 10.706401] Generic FE-GE Realtek PHY r8169-0-200:00: attached PHY driver (mii_bus:phy_addr=r8169-0-200:00, irq=MAC) [ 10.780320] Lockdown: Xorg: raw io port access is restricted; see man kernel_lockdown.7 [ 10.901951] r8169 0000:02:00.0 enp2s0: Link is Down [ 11.040232] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this. [ 13.279884] Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7 [ 13.291180] Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7 [ 14.157624] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control rx/tx [ 14.157677] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready [ 14.162293] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control off [ 14.162428] r8169 0000:02:00.0 enp2s0: Link is Down [ 15.530004] logitech-hidpp-device 0003:046D:406E.0005: HID++ 4.5 device connected. [ 17.671377] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full - flow control off [ 18.257246] logitech-hidpp-device 0003:046D:406D.0004: HID++ 4.5 device connected. [ 22.827616] EXT4-fs (sda): mounted filesystem 62140d4d-9618-435f-8414-18384be15421 with ordered data mode. Quota mode: none. [ 67.980872] show_signal_msg: 18 callbacks suppressed [ 67.980874] GpuWatchdog[2955]: segfault at 0 ip 00007fd0eaf929a6 sp 00007fd0dfdfd370 error 6 in libcef.so[7fd0e6aef000+7770000] likely on CPU 1 (core 1, socket 0) [ 67.980884] Code: 89 de e8 0d ef 6e ff 80 7d cf 00 79 09 48 8b 7d b8 e8 4e 66 2c 03 41 8b 84 24 e0 00 00 00 89 45 b8 48 8d 7d b8 e8 5a d3 b5 fb <c7> 04 25 00 00 00 00 37 13 00 00 48 83 c4 38 5b 41 5c 41 5d 41 5e [ 2809.974520] perf: interrupt took too long (2529 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 [ 6003.062611] perf: interrupt took too long (3175 > 3161), lowering kernel.perf_event_max_sample_rate to 63000 [ 6781.127013] perf: interrupt took too long (3972 > 3968), lowering kernel.perf_event_max_sample_rate to 50250 [ 8278.008471] ------------[ cut here ]------------ [ 8278.008479] NETDEV WATCHDOG: enp2s0 (r8169): transmit queue 0 timed out [ 8278.008510] WARNING: CPU: 1 PID: 3485 at net/sched/sch_generic.c:525 dev_watchdog+0x21f/0x230 [ 8278.008527] Modules linked in: xt_multiport xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc sunrpc binfmt_misc snd_soc_avs snd_hda_codec_hdmi snd_soc_hda_codec snd_hda_ext_core snd_hda_codec_conexant snd_soc_core snd_hda_codec_generic ledtrig_audio snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg intel_rapl_msr snd_intel_sdw_acpi intel_rapl_common snd_hda_codec intel_tcc_cooling snd_hda_core x86_pkg_temp_thermal snd_hwdep intel_powerclamp snd_pcm coretemp snd_seq_midi kvm_intel snd_seq_midi_event uas usb_storage snd_rawmidi mei_pxp mei_hdcp kvm snd_seq irqbypass snd_seq_device joydev hp_wmi rapl snd_timer r8169 nls_iso8859_1 input_leds intel_cstate sparse_keymap mei_me snd realtek serio_raw platform_profile ee1004 mei soundcore wmi_bmof mac_hid acpi_pad sch_fq_codel msr parport_pc ppdev lp pstore_blk ramoops parport pstore_zone [ 8278.008682] reed_solomon efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c dm_mirror dm_region_hash dm_log hid_logitech_hidpp i915 hid_logitech_dj drm_buddy i2c_algo_bit ttm drm_display_helper crct10dif_pclmul cec crc32_pclmul rc_core polyval_clmulni hid_generic polyval_generic drm_kms_helper ghash_clmulni_intel usbhid syscopyarea sha512_ssse3 hid sysfillrect aesni_intel nvme sysimgblt crypto_simd ahci i2c_i801 xhci_pci nvme_core drm cryptd psmouse xhci_pci_renesas i2c_smbus libahci nvme_common video wmi [ 8278.008808] CPU: 1 PID: 3485 Comm: bitburner Not tainted 6.2.0-26-generic #26~22.04.1-Ubuntu [ 8278.008816] Hardware name: HP HP ProDesk 400 G4 SFF /82A2, BIOS P08 Ver. 02.46 03/28/2023 [ 8278.008819] RIP: 0010:dev_watchdog+0x21f/0x230 [ 8278.008830] Code: 00 e9 31 ff ff ff 4c 89 e7 c6 05 f5 a9 78 01 01 e8 c6 ff f7 ff 44 89 f1 4c 89 e6 48 c7 c7 08 30 24 93 48 89 c2 e8 31 0b 2c ff <0f> 0b e9 22 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 [ 8278.008836] RSP: 0000:ffffa30488b17db0 EFLAGS: 00010246 [ 8278.008842] RAX: 0000000000000000 RBX: ffff924a5f31c4c8 RCX: 0000000000000000 [ 8278.008847] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 8278.008851] RBP: ffffa30488b17dd8 R08: 0000000000000000 R09: 0000000000000000 [ 8278.008855] R10: 0000000000000000 R11: 0000000000000000 R12: ffff924a5f31c000 [ 8278.008858] R13: ffff924a5f31c41c R14: 0000000000000000 R15: 0000000000000000 [ 8278.008862] FS: 00007fa39d427280(0000) GS:ffff924d62680000(0000) knlGS:0000000000000000 [ 8278.008868] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8278.008872] CR2: 000016980078c380 CR3: 0000000216c98002 CR4: 00000000003706e0 [ 8278.008877] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8278.008880] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8278.008884] Call Trace: [ 8278.008888] <TASK> [ 8278.008895] ? __pfx_dev_watchdog+0x10/0x10 [ 8278.008905] call_timer_fn+0x29/0x160 [ 8278.008914] ? __pfx_dev_watchdog+0x10/0x10 [ 8278.008921] __run_timers.part.0+0x1fb/0x2b0 [ 8278.008929] ? ktime_get+0x43/0xc0 [ 8278.008934] ? __pfx_tick_sched_timer+0x10/0x10 [ 8278.008945] ? lapic_next_deadline+0x2c/0x50 [ 8278.008951] ? clockevents_program_event+0xb2/0x140 [ 8278.008959] run_timer_softirq+0x2a/0x60 [ 8278.008966] __do_softirq+0xda/0x330 [ 8278.008972] ? hrtimer_interrupt+0x12b/0x250 [ 8278.008982] __irq_exit_rcu+0xa2/0xd0 [ 8278.008988] irq_exit_rcu+0xe/0x20 [ 8278.008994] sysvec_apic_timer_interrupt+0x43/0xb0 [ 8278.009001] asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ 8278.009008] RIP: 0033:0x55be202d9e83 [ 8278.009014] Code: 00 00 00 0c 00 00 00 dd 00 00 00 ff ff ff ff 00 00 00 00 ff ff ff ff 0c 00 00 00 0c 00 00 00 0c 00 00 00 00 00 00 00 8b 59 d0 <49> 03 de f6 43 1b 01 74 05 e9 2f 30 bb 1f 55 48 89 e5 56 57 50 48 [ 8278.009019] RSP: 002b:00007ffd8e384078 EFLAGS: 00000206 [ 8278.009024] RAX: 0000000000000001 RBX: 0000000004c3e909 RCX: 000055be202d9e80 [ 8278.009028] RDX: 00003636000023e1 RSI: 00003636006a2ca1 RDI: 000036360069c629 [ 8278.009032] RBP: 00007ffd8e3840d0 R08: 000036360069c629 R09: 00003636001c4841 [ 8278.009035] R10: 00001698003f3209 R11: 0000000000000011 R12: 000036360000400d [ 8278.009039] R13: 0000169800550000 R14: 0000363600000000 R15: 0000000000000006 [ 8278.009047] </TASK> [ 8278.009050] ---[ end trace 0000000000000000 ]--- [ 8278.038321] r8169 0000:02:00.0 enp2s0: rtl_chipcmd_cond == 1 (loop: 100, delay: 100). [ 8278.039685] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8278.041049] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8278.042416] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8278.043805] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8278.045244] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8278.046613] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8278.070550] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100). [ 8278.093905] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100). [ 8278.117905] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100). [ 8388.884453] net_ratelimit: 9 callbacks suppressed [ 8388.884460] r8169 0000:02:00.0 enp2s0: rtl_chipcmd_cond == 1 (loop: 100, delay: 100). [ 8388.886144] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8388.887504] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8388.888934] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8388.890386] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8388.891839] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8388.893406] r8169 0000:02:00.0 enp2s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10). [ 8388.920620] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100). [ 8388.949110] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100). [ 8388.973078] r8169 0000:02:00.0 enp2s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
i believe i was on kernel 5.15 before the upgrade and before that on 5.4 (been running mint on this system for a number of years now and upgrading it via the mint upgrader when prompted)
also submitted to ubuntu launchpad: https://bugs.launchpad.net/ubuntu/+bug/2032706
See comment on the bugzilla start page: Downstream kernels aren't supported here. Please test with a (best self-compiled) mainline kernel. Your lspci output misses relevant information, please use -vv as root. [ 14.157624] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control rx/tx [ 14.157677] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready [ 14.162293] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control off [ 14.162428] r8169 0000:02:00.0 enp2s0: Link is Down [ 17.671377] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full - flow control off This looks strange and may indicate a physical issue. Please check RJ45 ports on both sides and cabling. Best also test with another link partner. You can try the following to rule out ASPM-related issues: Disable l1_1_aspm and if that doesn't help l1_aspm under /sys/class/net/enp2s0/device/link. Best of course would be if you bisect the issue between last known good kernel and first troublesome.
I've already tried different cables, and bypassed my unmanaged switch using the cable that feeds it, so I'm pretty sure it's not in my infra (although the port itself could be going bad i guess?) Here's the lspci -vv: 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15) Subsystem: Hewlett-Packard Company RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: I/O ports at 3000 [size=256] Region 2: Memory at cc904000 (64-bit, non-prefetchable) [size=4K] Region 4: Memory at cc900000 (64-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 01 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 4096 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s (ok), Width x1 (ok) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+ 10BitTagComp- 10BitTagReq- OBFF Via message/WAKE#, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled, AtomicOpsCtl: ReqEn- LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS- LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [b0] MSI-X: Enable+ Count=4 Masked- Vector table: BAR=4 offset=00000000 PBA: BAR=4 offset=00000800 Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [140 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00 Capabilities: [170 v1] Latency Tolerance Reporting Max snoop latency: 3145728ns Max no snoop latency: 3145728ns Capabilities: [178 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=150us PortTPowerOnTime=150us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=0ns L1SubCtl2: T_PwrOn=150us Kernel driver in use: r8169 Kernel modules: r8169
OK, so ASPM L1 sub-states are disabled. Then: - test with a mainline kernel - disable ASPM L1 via sysfs - bisect The following from your log may also play a role: [ 2809.974520] perf: interrupt took too long (2529 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 [ 6003.062611] perf: interrupt took too long (3175 > 3161), lowering kernel.perf_event_max_sample_rate to 63000 [ 6781.127013] perf: interrupt took too long (3972 > 3968), lowering kernel.perf_event_max_sample_rate to 50250
(In reply to Heiner Kallweit from comment #5) > OK, so ASPM L1 sub-states are disabled. Then: > - test with a mainline kernel To compile your own kernel (which is a requirement), see Documentation/admin-guide/quickly-build-trimmed-linux.rst in kernel sources (6.4 and later). > - bisect > See Documentation/admin-guide/bug-bisect.rst.
ok so to disable l1_aspm? sudo echo 0 > /sys/class/net/enp2s0/device/link/l1_aspm also, a question re: building kernel and secure boot: linux mint supports secure boot and it is currently enabled... can I leave it enabled then when building my own kernel or should I disable it? (i'm thinking leave it enabled and disable in BIOS if I can't boot) btw, thanks everyone for helping
i ended up adding pcie_aspm=off to my grub kernel parameters looks promising so far... will have to wait a while to see if the nic crashes though
(In reply to Chris Heath from comment #7) > ok so to disable l1_aspm? > sudo echo 0 > /sys/class/net/enp2s0/device/link/l1_aspm > exactly
I'm still having this issue on Debian Sid with 6.6.13 kernel. I believe last working was 6.1.27-1, I still have this kernel on hold. Tried r8169 and r8168-dkms with same result. pcie_aspm=off does not help, nor any other workaround tried, like disabling offloading and limiting to 100Mbps. In my case it's Asus N551JM laptop with this ethernet card: ``` 05:00.1 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 12) Subsystem: ASUSTeK Computer Inc. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 19 IOMMU group: 15 Region 0: I/O ports at d000 [size=256] Region 2: Memory at f7814000 (64-bit, non-prefetchable) [size=4K] Region 4: Memory at f7810000 (64-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 01 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10W DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 4096 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1 TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+ 10BitTagComp- 10BitTagReq- OBFF Via message/WAKE#, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [b0] MSI-X: Enable+ Count=4 Masked- Vector table: BAR=4 offset=00000000 PBA: BAR=4 offset=00000800 Capabilities: [d0] Vital Product Data Not readable Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00 Capabilities: [170 v1] Latency Tolerance Reporting Max snoop latency: 71680ns Max no snoop latency: 71680ns Capabilities: [178 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=150us PortTPowerOnTime=150us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=0ns L1SubCtl2: T_PwrOn=10us Kernel driver in use: r8169 Kernel modules: r8169 ```
(In reply to Vincas Dargis from comment #10) > I'm still having this issue on Debian Sid with 6.6.13 kernel. > > I believe last working was 6.1.27-1, I still have this kernel on hold. > > Tried r8169 and r8168-dkms with same result. > > pcie_aspm=off does not help, nor any other workaround tried, like disabling > offloading and limiting to 100Mbps. > You can try disabling ASPM in BIOS or disabling ASPM L1 via sysfs. And if it worked with an earlier kernel, please bisect. However, if the vendor driver also fails, it may indicate a hw issue.
Hello, I'm also having this issue on Gentoo with 6.6.21: Linux akita 6.6.21-gentoo-dist #1 SMP PREEMPT_DYNAMIC Tue Apr 2 22:51:19 CDT 2024 x86_64 AMD Ryzen 9 5900X 12-Core Processor AuthenticAMD GNU/Linux --- 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05) Subsystem: Gigabyte Technology Co., Ltd RTL8125 2.5GbE Controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 24 IOMMU group: 23 Region 0: I/O ports at f000 [size=256] Region 2: Memory at fc600000 (64-bit, non-prefetchable) [size=64K] Region 4: Memory at fc610000 (64-bit, non-prefetchable) [size=16K] Capabilities: <access denied> Kernel driver in use: r8169 Kernel modules: r8169, r8125 --- [ 34.898835] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-500:00: Downshift occurred from negotiated speed 1Gbps to actual speed 100Mbps, check cabling! [ 34.898849] r8169 0000:05:00.0 enp5s0: Link is Up - 100Mbps/Full (downshifted) - flow control rx/tx [ 45.767203] ------------[ cut here ]------------ [ 45.767215] NETDEV WATCHDOG: enp5s0 (r8169): transmit queue 0 timed out 10057 ms [ 45.767228] WARNING: CPU: 6 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x232/0x240 [ 45.767237] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq nvidia_uvm(PO) nvidia_drm(PO) nvidia_modeset(PO) intel_rapl_msr intel_rapl_common 8021q garp mrp stp llc edac_mce_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi mt7921e snd_usb_audio mt7921_common nvidia(PO) snd_hda_intel snd_usbmidi_lib mt792x_lib snd_intel_dspcfg snd_ump mt76_connac_lib snd_intel_sdw_acpi snd_rawmidi mt76 snd_hda_codec snd_seq_device btusb nls_iso8859_1 snd_hda_core mc btrtl mac80211 snd_hwdep kvm btintel snd_pcm vfat fat irqbypass btbcm libarc4 snd_timer r8125(O) btmtk razermouse(O) snd video gigabyte_wmi rapl wmi_bmof soundcore k10temp cfg80211 bluetooth pcspkr acpi_cpufreq i2c_piix4 rfkill r8169 joydev loop fuse nfnetlink crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 sp5100_tco ccp nvme nvme_core nvme_common wmi [ 45.767411] CPU: 6 PID: 0 Comm: swapper/6 Tainted: P O 6.6.21-gentoo-dist #1 [ 45.767416] Hardware name: Gigabyte Technology Co., Ltd. X570S AORUS ELITE AX/X570S AORUS ELITE AX, BIOS F8 03/22/2024 [ 45.767419] RIP: 0010:dev_watchdog+0x232/0x240 [ 45.767424] Code: ff ff ff 48 89 df c6 05 9e a0 23 01 01 e8 c6 0b fa ff 45 89 f8 44 89 f1 48 89 de 48 89 c2 48 c7 c7 60 9f be 9c e8 de 35 24 ff <0f> 0b e9 2d ff ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 [ 45.767429] RSP: 0018:ffffc900003c0e78 EFLAGS: 00010286 [ 45.767434] RAX: 0000000000000000 RBX: ffff8881060d4000 RCX: 0000000000000000 [ 45.767438] RDX: ffff88881ebae340 RSI: ffff88881eba1580 RDI: 00000000000400f6 [ 45.767441] RBP: ffff8881060d44c8 R08: 0000000000000000 R09: ffffc900003c0d00 [ 45.767444] R10: 0000000000000003 R11: ffffffff9cf46328 R12: ffff88810e284a00 [ 45.767447] R13: ffff8881060d441c R14: 0000000000000000 R15: 0000000000002749 [ 45.767451] FS: 0000000000000000(0000) GS:ffff88881eb80000(0000) knlGS:0000000000000000 [ 45.767455] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.767458] CR2: 0000559bb230db78 CR3: 000000010616c000 CR4: 0000000000f50ee0 [ 45.767462] PKRU: 55555554 [ 45.767465] Call Trace: [ 45.767469] <IRQ> [ 45.767472] ? dev_watchdog+0x232/0x240 [ 45.767477] ? __warn+0x81/0x130 [ 45.767485] ? dev_watchdog+0x232/0x240 [ 45.767490] ? report_bug+0x171/0x1a0 [ 45.767496] ? apic_mem_wait_icr_idle+0x14/0x20 [ 45.767503] ? handle_bug+0x3c/0x80 [ 45.767508] ? exc_invalid_op+0x17/0x70 [ 45.767512] ? asm_exc_invalid_op+0x1a/0x20 [ 45.767522] ? dev_watchdog+0x232/0x240 [ 45.767527] ? dev_watchdog+0x232/0x240 [ 45.767531] ? __pfx_dev_watchdog+0x10/0x10 [ 45.767536] call_timer_fn+0x27/0x130 [ 45.767543] ? __pfx_dev_watchdog+0x10/0x10 [ 45.767547] __run_timers+0x222/0x2c0 [ 45.767555] run_timer_softirq+0x1d/0x40 [ 45.767560] __do_softirq+0xd4/0x2c8 [ 45.767568] __irq_exit_rcu+0xa6/0xc0 [ 45.767574] sysvec_apic_timer_interrupt+0x72/0x90 [ 45.767579] </IRQ> [ 45.767582] <TASK> [ 45.767586] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 45.767590] RIP: 0010:cpuidle_enter_state+0xcc/0x440 [ 45.767595] Code: 4a 40 0b ff e8 c5 f3 ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 83 47 0a ff 45 84 ff 0f 85 56 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d [ 45.767599] RSP: 0018:ffffc900001b7e90 EFLAGS: 00000246 [ 45.767604] RAX: ffff88881ebb3e40 RBX: ffff888100ddc400 RCX: 000000000000001f [ 45.767607] RDX: 0000000000000006 RSI: 00000000229837f7 RDI: 0000000000000000 [ 45.767610] RBP: 0000000000000002 R08: 0000000000000002 R09: 0000000000000401 [ 45.767613] R10: 0000000000000008 R11: ffff88881ebb27a4 R12: ffffffff9d077d40 [ 45.767616] R13: 0000000aa7eff1b9 R14: 0000000000000002 R15: 0000000000000000 [ 45.767625] cpuidle_enter+0x2d/0x40 [ 45.767631] do_idle+0x1d8/0x230 [ 45.767638] cpu_startup_entry+0x2a/0x30 [ 45.767642] start_secondary+0x11e/0x140 [ 45.767647] secondary_startup_64_no_verify+0x18f/0x19b [ 45.767656] </TASK> [ 45.767659] ---[ end trace 0000000000000000 ]--- [ 4316.702225] r8169 0000:05:00.0: invalid VPD tag 0x00 (size 0) at offset 0; assume missing optional EEPROM I've been losing my Internet connection every time my computer reboots. If it helps, I noticed that I can reliably get an Internet connection if I turn off my computer, unplug (power off) my unmanaged switch, then turn my computer on, and then plug the switch back in. I also noticed this happening in the dhcpcd logs: Apr 02 19:15:13 akita systemd[1]: Started Lightweight DHCP client daemon. Apr 02 19:15:14 akita dhcpcd[3282]: enp5s0: failed to request the lease Apr 02 19:15:14 akita dhcpcd[3282]: enp5s0: soliciting a DHCP lease Apr 02 19:15:18 akita dhcpcd[3282]: enp5s0: offered 192.168.1.243 from 192.168.1.254 Apr 02 19:15:23 akita dhcpcd[3282]: enp5s0: failed to request the lease Apr 02 19:15:23 akita dhcpcd[3282]: enp5s0: soliciting a DHCP lease Apr 02 19:15:23 akita dhcpcd[3282]: enp5s0: offered 192.168.1.243 from 192.168.1.254 Apr 02 19:15:28 akita dhcpcd[3282]: enp5s0: failed to request the lease I've tried the vendor driver (net-misc/r8125) and the latest available kernel (6.8.2 in testing, although I just noticed 6.8.3 is available) and neither seemed to help. Also tried setting pcie_aspm=off in my kernel options and that didn't seem to do anything. I can try bisecting some time next week if it helps, although I'm not sure where to start since this driver has never worked for me. Prior to this I was using Debian 12 and having the same issue there.
> although I'm not sure where to start since this driver has never worked for > me. Please try between 6.1 and 6.2. This Ethernet card worked for me up until 6.2. My laptop is almost 10 years old, used it with Ubuntu, Debian stable and now Sid. Only in Sid with ~ >=6.2 I started to get issues. Sorry, but I haven't found energy yet to start bisecting myself.
(In reply to Sean Haugh from comment #12) > Hello, I'm also having this issue on Gentoo with 6.6.21: > [...] > [ 34.898835] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-500:00: Downshift > occurred from negotiated speed 1Gbps to actual speed 100Mbps, check cabling! > [ 34.898849] r8169 0000:05:00.0 enp5s0: Link is Up - 100Mbps/Full > (downshifted) - flow control rx/tx > [...] > I've been losing my Internet connection every time my computer reboots. If > it helps, I noticed that I can reliably get an Internet connection if I turn > off my computer, unplug (power off) my unmanaged switch, then turn my > computer on, and then plug the switch back in. I also noticed this happening > in the dhcpcd logs: > [...] > I've tried the vendor driver (net-misc/r8125) and the latest available > kernel (6.8.2 in testing, although I just noticed 6.8.3 is available) and > neither seemed to help. Also tried setting pcie_aspm=off in my kernel > options and that didn't seem to do anything. > A downshift typically indicates a physical issue. Best re-test with different cable and different switch, or at least different switch port.
And ensure that firmware for the NIC is loaded. ethtool -i <if> will tell you.
I tried 6.8.3, 6.1.81, and 5.15.151 without any luck. The cable is unfortunately inside the wall, but I did try a different switch port; no luck there either. I can try moving my computer downstairs and connecting it directly to the ISP router with a better cable, but that'll have to wait for another day. -- FYR: driver: r8169 version: 6.6.21-gentoo-dist firmware-version: rtl8125b-2_0.0.2 07/13/20 expansion-rom-version: bus-info: 0000:05:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no
I'm using the upstream 6.8 kernel now (from git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git) and although I'm still getting downshifting, I'm not seeing the "transmit queue 0 timed out" message anymore.
I have a similar problem, detected since kernel 6.8, now using 6.9.1: [ 1.342197] r8169 0000:02:00.0 eth0: RTL8168h/8111h, 84:47:09:0e:ba:2c, XID 541, IRQ 133 [ 1.342777] r8169 0000:02:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko] [ 1.363036] r8169 0000:03:00.0 eth1: RTL8168h/8111h, 84:47:09:0e:ba:2d, XID 541, IRQ 134 [ 1.363574] r8169 0000:03:00.0 eth1: jumbo features [frames: 9194 bytes, tx checksumming: ko] [ 1.833163] r8169 0000:02:00.0 enp2s0: renamed from eth0 [ 1.834953] r8169 0000:03:00.0 enp3s0: renamed from eth1 [ 10.932526] Generic FE-GE Realtek PHY r8169-0-200:00: attached PHY driver (mii_bus:phy_addr=r8169-0-200:00, irq=MAC) [ 11.138959] r8169 0000:02:00.0 enp2s0: Link is Down [ 11.218894] r8169 0000:02:00.0 enp2s0: entered allmulticast mode [ 11.219028] r8169 0000:02:00.0 enp2s0: entered promiscuous mode [ 27.226344] Generic FE-GE Realtek PHY r8169-0-200:00: Downshift occurred from negotiated speed 1Gbps to actual speed 100Mbps, check cabling! [ 27.226478] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full (downshifted) - flow control rx/tx [108133.125832] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 8587 ms [108133.134207] r8169 0000:02:00.0 enp2s0: ASPM disabled on Tx timeout [108133.158610] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100). [108133.176558] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). [108143.152459] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 3: transmit queue 0 timed out 8987 ms [108143.175917] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100). [108143.197345] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). [108153.178824] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 9974 ms [108163.205511] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 8234 ms [108163.224605] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100). [108163.238986] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). [108173.018734] r8169 0000:02:00.0 enp2s0: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 8720 ms [108173.041848] r8169 0000:02:00.0 enp2s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100). [108173.060345] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). # ethtool -i enp2s0 driver: r8169 version: 6.9.1-i7 firmware-version: rtl8168h-2_0.0.2 02/26/15 expansion-rom-version: bus-info: 0000:02:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no I tried with different network wires and same problem. It starts dumping the errors after some hours connected to the network. But not sure why, sometimes can be few days or only some minutes I cannot disable aspm: # find /sys/class/net/enp2s0/device/ | grep aspm || echo no aspm no aspm The network interface is dual, although I'm only using one of them.
A downshift always indicates a physical issue. It doesn't have to be the cable, the trouble could also be caused by e.g. the RJ45 port on either side.
After Debian Sid upgraded to 6.9.7-1, I no longer see this timeout (was still issue in 6.8.x)! I have left computer running over night (with torrents and bitcoind and whatever running), and I see no any timeouts or crashes in journalctl -k, network still works fine!
It seems I do still see this in logs: ``` Jul 07 17:18:09 kernel: r8169 0000:05:00.1 enp5s0f1: NETDEV WATCHDOG: CPU: 6: transmit queue 0 timed out 10484 ms Jul 07 17:18:09 kernel: r8169 0000:05:00.1: can't disable ASPM; OS doesn't have ASPM control Jul 07 17:18:09 kernel: r8169 0000:05:00.1 enp5s0f1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). ``` But network is still usable. Still no "[ cut here ]" crashes printed.