Bug 209839

Summary: r8169 (RTL8125B): "rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100)" and connectivity loss, caused by small fragmented datagrams
Product: Drivers Reporter: WGH (wgh)
Component: NetworkAssignee: drivers_network (drivers_network)
Status: REOPENED ---    
Severity: normal CC: curtdept, haithcockce, hkallweit1, lubosd, mparka, priit, rm+bko, xplo.bn, ykonotopov
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.9.1 Subsystem:
Regression: No Bisected commit-id:
Attachments: pcap when the the hang occured
trace when the hang occured
distilled hang pcap
full dmesg
30 packets to cause crash
Test patch v1
Test patch v2
Dmesg output
Kernel 5.15 and 6.2 vs. r8169 and r8125
Test results - kernel 6.1.0
Test results - kernel 6.4.0-rc1

Description WGH 2020-10-24 10:15:05 UTC
This is a new RTL8125B chip that is supported only since Linux 5.9.

eth0: RTL8125B, [mac address redacted], XID 641, IRQ 42

I can trigger this problem semi-reliably by attempting to connect to a Windows 7 VM via RDP running on this machine. The VM in question uses a macvtap (I'll check with a bridge whether that's something important or not, and report back).

The general symptoms are that the connectivity is completely lost on the host machine, and it recovers itself after a minute a so.

The first time a got a generic stacktrace:

[94171.732285] ------------[ cut here ]------------
[94171.732301] NETDEV WATCHDOG: enp6s0 (r8169): transmit queue 0 timed out
[94171.732319] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0x232/0x240
[94171.732321] Modules linked in: macvtap macvlan tap veth xt_MASQUERADE xt_CHECKSUM xt_comment bridge stp llc ip6table_raw ip6table_nat iptable_raw iptable_nat bpfilter fuse btrfs blake2b_generic xor zstd_compress lzo_compress raid6_pq sctp kvm_amd kvm amdgpu irqbypass mfd_core gpu_sched ttm ghash_clmulni_intel nct6775 hwmon_vid k10temp efivarfs
[94171.732345] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1-gentoo #6
[94171.732348] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B550 Extreme4, BIOS P1.20 08/13/2020
[94171.732353] RIP: 0010:dev_watchdog+0x232/0x240
[94171.732357] Code: 85 c0 75 e5 eb 9c 4c 89 ef c6 05 c5 e6 c9 00 01 e8 b3 1d fc ff 44 89 e1 48 89 c2 4c 89 ee 48 c7 c7 c8 12 a8 91 e8 1e 9d 7e ff <0f> 0b e9 7a ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48 c7 47
[94171.732360] RSP: 0018:ffff9e8d00003ea0 EFLAGS: 00010286
[94171.732363] RAX: 0000000000000000 RBX: ffff8c1aa331d600 RCX: 0000000000000000
[94171.732365] RDX: ffff8c1aaea278a0 RSI: ffff8c1aaea17820 RDI: 0000000000000300
[94171.732368] RBP: ffff8c1aa306a440 R08: ffff8c1aaea17820 R09: 00000000000006e3
[94171.732370] R10: ffffffff922cdd78 R11: ffff9e8d00003d48 R12: 0000000000000000
[94171.732372] R13: ffff8c1aa306a000 R14: ffff8c1aa306a440 R15: 0000000000000000
[94171.732374] FS:  0000000000000000(0000) GS:ffff8c1aaea00000(0000) knlGS:0000000000000000
[94171.732377] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[94171.732380] CR2: 00007f8d50009048 CR3: 0000001e9b98a000 CR4: 0000000000350ef0
[94171.732382] Call Trace:
[94171.732385]  <IRQ>
[94171.732390]  ? qdisc_put_unlocked+0x30/0x30
[94171.732395]  call_timer_fn+0x2d/0x130
[94171.732399]  run_timer_softirq+0x393/0x450
[94171.732403]  ? tick_sched_handle.isra.0+0x40/0x40
[94171.732405]  ? __hrtimer_run_queues+0xfd/0x260
[94171.732408]  ? ktime_get+0x4a/0xc0
[94171.732412]  __do_softirq+0xe1/0x2bf
[94171.732416]  asm_call_irq_on_stack+0x12/0x20
[94171.732418]  </IRQ>
[94171.732423]  do_softirq_own_stack+0x36/0x40
[94171.732427]  irq_exit_rcu+0x9a/0xa0
[94171.732432]  sysvec_apic_timer_interrupt+0x2e/0x80
[94171.732435]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[94171.732440] RIP: 0010:cpuidle_enter_state+0xd5/0x380
[94171.732443] Code: c4 0f 1f 44 00 00 31 ff e8 88 77 91 ff 80 7c 24 0f 00 74 12 9c 58 f6 c4 02 0f 85 8d 02 00 00 31 ff e8 7f c0 96 ff fb 45 85 f6 <0f> 88 20 01 00 00 49 63 c6 be 68 00 00 00 4c 2b 24 24 48 89 c2 48
[94171.732446] RSP: 0018:ffffffff91c03e58 EFLAGS: 00000202
[94171.732448] RAX: ffff8c1aaea2a5c0 RBX: ffff8c1aa5d6b000 RCX: 000000000000001f
[94171.732451] RDX: 0000000000000000 RSI: 0000000021bf5c7a RDI: 0000000000000000
[94171.732453] RBP: ffffffff91cdce00 R08: 000055a610a6578d R09: ffff8c1aa77c9000
[94171.732455] R10: ffff8c1aaea29584 R11: ffff8c1aaea29564 R12: 000055a610a6578d
[94171.732458] R13: ffffffff91cdcee8 R14: 0000000000000002 R15: ffff8c1aa5d6b000
[94171.732463]  ? cpuidle_enter_state+0xb8/0x380
[94171.732466]  cpuidle_enter+0x37/0x60
[94171.732470]  do_idle+0x1c9/0x240
[94171.732473]  cpu_startup_entry+0x19/0x20
[94171.732476]  start_kernel+0x50a/0x52c
[94171.732480]  secondary_startup_64+0xa4/0xb0
[94171.732484] ---[ end trace 896922ae98389a20 ]---

However, this stack trace was also followed by 
[94171.744374] r8169 0000:06:00.0 enp6s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[94171.752789] r8169 0000:06:00.0 enp6s0: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).

The next times when this was triggered, no stack trace was reported, but the these two messages invariably appeared.

I didn't this problem before installing a VM and trying to connect to it, even though I had relatively heavy network activity for a week (it's a new machine).

I'll continue debugging, and try to eliminate some extra variables. As Heiner Kallweit suggested in bug 203703, comment 27, I'll also try various 5.9-rc kernels, though bisecting might take some time.
Comment 1 WGH 2020-10-24 10:44:45 UTC
The problem occurs when both VM and host bridged through a bridge interface as well, so that likely rules macvtap problem out.
Comment 2 Heiner Kallweit 2020-10-27 20:19:49 UTC
On tx timeout a chip reset is done and connectivity should recover. However, depending on how confused the chip is, even this may not help 100% (indicated by the error messages that tx/rx fifo's aren't empty).

I wonder what could be special with RDP traffic. It should be normal TCP traffic. What may help to analyze the issue:
- trace when the tx timeout occurs. Enable trace events (under /sys/kernel/debug/tracing/events/net):
  - net_dev_start_xmit
  - net_dev_xmit
  - net_dev_xmit_timeout
- a capture of the network traffic when tx times out
Comment 3 WGH 2020-10-28 02:00:53 UTC
Created attachment 293271 [details]
pcap when the the hang occured
Comment 4 WGH 2020-10-28 02:01:18 UTC
Created attachment 293273 [details]
trace when the hang occured
Comment 5 WGH 2020-10-28 02:01:53 UTC
The hang didn't occur immediately, it appears it only happens almost as soon as the desktop is rendered for the first time, and rarely much later.

The network topology is as follows: br0 bridge interface with enp6s0 (ethernet adapter) and vnet0 (VM) enslaved. I captured the traffic on the enp6s0 interface.

I have edited the trace by adding the realtime column so it's easier to match trace with the pcap.

I have randomized MAC addresses in the pcap, and removed a stray IPv6 multicast packet that has my public address in it. I can send you an unedited dump to you privately, if you need it.

I didn't fiddle with ethtool settings, everything was left at their defaults (I know that GSO/TSO sometimes produce funny oversized packets in pcaps, though it doesn't seem to be the case this time).
Comment 6 Heiner Kallweit 2020-10-28 10:08:23 UTC
Thanks for the logs. Only thing that looks interesting at the first glance is the fragmented packets in the pcap log. Not sure however whether this could conflict with the following in the r8169 driver:

		/* The driver does not support incoming fragmented frames.
		 * They are seen as a symptom of over-mtu sized frames.
		 */
		if (unlikely(rtl8169_fragmented_frame(status))) {
			dev->stats.rx_dropped++;
			dev->stats.rx_length_errors++;
			goto release_descriptor;
		}
Comment 7 WGH 2020-10-29 11:21:59 UTC
Your intuition was right, it does appear it has something to do with fragmented packets.

I have distilled the test case to 3 packets you can simply send with tcpreplay to get the hang with 100% reliability (I wish every driver/hardware bug was like that!).

(As a side-note, I find it odd that Windows decided to use fragmented IP datagrams. To my knowledge, IP fragmentation is something that is usually avoided. It seems that the qemu device advertises TSO support, but instead of splitting it into TCP segments, oversized packets gets to the bridge device, and gets fragmented on Ethernet output. Which is likely a bug on its own, but still, IP fragments is not something that should hang the network card driver. Disabling Large Send Offload on the Windows side prevents this fragmentation)
Comment 8 WGH 2020-10-29 11:24:01 UTC
Created attachment 293293 [details]
distilled hang pcap

sudo tcpreplay -i enp6s0 distilled_hang1.pcap
Comment 9 Heiner Kallweit 2020-10-29 12:16:57 UTC
The tx timeout is triggered by the net core if the tx queue has been stopped for too long and there are pending packets. Can't say how this relates to the dropped incoming fragmented packets, IOW: who stops the tx queue.
The fragmented rx packets are dropped only in the driver, therefore I'd expect that the rx/tx hw should be fine. Well, except this chip version has a hw issue with fragmented rx packets.
Not sure whether an upper layer could have stopped the tx queue due to the missing (dropped) packets.
Comment 10 Heiner Kallweit 2020-10-29 12:18:31 UTC
What we could do fow now is adding a WARN_ONCE if we detect a fragmented incoming packet, knowing that this can cause trouble.
Comment 11 WGH 2020-10-29 12:26:31 UTC
I don't really have the expertise with this driver and hardware, but at the first glance that particular comment about fragmented frames seems to be concerned with receiving large Ethernet frames when each ring buffer entry is too small (smaller than actual MTU). Fragmentation on the IP level has nothing to do with it. Unless there's some weird fragmentation/segmentation offload interaction that causes trouble, of course.
Comment 12 WGH 2020-10-29 12:28:51 UTC
Oh, just to clarify, I'm running tcpreplay on the machine that has the RTL8125B network card, i.e. I'm sending, not receiving.
Comment 13 Heiner Kallweit 2020-11-02 10:12:47 UTC
To verify whether the mentioned fragmented packets rx path is hit:
If the issue occurs, do you see rx length errors here?
/sys/class/net/<if>/statistics/rx_length_errors
Comment 14 WGH 2020-11-02 10:54:07 UTC
It's zero. The only non-zero, in fact, are multicast, rx_bytes, rx_packets, tx_bytes and tx_packets.
Comment 15 WGH 2020-11-02 11:04:22 UTC
FWIW, tcpreplaying the pcap from other machine, and watching tcpdump on the affected one, has no ill effects. Packets are visible in tcpdump, but nothing else happens. No error counter increase as well.
Comment 16 Heiner Kallweit 2020-11-02 11:46:16 UTC
Ah, then we don't talk about the rx path. If you tcpreplay the "distilled hang" pcap, does the traffic look ok on the wire?
Comment 17 WGH 2020-11-02 11:52:20 UTC
> Ah, then we don't talk about the rx path.

Sorry for not making it more clear in the beginning.

> If you tcpreplay the "distilled hang" pcap, does the traffic look ok on the
> wire?

The other machine reliably sees only the first 2 packets (out of 3).
Comment 18 WGH 2020-11-02 11:56:55 UTC
I mean it's always 2 out of 3, not that the 3rd packet appears unrealiably.
Comment 19 Heiner Kallweit 2020-11-02 13:51:55 UTC
OK. Hmm, maybe a HW bug. Can you test whether the same issue occurs with the r8125 vendor driver?
There's no errata info available from Realtek, but I can check with a contact in Realtek whether they are aware of any such issue.
Comment 20 WGH 2020-11-02 15:25:34 UTC
It's exactly the same with the out-of-tree r8125, except it doesn't say anything in dmesg when the hang happens, and never recovers on its own.
Comment 21 Heiner Kallweit 2020-11-02 19:39:07 UTC
Realtek was able to reproduce the issue. RTL8125B may suffer from a similar issue with short packets like RTL8168evl.
Could you please check whether the following fixes the issue for you?

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 7e0947e29..375f451cc 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -4053,7 +4053,13 @@ static int rtl8169_xmit_frags(struct rtl8169_private *tp, struct sk_buff *skb,
 
 static bool rtl_test_hw_pad_bug(struct rtl8169_private *tp, struct sk_buff *skb)
 {
-	return skb->len < ETH_ZLEN && tp->mac_version == RTL_GIGA_MAC_VER_34;
+	switch (tp->mac_version) {
+	case RTL_GIGA_MAC_VER_34:
+	case RTL_GIGA_MAC_VER_63:
+		return skb->len < ETH_ZLEN;
+	default:
+		return false;
+	}
 }
 
 static void rtl8169_tso_csum_v1(struct sk_buff *skb, u32 *opts)
-- 
2.29.2
Comment 22 WGH 2020-11-02 19:46:22 UTC
Seems to help! No hangs, and the other machine receives all 3 packets fine as well.
Comment 23 Heiner Kallweit 2020-11-02 19:54:28 UTC
Great! Then I'll check with Realtek whether other chip versions may be affected too before submitting the change.

This way bug analysis is fun: Realtek responds within 2 hrs, and you test within 10 mins.
Comment 24 WGH 2020-11-02 19:56:44 UTC
My own e-mail to realtek that I sent earlier today, given the lack of positive delivery status notification, seems to be devnulled, though :)
Comment 25 Heiner Kallweit 2020-11-10 07:16:35 UTC
Fixed with 2aaf09a0e784 ("r8169: work around short packet hw bug on RTL8125").
Comment 26 WGH 2020-11-26 15:53:32 UTC
The fix was released in Linux 5.9.7.
Comment 27 Priit O. 2020-12-17 11:11:30 UTC
Was this fix removed in 5.10.0 RC5?
I still get this error regularly when torrenting. Have not tried 5.9 because ... ryzen 5950x and it works better with 5.10.

[1274670.281163] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[1274670.289901] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).
[1274680.304429] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[1274680.313171] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).
[1275010.329033] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[1275010.337773] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).
[1275020.355642] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[1275020.364378] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).
[1275375.340078] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[1275375.348817] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).
[1275390.273322] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[1275390.282058] r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).

$ lspci | grep RTL
25:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 04)
Comment 28 Heiner Kallweit 2020-12-17 11:18:54 UTC
Not sure whether the fix made it to 5.10-rc5 or to a later rc version only. Please test with 5.10.1. There the fix is included.
Comment 29 Priit O. 2021-01-05 05:54:47 UTC
Nope, not fixed. It's still here.
jaan  05 07:50:16 Zen kernel: r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
jaan  05 07:50:16 Zen kernel: r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).
jaan  05 07:51:56 Zen kernel: r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
jaan  05 07:51:56 Zen kernel: r8169 0000:25:00.0 huginn: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).
[deemon@Zen ~]$ uname -a
Linux Zen 5.10.2-2-MANJARO #1 SMP PREEMPT Tue Dec 22 08:14:42 UTC 2020 x86_64 GNU/Linux
Comment 30 Heiner Kallweit 2021-01-05 07:40:26 UTC
The patch from comment 21 is included in 5.10.2. And in comment 22 it was confirmed that it fixes the issue.
So maybe we talk about something different here. Do you also see the transmit queue timeouts? Because the log messages are triggered from the chip reset procedure after a tx timeout.

Best test with a mainline non-preempt kernel to rule out that the issue is caused by downstream kernel patches.
Comment 31 Priit O. 2021-01-05 08:41:46 UTC
This is Manjaro default "mainline" kernel. There is also "real-time" variant for audio production people and such. No non-preempt options.

I have no idea what this "transmit queue" even is, but I do lose connection when this thing is happening (pings get lost for like 5-15 seconds.)

But maybe I don't have this "RTL8125B" but just "RTL8125" without B?
$ lspci | grep -i realtek
25:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 04)

Also no wizard to create myself any such packets for testing, but for me the problem appears every time I use qbittorrent to download stuff. Quite easy way for me to check if the problem is still there.
Comment 32 Heiner Kallweit 2021-01-05 09:47:09 UTC
Are you sure that Manjaro didn't add own patches? Best compile a mainline kernel yourself from kernel.org sources.
And please attach full dmesg output.
Comment 33 Priit O. 2021-01-05 10:02:06 UTC
Created attachment 294501 [details]
full dmesg
Comment 34 Heiner Kallweit 2021-01-05 10:19:00 UTC
OK, so it is a RTL8125B. Please provide the output of "ethtool -i huginn" to check for the right firmware version being loaded.
Comment 35 Priit O. 2021-01-05 10:51:49 UTC
$ ethtool -i huginn
driver: r8169
version: 5.10.2-2-MANJARO
firmware-version: rtl8125b-2_0.0.2 07/13/20
expansion-rom-version: 
bus-info: 0000:25:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no
Comment 36 Heiner Kallweit 2021-01-05 11:07:46 UTC
All this looks normal. Just strange that the link breaks immediately once it's established. You could re-test with a different cable and best also a different link partner. Maybe it's a platform issue as your dmesg shows several other errors. It would be helpful if you could check whether any special network packets trigger the issue.
Last but not least please check whether the same happens with the r8125 vendor driver.
Comment 37 Priit O. 2021-01-05 11:28:49 UTC
I have never experienced such problem when qbittorrent is not running.
And I have NAS, so I move big files over the network quite often back and forth and never have this link drop problem then -- this should eliminate the cable problem (sadly can't swap it easily out, as it is kind of built into the wall).
Link partner is my only router, so not really? I could put some switch in-between, but I doubt it will change much.

About platform errors -- maybe; Yes, Zen3 is pretty raw also still; and I am eagerly waiting for MSI to provide newer stable BIOSes, but they are slow and release only betas right now, so... don't feel like being their beta tester with possibility to brick the board :D (they just yesterday released newer beta, with "Update to ComboAM4PIV2 1.1.9.0" but ... it's still beta sadly.

About special network packets -- I have no idea how. If you provide some more handholding or guide with what packets to test and how, I could try, maybe.
Comment 38 WGH 2021-01-05 11:37:48 UTC
>About special network packets -- I have no idea how. If you provide
>some more
>handholding or guide with what packets to test and how, I could try,
>maybe.

The tcpreplay command I posted earlier is literally all you need to run.
Comment 39 Priit O. 2021-01-05 11:42:05 UTC
ah yes, and I never experienced this link drop problem with my other (previous) computer in the same router port with the same cable with qbittorrent. So I assume it's either this NIC or the NIC drivers issue, not the "other end of the link". But I can do the switch in-between test later. Also will look into the tcpreplay thing a bit later when back at home.
Comment 40 Priit O. 2021-01-05 14:34:11 UTC
Downloaded the distilled_hang1.pcap and ran the command on the NIC. 20 times. No hang. How can I create my own version? :)
Comment 41 WGH 2021-01-05 14:53:44 UTC
It's Heiner's call, but I'd say that yours is a different bug.

> How can I create my own version?

Capture the network traffic with Wireshark or tcpdump when the hang occurs. You'll likely notice TCP retransmissions when it happens. If tcpreplaying it gives the same problem - good news, you're mostly there. Wireshark lets you export a selected subset of packets to a new file, which may help to minimize the testcase.

If _incoming_ traffic causes the problem, then you might need to replay the traffic from a different machine.
Comment 42 Priit O. 2021-01-05 17:30:46 UTC
it's kind of stupid to find the correct package because tcpreplay fails to deal with big packages (does not fragment them?) and just dies. Is there some flag or trick I can use to remedy that?

Warning: Unable to send packet: Error with PF_PACKET send() [119]: Message too long (errno = 90)
Actual: 118 packets (87206 bytes) sent in 0.009997 seconds
Rated: 8723216.9 Bps, 69.78 Mbps, 11803.54 pps
Flows: 19 flows, 1900.57 fps, 119 flow packets, 0 non-flow
Statistics for network device: huginn
	Successful packets:        118
	Failed packets:            1
	Truncated packets:         0
	Retried packets (ENOBUFS): 0
	Retried packets (EAGAIN):  0
Comment 43 WGH 2021-01-05 17:40:16 UTC
This might be caused by TSO/GSO.

(When TSO is active, the in-kernel TCP/IP stack implementation sends seemingly oversized TCP segments to the network card, which cuts it up to appropriately sized segments, offloading the segmentation to the network card instead of CPU. Note that it has nothing to do with IP fragmentation. As a side effect, tcpdump usually captures such impossibly large IP datagrams, even though they will appear normal on the wire.)

Try to temporarily disable TSO/GSO with ethtool, that should fix the capture.
Comment 44 Priit O. 2021-01-05 18:01:28 UTC
Created attachment 294519 [details]
30 packets to cause crash

probably takes less than all those 30.
Comment 45 WGH 2021-01-05 18:41:05 UTC
Yup, I can also reproduce it with your pcap.
Comment 46 WGH 2021-01-05 19:02:21 UTC
The vendor r8125 driver hangs the same way. However, their driver doesn't even have a fix for my "distilled pcap" yet.

(After several rmmod/insmod the r8169 driver to longer detects the link, but r8125 works fine. I think I'll need a reboot, lol)
Comment 47 Heiner Kallweit 2021-01-05 19:49:29 UTC
It looks weird that a packet of 1480 bytes is fragmented to 1472 + 8 bytes.
Maybe something with your MTU setting is incorrect. Try to reduce your MTU by 8 bytes.
This may still not explain the hw issue, but it may help to avoid triggering it.
Comment 48 Priit O. 2021-01-06 01:28:32 UTC
Not quite sure which one of us you are talking, but my network MTU is 1500 like it's by default. Normally the unfragmented traffic in Wireshark is also 1514 bytes "on wire" (technically it should be 1518, 14 header and 4 tail checksum; no idea why wireshark doesn't show this):
Ethernet II: 14 bytes
IP header: 20 bytes
UDP header: 8 bytes
data or "payload": 1472 bytes.
(checksum: 4 bytes which wireshark doesn't count)

https://en.wikipedia.org/wiki/Ethernet_frame#Ethernet_II

1472(data) + 8(udp) + 20(ip) = 1500 MTU size exactly

I have no clear understanding why it sometimes starts fragmenting normal size packets to smaller pieces -- maybe because the other end asks for smaller (or bigger) chunks? Smaller makes sense if they are connected over PPPoE or VPN or something (those protocol encapsulations need to fit into the frame also) and instead of multiple fragmentations on the way they ask smaller size from the beginning somehow?

Not sure where you got this 1480 and what exactly you mean by it, but when the payload is 1480 (anything over 1472) for some reason, it has to be fragmented to 1472 byte chunks I think.
Comment 49 Priit O. 2021-01-06 16:49:36 UTC
So, will you reopen this bug or let it remain "fixed" and create a new one? :)
Technically it's a problem with the same hardware and same driver.
Comment 50 WGH 2021-01-06 16:57:57 UTC
Your pcap causes the same problem, and looks strikingly similar to mine, except 

1) UDP instead of TCP (inside the fragmented IP datagrams),
2) the mininum IP datagram size (of the last fragment) is 28 instead of 25,
3) the padding workaround implemented earlier doesn't fix it.

In my opinion, the underlying problem is likely the same, it's just the fix was incomplete. I'm reopening it.
Comment 51 Priit O. 2021-01-06 21:45:46 UTC
Thank you WGH.

Just to clarify one thing -- since the minimum Ethernet frame by IEEE802.3 is 64 bytes/octets (14 header; 46 "payload"; 4 checksum). When the payload is smaller than 46 octets, the padding should be added accordingly by specification.
The question is, which component in the linux network stack is responsible for that (I have no clue), but when it should be the driver, then adding padding is not a "workaround" but actual "fix" and how it should have been done in the first place. :-)
Comment 52 Heiner Kallweit 2021-01-06 21:53:29 UTC
For chips handled by r8169 the hw is supposed to do the padding.
Comment 53 Heiner Kallweit 2021-01-07 15:28:24 UTC
(In reply to Priit O. from comment #48)
> Not sure where you got this 1480 and what exactly you mean by it, but when
> the payload is 1480 (anything over 1472) for some reason, it has to be
> fragmented to 1472 byte chunks I think.

The 1480 comes from the recorded pcap file. A 1480 byte UDP packet is fragmented to 1472 + 8 bytes. IP fragmentation is bad in general. Data load should be splitted on a higher level. So the question is why a higher level creates 1480 byte UDP packets instead of 1472 bytes packets.
Comment 54 WGH 2021-01-07 17:22:54 UTC
It's hard to tell. libtorrent (which qBittorrent is based on) appears to have MTU discovery code built in its uTP implementation.

https://github.com/arvidn/libtorrent/blob/e3f2b016dcd37a9a6e8a94006c7befcf2cb7bfac/src/utp_stream.cpp#L1752

Perhaps the code is simply not written to avoid fragmentation at all costs, so datagrams will be occasionally fragmented before path MTU is probed. "Don't fragment" is set only on the probe datagrams, so other datagrams might still be fragmented.  The kernel might also already know that the effective path MTU to some specific host is lower (e.g. discovered by Path MTU Discovery earlier), which would cause fragmentation right away (as opposed to on some distant router).

Maybe the libtorrent developer had a reason to not enforce "don't fragment" on all datagrams, as ICMP is known to be occasionally blocked by some retarded ISPs for no reason, causing broken PMTUD (hello, Fastly). TCP has a "clamp MSS" workaround for that scenario, but I don't think there's any for UDP.

Anyway, I believe this is tangential to the problem at hand.
Comment 55 xplo 2021-01-19 23:42:11 UTC
For information i have the same bug
[Tue Jan 19 22:24:00 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[Tue Jan 19 22:24:00 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).
[Tue Jan 19 22:24:20 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[Tue Jan 19 22:24:20 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).
[Tue Jan 19 22:24:34 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[Tue Jan 19 22:24:34 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).
[Tue Jan 19 22:24:44 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[Tue Jan 19 22:24:44 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).

When it happens the server is timing out for a few seconds.

It happens randomly when there is heavy traffic. I also use qbitorrent. But i never have seen this problem on an older hardware that used the same connexion and the same rj45 cable.
I tried to change the rj45 cable but it didnt fix it.

Following this thread i tried :
debian kernel 5.9
debian kernel 5.10
ubuntu kernel 5.10.8
ubuntu kernel 5.11.0-051100rc4-generic

I ll try the same setup on windows 10 pro (if i can do it with WSL2) to see if it s not driver related but hardware related.
Comment 56 Heiner Kallweit 2021-01-20 06:41:22 UTC
Please check whether the same issue happens with the r8125 vendor driver.
Comment 57 xplo 2021-01-20 06:54:32 UTC
r8125 realtek driver was worse and it didnt log an error in dmesg. I could see enp1s0 going up multiple time on dmesg when the server was not responding.
It was also random on heavy traffic but the timed out was way longer like 30s-2mn.
The ssh connexion would just never recover when in r8169 it would recover after a few seconds.
Comment 58 WGH 2021-01-21 00:58:57 UTC
Hey, Heiner, are you in touch with Realtek regarding the new pcap this time?
Comment 59 Heiner Kallweit 2021-01-21 06:49:48 UTC
I contacted Realtek, however have no feedback yet.
Comment 60 xplo 2021-01-21 09:39:29 UTC
I searched for similar problem and found out that it was happening with UDP connexion even on windows 10.
https://forum-en.msi.com/index.php?threads/realtek-pcie-2-5gbe-family-controller-resetting.345823/page-4
It seems realtek fixed it but just in the windows driver ( 2020/12/31 version )
https://www.realtek.com/en/component/zoo/category/network-interface-controllers-10-100-1000m-gigabit-ethernet-pci-express-software
All the post strongly suggest heavy udp load that lead the driver to crash on windows. It seems exactly the same problem on linux.
Comment 61 xplo 2021-01-22 09:38:13 UTC
setting qbittorrent to TCP only and bloquing UDP seems to have stabilized the connexion for one day.
Now anyway i just use a usb gigabit ethernet adaptator and everything work fine.
Comment 62 Heiner Kallweit 2021-01-22 10:18:33 UTC
Created attachment 294809 [details]
Test patch v1
Comment 63 Heiner Kallweit 2021-01-22 10:21:41 UTC
Realtek provided a workaround proposal. Please test attached patch.
Comment 64 xplo 2021-01-22 11:28:09 UTC
i dont know how to do that, is there a guide i could follow ?
Comment 65 Heiner Kallweit 2021-01-22 12:12:46 UTC
Apply patch (patch -p1 -i <patch>) on top of linux-next. It should apply also on top of recent stable kernels, but maybe small adjustments are needed.
Then build the kernel as usual, procedure may be distro-dependent.
Comment 66 xplo 2021-01-22 17:15:29 UTC
i think i managed to install it. I compiled the new r8169.ko module and just copied it over the old one so i would not have to recompile the whole kernel ( and probably fail doing it ).

Thank you it s been 1 hour without any timeout so it looks good.
Comment 67 Heiner Kallweit 2021-01-22 17:35:12 UTC
Thanks a lot for testing! Few inquiries, just to be on the safe side:
- After installing the fresh r8169 module, you rmmod/modprobe'd it?
- You reversed the changes described in comment 61?
Comment 68 xplo 2021-01-22 18:35:41 UTC
i did not rmmod/modprobe it. I just rebooted. I saw that modinfo r8169 changed, it lost its signature part so my guess is that it did work but i have no idea.
So now i did modprobe -r r8169 (and found out the hard way it s directly effective as it s headless server and i was connected throught the realtek :o )
modprobe -v r8169
and i rebooted

Yes i reversed what i did to set back udp on in qbittorrent

I acutally got the
[Fri Jan 22 18:47:35 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[Fri Jan 22 18:47:35 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).

But that was before i did the modprobe actions so i hope i was still on the old version.

I m not good enought with linux module compilation to know if what i did was the correct way to update it :/

here is what i get with modinfo (no more signature part)
xplo@nas:~$ modinfo r8169
filename:       /lib/modules/5.10.9-051009-generic/kernel/drivers/net/ethernet/realtek/r8169.ko
firmware:       rtl_nic/rtl8125b-2.fw
firmware:       rtl_nic/rtl8125a-3.fw
firmware:       rtl_nic/rtl8107e-2.fw
firmware:       rtl_nic/rtl8107e-1.fw
firmware:       rtl_nic/rtl8168fp-3.fw
firmware:       rtl_nic/rtl8168h-2.fw
firmware:       rtl_nic/rtl8168h-1.fw
firmware:       rtl_nic/rtl8168g-3.fw
firmware:       rtl_nic/rtl8168g-2.fw
firmware:       rtl_nic/rtl8106e-2.fw
firmware:       rtl_nic/rtl8106e-1.fw
firmware:       rtl_nic/rtl8411-2.fw
firmware:       rtl_nic/rtl8411-1.fw
firmware:       rtl_nic/rtl8402-1.fw
firmware:       rtl_nic/rtl8168f-2.fw
firmware:       rtl_nic/rtl8168f-1.fw
firmware:       rtl_nic/rtl8105e-1.fw
firmware:       rtl_nic/rtl8168e-3.fw
firmware:       rtl_nic/rtl8168e-2.fw
firmware:       rtl_nic/rtl8168e-1.fw
firmware:       rtl_nic/rtl8168d-2.fw
firmware:       rtl_nic/rtl8168d-1.fw
license:        GPL
softdep:        pre: realtek
description:    RealTek RTL-8169 Gigabit Ethernet driver
author:         Realtek and the Linux r8169 crew <netdev@vger.kernel.org>
srcversion:     7022707DE347B6B940D4E30
alias:          pci:v000010ECd00003000sv*sd*bc*sc*i*
alias:          pci:v000010ECd00008125sv*sd*bc*sc*i*
alias:          pci:v00000001d00008168sv*sd00002410bc*sc*i*
alias:          pci:v00001737d00001032sv*sd00000024bc*sc*i*
alias:          pci:v000016ECd00000116sv*sd*bc*sc*i*
alias:          pci:v00001259d0000C107sv*sd*bc*sc*i*
alias:          pci:v00001186d00004302sv*sd*bc*sc*i*
alias:          pci:v00001186d00004300sv*sd*bc*sc*i*
alias:          pci:v00001186d00004300sv00001186sd00004B10bc*sc*i*
alias:          pci:v000010ECd00008169sv*sd*bc*sc*i*
alias:          pci:v000010FFd00008168sv*sd*bc*sc*i*
alias:          pci:v000010ECd00008168sv*sd*bc*sc*i*
alias:          pci:v000010ECd00008167sv*sd*bc*sc*i*
alias:          pci:v000010ECd00008161sv*sd*bc*sc*i*
alias:          pci:v000010ECd00008136sv*sd*bc*sc*i*
alias:          pci:v000010ECd00008129sv*sd*bc*sc*i*
alias:          pci:v000010ECd00002600sv*sd*bc*sc*i*
alias:          pci:v000010ECd00002502sv*sd*bc*sc*i*
depends:
retpoline:      Y
name:           r8169
vermagic:       5.10.9-051009-generic SMP mod_unload
Comment 69 xplo 2021-01-22 18:54:37 UTC
still happening
[Fri Jan 22 19:42:30 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[Fri Jan 22 19:42:30 2021] r8169 0000:01:00.0 enp1s0: rtl_rxtx_empty_cond_2 == 0 (loop: 42, delay: 100).

I just feel it s less frequent but i m not sure at all.
Comment 70 WGH 2021-01-22 19:26:08 UTC
I have applied the patch to 5.10.7 (the first hunk with the includes didn't apply cleanly, but it was easy enough to fix).

Neither my TCP distilled pcap nor fragmented UDP pcap by Priit causes any hangs, so it seems to work.

xplo, could you tcpreplay the attached pcaps on your machine with the patch applied? If they don't cause any hangs, could you try to create a small reproduction pcap the same way we did further up this discussion?
Comment 71 Heiner Kallweit 2021-01-22 20:21:14 UTC
Created attachment 294813 [details]
Test patch v2

I attached a second patch version, also based on a proposal from Realtek.
Maybe the first version fixes most of the problem cases, but not all.
Comment 72 xplo 2021-01-22 20:56:26 UTC
forget everything i said i just fail at patching the driver :/
WGH since i reproduce the problem with Priit udp pcap it just means i m still on the old driver. I dont know how to patch / install corectly a module, each google search got me differents ways.
Comment 73 xplo 2021-01-22 21:21:29 UTC
it seems if i reboot after modprobe r8169 it fall back to the old driver. Sorry i still have to learn a lot about linux kernel.

Without rebooting the pcap udp work fine.
So i wont reboot and let it run to see if it time-out again.
Comment 74 WGH 2021-01-22 21:23:24 UTC
Perhaps, you didn't rebuild your initramfs, and the old module is still in it?
Comment 75 Heiner Kallweit 2021-01-23 09:03:02 UTC
On Ubuntu "update-initramfs -u" seems to be the proper command.
Or you remove r8169 from initramfs, it's not needed except your rootfs is e.g. on NFS. Typically this is done in /etc/mkinitcpio.conf.
Comment 76 xplo 2021-01-23 11:07:23 UTC
i just wont reboot as it s a home server.
so far so good
i just got that in dmesg
[Sat Jan 23 04:19:13 2021] TCP: enp1s0: Driver has suspect GRO implementation, TCP performance may be compromised.

but the smokeping service report no packet loss for all night.

On a sidenote i dont really understand why realtek doesn't fix their own driver. It s like the only 2.5gbe ethernet device that are shiped by realtek with nearly all 2020 motherboard o_o
Comment 77 Heiner Kallweit 2021-01-23 15:03:12 UTC
Sooner or later Realtek is going to come up with a fixed r8125 Linux driver, but apparently their focus is on the Windows driver, especially as their network chips are used primarily on consumer hardware.

Regarding the TCP warning: There has been a number of reports over the last years, with very different network drivers. But I didn't find a good explanation what actually causes the error. One hint was that it can happen if you receive packets bigger than MTU. Do you have any device in your network that may send jumbo packets?
Comment 78 xplo 2021-01-23 21:33:49 UTC
i dont know but at night if it was a jumbo packet it was from the internet (if that is even possible). And i did set mtu to 9000.

24h so far the new driver with Test patch v2. Nothing to report expect it s perfectly working. Thank you so much :)
Comment 79 Heiner Kallweit 2021-01-23 22:39:46 UTC
Good to hear that patch 2 works for you. It has slightly more performance impact than version 1, therefore if possible I'd go with version 1. But only if version 1 definitely fixes the issue. Could you run version 1 again for some time and check whether it's fine as well?
Comment 80 xplo 2021-01-24 00:35:12 UTC
of course performances are important, i m now testing patch1
Comment 81 xplo 2021-01-24 22:11:35 UTC
After one day the patch 1 looks good.
By the way update-initramfs -u did work to keep the driver loaded after reboot thx :)
Comment 82 Heiner Kallweit 2021-01-24 22:52:30 UTC
Great. Just to be sure: You did a update-initramfs after switching from patch v2 to v2?
Comment 83 xplo 2021-01-24 23:25:13 UTC
yes i did it after modprobe r8169 with v1 patch
Comment 84 xplo 2021-01-26 08:16:32 UTC
Still good after 3 days.
I dont see any problem and it s far better than the realtek driver or the current kernel driver.
Comment 85 Heiner Kallweit 2021-01-26 08:39:19 UTC
Great, thanks for the feedback. Then I'll submit patch v1.
Comment 86 Heiner Kallweit 2021-01-30 22:57:00 UTC
Should be fixed with 8d520b4de3ed ("r8169: work around RTL8125 UDP hw bug").
Comment 87 xplo 2021-01-31 20:47:41 UTC
Thank you. 
How can we know in which kernel version it will be merged ?
Comment 88 Heiner Kallweit 2021-01-31 21:27:29 UTC
It may take 1-2 weeks until it's applied to 5.11-rc and the stable kernel versions. Just check the change log of new kernel versions.
Comment 89 curtdept 2021-03-26 21:07:55 UTC
I am also hitting this pretty consistently after some time on an 8111H with all offloads turned off

03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
        Subsystem: Realtek Semiconductor Co., Ltd. Device 0123
        Flags: bus master, fast devsel, latency 0, IRQ 54, IOMMU group 11
        I/O ports at e000 [size=256]
        Memory at fe804000 (64-bit, non-prefetchable) [size=4K]
        Memory at fe800000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 01
        Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Virtual Channel
        Capabilities: [160] Device Serial Number 01-00-00-00-68-4c-e0-00
        Capabilities: [170] Latency Tolerance Reporting
        Capabilities: [178] L1 PM Substates
        Kernel driver in use: r8169
Comment 90 Heiner Kallweit 2021-03-26 21:19:07 UTC
Please attach a full dmesg incl. the error. Then:
- Any specific action how you can trigger it?
- Did you test the potential reasons / fixes discussed here?
- Is it a regression?
Comment 91 curtdept 2021-03-26 22:39:13 UTC
Created attachment 296079 [details]
attachment-20686-0.html

For sure sorry, I had already cleared it with a power cycle on a volatile file system image. When it comes around again I'll try try capture it.

-Curtis
________________________________
From: bugzilla-daemon@bugzilla.kernel.org <bugzilla-daemon@bugzilla.kernel.org>
Sent: Friday, March 26, 2021 3:19:07 PM
To: curtdept@me.com <curtdept@me.com>
Subject: [Bug 209839] r8169 (RTL8125B): "rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100)" and connectivity loss, caused by small fragmented datagrams

https://bugzilla.kernel.org/show_bug.cgi?id=209839

--- Comment #90 from Heiner Kallweit (hkallweit1@gmail.com) ---
Please attach a full dmesg incl. the error. Then:
- Any specific action how you can trigger it?
- Did you test the potential reasons / fixes discussed here?
- Is it a regression?

--
You may reply to this email to add a comment.

You are receiving this mail because:
You are on the CC list for the bug.
Comment 92 curtdept 2021-04-08 20:17:49 UTC
No obvious triggers, short of time, it seems to happen after a few days or so, sometimes as short as a day. Reboot always fixes, causes the NIC to stop responding.

Managed to capture this.

237837.769013] ------------[ cut here ]------------
[237837.774087] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
[237837.781113] WARNING: CPU: 6 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0x1f6/0x200
[237837.789995] Modules linked in: xt_connlimit nf_conncount iptable_nat xt_state xt_nat xt_helper xt_conntrack xt_connmark xt_connbytes xt_REDIRECT xt_MASQUERADE xt_FLOWOFFLOAD xt_CT nf_nat nf_flow_table nf_conntrack ipt_REJECT xt_time xt_tcpudp xt_tcpmss xt_statistic xt_recent xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_comment xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY sch_cake rfcomm nf_reject_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptable_mangle iptable_filter ipt_ECN ip_tables hidp hci_uart btusb btintel bnep bluetooth_6lowpan bluetooth sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred 6lowpan evdev xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port
[237837.790226]  ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ifb vfat fat nls_utf8 nls_iso8859_1 nls_cp437 ecdh_generic ecc kpp button_hotplug
[237837.907427] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 5.10.27 #0
[237837.913962] Hardware name: Maxtang FP30/FP30, BIOS FP30T306 06/19/2020
[237837.921049] RIP: 0010:dev_watchdog+0x1f6/0x200
[237837.925905] Code: 48 63 75 28 eb 91 4c 89 ef c6 05 0d 3c d1 00 01 e8 2f 09 fd ff 44 89 e1 4c 89 ee 48 c7 c7 d8 22 4d 82 48 89 c2 e8 b9 e1 10 00 <0f> 0b eb bc 66 0f 1f 44 00 00 49 89 f9 48 8d 87 40 01 00 00 31 c9
[237837.945635] RSP: 0018:ffffa128802b0ec0 EFLAGS: 00010282
[237837.951281] RAX: 0000000000000039 RBX: ffff949700934800 RCX: 0000000000000000
[237837.958946] RDX: ffff949818f9e2c8 RSI: ffff949818f973a0 RDI: 0000000000000300
[237837.966824] RBP: ffff949706eb8480 R08: 0000000000000000 R09: ffffa128802b0d10
[237837.974543] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
[237837.982175] R13: ffff949706eb8000 R14: ffff949706eb83dc R15: ffff949700934880
[237837.989942] FS:  0000000000000000(0000) GS:ffff949818f80000(0000) knlGS:0000000000000000
[237837.998721] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[237838.004914] CR2: 0000559c83b1c920 CR3: 0000000107a58000 CR4: 00000000003506e0
[237838.012603] Call Trace:
[237838.015336]  <IRQ>
[237838.017573]  ? dev_trans_start+0x70/0x70
[237838.021868]  ? dev_trans_start+0x70/0x70
[237838.026170]  call_timer_fn.constprop.0+0x11/0x70
[237838.031244]  expire_timers+0x94/0xc0
[237838.035152]  run_timer_softirq+0x21c/0x230
[237838.039597]  ? lapic_next_event+0x18/0x20
[237838.043957]  ? clockevents_program_event+0x8a/0xe0
[237838.049172]  ? hrtimer_interrupt+0x136/0x280
[237838.053830]  __do_softirq+0xb3/0x1de
[237838.057761]  asm_call_irq_on_stack+0x12/0x20
[237838.062423]  </IRQ>
[237838.064781]  do_softirq_own_stack+0x32/0x40
[237838.069331]  irq_exit_rcu+0x83/0xb0
[237838.072987]  sysvec_apic_timer_interrupt+0x2e/0x80
[237838.078254]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[237838.083785] RIP: 0010:cpuidle_enter_state+0xba/0x280
[237838.089209] Code: ac 0d 7c ff 31 ff 49 89 c5 e8 02 21 7c ff 41 83 e7 01 74 12 9c 58 f6 c4 02 0f 85 b6 01 00 00 31 ff e8 ba 57 80 ff fb 45 85 f6 <0f> 88 bd 00 00 00 4c 2b 2c 24 49 63 c6 48 6b c8 68 48 8d 14 40 48
[237838.109497] RSP: 0018:ffffa12880157eb0 EFLAGS: 00000202
[237838.115232] RAX: ffff949818fa0b40 RBX: ffff94970147a400 RCX: 000000000000001f
[237838.122917] RDX: 0000000000000000 RSI: 000000003d112d75 RDI: 0000000000000000
[237838.130642] RBP: ffffffff826c8980 R08: 0000d84febbc0f9b R09: 0000000000000018
[237838.138353] R10: 0000000000027c58 R11: ffff949818f9fde4 R12: 0000000000000002
[237838.146117] R13: 0000d84febbc0f9b R14: 0000000000000002 R15: 0000000000000000
[237838.153849]  cpuidle_enter+0x24/0x40
[237838.157780]  do_idle+0x196/0x1f0
[237838.161293]  cpu_startup_entry+0x14/0x20
[237838.165596]  secondary_startup_64_no_verify+0xc2/0xcb
[237838.171152] ---[ end trace 23fecbc400b1bc82 ]---
[237838.189098] r8169 0000:03:00.0 eth1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[237838.201581] r8169 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffffffdf8d90270 flags=0x0018]
[237843.669099] r8169 0000:03:00.0 eth1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[237843.681078] r8169 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffffffdf8880250 flags=0x0018]
[237848.788563] r8169 0000:03:00.0 eth1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[237848.830082] r8169 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffffffdf8000040 flags=0x0018]
[237858.771434] r8169 0000:03:00.0 eth1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[237858.846681] r8169 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffffffdf8d80200 flags=0x0008]
[237863.890870] r8169 0000:03:00.0 eth1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[237863.933365] r8169 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffffffdf82e0240 flags=0x0018]
[237869.777828] r8169 0000:03:00.0 eth1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[237869.991911] r8169 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffffffdf8c70200 flags=0x0008]
[237874.897662] r8169 0000:03:00.0 eth1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[237875.476296] r8169 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfffffffdf8000040 flags=0x0018]
Comment 93 Heiner Kallweit 2021-04-08 20:41:18 UTC
Typically RTL8168h runs fine with r8169, it can be found on a lot of consumer mainboards. The timeout warning provides no further details, "rtl_rxtx_empty_cond == 0" is caused by the chip not responding. Therefore it doesn't say anything about the root cause.

You could check whether it's a regression, IOW whether some previous kernel version is fine. Also you could check whether it works ok with the r8168 vendor driver from Realtek.
Comment 94 curtdept 2021-04-08 20:46:48 UTC
(In reply to Heiner Kallweit from comment #93)
> Typically RTL8168h runs fine with r8169, it can be found on a lot of
> consumer mainboards. The timeout warning provides no further details,
> "rtl_rxtx_empty_cond == 0" is caused by the chip not responding. Therefore
> it doesn't say anything about the root cause.
> 
> You could check whether it's a regression, IOW whether some previous kernel
> version is fine. Also you could check whether it works ok with the r8168
> vendor driver from Realtek.

Ya tried vendor driver, stability on it is significantly worse. It does seem like a regression to the mid 5.4 series, but I havent been able to pinpoint which one due to the length of time to repro the error.
Comment 95 Gustavo A. Díaz 2021-08-16 20:06:48 UTC
Hi,

I think I am facing similar issue. In my case I have a "Realtek Semiconductor Co., Ltd. Device 8161 (rev 15)", running Ubuntu 20.04, Kernel 5.11.0.

Devices names:

- waneth (the ethernet connected to Internet cablemodem)
- laneth (for the internal lan)
- lanbr (the bridge)
- wlan24g (one USB3.0 wifi dongle, TP-Link Archer T9UH V2, which uses RTLRTL8814AU, and with drivers from https://github.com/morrownr/8814au)
- wlan5g (same wifi dongle)

NOTE: I manage the network by systemd (systemd-networkd) and dnsmasq (for DHCP). Assigned those network devices names via systemd.

My problem, I believe, started when I've attached to my existing bridge (lanbr) the 2 USB Wifi devices, and started to show in logs:

[  456.011918] r8169 0000:06:00.0 laneth: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100).
[  456.021994] r8169 0000:06:00.0 laneth: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).

So, the LAN network, via UTP cable, is not working (WIFI works perfectly) since this behavior.

This started to happen 3 days ago, which matches the day I've configured/installed/bridged the USB WIFI devices.

Any ideas?
Comment 96 Gustavo A. Díaz 2021-08-17 00:44:04 UTC
(In reply to Gustavo A. Díaz from comment #95)
> Hi,
> 
> I think I am facing similar issue. In my case I have a "Realtek
> Semiconductor Co., Ltd. Device 8161 (rev 15)", running Ubuntu 20.04, Kernel
> 5.11.0.
> 
> Devices names:
> 
> - waneth (the ethernet connected to Internet cablemodem)
> - laneth (for the internal lan)
> - lanbr (the bridge)
> - wlan24g (one USB3.0 wifi dongle, TP-Link Archer T9UH V2, which uses
> RTLRTL8814AU, and with drivers from https://github.com/morrownr/8814au)
> - wlan5g (same wifi dongle)
> 
> NOTE: I manage the network by systemd (systemd-networkd) and dnsmasq (for
> DHCP). Assigned those network devices names via systemd.
> 
> My problem, I believe, started when I've attached to my existing bridge
> (lanbr) the 2 USB Wifi devices, and started to show in logs:
> 
> [  456.011918] r8169 0000:06:00.0 laneth: rtl_txcfg_empty_cond == 0 (loop:
> 42, delay: 100).
> [  456.021994] r8169 0000:06:00.0 laneth: rtl_rxtx_empty_cond == 0 (loop:
> 42, delay: 100).
> 
> So, the LAN network, via UTP cable, is not working (WIFI works perfectly)
> since this behavior.
> 
> This started to happen 3 days ago, which matches the day I've
> configured/installed/bridged the USB WIFI devices.
> 
> Any ideas?

Never mind, seems in my particular case was a problem with a UTP cable inside my LAN, which was causing flood from one device trying to get an IP via DHCP. Changed the cable, and fixed :S
Comment 97 Charles Haithcock 2022-05-16 05:58:33 UTC
Created attachment 300966 [details]
Dmesg output
Comment 98 Charles Haithcock 2022-05-16 07:17:18 UTC
Having the same issue with the same chipset (RTL8168h), so throwing some info in as well.

(In reply to Heiner Kallweit from comment #90)
> Please attach a full dmesg incl. the error. Then:
> - Any specific action how you can trigger it?
> - Did you test the potential reasons / fixes discussed here?
> - Is it a regression?


Full dmesg is attached in c.97


It seems to happen simply by booting now. Dmesg shows a timeout not too long after loading the module and then the empty condition errors start occurring. With all this, I can not get any networking up and going now for some reason. This started occurring right after an upgrade to ubuntu from 20.04 to 22.04 to two separate systems. I was running the 5.4 kernel and jumped to 5.15 with the upgrade. 


For testing, I've tried the driver directly from realtek with no positive change in behavior; in fact, it produces less logs but outright does not provide any interface for NetworkManager to pick up on. I also tested the kmod packages for ubuntu with no luck. I'm getting errors when trying to build both the mainline 5.18-rc7 kernel and 5.17.8 kernel; 



~/linux-5.17.8 via 🐍 v3.10.4 
❯ sudo make 
  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  DESCEND objtool
  DESCEND bpf/resolve_btfids
  CHK     include/generated/compile.h
  CHK     kernel/kheaders_data.tar.xz
make[1]: *** No rule to make target 'debian/canonical-certs.pem', needed by 'certs/x509_certificate_list'.  Stop.
make: *** [Makefile:1831: certs] Error 2


~/linux-5.18-rc7 via 🐍 v3.10.4 
❯ sudo make 
  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  DESCEND objtool
  CHK     include/generated/compile.h
  CC [M]  arch/x86/kvm/emulate.o
In function ‘reg_read’,
    inlined from ‘reg_rmw’ at arch/x86/kvm/emulate.c:266:2:
arch/x86/kvm/emulate.c:254:27: error: array subscript 32 is above array bounds of ‘long unsigned int[17]’ [-Werror=array-bounds]
  254 |         return ctxt->_regs[nr];
      |                ~~~~~~~~~~~^~~~
In file included from arch/x86/kvm/emulate.c:23:
arch/x86/kvm/kvm_emulate.h: In function ‘reg_rmw’:
arch/x86/kvm/kvm_emulate.h:366:23: note: while referencing ‘_regs’
  366 |         unsigned long _regs[NR_VCPU_REGS];
      |                       ^~~~~
cc1: all warnings being treated as errors
make[2]: *** [scripts/Makefile.build:288: arch/x86/kvm/emulate.o] Error 1
make[1]: *** [scripts/Makefile.build:550: arch/x86/kvm] Error 2
make: *** [Makefile:1834: arch/x86] Error 2



Any help on the above would be appreciated and I can try building again with that help. Not sure what's going wrong here. FWIW I copied /boot/config-`uname -r`* to .config and, in menuconfig, just open and save it as is (I'm not savvy enough with the kernel to reasonably know what every option should be). 


As an end user, it absolutely feels like a regression introduced by something. I updated two systems, both with the same Realtek chip, from ubuntu 20.04 to 22.04 and both are now failing in the same way without any other changes. Neither system goes back to prior working behavior when booting from an old kernel either and neither shows working networking when booting from an ubuntu 22.04 live usb. 


I can also try other things to see what works and does not. Also please let me know if I missed something, it's late and I need to get rest.
Comment 99 Heiner Kallweit 2022-05-16 07:23:01 UTC
(In reply to Charles Haithcock from comment #97)
> Created attachment 300966 [details]
> Dmesg output

Seems your report refers to an Ubuntu kernel. Please pay attention to the hint on the kernel bugzilla startpage. Issues with distro kernels should be reported to distro support forums.
Alternatively you can build a mainline kernel and re-test.
Comment 100 Charles Haithcock 2022-05-16 12:19:39 UTC
(In reply to Heiner Kallweit from comment #99)
> (In reply to Charles Haithcock from comment #97)
> > Created attachment 300966 [details]
> > Dmesg output
> 
> Seems your report refers to an Ubuntu kernel. Please pay attention to the
> hint on the kernel bugzilla startpage. Issues with distro kernels should be
> reported to distro support forums.
> Alternatively you can build a mainline kernel and re-test.

I understand. That's why I also mentioned in my comment I tried building the two most recent kernels available:

> I'm getting errors when trying to build both the mainline 5.18-rc7 kernel and
> 5.17.8 kernel; 

It's very likely a problem with how I'm building it but not sure what I chose incorrectly in menuconfig to prevent it from building properly. I don't mind helping to isolate an issue within the requirements outlined, but I need some help doing so.
Comment 101 Heiner Kallweit 2022-05-16 16:56:07 UTC
Best don't reuse the existing .config but create a clean new one with "make defconfig".
Comment 102 Charles Haithcock 2022-05-16 18:09:25 UTC
(In reply to Heiner Kallweit from comment #101)
> Best don't reuse the existing .config but create a clean new one with "make
> defconfig".

Thank you! I found various tutorials online which pointed out to just copy over the .config file; I'm unfamiliar with building kernels with just make, where I work I sometimes compile kernels but it's a super convoluted process and I don't even touch make or any build toolchain directly. 

Interestingly, everything works now. This morning, I attempted to boot both ubuntu 21.04 as well as fedora 36 from usb and both had proper networking activity. After rebooting to the kernel which experiened the issue in both systems, the issue is no longer present. This is exceedingly odd due to several reboots for both systems last night into the kernel whih originally experienced the issue as well as all the older kernels (I couldn't boot into one of them due to issues with grub finding / but didn't bother pursuing fixing that). 

Apologies! If it happens again, should a new bz be opened?
Comment 103 Heiner Kallweit 2022-05-17 10:12:07 UTC
(In reply to Charles Haithcock from comment #102)
> (In reply to Heiner Kallweit from comment #101)
> > Best don't reuse the existing .config but create a clean new one with "make
> > defconfig".
> 
> Thank you! I found various tutorials online which pointed out to just copy
> over the .config file; I'm unfamiliar with building kernels with just make,
> where I work I sometimes compile kernels but it's a super convoluted process
> and I don't even touch make or any build toolchain directly. 
> 
> Interestingly, everything works now. This morning, I attempted to boot both
> ubuntu 21.04 as well as fedora 36 from usb and both had proper networking
> activity. After rebooting to the kernel which experiened the issue in both
> systems, the issue is no longer present. This is exceedingly odd due to
> several reboots for both systems last night into the kernel whih originally
> experienced the issue as well as all the older kernels (I couldn't boot into
> one of them due to issues with grub finding / but didn't bother pursuing
> fixing that). 
> 
> Apologies! If it happens again, should a new bz be opened?

This would be better. The bug report here is about RTL8125B and issues caused by small fragmented packets. Even though you faced similar symptoms it's a different story.
Comment 104 Lubos Dolezel 2022-06-23 09:16:36 UTC
I run a network-based desktop with rootfs via iSCSI and /home via NFS4. I also do a lot of development in Golang. All this together means it's a very network heavy system.

I get the error from the title at every boot, but it recovers. Then, during work (e.g. when "gopls" kicks in), I get various intermittent network hangs (anywhere between 5 and 30 seconds). Sometimes the whole system permanently hangs (even the mouse cursor freezes) and I have to reset.

Unfortunately, there are absolutely no warning/error messages in dmesg when hangs occur. When the full freeze happens, I can't access the system in any way to check - no wonder, as it fully depends on the network - but it responds to ping. I intend to get a RS232 port working in the system to capture some final kernel logs, if there are any.

What did help with performance and these hangs, though, was switching to the out-of-tree r8168 kernel module.
Comment 105 Roman Mamedov 2023-01-26 19:47:39 UTC
Hello,

With an add-on PCI-E card using RTL8125A (XID 609),

driver: r8169
version: 5.10.164-rm1+
firmware-version: rtl8125a-3_0.0.1 08/24/19
expansion-rom-version: 
bus-info: 0000:06:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

Got a backtrace once, and then just these timeout messages once in a while, I do not observe any unrecoverable hang or significant link downtime.

The system also has an RTL8125B (XID 641) on the motherboard, that one seems fine.

[382619.488071] ------------[ cut here ]------------
[382619.488074] NETDEV WATCHDOG: rt1 (r8169): transmit queue 0 timed out
[382619.488084] WARNING: CPU: 21 PID: 0 at net/sched/sch_generic.c:467 dev_watchdog+0x25c/0x260
[382619.488084] Modules linked in: tcp_diag(E) udp_diag(E) inet_diag(E) unix_diag(E) dm_snapshot(E) xt_set(E) ip_set_hash_net(E) ip_set(E) nfnetlink(E) veth(E) vhost_net(E) vhost(E) vhost_iotlb(E) tap(E) tun(E) i2c_dev(E) sit(E) tunnel4(E) ip_tunnel(E) xt_comment(E) xt_multiport(E) xt_limit(E) xt_length(E) xt_CT(E) xt_state(E) xt_conntrack(E) ip6t_rpfilter(E) ipt_rpfilter(E) ip6table_nat(E) ip6table_raw(E) ip6table_mangle(E) iptable_raw(E) ip6table_filter(E) ip6_tables(E) iptable_mangle(E) xt_CHECKSUM(E) iptable_nat(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) xt_tcpudp(E) cpufreq_userspace(E) cpufreq_conservative(E) iptable_filter(E) ip_tables(E) cpufreq_ondemand(E) x_tables(E) cpufreq_powersave(E) fuse(E) nbd(E) 8021q(E) garp(E) mrp(E) bridge(E) stp(E) llc(E) tcp_bbr(E) xfs(E) crc32_generic(E) loop(E) dm_thin_pool(E) edac_mce_amd(E) kvm_amd(E) radeon(E) kvm(E) snd_pcsp(E) ttm(E) snd_pcm(E) irqbypass(E) drm_kms_helper(E) snd_timer(E) rapl(E) snd(E) cec(E)
[382619.488124]  huawei_cdc_ncm(E) sp5100_tco(E) soundcore(E) wmi_bmof(E) watchdog(E) qmi_wwan(E) cdc_ncm(E) k10temp(E) ccp(E) evdev(E) option(E) drm(E) cdc_wdm(E) cdc_ether(E) joydev(E) cp210x(E) rng_core(E) usb_wwan(E) usbnet(E) i2c_algo_bit(E) usbserial(E) mii(E) sg(E) acpi_cpufreq(E) button(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) dm_crypt(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) sr_mod(E) xor(E) cdrom(E) raid6_pq(E) raid0(E) multipath(E) linear(E) dm_cache_smq(E) dm_cache(E) dm_persistent_data(E) dm_bio_prison(E) dm_bufio(E) dm_mod(E) libcrc32c(E) crc32c_generic(E) hid_generic(E) usbhid(E) raid1(E) hid(E) sd_mod(E) uas(E) usb_storage(E) crc32_pclmul(E) crc32c_intel(E) md_mod(E) ghash_clmulni_intel(E) aesni_intel(E) libaes(E) crypto_simd(E) cryptd(E) glue_helper(E) r8169(E) ahci(E) xhci_pci(E) libahci(E) realtek(E) nvme(E) mdio_devres(E) xhci_hcd(E) i2c_piix4(E) libata(E) nvme_core(E) libphy(E) usbcore(E)
[382619.488167]  t10_pi(E) crc_t10dif(E) usb_common(E) scsi_mod(E) crct10dif_generic(E) crct10dif_pclmul(E) crct10dif_common(E) wmi(E) gpio_amdpt(E) gpio_generic(E)
[382619.488175] CPU: 21 PID: 0 Comm: swapper/21 Tainted: G            E     5.10.164-rm1+ #324
[382619.488176] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B550 PG Velocita, BIOS P2.10 08/04/2021
[382619.488178] RIP: 0010:dev_watchdog+0x25c/0x260
[382619.488180] Code: eb a9 48 8b 1c 24 c6 05 e0 09 cd 00 01 48 89 df e8 09 32 fa ff 44 89 e9 48 89 c2 48 89 de 48 c7 c7 20 85 f5 a4 e8 6f 4e 15 00 <0f> 0b eb 86 0f 1f 44 00 00 48 83 ec 40 48 89 5c 24 10 48 89 6c 24
[382619.488181] RSP: 0018:ffff98f9005e4eb0 EFLAGS: 00010282
[382619.488182] RAX: 0000000000000000 RBX: ffff88d480c6a000 RCX: ffff88e37ef5ca08
[382619.488183] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff88e37ef5ca00
[382619.488184] RBP: ffff88d480c6a3dc R08: 0000000000000000 R09: ffff98f9005e4cd0
[382619.488185] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88d4a16ff080
[382619.488185] R13: 0000000000000000 R14: ffff88d480c6a480 R15: 0000000000000001
[382619.488187] FS:  0000000000000000(0000) GS:ffff88e37ef40000(0000) knlGS:0000000000000000
[382619.488188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[382619.488188] CR2: fffff8a002e02000 CR3: 000000012be94000 CR4: 0000000000750ee0
[382619.488189] PKRU: 55555554
[382619.488190] Call Trace:
[382619.488192]  <IRQ>
[382619.488195]  ? pfifo_fast_enqueue+0x180/0x180
[382619.488197]  call_timer_fn+0x29/0x100
[382619.488199]  __run_timers.part.0+0x1d4/0x250
[382619.488200]  ? ktime_get+0x38/0xa0
[382619.488203]  ? native_x2apic_icr_read+0x20/0x20
[382619.488204]  ? lapic_next_event+0x1d/0x30
[382619.488206]  ? clockevents_program_event+0xad/0x130
[382619.488207]  run_timer_softirq+0x26/0x50
[382619.488210]  __do_softirq+0xc3/0x275
[382619.488212]  asm_call_irq_on_stack+0x12/0x20
[382619.488213]  </IRQ>
[382619.488216]  do_softirq_own_stack+0x36/0x50
[382619.488218]  irq_exit_rcu+0x92/0xd0
[382619.488221]  sysvec_apic_timer_interrupt+0x36/0x80
[382619.488222]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[382619.488225] RIP: 0010:cpuidle_enter_state+0xc5/0x360
[382619.488226] Code: 8b 3d af e4 b5 5b e8 3a a0 a1 ff 49 89 c5 0f 1f 44 00 00 31 ff e8 5b ac a1 ff 45 84 ff 0f 85 ff 00 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 0b 01 00 00 49 63 ce be 68 00 00 00 4c 2b 2c 24 48
[382619.488227] RSP: 0018:ffff98f9001efea8 EFLAGS: 00000246
[382619.488228] RAX: ffff88e37ef6fbc0 RBX: 0000000000000002 RCX: 000000000000001f
[382619.488229] RDX: 0000000000000000 RSI: 0000000022a8da03 RDI: 0000000000000000
[382619.488229] RBP: ffff88d4a0a78c00 R08: 00015bfd8aab86e1 R09: 000000000000000a
[382619.488230] R10: 0000000000007d99 R11: ffff88e37ef6eb24 R12: ffffffffa51a4ca0
[382619.488231] R13: 00015bfd8aab86e1 R14: 0000000000000002 R15: 0000000000000000
[382619.488233]  ? cpuidle_enter_state+0xb5/0x360
[382619.488234]  cpuidle_enter+0x37/0x60
[382619.488237]  do_idle+0x1fb/0x2c0
[382619.488238]  cpu_startup_entry+0x19/0x20
[382619.488241]  secondary_startup_64_no_verify+0xb0/0xbb
[382619.488243] ---[ end trace b6bb48d8503993d8 ]---
[382619.501118] r8169 0000:06:00.0 rt1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
...
[444128.054196] r8169 0000:06:00.0 rt1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[446939.932765] r8169 0000:06:00.0 rt1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
...
[450948.856709] r8169 0000:06:00.0 rt1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[451152.886863] r8169 0000:06:00.0 rt1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
[451308.789226] r8169 0000:06:00.0 rt1: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).
Comment 106 Heiner Kallweit 2023-01-26 21:21:56 UTC
Try disabling ASPM L1.1, either in BIOS or using the respective sysfs property.
Comment 107 Roman Mamedov 2023-01-26 21:48:31 UTC
Hello,

It appears to be disabled (for 0000:06:00.0):

# grep . `find /sys/ -type f | grep aspm`
/sys/devices/pci0000:00/0000:00:01.2/0000:02:00.2/0000:03:00.0/0000:04:00.0/link/l1_aspm:1
/sys/devices/pci0000:00/0000:00:01.2/0000:02:00.2/0000:03:08.0/0000:07:00.0/link/l1_aspm:0
/sys/devices/pci0000:00/0000:00:01.2/0000:02:00.2/link/l1_aspm:1
/sys/devices/pci0000:00/0000:00:01.2/0000:02:00.2/0000:03:07.0/0000:06:00.0/link/l1_aspm:0
/sys/devices/pci0000:00/0000:00:01.2/0000:02:00.0/link/l1_aspm:1
/sys/devices/pci0000:00/0000:00:01.2/0000:02:00.1/link/l1_aspm:1
/sys/devices/pci0000:00/0000:00:08.1/0000:0a:00.0/link/l0s_aspm:0
/sys/devices/pci0000:00/0000:00:08.1/0000:0a:00.0/link/l1_aspm:0
/sys/devices/pci0000:00/0000:00:08.1/0000:0a:00.3/link/l0s_aspm:0
/sys/devices/pci0000:00/0000:00:08.1/0000:0a:00.3/link/l1_aspm:0
/sys/devices/pci0000:00/0000:00:08.1/0000:0a:00.1/link/l0s_aspm:0
/sys/devices/pci0000:00/0000:00:08.1/0000:0a:00.1/link/l1_aspm:0
/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/link/l1_1_aspm:0
/sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/link/l1_aspm:1
/sys/devices/pci0000:00/0000:00:07.1/0000:09:00.0/link/l0s_aspm:0
/sys/devices/pci0000:00/0000:00:07.1/0000:09:00.0/link/l1_aspm:0
grep: /sys/module/pcie_aspm/uevent: Permission denied
/sys/module/pcie_aspm/parameters/policy:[default] performance powersave powersupersave 
/sys/module/radeon/parameters/aspm:-1
Comment 108 Heiner Kallweit 2023-01-26 22:13:06 UTC
Yes, even L1 is disabled, so ASPM should not be the root cause.
You could test whether latest linux-next is better, and you can test with r8125 vendor driver.
Comment 109 Marcin Parka 2023-05-10 06:49:41 UTC
I am not 100% sure if this is related but I am encountering the same kind of error. My setup is proxmox with kernel 5.15, but during tests I have tried also 6.2.

As attachment I am providing kernel.log from 1st of May, when I tried all combinations of kernel version as well as using r8169 and vendor's r8125. The case is that vendor driver r8125 is working without any problems but r8169 is crashing (however, restoring the activity after some time). From the log please note:


May  1 11:27:55 pve kernel: [  291.063523] vmbr0: port 2(enp3s0) entered forwarding state
May  1 11:28:16 pve kernel: [  311.589586] ------------[ cut here ]------------
May  1 11:28:16 pve kernel: [  311.589654] NETDEV WATCHDOG: enp3s0 (r8169): transmit queue 0 timed out
May  1 11:28:16 pve kernel: [  311.589820] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x212/0x220


You may also see r8125 "correct behavior" in similar cases.

Now something that I didn't noticed in previous comments - for me the error happens when I unplug the network cable and plug in again. So, I am able to reproduce it without any problems (no need to wait).

Maybe from reporter's perspective it was similar, just carrier went down etc.?
Comment 110 Marcin Parka 2023-05-10 06:52:46 UTC
I do not know how to attach file (if passible - somebody can attach it? the this comment can be deleted):

https://data.online-mp.pl/s/GrM7tNzGRLRdz9X
Comment 111 Heiner Kallweit 2023-05-10 08:51:23 UTC
(In reply to Marcin from comment #110)
> I do not know how to attach file (if passible - somebody can attach it? the
> this comment can be deleted):
> 
> https://data.online-mp.pl/s/GrM7tNzGRLRdz9X

See the "Attachments" part, there's an option "Add an attachment".

Could you please re-test with 6.4-rc1 and r8169?

I'v heard before about an issue that on a specific RTL8168 chip version and maybe under specific circumstances a tx timeout occured after removing and re-plugging the cable. Maybe some hw erratum.
Comment 112 Marcin Parka 2023-05-10 09:02:25 UTC
Created attachment 304240 [details]
Kernel 5.15 and 6.2 vs. r8169 and r8125

Thank you. File is attached.

For the request to re-test with 6.4-rc1 and r8169 - I may have some difficulties as I do not have proper (spare) hardware for such a test.

But maybe would it be ok to create VM (with Linux 6.4-rc1 and with included r8169) and passthrough PCIe device to it?
Comment 113 Marcin Parka 2023-05-10 09:04:09 UTC
My setup is a following (01:00.0 is add-on card, 02:00.0 and 03:00.0 are built-in):

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 04)
        Subsystem: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
        Kernel driver in use: r8125
        Kernel modules: r8169, r8125
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 04)
        Subsystem: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
        Kernel driver in use: r8125
        Kernel modules: r8169, r8125
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 04)
        Subsystem: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
        Kernel driver in use: r8125
        Kernel modules: r8169, r8125
Comment 114 Heiner Kallweit 2023-05-11 05:44:31 UTC
Not sure whether the VM use case is a fully valid test case. But give it a try.

Do the built-in and add-on NICs behave the same? I'm asking because the BIOS may do some special initialization for the built-in NIC, resulting in a different behavior.
Comment 115 Marcin Parka 2023-05-11 10:57:31 UTC
(In reply to Heiner Kallweit from comment #114)
> Not sure whether the VM use case is a fully valid test case. But give it a
> try.

Sure, I will do this test. But I have a question (it was very long time since I have compiled kernel).

If I did "make defconfig" - compilation was fast but almost without any modules.

So, I did copied /boot/config-6.1.0-0.deb11.6-amd64 as .config and executed "make olddefconfig". Will it be ok (or there is better way to match configuration as much as possible)?

BTW. I will do the test on VM with:

[root@r8169 ~]# uname -a
Linux r8169 6.1.0-0.deb11.6-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.15-1~bpo11+1 (2023-03-16) x86_64 GNU/Linux


and then with new "6.4-rc1" (which I am compiling now). Bot kernels. With firmware for RTL8125B "not installed" and "installed".

So, in total there will be 4 tests. However, in case "VM Test Platform" is not valid test case... I will need to think about different approach ().


> 
> Do the built-in and add-on NICs behave the same?

Yes. By means - the same error is thrown in dmesg. I didn't noticed any differences in behavior. BTW. This is Odroid H2+ - and "built-in" NICs are simply soldered Realtek RTL8125B chips assigned to two exposed (from CPU) PCIe lanes (from remaining 4 - 1 lane is assigned to "add-on" RTL8125B).
Comment 116 Marcin Parka 2023-05-11 18:18:22 UTC
Created attachment 304247 [details]
Test results - kernel 6.1.0
Comment 117 Marcin Parka 2023-05-11 18:18:47 UTC
Created attachment 304248 [details]
Test results - kernel 6.4.0-rc1
Comment 118 Marcin Parka 2023-05-11 18:28:44 UTC
Please see as provided (in comment #c116 and #c117) - for both kernel versions - I am able to reproduce the issue (by simply plugging the cable out and in). Remark - please note the timing between:

[   13.789561] r8169 0000:00:10.0 enx001e064549e5: Link is Down
[   14.418911] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=515 'systemd'
[   17.325981] r8169 0000:00:10.0 enx001e064549e5: Link is Up - 2.5Gbps/Full - flow control off
[   17.326031] IPv6: ADDRCONF(NETDEV_CHANGE): enx001e064549e5: link becomes ready
[  150.591000] r8169 0000:00:10.0 enx001e064549e5: Link is Down
[  164.609442] r8169 0000:00:10.0 enx001e064549e5: Link is Up - 2.5Gbps/Full - flow control off
[  351.022591] ------------[ cut here ]------------
[  351.022636] NETDEV WATCHDOG: enx001e064549e5 (r8169): transmit queue 0 timed out 9408 ms


164 - plug in the cable - dmesg is showing immediately the change "Link is Up" BUT - network is not active (in both ingress - eg. pinging my host from other machine and egress - pinging other host from test machine).
351 - when stack trace is thrown - network is full active again. It mean that "inactivity" lasted for more than 3 minutes (kernel 6.4) or 2 minutes (kernel 6.1)?

Just a guess - besides of the reason (root cause of the issue, which is the most important here) - interface seems to become active after:
6.1.0 - 120 seconds + ~timeout (a few seconds?)
6.4.0 - 180 seconds + ~timeout (a few seconds?)
Comment 119 Marcin Parka 2023-05-11 18:35:24 UTC
Comment for above - timing between "Link is Up" (interface is inactive) and stack trace (interface becomes active again) is really RANDOM, as see below.

Tests WITH or WITHOUT firmware are giving the same results, here is 6.4.0-rc1 without firmware:

[   13.083300] Console: switching to colour frame buffer device 160x50
[   13.100055] bochs-drm 0000:00:02.0: [drm] fb0: bochs-drmdrmfb frame buffer device
[   13.116609] Error: Driver 'pcspkr' is already registered, aborting...
[   13.395255] r8169 0000:00:10.0: Direct firmware load for rtl_nic/rtl8125b-2.fw failed with error -2
[   13.395270] r8169 0000:00:10.0: Unable to load firmware rtl_nic/rtl8125b-2.fw (-2)
[   13.425483] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-80:00: attached PHY driver (mii_bus:phy_addr=r8169-0-80:00, irq=MAC)
[   13.502455] r8169 0000:00:10.0 enx001e064549e5: No native access to PCI extended config space, falling back to CSI
[   13.529332] r8169 0000:00:10.0 enx001e064549e5: Link is Down
[   14.155465] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=513 'systemd'
[   17.113067] r8169 0000:00:10.0 enx001e064549e5: Link is Up - 2.5Gbps/Full - flow control off
[   17.113177] IPv6: ADDRCONF(NETDEV_CHANGE): enx001e064549e5: link becomes ready
[   59.694671] r8169 0000:00:10.0 enx001e064549e5: Link is Down
[   77.076825] r8169 0000:00:10.0 enx001e064549e5: Link is Up - 2.5Gbps/Full - flow control off
[  143.526494] ------------[ cut here ]------------
[  143.526540] NETDEV WATCHDOG: enx001e064549e5 (r8169): transmit queue 0 timed out 5664 ms
[  143.526608] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x215/0x220
[  143.526620] Modules linked in: snd_pcm snd_timer bochs snd drm_vram_helper drm_ttm_helper ttm soundcore virtio_console pcspkr drm_kms_helper virtio_balloon button evdev joydev serio_raw sg drm sunrpc fuse configfs efi                                                                                                 _pstore qemu_fw_cfg ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c crc32c_generic sd_mod t10_pi crc64_rocksoft crc64 crc_t10dif crct10dif_generic virtio_net net_failover crct10dif_common failover                                                                                                  virtio_scsi ata_generic ata_piix libata r8169 uhci_hcd ehci_hcd virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev i2c_piix4 realtek virtio usbcore mdio_devres libphy psmouse virtio_ring scsi_mod usb_common scsi_com                                                                                                 mon floppy
[  143.526797] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.0-rc1 #8
[  143.526802] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
[  143.526805] RIP: 0010:dev_watchdog+0x215/0x220
Comment 120 Marcin Parka 2023-05-11 19:55:44 UTC
Additional aspect - the problem exists no matter if connection is 2.5Gbps or 1Gbps:


[   13.087589] fbcon: bochs-drmdrmfb (fb0) is primary device
[   13.127481] Console: switching to colour frame buffer device 160x50
[   13.136263] Error: Driver 'pcspkr' is already registered, aborting...
[   13.145607] bochs-drm 0000:00:02.0: [drm] fb0: bochs-drmdrmfb frame buffer device
[   14.481616] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-80:00: attached PHY driver (mii_bus:phy_addr=r8169-0-80:00, irq=MAC)
[   14.642369] r8169 0000:00:10.0 enx001e064549e5: No native access to PCI extended config space, falling back to CSI
[   14.669778] r8169 0000:00:10.0 enx001e064549e5: Link is Down
[   15.307257] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=514 'systemd'
[   17.654585] r8169 0000:00:10.0 enx001e064549e5: Link is Up - 1Gbps/Full - flow control off
[   17.654615] IPv6: ADDRCONF(NETDEV_CHANGE): enx001e064549e5: link becomes ready
[   37.268371] r8169 0000:00:10.0 enx001e064549e5: Link is Down
[   44.092858] r8169 0000:00:10.0 enx001e064549e5: Link is Up - 1Gbps/Full - flow control off
[   70.052069] r8169 0000:00:10.0 enx001e064549e5: Link is Down
[   78.661951] r8169 0000:00:10.0 enx001e064549e5: Link is Up - 1Gbps/Full - flow control off
[  153.913556] ------------[ cut here ]------------
[  153.913598] NETDEV WATCHDOG: enx001e064549e5 (r8169): transmit queue 0 timed out 8352 ms
[  153.913638] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:525 dev_watchdog+0x215/0x220
[  153.913651] Modules linked in: snd_pcm snd_timer bochs drm_vram_helper drm_ttm_helper snd ttm soundcore virtio_console virtio_balloon pcspkr drm_kms_helper button evdev joydev serio_raw sg drm fuse sunrpc configfs efi_pstore qemu_fw_cfg ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c crc32c_generic sd_mod t10_pi crc64_rocksoft crc64 crc_t10dif crct10dif_generic uhci_hcd crct10dif_common r8169 ehci_hcd virtio_net realtek net_failover ata_generic virtio_scsi failover mdio_devres psmouse libphy virtio_pci usbcore virtio_pci_legacy_dev virtio_pci_modern_dev ata_piix virtio libata i2c_piix4 scsi_mod virtio_ring usb_common scsi_common floppy
[  153.913816] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.0-rc1 #8
[  153.913821] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
[  153.913825] RIP: 0010:dev_watchdog+0x215/0x220
[  153.913830] Code: ff ff ff 4c 89 e7 c6 05 ba 12 09 01 01 e8 43 62 fa ff 45 89 f8 44 89 f1 4c 89 e6 48 89 c2 48 c7 c7 80 32 4e 8b e8 5b c7 73 ff <0f> 0b e9 2e ff ff ff 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90
[  153.913835] RSP: 0018:ffffb3c600003ea0 EFLAGS: 00010286
Comment 121 Heiner Kallweit 2023-05-11 20:05:14 UTC
Thanks a lot for testing! As I said, apparently this a HW erratum. Unfortunately Realtek doesn't publish such information.
The vendor drivers seem to work around this by doing a full hw initialization on each link-up.
To simulate this for testing:
Supposedly runtime pm for the NIC isn't active on your system.
Do a "echo auto > /sys/class/net/<if>/power/control" and unplug the cable for more than 10s. Then NIC goes into runtime suspend with PCIe switching to D3hot.
The PME triggered by link-up initializes the HW. You should also see the PCIe PME interrupt count incrementing in /proc/interrupts.
Comment 122 Marcin Parka 2023-05-12 19:28:08 UTC
If you could give a little bit more explanation. One thing is that for Proxmox hypervisor ("pve" host) I have disabled ASPM. The reason is that vendor driver (which I am using now) when getting ingress traffic was "waking up slowly". It was visible by iperf3 -s for inbound. I was getting ~300mbit, ~300mbit, ~300mbit, then after 3-5 seconds: 2.47 Gbit, 2.47 Gbit. I have found the solution - I was able to make it "always fast" by changing the ASPM settings in BIOS or giving parameters in command line as given below (see "pcie_port_pm=off pcie_aspm.policy=performance"). And remark - r8169 was NOT impacted by performance by ASPM - it was fast always. But it has the issue, which we are discussing here.

[root@pve ~]# dmesg | head -3
[    0.000000] microcode: microcode updated early to revision 0x3e, date = 2022-09-16
[    0.000000] Linux version 5.15.107-1-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.15.107-1 (2023-04-20T10:05Z) ()
[    0.000000] Command line: BOOT_IMAGE=/@/boot/vmlinuz-5.15.107-1-pve root=UUID=7719ce9c-5acb-4d3d-ae50-112101732256 ro rootflags=subvol=@ quiet nosgx intel_iommu=on iommu=pt pcie_port_pm=off pcie_aspm.policy=performance
[root@pve ~]# ll /sys/class/net/enp*s0/power/control
-rw-r--r-- 1 root root 4096 May 12 07:38 /sys/class/net/enp1s0/power/control
-rw-r--r-- 1 root root 4096 May 12 07:38 /sys/class/net/enp2s0/power/control
[root@pve ~]# cat /sys/class/net/enp*s0/power/control
auto
auto
[root@pve ~]#

Note, that I haven't executed the "echo" as given by you - my two interfaces are already in auto control. The 3rd one (enp3s0) is not displayed as it is forwarded to VM (so, host machine stops seeing this interface).

Now the same for VM itself - on which I have been executing all tests. The interface which was forwarded by PCIe passthrough is enx001e064549e5. The enx26ee9bd6b647 is virtio network. But as you can see - they are both in "auto" as well (without doing any additional command):


[root@r8169 ~]# dmesg | head -3
[    0.000000] Linux version 6.4.0-rc1 (root@PK) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #8 SMP PREEMPT_DYNAMIC Thu May 11 18:35:51 CEST 2023
[    0.000000] Command line: BOOT_IMAGE=/@/boot/vmlinuz-6.4.0-rc1 root=UUID=8f1ac5cf-d262-4729-ba23-dd0513ee3db7 ro rootflags=subvol=@ quiet
[    0.000000] x86/fpu: x87 FPU will use FXSAVE
[root@r8169 ~]# dmesg | head -3
[    0.000000] Linux version 6.4.0-rc1 (root@PK) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #8 SMP PREEMPT_DYNAMIC Thu May 11 18:35:51 CEST 2023
[    0.000000] Command line: BOOT_IMAGE=/@/boot/vmlinuz-6.4.0-rc1 root=UUID=8f1ac5cf-d262-4729-ba23-dd0513ee3db7 ro rootflags=subvol=@ quiet
[    0.000000] x86/fpu: x87 FPU will use FXSAVE
[root@r8169 ~]# ll /sys/class/net/enx*/power/control
-rw-r--r-- 1 root root 4096 May 12 21:10 /sys/class/net/enx26ee9bd6b647/power/control
-rw-r--r-- 1 root root 4096 May 12 21:10 /sys/class/net/enx001e064549e5/power/control
[root@r8169 ~]# cat /sys/class/net/enx*/power/control
auto
auto
[root@r8169 ~]#



Now /proc/interrupts - which one is PCIe PME?

[root@r8169 ~]# cat /proc/interrupts
           CPU0       CPU1
  0:         33          0   IO-APIC   2-edge      timer
  1:          0          9   IO-APIC   1-edge      i8042
  6:          0          3   IO-APIC   6-edge      floppy
  8:          1          0   IO-APIC   8-edge      rtc0
  9:          0          0   IO-APIC   9-fasteoi   acpi
 10:          0         70   IO-APIC  10-fasteoi   virtio0
 11:          0          0   IO-APIC  11-fasteoi   uhci_hcd:usb1
 12:         33          0   IO-APIC  12-edge      i8042
 14:          0          0   IO-APIC  14-edge      ata_piix
 15:          0          0   IO-APIC  15-edge      ata_piix
 24:       1193          0  PCI-MSIX-0000:00:10.0   0-edge      enx001e064549e5
 25:          0          0  PCI-MSIX-0000:00:05.0   0-edge      virtio1-config
 26:          0          0  PCI-MSIX-0000:00:05.0   1-edge      virtio1-control
 27:          0          0  PCI-MSIX-0000:00:05.0   2-edge      virtio1-event
 28:        837          0  PCI-MSIX-0000:00:05.0   3-edge      virtio1-request
 29:          0       1008  PCI-MSIX-0000:00:05.0   4-edge      virtio1-request
 30:          0          0  PCI-MSIX-0000:00:12.0   0-edge      virtio3-config
 31:          0          0  PCI-MSIX-0000:00:12.0   1-edge      virtio3-input.0
 32:          0          0  PCI-MSIX-0000:00:12.0   2-edge      virtio3-output.0
 33:          0          0  PCI-MSIX-0000:00:08.0   0-edge      virtio2-config
 34:          0          3  PCI-MSIX-0000:00:08.0   1-edge      virtio2-virtqueues
NMI:          0          0   Non-maskable interrupts
LOC:       6841       4121   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:        455        935   Rescheduling interrupts
CAL:       3366       4221   Function call interrupts
TLB:          7         12   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
DFR:          0          0   Deferred Error APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          1          1   Machine check polls
ERR:          0
MIS:          0
PIN:          0          0   Posted-interrupt notification event
NPI:          0          0   Nested posted-interrupt event
PIW:          0          0   Posted-interrupt wakeup event
[root@r8169 ~]#



On host (hypervisor) it looks differently (I see some PME PCI records):

[root@pve ~]# cat /proc/interrupts
            CPU0       CPU1       CPU2       CPU3
   0:         17          0          0          0  IR-IO-APIC    2-edge      timer
   8:          0          0          0          0  IR-IO-APIC    8-fasteoi   rtc0
   9:          0          0          0          0  IR-IO-APIC    9-fasteoi   acpi
  14:          0          0          0          0  IR-IO-APIC   14-fasteoi   INT3453:00, INT3453:01, INT3453:03
  15:          0          0          0          0  IR-IO-APIC   15-fasteoi   INT3453:02
  20:          0          0          0         27  IR-IO-APIC   20-fasteoi   i801_smbus
  31:          0          0          0          0  IR-IO-APIC   31-fasteoi   idma64.0, i2c_designware.0
  32:          0          0          0          0  IR-IO-APIC   32-fasteoi   idma64.1, i2c_designware.1
  33:          0          0          0          0  IR-IO-APIC   33-fasteoi   idma64.2, i2c_designware.2
  34:          0          0          0          0  IR-IO-APIC   34-fasteoi   idma64.3, i2c_designware.3
  39:          0          0     725786          0  IR-IO-APIC   39-fasteoi   mmc0
 120:          0          0          0          0  DMAR-MSI    0-edge      dmar0
 121:          0          0          0          0  DMAR-MSI    1-edge      dmar1
 122:          0          0          0          0  IR-PCI-MSI 311296-edge      PCIe PME
 123:          0          0          0          0  IR-PCI-MSI 327680-edge      PCIe PME
 124:          0          0          0          0  IR-PCI-MSI 329728-edge      PCIe PME
 125:          0          0    3333801          0  IR-PCI-MSI 294912-edge      ahci[0000:00:12.0]
 126:          0          0          0          0  IR-PCI-MSI 344064-edge      xhci_hcd
 127:          0    5966557          0          0  IR-PCI-MSI 524288-edge      enp1s0-0
 128:          0          0          0          0  IR-PCI-MSI 524289-edge      enp1s0-1
 129:          0          0          0          0  IR-PCI-MSI 524290-edge      enp1s0-2
 130:          0          0          0          0  IR-PCI-MSI 524291-edge      enp1s0-3
 131:          0          0          0          0  IR-PCI-MSI 524292-edge      enp1s0-4
 132:          0          0          0          0  IR-PCI-MSI 524293-edge      enp1s0-5
 133:          0          0          0          0  IR-PCI-MSI 524294-edge      enp1s0-6
 134:          0          0          0          0  IR-PCI-MSI 524295-edge      enp1s0-7
 135:          0          0          0          0  IR-PCI-MSI 524296-edge      enp1s0-8
 136:          0          0          0          0  IR-PCI-MSI 524297-edge      enp1s0-9
 137:          0          0          0          0  IR-PCI-MSI 524298-edge      enp1s0-10
 138:          0          0          0          0  IR-PCI-MSI 524299-edge      enp1s0-11
 139:          0          0          0          0  IR-PCI-MSI 524300-edge      enp1s0-12
 140:          0          0          0          0  IR-PCI-MSI 524301-edge      enp1s0-13
 141:          0          0          0          0  IR-PCI-MSI 524302-edge      enp1s0-14
 142:          0          0          0          0  IR-PCI-MSI 524303-edge      enp1s0-15
 143:          0    6050052          0          0  IR-PCI-MSI 524304-edge      enp1s0-16
 144:          0          0          0          0  IR-PCI-MSI 524305-edge      enp1s0-17
 145:          0          0          0          0  IR-PCI-MSI 524306-edge      enp1s0-18
 146:          0          0          0          0  IR-PCI-MSI 524307-edge      enp1s0-19
 147:          0          0          0          0  IR-PCI-MSI 524308-edge      enp1s0-20
 148:          0          0          1          0  IR-PCI-MSI 524309-edge      enp1s0-21
 149:          0          0          0          0  IR-PCI-MSI 524310-edge      enp1s0-22
 150:          0          0          0          0  IR-PCI-MSI 524311-edge      enp1s0-23
 151:          0          0          0          0  IR-PCI-MSI 524312-edge      enp1s0-24
 152:          0          0          0          0  IR-PCI-MSI 524313-edge      enp1s0-25
 153:          0          0          0          0  IR-PCI-MSI 524314-edge      enp1s0-26
 154:          0          0          0          0  IR-PCI-MSI 524315-edge      enp1s0-27
 155:          0          0          0          0  IR-PCI-MSI 524316-edge      enp1s0-28
 156:          0          0          0          0  IR-PCI-MSI 524317-edge      enp1s0-29
 157:          0          0          0          0  IR-PCI-MSI 524318-edge      enp1s0-30
 158:          0          0          0          0  IR-PCI-MSI 524319-edge      enp1s0-31
 159:          0          0         36          0  IR-PCI-MSI 245760-edge      mei_me
 160:          0    8028785          0          0  IR-PCI-MSI 1048576-edge      enp2s0-0
 161:          0          0          0          0  IR-PCI-MSI 1048577-edge      enp2s0-1
 162:          0          0          0          0  IR-PCI-MSI 1048578-edge      enp2s0-2
 163:          0          0          0          0  IR-PCI-MSI 1048579-edge      enp2s0-3
 164:          0          0          0          0  IR-PCI-MSI 1048580-edge      enp2s0-4
 165:          0          0          0          0  IR-PCI-MSI 1048581-edge      enp2s0-5
 166:          0          0          0          0  IR-PCI-MSI 1048582-edge      enp2s0-6
 167:          0          0          0          0  IR-PCI-MSI 1048583-edge      enp2s0-7
 168:          0          0          0          0  IR-PCI-MSI 1048584-edge      enp2s0-8
 169:          0          0          0          0  IR-PCI-MSI 1048585-edge      enp2s0-9
 170:          0          0          0          0  IR-PCI-MSI 1048586-edge      enp2s0-10
 171:          0          0          0          0  IR-PCI-MSI 1048587-edge      enp2s0-11
 172:          0          0          0          0  IR-PCI-MSI 1048588-edge      enp2s0-12
 173:          0          0          0          0  IR-PCI-MSI 1048589-edge      enp2s0-13
 174:          0          0          0          0  IR-PCI-MSI 1048590-edge      enp2s0-14
 175:          0          0          0          0  IR-PCI-MSI 1048591-edge      enp2s0-15
 176:          0    8839858          0          0  IR-PCI-MSI 1048592-edge      enp2s0-16
 177:          0          0          0          0  IR-PCI-MSI 1048593-edge      enp2s0-17
 178:          0          0          0          0  IR-PCI-MSI 1048594-edge      enp2s0-18
 179:          0          0          0          0  IR-PCI-MSI 1048595-edge      enp2s0-19
 180:          0          0          0          0  IR-PCI-MSI 1048596-edge      enp2s0-20
 181:          0          0          3          0  IR-PCI-MSI 1048597-edge      enp2s0-21
 182:          0          0          0          0  IR-PCI-MSI 1048598-edge      enp2s0-22
 183:          0          0          0          0  IR-PCI-MSI 1048599-edge      enp2s0-23
 184:          0          0          0          0  IR-PCI-MSI 1048600-edge      enp2s0-24
 185:          0          0          0          0  IR-PCI-MSI 1048601-edge      enp2s0-25
 186:          0          0          0          0  IR-PCI-MSI 1048602-edge      enp2s0-26
 187:          0          0          0          0  IR-PCI-MSI 1048603-edge      enp2s0-27
 188:          0          0          0          0  IR-PCI-MSI 1048604-edge      enp2s0-28
 189:          0          0          0          0  IR-PCI-MSI 1048605-edge      enp2s0-29
 190:          0          0          0          0  IR-PCI-MSI 1048606-edge      enp2s0-30
 191:          0          0          0          0  IR-PCI-MSI 1048607-edge      enp2s0-31
 192:          0       4458          0          0  IR-PCI-MSI 1572864-edge      vfio-msix[0](0000:03:00.0)
 224:          0          0          0         53  IR-PCI-MSI 32768-edge      i915
 225:        362          0          0          0  IR-PCI-MSI 229376-edge      snd_hda_intel:card0
 NMI:       1188       1269       1183       1176   Non-maskable interrupts
 LOC:   34826494   35213804   34565557   34451312   Local timer interrupts
 SPU:          0          0          0          0   Spurious interrupts
 PMI:       1188       1269       1183       1176   Performance monitoring interrupts
 IWI:         16          0          0          7   IRQ work interrupts
 RTR:          0          0          0          0   APIC ICR read retries
 RES:     622077     574339     560971     611903   Rescheduling interrupts
 CAL:    4991583    5201007    4762941    4735719   Function call interrupts
 TLB:     239654     192327     182074     188543   TLB shootdowns
 TRM:          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0   Threshold APIC interrupts
 DFR:          0          0          0          0   Deferred Error APIC interrupts
 MCE:          0          0          0          0   Machine check exceptions
 MCP:        294        295        295        295   Machine check polls
 ERR:          0
 MIS:          0
 PIN:     170890     183860     164263     165410   Posted-interrupt notification event
 NPI:          0          0          0          0   Nested posted-interrupt event
 PIW:          0          0          0          0   Posted-interrupt wakeup event
[root@pve ~]#
Comment 123 Heiner Kallweit 2023-05-12 21:43:34 UTC
I'm not a virtualization expert, maybe PME doesn't exist on the guest.
To ensure that runtime pm kicks in you could unplug the cable a little longer (say 30s). Then /sys/class/net/<if>/device/power_state should show D3hot for the NIC.
Comment 124 Heiner Kallweit 2023-05-13 11:28:43 UTC
Does the problem also occur if you set runtime pm control attribute to "on"?