Created attachment 307007 [details] dmesg when r8169 failed. After resuming, r8169 shows: kernel: r8169 0000:02:00.1 enp2s0f1: PCI error (cmd = 0x0407, status_errs = 0x0000) Hardware: The HP ZBook Power 15.6 inch G10 Ethernet: kernel: r8169 0000:02:00.1 eth0: RTL8168fp/RTL8117, 7c:4d:8f:a4:02:31, XID 54b, IRQ 71 kernel: r8169 0000:02:00.1 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko] kernel: r8169 0000:02:00.1 eth0: DASH disabled The issue is only reproduced on 1 of 4 HP G10. Failrate 1/1. The first regression patch: 5e864d90b2080 r8169: skip DASH fw status checks when DASH is disabled Before this patch, 4 HP G10 works fine. After this patch, only 1 HP G10 failed.
Apart from the error message, is there any actual issue with connectivity? I think the error message is a red herring, because in status_errs not a single error bit is set.
After the error, the ethernet can not link up.
Thanks. Please note that 6.8 isn't an LTS kernel and therefore won't receive fixes. Best re-test with latest mainline kernel 6.11. If the issue persists, please re-test with the following applied: diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 1a4d834c2..322a1e930 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4805,7 +4805,11 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance) struct rtl8169_private *tp = dev_instance; u32 status = rtl_get_events(tp); - if ((status & 0xffff) == 0xffff || !(status & tp->irq_mask)) + if ((status & 0xffff) == 0xffff) + return IRQ_NONE; + + status &= tp->irq_mask | RxFIFOOver; + if (!status) return IRQ_NONE; if (unlikely(status & SYSErr)) { --
Thanks you very much, Kallweit. Your change do fix the issue. It's reproduced on 6.11 upsteam kernel. Do you have plan to make the change upsteam?
Thanks, this patch works out great. https://github.com/torvalds/linux/compare/master...pseudocc:linux:r8169
(In reply to Pengyu Ma from comment #4) > Thanks you very much, Kallweit. > > Your change do fix the issue. > > It's reproduced on 6.11 upsteam kernel. > > Do you have plan to make the change upsteam? Yes, this change will be applied as a fix in mainline. Means it will also be rolled out to LTS kernel versions.
@Kallweit Thank you very much.
I found that the proposed patch causes issues under heavy load. Reason isn't clear yet. Could you please test whether the following fixes the issue for you as well? This change should have no side effects. diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 4166f1ab8..56ff450ad 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4749,7 +4749,8 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance) if ((status & 0xffff) == 0xffff || !(status & tp->irq_mask)) return IRQ_NONE; - if (unlikely(status & SYSErr)) { + if (unlikely(status & SYSErr && + tp->mac_version <= RTL_GIGA_MAC_VER_06)) { rtl8169_pcierr_interrupt(tp->dev); goto out; } -- 2.47.0
(In reply to Heiner Kallweit from comment #8) > I found that the proposed patch causes issues under heavy load. Reason isn't > clear yet. Could you please test whether the following fixes the issue for > you as well? > This change should have no side effects. > > diff --git a/drivers/net/ethernet/realtek/r8169_main.c > b/drivers/net/ethernet/realtek/r8169_main.c > index 4166f1ab8..56ff450ad 100644 > --- a/drivers/net/ethernet/realtek/r8169_main.c > +++ b/drivers/net/ethernet/realtek/r8169_main.c > @@ -4749,7 +4749,8 @@ static irqreturn_t rtl8169_interrupt(int irq, void > *dev_instance) > if ((status & 0xffff) == 0xffff || !(status & tp->irq_mask)) > return IRQ_NONE; > > - if (unlikely(status & SYSErr)) { > + if (unlikely(status & SYSErr && > + tp->mac_version <= RTL_GIGA_MAC_VER_06)) { > rtl8169_pcierr_interrupt(tp->dev); > goto out; > } > -- > 2.47.0 Much appreciate it, this patch also works.
Thanks, Kallweit, The patch fixes the issue.