Created attachment 295165 [details] dmesg after the first suspend on 5.11.0-rc6 When waking up from suspend, r8169 fails to restart the phy, preventing any form of networking until a complete reboot. This bug was introduced in commit e80bd76fbf563cc7ed8c9e9f3bbcdf59b0897f69 r8169: work around power-saving bug on some chip versions and could be reproduced with - the latest net-next kernel (5.11.0-rc6) - stable 4.19.0-171 but not with stable 4.19.0-160. The bug occurs regularly when suspending the maschine, but sometimes everything works fine after suspend. However on stable 4.19.0-171 when suspending without any LAN cable plugged in, the kernel does a partial freeze and needs to be restarted by the case switch. cat /proc/version: Linux version 5.11.0-rc6-net-next+ (wolf@MX-Linux-Intel) (gcc (Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #3 SMP Sat Feb 6 20:41:37 CET 2021 hostnamectl | grep "Operating System": Operating System: Debian GNU/Linux 10 (buster) lspci -nn: 00:00.0 Host bridge [0600]: Intel Corporation 2nd Generation Core Processor Family DRAM Controller [8086:0100] (rev 09) 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port [8086:0101] (rev 09) 00:02.0 Display controller [0380]: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0102] (rev 09) 00:16.0 Communication controller [0780]: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 [8086:1c3a] (rev 04) 00:1a.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 [8086:1c2d] (rev 05) 00:1b.0 Audio device [0403]: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller [8086:1c20] (rev 05) 00:1c.0 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 [8086:1c10] (rev b5) 00:1c.2 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 3 [8086:1c14] (rev b5) 00:1d.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 [8086:1c26] (rev 05) 00:1f.0 ISA bridge [0601]: Intel Corporation H61 Express Chipset Family LPC Controller [8086:1c5c] (rev 05) 00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller [8086:1c02] (rev 05) 00:1f.3 SMBus [0c05]: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller [8086:1c22] (rev 05) 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cedar [Radeon HD 7350/8350 / R5 220] [1002:68fa] 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cedar HDMI Audio [Radeon HD 5400/6300/7300 Series] [1002:aa68] 03:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136] (rev 05)
Created attachment 295167 [details] dmesg after doing a second suspend with kernel 5.11.0-rc6
Created attachment 295169 [details] dmesg when reloading the module after suspend with kernel 5.11.0-rc6
Created attachment 295171 [details] dmesg with kernel 4.19.0-171 (partial freeze)
Thanks for the report. Seems like the fix for some other chip version triggered a hw issue on RTL8105e. I can't reproduce the issue on RTL8168g. Could you please test whether the following fixes the issue (patch applies up to 5.11, not on net-next). diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 0d78408b4..e7a59dc5f 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -2208,6 +2208,7 @@ static void rtl_pll_power_down(struct rtl8169_private *tp) switch (tp->mac_version) { case RTL_GIGA_MAC_VER_25 ... RTL_GIGA_MAC_VER_26: + case RTL_GIGA_MAC_VER_29 ... RTL_GIGA_MAC_VER_30: case RTL_GIGA_MAC_VER_32 ... RTL_GIGA_MAC_VER_33: case RTL_GIGA_MAC_VER_37: case RTL_GIGA_MAC_VER_39: @@ -2235,6 +2236,7 @@ static void rtl_pll_power_up(struct rtl8169_private *tp) { switch (tp->mac_version) { case RTL_GIGA_MAC_VER_25 ... RTL_GIGA_MAC_VER_26: + case RTL_GIGA_MAC_VER_29 ... RTL_GIGA_MAC_VER_30: case RTL_GIGA_MAC_VER_32 ... RTL_GIGA_MAC_VER_33: case RTL_GIGA_MAC_VER_37: case RTL_GIGA_MAC_VER_39: -- 2.30.0
To avoid misunderstandings: net-next is the development version for 5.12. 5.11-rc isn't net-next.
My fault, i should have called it net-next and not 5.11-rc6. I will edit the original bug report to change that.
I cant edit the original report, so just ignore the last line.
On net-next you can test the following: diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 04231585e..376dfd011 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -1252,6 +1252,7 @@ static void rtl_set_d3_pll_down(struct rtl8169_private *tp, bool enable) { switch (tp->mac_version) { case RTL_GIGA_MAC_VER_25 ... RTL_GIGA_MAC_VER_26: + case RTL_GIGA_MAC_VER_29 ... RTL_GIGA_MAC_VER_30: case RTL_GIGA_MAC_VER_32 ... RTL_GIGA_MAC_VER_37: case RTL_GIGA_MAC_VER_39 ... RTL_GIGA_MAC_VER_63: if (enable) -- 2.30.1
Ok, after some testing, i observed the following: - the bug only appears after the computer has been disconnected from the power line for some time, after a reboot, the bug seems to be gone - with the patched 5.11 kernel, the bug never appears So i believe the patch solved the problem, thank you.
Interesting, thanks for the feedback! Not sure which difference it can make for the NIC whether system runs on battery or AC. At a first glance this more sounds like a BIOS bug. But if the patch can avoid the issue, fine with me.
Maybe it partly is, the nic also disappears from the PCIe bus without reboot=pci, which however is a BIOS bug. I think the bug is resolved now.