Bug 205047
Summary: | e1000e driver crashes/resets connection | ||
---|---|---|---|
Product: | Drivers | Reporter: | ginste51 |
Component: | Network | Assignee: | drivers_network (drivers_network) |
Status: | NEW --- | ||
Severity: | normal | CC: | adam, aminux, andymann375, anon.amish, bugzilla.kernel.org, cagnulein, carloscg, chemobejk, christian.rohmann, dan3805, dion, felix, grizzlyuser, hbayindir, jeffrey.t.kirsher, kimmo, lgpserranegra, lucas.yamanishi, marcus, mg05182-kernel, michael.groh, michael.j.lelli, mikegarcia556, nenad, ngodfriedt+bugzilla, null, nvaert1986, peter, pierrick, pnedkov, pyther, raxetul, sam.saffron, sasha.neftin, spamtrap, tabaire, thomas.natschlaeger, tmn505, vikb, vitaly.lifshits, web, wolkenschieber, xxdrshadowxx |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 5.3.1 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
lspci of affected system
dmesg of ThinkPad T470s failing to connect (via NetworkManager) dmesg of comment 7 journalctl for comment 7 dmesg of ThinkPad T470s - with patch from comment #11 applied. attachment-5901-0.html Thinkpad T470s - dmesg output with additional printk - 5.4.0-rc3 e1000e interface fails to connect with NetworkManager Patch to revert commit 59653e6497d16f7ac1d9db088f3959f57ee8c3db based on 5.4.0-rc8 Rejects from Attachment #285979 when applied to kernel 5.5rc1 attachment-24216-0.html dmesg output of Lenovo ThinkCentre M900 Tiny with the error attachment-13735-0.html |
Description
ginste51
2019-09-30 10:24:32 UTC
This bug also affects me. When connected the interface tries to go up but is disconnected immediately and it goes on repeatedly. My interface is 82577LC [8086:10eb] (more info in attached lspci output). I bisected this to: commit def4ec6dce393e2136b62a05712f35a7fa5f5e56 e1000e: PCIm function state support After reverting this commit everything went back to normal. The latest 5.4 rc1 does not fix the issue. Created attachment 285283 [details]
lspci of affected system
Also happening here: 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V Subsystem: ASUSTeK Computer Inc. Ethernet Connection (2) I219-V Flags: bus master, fast devsel, latency 0, IRQ 151 Memory at df400000 (32-bit, non-prefetchable) [size=128K] Capabilities: [c8] Power Management version 3 Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [e0] PCI Advanced Features Kernel driver in use: e1000e Kernel modules: e1000e Please attach dmesg log. If possible try applying the mentioned patch on an older kernel (4.19 for example). Also please try to reproduce this on e1000e out-of-tree driver: https://sourceforge.net/projects/e1000/files/e1000e%20stable/3.6.0/ I can confirm this problem with a Lenovo ThinkPad T470s. After upgrading to 5.4-rc1 i was unable to get my Ethernet connection working. According to "lspci -v", i have the following NIC: 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (4) I219-V (rev 21) Subsystem: Lenovo Ethernet Connection (4) I219-V Flags: bus master, fast devsel, latency 0, IRQ 132 Memory at e2200000 (32-bit, non-prefetchable) [size=128K] Capabilities: [c8] Power Management version 3 Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [e0] PCI Advanced Features Kernel driver in use: e1000e Kernel modules: e1000e I will attach dmesg. Also, i will revert said commit and report back if it works without "e1000e: PCIm function state support" Thank you for your work, Michael Created attachment 285397 [details]
dmesg of ThinkPad T470s failing to connect (via NetworkManager)
I built the in-tree module for 4.19.77 with the cherry-picked commit def4ec6dce393e2136b62a05712f35a7fa5f5e56 e1000e: PCIm function state support Result: Does not work. Same problem as reportet initially. See logs attached. Created attachment 285403 [details] dmesg of comment 7 Created attachment 285405 [details] journalctl for comment 7 I can now confirm with said ThinkPad T470s that 5.4.0-rc2 will have working ethernet connection again if i revert the following commit: commit def4ec6dce393e2136b62a05712f35a7fa5f5e56 e1000e: PCIm function state support If someone needs me to test a new version of said patch, i will gladly help. Also, if i should test the out-of-tree module with/without the patch, i can do that too if requested. Thanks for your work, Michael Please try applying this patch: http://patchwork.ozlabs.org/patch/1172931/ Created attachment 285465 [details] dmesg of ThinkPad T470s - with patch from comment #11 applied. I applied the patch from Comment 11, and it seems that the bug is not fixed. After login, Networkmanager tries to connect but never succeeds. Attached is the dmesg output, if you need more info i am happy to help. Thank you for your work, Michael Hi, I can confirm that this issue affect at least 2 motherboards users in the family of Z370/Z390 on kernel 5.3.1->5.3.5 https://bugs.archlinux.org/task/64018 I'm currently using a dkms package for e1000e driver which is a build of the e1000e without the commit def4ec6dce393e2136b62a05712f35a7fa5f5e56 e1000e: PCIm function state support Sorry I thought there was some kind of edit function here... So, what I meant about the dkms package above is that without this commit def4ec6dce393e2136b62a05712f35a7fa5f5e56, the network interface works as expected. I just wanted to add that, in order to have working ethernet on the Thinkpad T470s with 5.4.0-rc3 i still had to revert commit def4ec6dce393e2136b62a05712f35a7fa5f5e56 . Is there a timeline for a potential fix? If there is not, i think the revert should be included in 5.4.0, since it is a regression for quite popular hardware. Thank you for your work, Michael Created attachment 285515 [details]
attachment-5901-0.html
Out of office. Expected delayed response
We weren't able to reproduce on our side this issue, this is why fixing it is difficult. We are working on reproduction of this issue in order to get more information for this bug. I think that reverting this patch is not possible since it's a bug fix for LM devices. Untill we'll have a system to reproduce it, I could use your help with debugging it. Can you please add this line before and after the patch in watchdog_task function: printk("e1000e deb: STATUS = %d\n", er32(STATUS)); Also please attach dmesg output. Created attachment 285517 [details]
Thinkpad T470s - dmesg output with additional printk - 5.4.0-rc3
As requested, the dmesg output with a failing kernel (5.4.0-rc3) with additional printk lines.
It makes no difference if the Laptop is in its dock (with the ethernet connected to the dock) or by itself (ethernet connected to port on laptop).
Also, the Status 1074266240 is only achieved when i disconnect the ethernet cable, which i did at 80 and 149 seconds runtime.
Let me know if i can help to debug this further.
Thank you for your work,
Michael
Please try: 1. rmmod mei && rmmod mei_me 2. removing the if in the patch and moving the call e1000_phy_hw_reset(&adapter->hw) outside of the while loop: if (!(pcim_state & E1000_STATUS_PCIM_STATE)) e1000_phy_hw_reset(&adapter->hw); I am running Arch Linux on Dell T5610 with 82579LM rev 06 and I can easily reproduce the problem with all 5.3.x releases so far. After boot the e1000e network interface is constantly switching between "activated" and "deactivated" state every few seconds. The LEDs on the network port switch between going blank and blinking yellow every few seconds respectively. The last known kernel version where e1000e works fine is 5.2.14. Please let me know how can I help. Created attachment 285549 [details]
e1000e interface fails to connect with NetworkManager
(In reply to Vitaly Lifshits from comment #19) > Please try: > > 1. rmmod mei && rmmod mei_me > 2. removing the if in the patch and moving the call > e1000_phy_hw_reset(&adapter->hw) outside of the while loop: > > if (!(pcim_state & E1000_STATUS_PCIM_STATE)) > e1000_phy_hw_reset(&adapter->hw); Hello Vitaly, i tried both, and the problem is still there. I did "rmmod mei_hdcp && rmmod mei_me && rmmod mei && rmmod e1000e && modprobe e1000e" but still cant get it to connect. dmesg says: [ 959.013605] e1000e 0000:00:1f.6 enp0s31f6: removed PHC [ 959.088952] e1000e: enp0s31f6 NIC Link is Down [ 959.133390] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k [ 959.133392] e1000e: Copyright(c) 1999 - 2015 Intel Corporation. [ 959.133597] e1000e 0000:00:1f.6: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode [ 959.323759] e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): registered PHC clock [ 959.388088] e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) c8:5b:76:fb:b5:47 [ 959.388122] e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection [ 959.388222] e1000e 0000:00:1f.6 eth0: MAC: 12, PHY: 12, PBA No: 1000FF-0FF [ 959.393484] e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0 [ 995.521449] e1000e deb: STATUS = 1074266243 [ 997.536794] e1000e deb: STATUS = 1074266240 [ 997.536808] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Half Duplex, Flow Control: None [ 997.537766] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s31f6: link becomes ready [ 1001.857343] e1000e deb: STATUS = 1074266243 [ 1003.883432] e1000e deb: STATUS = 1074266240 [ 1003.883440] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Half Duplex, Flow Control: None [ 1009.217251] e1000e deb: STATUS = 1074266243 [ 1011.240393] e1000e deb: STATUS = 1074266240 [ 1011.240402] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Half Duplex, Flow Control: None [ 1015.617225] e1000e deb: STATUS = 1074266243 [ 1017.642151] e1000e deb: STATUS = 1074266240 Is there any more i can help to debug the problem? Thank you, Michael My dmesg: 9.642143] audit: type=1130 audit(1572499108.324:29): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-wait-online comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 9.857839] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 9.857874] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready [ 14.009279] audit: type=1131 audit(1572499112.691:30): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 15.976448] audit: type=1130 audit(1572499114.657:31): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 16.065737] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 22.271383] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 26.009253] audit: type=1131 audit(1572499124.691:32): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 28.388391] audit: type=1130 audit(1572499127.071:33): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 28.464720] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 32.437828] random: crng init done [ 32.437833] random: 5 urandom warning(s) missed due to ratelimiting [ 32.710380] aufs 5.3-20190923 [ 32.868380] resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000d8000-0x000dbfff window] [ 32.868507] caller _nv000939rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs [ 32.951333] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this. [ 32.953206] Bridge firewalling registered [ 32.973349] audit: type=1325 audit(1572499131.654:34): table=nat family=2 entries=0 [ 32.977263] audit: type=1325 audit(1572499131.657:35): table=filter family=2 entries=0 [ 32.993180] audit: type=1325 audit(1572499131.674:36): table=nat family=2 entries=5 [ 32.994507] audit: type=1325 audit(1572499131.677:37): table=filter family=2 entries=4 [ 32.996106] audit: type=1325 audit(1572499131.677:38): table=filter family=2 entries=6 [ 32.997528] audit: type=1325 audit(1572499131.681:39): table=filter family=2 entries=8 [ 32.999261] audit: type=1325 audit(1572499131.681:40): table=filter family=2 entries=10 [ 33.000619] audit: type=1325 audit(1572499131.681:41): table=filter family=2 entries=11 [ 33.001245] audit: type=1325 audit(1572499131.684:42): table=filter family=2 entries=12 [ 33.008622] Initializing XFRM netlink socket [ 33.019825] audit: type=1325 audit(1572499131.701:43): table=nat family=2 entries=7 [ 34.581104] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 38.989875] kauditd_printk_skb: 49 callbacks suppressed [ 38.989876] audit: type=1006 audit(1572499137.671:93): pid=1065 uid=0 old-auid=4294967295 auid=1000 tty=(none) old-ses=4294967295 ses=2 res=1 [ 39.004396] audit: type=1130 audit(1572499137.687:94): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=user-runtime-dir@1000 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 39.005892] audit: type=1131 audit(1572499137.687:95): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 39.009556] audit: type=1006 audit(1572499137.691:96): pid=1069 uid=0 old-auid=4294967295 auid=1000 tty=(none) old-ses=4294967295 ses=3 res=1 [ 39.061782] audit: type=1130 audit(1572499137.744:97): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=user@1000 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 39.133667] fuse: init (API version 7.31) [ 39.643913] logitech-hidpp-device 0003:046D:4008.0005: HID++ 2.0 device connected. [ 40.349622] audit: type=1130 audit(1572499139.031:98): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=rtkit-daemon comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 40.716153] audit: type=1130 audit(1572499139.397:99): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 40.821104] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 47.024427] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 49.330765] audit: type=1131 audit(1572499148.014:100): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=user@969 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 49.338774] audit: type=1131 audit(1572499148.021:101): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=user-runtime-dir@969 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 51.005671] audit: type=1131 audit(1572499149.687:102): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 53.140055] audit: type=1130 audit(1572499151.821:103): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 53.211122] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 59.401131] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 63.004496] audit: type=1131 audit(1572499161.687:104): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 65.518240] audit: type=1130 audit(1572499164.201:105): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 65.584434] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 71.757782] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 76.007103] audit: type=1131 audit(1572499174.687:106): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 77.874156] audit: type=1130 audit(1572499176.557:107): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 77.957836] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 84.117783] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 88.008960] audit: type=1131 audit(1572499186.691:108): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 90.235067] audit: type=1130 audit(1572499188.917:109): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 90.317781] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 96.491127] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 100.008245] audit: type=1131 audit(1572499198.691:110): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 102.608210] audit: type=1130 audit(1572499201.291:111): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 102.690547] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 108.877784] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx ..... ..... Lots of same message ..... [ 1555.006073] audit: type=1131 audit(1572500653.687:364): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 1556.694618] audit: type=1130 audit(1572500655.377:365): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 1556.775603] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 1562.970906] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 1567.007714] audit: type=1131 audit(1572500665.691:366): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 1569.088027] audit: type=1130 audit(1572500667.771:367): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 1569.144215] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 1575.330893] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 1579.007277] audit: type=1131 audit(1572500677.691:368): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 1581.447017] audit: type=1130 audit(1572500680.127:369): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [ 1581.520891] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx ..... ..... continoues trying to reconnect ..... ---------------------------------------------------------------------------------------- I am using KDE and journalctl is below: Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63372, resource id: 35653485, major code: 19 (DeleteProperty), minor code: 0 Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63375, resource id: 35653485, major code: 19 (DeleteProperty), minor code: 0 Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63376, resource id: 35653485, major code: 18 (ChangeProperty), minor code: 0 Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63377, resource id: 35653485, major code: 19 (DeleteProperty), minor code: 0 Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63378, resource id: 35653485, major code: 19 (DeleteProperty), minor code: 0 Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63379, resource id: 35653485, major code: 19 (DeleteProperty), minor code: 0 Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63380, resource id: 35653485, major code: 7 (ReparentWindow), minor code: 0 Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63381, resource id: 35653485, major code: 6 (ChangeSaveSet), minor code: 0 Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63382, resource id: 35653485, major code: 2 (ChangeWindowAttributes), minor code: 0 Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63383, resource id: 35653485, major code: 10 (UnmapWindow), minor code: 0 Eki 31 08:33:30 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63468, resource id: 35653491, major code: 15 (QueryTree), minor code: 0 Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.7515] device (eno1): state change: activated -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed') Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: Withdrawing address record for 10.6.1.77 on eno1. Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: Leaving mDNS multicast group on interface eno1.IPv4 with address 10.6.1.77. Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: Interface eno1.IPv4 no longer relevant for mDNS. Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: Withdrawing address record for fe80::9f94:fdd5:ce5d:d1b3 on eno1. Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: Leaving mDNS multicast group on interface eno1.IPv6 with address fe80::9f94:fdd5:ce5d:d1b3. Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: Interface eno1.IPv6 no longer relevant for mDNS. Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.7579] manager: NetworkManager state is now CONNECTED_LOCAL Eki 31 08:33:31 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63702, resource id: 0, major code: 14 (GetGeometry), minor code: 0 Eki 31 08:33:31 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63703, resource id: 0, major code: 14 (GetGeometry), minor code: 0 Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.8454] device (eno1): carrier: link connected Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.8456] device (eno1): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed') Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.8464] policy: auto-activating connection 'MY CONNECTION' (bd544ca7-0af1-4df3-b6d5-82486565ad83) Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.8470] device (eno1): Activation: starting connection 'MY CONNECTION' (bd544ca7-0af1-4df3-b6d5-82486565ad83) Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.8471] device (eno1): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed') Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.8474] manager: NetworkManager state is now CONNECTING Eki 31 08:33:31 MY-HOSTNAME kernel: e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.9370] device (eno1): state change: prepare -> config (reason 'none', sys-iface-state: 'managed') Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.9375] device (eno1): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed') Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: Joining mDNS multicast group on interface eno1.IPv6 with address fe80::9f94:fdd5:ce5d:d1b3. Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: New relevant interface eno1.IPv6 for mDNS. Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: Registering new address record for fe80::9f94:fdd5:ce5d:d1b3 on eno1.*. Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: Joining mDNS multicast group on interface eno1.IPv4 with address 10.6.1.77. Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: New relevant interface eno1.IPv4 for mDNS. Eki 31 08:33:31 MY-HOSTNAME avahi-daemon[609]: Registering new address record for 10.6.1.77 on eno1.IPv4. Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <warn> [1572500011.9390] acd[0x55f8eecbdb20,3]: couldn't init ACD for announcing addresses on interface 'eno1': İşleme izin verilmedi Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.9391] device (eno1): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed') Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.9405] device (eno1): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed') Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.9407] device (eno1): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed') Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.9410] manager: NetworkManager state is now CONNECTED_LOCAL Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.9418] manager: NetworkManager state is now CONNECTED_SITE Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.9419] policy: set 'MY CONNECTION' (eno1) as default for IPv4 routing and DNS Eki 31 08:33:31 MY-HOSTNAME NetworkManager[611]: <info> [1572500011.9458] device (eno1): Activation: successful, device activated. Eki 31 08:33:31 MY-HOSTNAME kdeinit5[1122]: plasma-nm: Unhandled active connection state change: 1 Eki 31 08:33:32 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63971, resource id: 0, major code: 14 (GetGeometry), minor code: 0 Eki 31 08:33:32 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63972, resource id: 0, major code: 14 (GetGeometry), minor code: 0 Eki 31 08:33:32 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63973, resource id: 0, major code: 14 (GetGeometry), minor code: 0 Eki 31 08:33:32 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63974, resource id: 0, major code: 14 (GetGeometry), minor code: 0 Eki 31 08:33:32 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63975, resource id: 0, major code: 14 (GetGeometry), minor code: 0 Eki 31 08:33:32 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63976, resource id: 0, major code: 14 (GetGeometry), minor code: 0 Eki 31 08:33:32 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63977, resource id: 0, major code: 14 (GetGeometry), minor code: 0 Eki 31 08:33:32 MY-HOSTNAME kwin_x11[1182]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63978, resource id: 0, major code: 14 (GetGeometry), minor code: 0 Please try applying this: @@ -5208,6 +5208,14 @@ mac->ops.get_link_up_info(&adapter->hw, &adapter->link_speed, &adapter->link_duplex); + + /* Check for Duplex mismatch in 1gb */ + if (adapter->link_duplex == HALF_DUPLEX && + adapter->link_speed == SPEED_1000) { + e1000e_down(adapter, true); + e1000e_up(adapter); + } + e1000_print_link_info(adapter); /* check if SmartSpeed worked */ Does not work for me. Intel I219V nic (on Asrock Z390 Taichi motherboard). Arch linux kernel 5.3.8. #dmesg|grep e1000 Nov 03 23:09:54 kernel: e1000e: loading out-of-tree module taints kernel. Nov 03 23:09:54 kernel: e1000e: module verification failed: signature and/or required key missing - tainting kernel Nov 03 23:09:54 kernel: e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-l Nov 03 23:09:54 kernel: e1000e: Copyright(c) 1999 - 2015 Intel Corporation. Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): registered PHC clock Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) 70:85:c2:a4:d0:16 Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6 eno1: renamed from eth0 Nov 03 23:10:01 kernel: e1000e: eno1 NIC Link is Down (In reply to Vitaly Lifshits from comment #24) > Please try applying this: > > @@ -5208,6 +5208,14 @@ > mac->ops.get_link_up_info(&adapter->hw, > &adapter->link_speed, > &adapter->link_duplex); > + > + /* Check for Duplex mismatch in 1gb */ > + if (adapter->link_duplex == HALF_DUPLEX && > + adapter->link_speed == SPEED_1000) { > + e1000e_down(adapter, true); > + e1000e_up(adapter); > + } > + > e1000_print_link_info(adapter); > > /* check if SmartSpeed worked */ Hello Vitaly, i applied this to 5.4.0-rc5, and it still does not work. Here is dmesg after plugging in the ethernet cable: [ 180.620029] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 180.620136] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s31f6: link becomes ready [ 186.994114] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 193.835072] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 200.617964] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None (i did not apply the "STATUS" patch) Is there anything more i could test? Thank you, Michael (In reply to peter_s from comment #25) > Does not work for me. Intel I219V nic (on Asrock Z390 Taichi motherboard). > Arch linux kernel 5.3.8. > > #dmesg|grep e1000 > > Nov 03 23:09:54 kernel: e1000e: loading out-of-tree module taints kernel. > Nov 03 23:09:54 kernel: e1000e: module verification failed: signature and/or > required key missing - tainting kernel > Nov 03 23:09:54 kernel: e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-l > Nov 03 23:09:54 kernel: e1000e: Copyright(c) 1999 - 2015 Intel Corporation. > Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6: Interrupt Throttling Rate > (ints/sec) set to dynamic conservative mode > Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): > registered PHC clock > Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width > x1) 70:85:c2:a4:d0:16 > Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network > Connection > Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: > FFFFFF-0FF > Nov 03 23:09:54 kernel: e1000e 0000:00:1f.6 eno1: renamed from eth0 > Nov 03 23:10:01 kernel: e1000e: eno1 NIC Link is Down Note: in this report I also applied the latest proposed patch (by Vitality) on 5.3.8 kernel. Recently we got a similar complain that is connected to a different patch, and we are working on reverting it. Can you please try to revert it and see if it resolves your issue? The patch is: The commit introducing the bug is 59653e6497d16f7ac1d9db088f3959f57ee8c3db (e1000e: Make watchdog use delayed work) You recommendation was tested this way (https://bugs.archlinux.org/task/64018). It does not work for me either. [ 4.195139] e1000e: loading out-of-tree module taints kernel. [ 4.215400] e1000e: module verification failed: signature and/or required key missing - tainting kernel [ 4.218393] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-l [ 4.218394] e1000e: Copyright(c) 1999 - 2015 Intel Corporation. [ 4.218581] e1000e 0000:00:1f.6: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode [ 4.475106] e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): registered PHC clock [ 4.543593] e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) 70:85:c2:a4:d0:16 [ 4.543596] e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection [ 4.543677] e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF [ 4.604465] e1000e 0000:00:1f.6 eno1: renamed from eth0 [ 11.407189] e1000e: eno1 NIC Link is Down [ 12.435905] e1000e: eno1 NIC Link is Up 1000 Mbps Half Duplex, Flow Control: None Maybe the commit was created earlier. Let us know if there is a new patch to test. (In reply to peter_s from comment #29) > You recommendation was tested this way > (https://bugs.archlinux.org/task/64018). It does not work for me either. > > [ 4.195139] e1000e: loading out-of-tree module taints kernel. > [ 4.215400] e1000e: module verification failed: signature and/or required > key missing - tainting kernel > [ 4.218393] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-l > [ 4.218394] e1000e: Copyright(c) 1999 - 2015 Intel Corporation. > [ 4.218581] e1000e 0000:00:1f.6: Interrupt Throttling Rate (ints/sec) set > to dynamic conservative mode > [ 4.475106] e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): registered > PHC clock > [ 4.543593] e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) > 70:85:c2:a4:d0:16 > [ 4.543596] e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection > [ 4.543677] e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF > [ 4.604465] e1000e 0000:00:1f.6 eno1: renamed from eth0 > [ 11.407189] e1000e: eno1 NIC Link is Down > [ 12.435905] e1000e: eno1 NIC Link is Up 1000 Mbps Half Duplex, Flow > Control: None > > Maybe the commit was created earlier. Let us know if there is a new patch to > test. Can you please try to revert the patch I mentioned with the change I offered in line 24? (the patch to revert is: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/ethernet/intel/e1000e?id=59653e6497d16f7ac1d9db088f3959f57ee8c3db) I saw in Michael's comment that the link comes up with correct speed and duplex. Probably the usage of the struct in the patch creates problems when it is working with network manager. (In reply to Vitaly Lifshits from comment #28) > Recently we got a similar complain that is connected to a different patch, > and we are working on reverting it. > > Can you please try to revert it and see if it resolves your issue? > > The patch is: > The commit introducing the bug is 59653e6497d16f7ac1d9db088f3959f57ee8c3db > (e1000e: Make watchdog use delayed work) Hello Vitaly, i reverted said commit. However, there were merge-issues in the file drivers/net/ethernet/intel/e1000e/netdev.c. I cleaned those up an will post the "git diff" as a patch based on 5.4.0-rc8. I can confirm that reverting said patch does indeed help. I can use ethernet on my ThinkPad T470s again. Will this get reverted from mainline then? Anyway, thank you for your work, have a nice day, Michael yes. We will work up to revert it. Created attachment 285979 [details]
Patch to revert commit 59653e6497d16f7ac1d9db088f3959f57ee8c3db based on 5.4.0-rc8
As suggested in #28 i did revert commit 59653e6497d16f7ac1d9db088f3959f57ee8c3db. There have been merge conflicts which i did try to resolve, this is the diff for 5.4.0-rc8.
It is working now with reverting back to the right commit (on Archlinux with kernel 5.3.11-arch1-1). [ 4.156613] e1000e: loading out-of-tree module taints kernel. [ 4.162353] e1000e: module verification failed: signature and/or required key missing - tainting kernel [ 4.196559] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-l [ 4.196560] e1000e: Copyright(c) 1999 - 2015 Intel Corporation. [ 4.196767] e1000e 0000:00:1f.6: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode [ 4.437055] e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): registered PHC clock [ 4.503870] e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) 70:85:c2:a4:d0:16 [ 4.503875] e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection [ 4.503956] e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF [ 4.584850] e1000e 0000:00:1f.6 eno1: renamed from eth0 [ 11.243194] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Can we expect this code in final version of 5.4? Disconnecting repeatedly continues in 5.3.12-1 but retry count has got lower. After connection I experience lots of connection drops instead there is no sign of disconnection. This implicit connection drops also exist in 4.9.85-1. Distro is Arch based Manjaro. Also affects me with I219-V on AsRock Z370 motherboard. Kernel 5.3.6 works fine, but 5.4.0-rc8 and 5.4.1 (didn't check other) result in connection being established and then dropped shortly afterwards. No messages in kernel log besides this: [ 23.043952] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Duplicated as many times as reconnection happened. So whatever happened between 5.3.6 and 5.4.0-rc8 introduced a regression. I'm on Ubuntu 19.04 daily (not using stock kernel obviously). (In reply to Nazar Mokrynskyi from comment #36) > Also affects me with I219-V on AsRock Z370 motherboard. > Kernel 5.3.6 works fine, but 5.4.0-rc8 and 5.4.1 (didn't check other) result > in connection being established and then dropped shortly afterwards. > No messages in kernel log besides this: > [ 23.043952] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow > Control: Rx/Tx > > Duplicated as many times as reconnection happened. > > So whatever happened between 5.3.6 and 5.4.0-rc8 introduced a regression. > > I'm on Ubuntu 19.04 daily (not using stock kernel obviously). Did you revert the patch we mentioned in comment 28? Does it happen if you disable network manager? (sudo systemctl stop NetworkManger) It seems that Vitaly's proposal (i.e. eliminating the problematic commit)
> The patch is:
> The commit introducing the bug is 59653e6497d16f7ac1d9db088f3959f57ee8c3db
> (e1000e: Make watchdog use delayed work)
did not hit the 5.4 kernel. The e1000e driver is still broken in kernel 5.4.1. What are the plans now to release this fix?
(In reply to Michael Groh from comment #33) > Created attachment 285979 [details] > Patch to revert commit 59653e6497d16f7ac1d9db088f3959f57ee8c3db based on > 5.4.0-rc8 > > As suggested in #28 i did revert commit > 59653e6497d16f7ac1d9db088f3959f57ee8c3db. There have been merge conflicts > which i did try to resolve, this is the diff for 5.4.0-rc8. Yeah, applying this patch solved my problem with 5.4.1 and 5.4.2. I had the same problem from Nazar Mokrynskyi (comment #36) before patching the kernel. I use an Intel 82579V with linux-ck 5.4.2 and NetworkManager 1.20. (In reply to Sasha Neftin from comment #32) > yes. We will work up to revert it. While this seems to have missed the 5.4 tree, the problem is still there with 5.5-rc1 Is there somewhere i can contribute to that the patch will get reverted? Maybe an LKML thread? Thanks for your work, Michael Created attachment 286283 [details] Rejects from Attachment #285979 [details] when applied to kernel 5.5rc1 A new patch would be required (unless I'm doing something wrong). Hunk #9 FAILED at 7414. 1 out of 9 hunks FAILED -- saving rejects to file drivers/net/ethernet/intel/e1000e/netdev.c.rej Created attachment 286285 [details]
attachment-24216-0.html
Out of office. Expected delayed response
Not fixed for me as of kernel 5.4.6 with the integrated I218V Intel Ethernet on an ASUS H97M-PLUS board. Unfortunately, I cannot downgrade because older kernels do not support my Realtek RTL8125 2.5GbE Adapter. The patch from comment #33 applied fine to my 5.4.6 kernel though and I'll try over the next few days if this indeed fixes it for me. Please bring a fix to mainline linux. The e1000e driver is really really important and widely used and should not remain broken for so long. Update to my previous comment: Applying the patch from comment #33 did NOT fix it for me. e1000e still hangs and resets. It could be related to traffic sent through wireguard interfaces at the time, but on the other hand it might be that I just do not notice those e1000e hangs+resets unless I'm using the wireguard tunnels... However: I'm able to transfer lots of traffic through the e1000e NIC just fine, but once I have small amounts of Remote Desktop traffic going through e1000e + the wireguard interface, I can reproduce the hang+reset within 5 minutes. Perhaps this is a different bug? (In reply to Gerald H. from comment #44) > Update to my previous comment: Applying the patch from comment #33 did NOT > fix it for me. e1000e still hangs and resets. It could be related to traffic > sent through wireguard interfaces at the time, but on the other hand it > might be that I just do not notice those e1000e hangs+resets unless I'm > using the wireguard tunnels... > > However: I'm able to transfer lots of traffic through the e1000e NIC just > fine, but once I have small amounts of Remote Desktop traffic going through > e1000e + the wireguard interface, I can reproduce the hang+reset within 5 > minutes. > > Perhaps this is a different bug? It does look like a different bug since the original issue was with the interface not coming up at all. Please try disabling TCP segmentation offload (tso), it may solve your issue. Vitaly, thank you very much. "ethtool -K eno1 tso off" has completely solved my hang+reset problems with the e1000e driver that were happening as soon as the e1000e was involved with wireguard traffic. Hi all, the e1000e driver is still in a broken stage (from mid September 5.1.x till now 5.4.7). My understanding is that you successfully identified the commit which causes the problem and many of us tested/confirmed that as well. Could you please elaborate when you plan to remove it from the kernel code? Thanks. Jeff submit revert by 25/12/2019 - please, stay tuned: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org>; on behalf of; Jeff Kirsher <jeffrey.t.kirsher@intel.com> [Intel-wired-lan] [net] e1000e: RInIntel-wired-lan <intel-wired-lan-bounces@osuosl.org>tel-wired-lan <intel-wired-lan-bounces@osuosl.org>evert "e1000e: Make watchdog use delayed work" This reverts commit 59653e6497d16f7ac1d9db088f3959f57ee8c3db. This is due to this commit causing driver crashes and connections to reset unexpectedly. I can confirm that the revert from comment #49, backported to stable 5.4.8 (with one minor change to make it apply) fixes the issue on my Dell Latitude E6420: 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connectio n (Lewisville) (rev 04) Subsystem: Dell Device 0493 Flags: bus master, fast devsel, latency 0, IRQ 33 Memory at e6e00000 (32-bit, non-prefetchable) [size=128K] Memory at e6e80000 (32-bit, non-prefetchable) [size=4K] I/O ports at 5080 [size=32] Capabilities: [c8] Power Management version 2 Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [e0] PCI Advanced Features Kernel driver in use: e1000e Kernel modules: e1000e I would appreciate a backport to stable once it has been merged to master. Thank you. So I am facing the same issue on a Dell Precision M4800 with kernel 5.4.12. Question to Stefan: what was neccessary to make it apply on 5.4.8? Actually I took the patch from comment #33 which applied cleanly to 5.4.12 and it solved my problem. Please have it included in stable series for 5.4 Attachment: https://bugzilla.kernel.org/attachment.cgi?id=285979&action=diff from bug: https://bugzilla.kernel.org/show_bug.cgi?id=205047#c33 resolves the issue for me, please revert this patch in the mainline kernel. *** Bug 206219 has been marked as a duplicate of this bug. *** I can confirm that kernel 5.5.0 is working flawless with my NIC 82579V since this release include the revert commit mentioned by https://bugzilla.kernel.org/show_bug.cgi?id=205047#c48. You can see the commit here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.5&id=d5ad7a6a7f3c87b278d7e4973b65682be4e588dd Thanks for confirming that. As 5.4.x is a LTS kernel can you please submit the revert to that stable branch too? Thank you. Please also for the 5.3 stable branch. Even if not LTS, there are people using it. Hello. I make some experiments with this - this bug appear after upgrade kernel in Fedora-31 from 5.3.16-300.fc31.x86_64 to 5.4.x Connection continously start/reset with this messages: янв 26 15:54:08 Host1 kded5[3837]: plasma-nm: Unhandled active connection state change: 1 янв 26 15:54:08 Host1 kded5[3837]: plasma-nm: Unhandled active connection state change: 1 янв 26 15:54:14 Host1 NetworkManager[949]: <info> [1580043254.5138] device (enp0s31f6): state change: ip-config -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed') янв 26 15:54:14 Host1 NetworkManager[949]: <info> [1580043254.5282] dhcp4 (enp0s31f6): canceled DHCP transaction янв 26 15:54:14 Host1 NetworkManager[949]: <info> [1580043254.5283] dhcp4 (enp0s31f6): state changed unknown -> done янв 26 15:54:14 Host1 NetworkManager[949]: <info> [1580043254.5315] manager: NetworkManager state is now DISCONNECTED янв 26 15:54:15 Host1 kernel: e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0662] device (enp0s31f6): carrier: link connected янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0665] device (enp0s31f6): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed') янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0671] policy: auto-activating connection 'Onboard_LAN' (708b2b50-68f3-4413-9167-b85dc5741983) янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0677] device (enp0s31f6): Activation: starting connection 'Onboard_LAN' (708b2b50-68f3-4413-9167-b85dc5741983) янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0678] device (enp0s31f6): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed') янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0682] manager: NetworkManager state is now CONNECTING янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.1550] device (enp0s31f6): state change: prepare -> config (reason 'none', sys-iface-state: 'managed') янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.1575] device (enp0s31f6): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed') янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.1585] dhcp4 (enp0s31f6): activation: beginning transaction (timeout in 45 seconds) янв 26 15:54:15 Host1 kded5[3837]: plasma-nm: Unhandled active connection state change: 1 Switching from DHCP to static-config don't affect - connection continue resetting; Switching "auto-negotiation" option to "off" can help, also force setting link-speed to 1Gbps/HalfDuplex can help to make connection; But this connection can not survive/reopen after reboot; Sleeping in RAM-standby mode and waking up made connection drop to 10mbps and i detect "TxError" errors on switch port (DGS-1100-10ME): TX OutOctets 193115544851 OutUcastPkts 501851548 OutNUcastPkts 676615 OutErrors 51788 LateCollisions 51788 ExcessiveCollisions 0 In normal state Errors / Collisions not detected on switch interface even under load about 700-800 Mbps. This bug not all motherboards/NIC appear ! - My Asus STRIX H270F GAMING with bios version 1205 from release date: 05/11/2018 and NIC "Intel Corporation Ethernet Connection (2) I219-V" affected - only in 5.4.14-200 i can start network after setting "Half-Duplex" in connection properties; - Another PC with MB "ASUS P8P67 EVO" bios version: 3602 release date: 11/01/2012 and NIC "Intel Corporation 82579V Gigabit Network Connection (rev 05)" network work correctly without tweaks with same e1000e driver. Kernel 5.3.16 work ideal with both MB/NIC. (In reply to Amin from comment #59) > Hello. > I make some experiments with this - this bug appear after upgrade kernel in > Fedora-31 from 5.3.16-300.fc31.x86_64 to 5.4.x > > Connection continously start/reset with this messages: > > янв 26 15:54:08 Host1 kded5[3837]: plasma-nm: Unhandled active connection > state change: 1 > янв 26 15:54:08 Host1 kded5[3837]: plasma-nm: Unhandled active connection > state change: 1 > > янв 26 15:54:14 Host1 NetworkManager[949]: <info> [1580043254.5138] device > (enp0s31f6): state change: ip-config -> unavailable (reason > 'carrier-changed', sys-iface-state: 'managed') > янв 26 15:54:14 Host1 NetworkManager[949]: <info> [1580043254.5282] dhcp4 > (enp0s31f6): canceled DHCP transaction > янв 26 15:54:14 Host1 NetworkManager[949]: <info> [1580043254.5283] dhcp4 > (enp0s31f6): state changed unknown -> done > янв 26 15:54:14 Host1 NetworkManager[949]: <info> [1580043254.5315] > manager: NetworkManager state is now DISCONNECTED > янв 26 15:54:15 Host1 kernel: e1000e: enp0s31f6 NIC Link is Up 1000 Mbps > Full Duplex, Flow Control: Rx/Tx > янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0662] device > (enp0s31f6): carrier: link connected > янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0665] device > (enp0s31f6): state change: unavailable -> disconnected (reason > 'carrier-changed', sys-iface-state: 'managed') > янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0671] policy: > auto-activating connection 'Onboard_LAN' > (708b2b50-68f3-4413-9167-b85dc5741983) > янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0677] device > (enp0s31f6): Activation: starting connection 'Onboard_LAN' > (708b2b50-68f3-4413-9167-b85dc5741983) > янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0678] device > (enp0s31f6): state change: disconnected -> prepare (reason 'none', > sys-iface-state: 'managed') > янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.0682] > manager: NetworkManager state is now CONNECTING > янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.1550] device > (enp0s31f6): state change: prepare -> config (reason 'none', > sys-iface-state: 'managed') > янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.1575] device > (enp0s31f6): state change: config -> ip-config (reason 'none', > sys-iface-state: 'managed') > янв 26 15:54:15 Host1 NetworkManager[949]: <info> [1580043255.1585] dhcp4 > (enp0s31f6): activation: beginning transaction (timeout in 45 seconds) > янв 26 15:54:15 Host1 kded5[3837]: plasma-nm: Unhandled active connection > state change: 1 > > > Switching from DHCP to static-config don't affect - connection continue > resetting; > > Switching "auto-negotiation" option to "off" can help, also force setting > link-speed to 1Gbps/HalfDuplex can help to make connection; > But this connection can not survive/reopen after reboot; > > Sleeping in RAM-standby mode and waking up made connection drop to 10mbps > and i detect "TxError" errors on switch port (DGS-1100-10ME): > > TX > OutOctets 193115544851 > OutUcastPkts 501851548 > OutNUcastPkts 676615 > OutErrors 51788 > LateCollisions 51788 > ExcessiveCollisions 0 > > In normal state Errors / Collisions not detected on switch interface even > under load about 700-800 Mbps. > > > > This bug not all motherboards/NIC appear ! > > - My Asus STRIX H270F GAMING with bios version 1205 from release date: > 05/11/2018 and NIC "Intel Corporation Ethernet Connection (2) I219-V" > affected - only in 5.4.14-200 i can start network after setting > "Half-Duplex" in connection properties; > > - Another PC with MB "ASUS P8P67 EVO" bios version: 3602 release date: > 11/01/2012 and NIC "Intel Corporation 82579V Gigabit Network Connection (rev > 05)" network work correctly without tweaks with same e1000e driver. > > Kernel 5.3.16 work ideal with both MB/NIC. It seems that fedora's 5.4.x kernel didn't revert the problematic patch. You can try using the latest stable vanilla kernel 5.5.1, or to wait until Fedora will update the kernel. I'm having the same problem with 82579V on 5.14.13, but only on gigabit or auto-negotiated-to-gigabit connections. If I force driver to work at 100Mbps via network manager, the link works reliably. Another strange point is gigabit link works sometimes in first boot, but if I suspend/resume, it's impossible to connect in gigabit speeds. It just resets the phy indefinitely. Forcing card to 100Mbps keeps connection stable even after suspend/resume cycle. Resolved with Linux 5.5.0-1-MANJARO in Manjaro Dist. I also tested sleep/resume and everything seems to be fine. With kernel 5.6.4-arch1-1, this bug seems to still be present. lspci: 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (6) I219-V (rev 30) Subsystem: Lenovo Ethernet Connection (6) I219-V Flags: bus master, fast devsel, latency 0, IRQ 148 Memory at dd300000 (32-bit, non-prefetchable) [size=128K] Capabilities: <access denied> Kernel driver in use: e1000e Kernel modules: e1000e ip link: 5: enp0s31f6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether 98:fa:9b:5f:ee:8f brd ff:ff:ff:ff:ff:ff The interface is correctly shown with `ip link`. The driver keeps connecting and disconnecting constantly still. In Fedora-31 this bug similar to fully resolve with 5.5.x kernels; I use 5.5.x kernels now and don't have any problems with e1000e in fc31. (In reply to UndeadKernel from comment #63) > With kernel 5.6.4-arch1-1, this bug seems to still be present. > > lspci: > > 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (6) > I219-V (rev 30) > Subsystem: Lenovo Ethernet Connection (6) I219-V > Flags: bus master, fast devsel, latency 0, IRQ 148 > Memory at dd300000 (32-bit, non-prefetchable) [size=128K] > Capabilities: <access denied> > Kernel driver in use: e1000e > Kernel modules: e1000e > > > ip link: > > 5: enp0s31f6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel > state DOWN mode DEFAULT group default qlen 1000 > link/ether 98:fa:9b:5f:ee:8f brd ff:ff:ff:ff:ff:ff > > The interface is correctly shown with `ip link`. The driver keeps connecting > and disconnecting constantly still. Are you using the intel e1000e or the in-kernel e1000e driver? As the driver from the intel.com website still contains the bug and overrides the in-kernel driver. The driver also keeps getting re-compiled if you're using dkms. If you tried using the driver from the intel website (like me), then you need to remove the e1000e-dkms package and manually remove the module from the source tree update folder or compile a newer kernel after removing e1000e-dkms. When using 5.6.5 / 5.6.6 or 5.4.30ish then it should work, as I'm having a similar adapter to yours. 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V (rev 10) Subsystem: Lenovo Ethernet Connection (7) I219-V Flags: bus master, fast devsel, latency 0, IRQ 159 Memory at a4300000 (32-bit, non-prefetchable) [size=128K] Capabilities: [c8] Power Management version 3 Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Kernel driver in use: e1000e Kernel modules: e1000e This issue still happens with my 82579V. Under any significant load the driver starts to hang and drop packets, tried several kernels from 3.10, 4.18 now to 5.6.7, centos 7, 8 now 8.1, tried the kernel drivers, intel latest 3.8.4-NAPI drivers, tried several settings tso gso nothing helps I also have some servers using I210 and 82574L which no issues whatsoever, I210 never had issues and 82574L only had to update to intel 3.8.4 driver. The issue seems to be with 82579V and I219 only Created attachment 289157 [details]
dmesg output of Lenovo ThinkCentre M900 Tiny with the error
I am seeing this as well and it is reproducible for me.
Kernel: kernel-5.6.12-300.fc32.x86_64
Setup:
Hardware name: LENOVO 10FLS0A000/30D0, BIOS FWKTACA 03/24/2020 (Lenovo ThinkCentre M900 Tiny)
Host OS: Fedora 32 KVM host
Guest OS: CentOS 8.1 KVM guest
KVM network configuration: Bridged to Intel I219-LM on the Host hardware.
Scenario
When compiling OpenWrt on the CentOS guest through SSH using make V=s (full verbosity) the ethernet adapter will hang and I receive an SSH error as soon as there is a high volume of console output.
The connection is reset but recovers eventually after some time.
Perhaps a cable issue? I note some logs posted here have "carrier-changed" errors. I had "carrier-changed" errors, frequently disconnecting my ethernet, and after trying every solution in multiple forums, I finally upgraded the CAT5 cable to a CAT7 cable (10 meter run - cheap on ebay) which cured the problem. Apparently, the old CAT5 cable couldn't handle my ISP's upgraded speeds (Virgin Media, UK). The new CAT7 cable has not only stopped the dropouts, but also now allows the speed to auto-configure from 100Mb/s to 1000Mb/s. Hope that helps anyone looking to fix "carrier-changed" problems. (In reply to Andy Mann from comment #68) > Perhaps a cable issue? I note some logs posted here have "carrier-changed" > errors. > > I had "carrier-changed" errors, frequently disconnecting my ethernet, and > after trying every solution in multiple forums, I finally upgraded the CAT5 > cable to a CAT7 cable (10 meter run - cheap on ebay) which cured the > problem. Apparently, the old CAT5 cable couldn't handle my ISP's upgraded > speeds (Virgin Media, UK). The new CAT7 cable has not only stopped the > dropouts, but also now allows the speed to auto-configure from 100Mb/s to > 1000Mb/s. Hope that helps anyone looking to fix "carrier-changed" problems. that is not the cable , i have the same issue with multiple server at OVH . (datacenter) Yesterday i installed the kernel version "5.4.44-2-pve #1 SMP PVE 5.4.44-2 (Wed, 01 Jul 2020 16:37:57 +0200)" that includes the patch above. The ethernet seems fine but i got this on the logs. The ethernet is still working but i think something it happened (maybe, with kernel 5.3.x i would had the up/down loop instead) [37932.264016] hrtimer: interrupt took 11516 ns [39959.724485] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <42> TDT <6f> next_to_use <6f> next_to_clean <42> buffer_info[next_to_clean]: time_stamp <100974783> next_to_watch <42> jiffies <100974a40> next_to_watch.status <0> MAC Status <80083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> [39961.708425] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <42> TDT <6f> next_to_use <6f> next_to_clean <42> buffer_info[next_to_clean]: time_stamp <100974783> next_to_watch <42> jiffies <100974c30> next_to_watch.status <0> MAC Status <80083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> [39963.724435] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang: TDH <42> TDT <6f> next_to_use <6f> next_to_clean <42> buffer_info[next_to_clean]: time_stamp <100974783> next_to_watch <42> jiffies <100974e28> next_to_watch.status <0> MAC Status <80083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> [39964.844187] ------------[ cut here ]------------ [39964.844190] NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out [39964.844201] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 [39964.844202] Modules linked in: veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink snd_hda_codec_hdmi zfs(PO) zunicode(PO) zlua(PO) zavl(PO) snd_hda_codec_realtek icp(PO) snd_hda_codec_generic ledtrig_audio zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vfio_pci mei_hdcp vfio_virqfd kvmgt intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_intel_dspcfg btusb aesni_intel btrtl snd_hda_codec crypto_simd btbcm btintel snd_hda_core cryptd bluetooth glue_helper snd_hwdep snd_pcm ecdh_generic snd_timer ecc intel_cstate snd soundcore mei_me intel_rapl_perf 8250_dw intel_pch_thermal mei intel_wmi_thunderbolt pcspkr mac_hid acpi_pad i915 vfio_mdev mdev vfio_iommu_type1 vfio drm_kms_helper drm i2c_algo_bit [39964.844226] fb_sys_fops syscopyarea sysfillrect sysimgblt kvm irqbypass sunrpc ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq hid_generic dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c usbhid hid intel_lpss_pci xhci_pci intel_lpss e1000e ahci idma64 i2c_i801 virt_dma libahci xhci_hcd wmi video [39964.844237] CPU: 5 PID: 0 Comm: swapper/5 Tainted: P O 5.4.44-2-pve #1 [39964.844237] Hardware name: SECO S.p.A. MB09/MB09, BIOS MB09 1.03 2018/08/30 [39964.844239] RIP: 0010:dev_watchdog+0x264/0x270 [39964.844240] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 81 1a eb 00 01 e8 80 b1 fa ff 89 d9 4c 89 ee 48 c7 c7 70 2f 03 89 48 89 c2 e8 cd 7a 74 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 [39964.844241] RSP: 0018:ffffbfa280230e58 EFLAGS: 00010282 [39964.844242] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 [39964.844243] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff9a299db578c0 [39964.844243] RBP: ffffbfa280230e88 R08: 00000000000003a1 R09: 0000000000000004 [39964.844244] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001 [39964.844244] R13: ffff9a2990a60000 R14: ffff9a2990a60480 R15: ffff9a2991497c80 [39964.844245] FS: 0000000000000000(0000) GS:ffff9a299db40000(0000) knlGS:0000000000000000 [39964.844246] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [39964.844246] CR2: 00007ff83fb170f4 CR3: 00000002fc20a006 CR4: 00000000003626e0 [39964.844247] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [39964.844247] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [39964.844248] Call Trace: [39964.844249] <IRQ> [39964.844252] ? pfifo_fast_enqueue+0x160/0x160 [39964.844254] call_timer_fn+0x32/0x130 [39964.844255] run_timer_softirq+0x1a5/0x430 [39964.844257] ? enqueue_hrtimer+0x3c/0x90 [39964.844258] ? ktime_get+0x3c/0xa0 [39964.844260] ? lapic_next_deadline+0x26/0x30 [39964.844261] ? clockevents_program_event+0x93/0xf0 [39964.844264] __do_softirq+0xdc/0x2d4 [39964.844265] irq_exit+0xa9/0xb0 [39964.844267] smp_apic_timer_interrupt+0x79/0x130 [39964.844268] apic_timer_interrupt+0xf/0x20 [39964.844269] </IRQ> [39964.844271] RIP: 0010:cpuidle_enter_state+0xbd/0x450 [39964.844272] Code: ff e8 a7 b4 84 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 63 03 00 00 31 ff e8 ca 22 8b ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 8d 02 00 00 49 63 cd 48 8b 75 d0 48 2b 75 c8 48 8d [39964.844273] RSP: 0018:ffffbfa280113e48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [39964.844274] RAX: ffff9a299db6ad40 RBX: ffffffff89357a00 RCX: 000000000000001f [39964.844274] RDX: 000024590a574828 RSI: 000000002aaa9f7b RDI: 0000000000000000 [39964.844275] RBP: ffffbfa280113e88 R08: 0000000000000002 R09: 000000000002a5c0 [39964.844275] R10: 00006d19820469de R11: ffff9a299db699e0 R12: ffffdfa27fd52500 [39964.844276] R13: 0000000000000006 R14: ffffffff89357c58 R15: ffffffff89357c40 [39964.844278] ? cpuidle_enter_state+0x99/0x450 [39964.844280] cpuidle_enter+0x2e/0x40 [39964.844282] call_cpuidle+0x23/0x40 [39964.844283] do_idle+0x22c/0x270 [39964.844285] cpu_startup_entry+0x1d/0x20 [39964.844286] start_secondary+0x166/0x1c0 [39964.844288] secondary_startup_64+0xa4/0xb0 [39964.844289] ---[ end trace 6978f9a6f235f4ac ]--- [39964.844300] e1000e 0000:00:1f.6 enp0s31f6: Reset adapter unexpectedly ok i confirm that the patch DOESN'T SOLVE the issue. It just postpone it, but it doesn't solve it at all. After 1 week of uptime, the error had back: Jul 26 00:00:02 pvei7 rsyslogd: [origin software="rsyslogd" swVersion="8.1901.0" x-pid="798" x-info="https://www.rsyslog.com"] rsyslogd was HUPed Jul 27 00:00:01 pvei7 rsyslogd: [origin software="rsyslogd" swVersion="8.1901.0" x-pid="798" x-info="https://www.rsyslog.com"] rsyslogd was HUPed Jul 27 02:59:04 pvei7 lvm[492]: WARNING: Thin pool pve-data-tpool data is now 90.01% full. Jul 27 18:10:03 pvei7 kernel: [511516.174678] perf: interrupt took too long (3917 > 3911), lowering kernel.perf_event_max_sample_rate to 51000 Jul 28 00:00:01 pvei7 rsyslogd: [origin software="rsyslogd" swVersion="8.1901.0" x-pid="798" x-info="https://www.rsyslog.com"] rsyslogd was HUPed Jul 28 19:46:58 pvei7 kernel: [603731.430464] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:47:02 pvei7 kernel: [603735.319318] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:47:02 pvei7 kernel: [603735.319389] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:47:02 pvei7 kernel: [603735.319394] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:47:07 pvei7 kernel: [603740.386574] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:47:11 pvei7 kernel: [603744.357497] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:47:11 pvei7 kernel: [603744.357566] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:47:11 pvei7 kernel: [603744.357569] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:47:21 pvei7 kernel: [603754.466795] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:47:25 pvei7 kernel: [603758.362778] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:47:25 pvei7 kernel: [603758.362848] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:47:25 pvei7 kernel: [603758.362851] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:47:35 pvei7 kernel: [603768.547088] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:47:39 pvei7 kernel: [603772.512977] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:47:39 pvei7 kernel: [603772.513047] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:47:39 pvei7 kernel: [603772.513050] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:47:49 pvei7 kernel: [603782.371358] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:47:53 pvei7 kernel: [603786.306186] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:47:53 pvei7 kernel: [603786.306258] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:47:53 pvei7 kernel: [603786.306262] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:48:03 pvei7 kernel: [603796.455426] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:48:07 pvei7 kernel: [603800.433411] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:48:07 pvei7 kernel: [603800.433483] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:48:07 pvei7 kernel: [603800.433487] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:48:17 pvei7 kernel: [603810.531636] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:48:21 pvei7 kernel: [603814.478659] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:48:21 pvei7 kernel: [603814.478732] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:48:21 pvei7 kernel: [603814.478736] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:48:31 pvei7 kernel: [603824.355883] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:48:35 pvei7 kernel: [603828.308836] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:48:35 pvei7 kernel: [603828.308914] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:48:35 pvei7 kernel: [603828.308918] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:48:40 pvei7 kernel: [603833.572009] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:48:44 pvei7 kernel: [603837.425924] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:48:44 pvei7 kernel: [603837.426002] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:48:44 pvei7 kernel: [603837.426005] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:48:49 pvei7 kernel: [603842.536142] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:48:53 pvei7 kernel: [603846.400033] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:48:53 pvei7 kernel: [603846.400124] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:48:53 pvei7 kernel: [603846.400128] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:48:58 pvei7 kernel: [603851.496230] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:49:02 pvei7 kernel: [603855.363223] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:49:02 pvei7 kernel: [603855.363292] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:49:02 pvei7 kernel: [603855.363296] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:49:12 pvei7 kernel: [603865.572422] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:49:16 pvei7 kernel: [603869.442345] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:49:16 pvei7 kernel: [603869.442418] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:49:16 pvei7 kernel: [603869.442422] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:49:21 pvei7 kernel: [603874.532574] vmbr0: port 1(enp0s31f6) entered disabled state Jul 28 19:49:25 pvei7 kernel: [603878.468504] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx Jul 28 19:49:25 pvei7 kernel: [603878.468576] vmbr0: port 1(enp0s31f6) entered blocking state Jul 28 19:49:25 pvei7 kernel: [603878.468579] vmbr0: port 1(enp0s31f6) entered forwarding state Jul 28 19:49:35 pvei7 kernel: [603888.356714] vmbr0: port 1(enp0s31f6) entered disabled state I am experiencing similarly reported behavior with 5.10.22-200.fc33.x86_64 (Fedora). Usually works for an extended period (days) and then enters a connect/disconnect cycle with short or no connectivity. Attempted different switch and cable while it was in connect/disconnect cycles. Excerpt from logs: Mar 16 15:47:29 localhost.localdomain kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang: TDH <0> TDT <1> next_to_use <1> next_to_clean <0> buffer_info[next_to_clean]: time_stamp <10b2a22d2> next_to_watch <0> jiffies <10b2a3a00> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> Mar 16 15:47:31 localhost.localdomain kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang: TDH <0> TDT <1> next_to_use <1> next_to_clean <0> buffer_info[next_to_clean]: time_stamp <10b2a22d2> next_to_watch <0> jiffies <10b2a41c0> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> Mar 16 15:47:33 localhost.localdomain kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang: TDH <0> TDT <1> next_to_use <1> next_to_clean <0> buffer_info[next_to_clean]: time_stamp <10b2a22d2> next_to_watch <0> jiffies <10b2a4980> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> Mar 16 15:47:33 localhost.localdomain kernel: e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly Mar 16 15:47:33 localhost.localdomain NetworkManager[1140]: <info> [1615924053.4093] device (eno1): state change: disconnected -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed') Mar 16 15:47:38 localhost.localdomain systemd[1]: NetworkManager-dispatcher.service: Succeeded. Mar 16 15:47:38 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Mar 16 15:47:39 localhost.localdomain kernel: e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Mar 16 15:47:39 localhost.localdomain NetworkManager[1140]: <info> [1615924059.2580] device (eno1): carrier: link connected Mar 16 15:47:39 localhost.localdomain NetworkManager[1140]: <info> [1615924059.2586] device (eno1): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed') Mar 16 15:47:41 localhost.localdomain kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang: TDH <0> TDT <1> next_to_use <1> next_to_clean <0> buffer_info[next_to_clean]: time_stamp <10b2a60df> next_to_watch <0> jiffies <10b2a68c1> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> Mar 16 15:47:43 localhost.localdomain kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang: TDH <0> TDT <1> next_to_use <1> next_to_clean <0> buffer_info[next_to_clean]: time_stamp <10b2a60df> next_to_watch <0> jiffies <10b2a7080> next_to_watch.status <0> MAC Status <40080083> PHY Status <796d> PHY 1000BASE-T Status <3800> PHY Extended Status <3000> PCI Status <10> Mar 16 15:47:44 localhost.localdomain kernel: e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly Mar 16 15:47:44 localhost.localdomain NetworkManager[1140]: <info> [1615924064.6731] device (eno1): state change: disconnected -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed') Mar 16 15:47:50 localhost.localdomain kernel: e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None ``` This also affects me. I use a T480s with the 5.4.0-198 generic kernel on Mint. This also has problems on other distributions. I boot the notebook normally, it tries to connect to the E1000e ethernet adapter on boot. Then I get the error "Failed to start NetworkManager-wait-online.service". The next log says that the network is online. Once logged in, I can use my WiFi normally, but the lan is stuck in a loop, timed out every 45 seconds. I found out that the systemd-networkd has an error called Unknown state for interface NetworkctlListState(idx=2, name='enp0s31f6, type='ether' ...). I decided to restart every service with network in its name... :-) Which unfortunately did not work, then I killed the driver - reconnected the LAN cable - and when I restarted the driver with modprobe it worked fine. Weirdly, I rebooted the machine and reset the OS configuration and on reconnecting it worked fine again... I am running this in a school, so telling the students to 'just reconnect the LAN cable' is not an option. Created attachment 307255 [details] attachment-13735-0.html Hello, I am currently out of the office on Army Reserve Duty, which limits my ability to respond to emails. Please do not communicate through email for existing or new issues; instead, use the IPS system. Instructions on opening an IPS ticket are attached to this auto-reply. We are committed to support all IPS submitted, we might decide to assign it to someone else. For urgent matters, you may contact my manager at shmuel.ben-nisan@intel.com<mailto:shmuel.ben-nisan@intel.com>. To ensure effective monitoring and resolution of this issue, we request that you open an IPS ticket and provide the following information: 1. A comprehensive and detailed description of the failing scenario, broken down into steps, e.g., * Boot to OS * Enter S4 * Return from S4 * Issue reproduced 1. Confirm if this issue is reproducible on Vpro or Non Vpro SKU (V/LM). 2. Provide fail rate and number of systems that can reproduce the issue, along with the number of tests conducted and failures, e.g., * 2 out of 15 systems failed * System 1 fail rate: 1 out of 20 * System 2 fail rate: 1 out of 15 3. Specify the test environment (Windows/Linux/UEFI/PXE). If Windows, include the LAN device driver version (e.g., 20.0.2.8). 4. Provide the LAN NVM version. 5. Indicate whether the LAN cable is connected or disconnected and if it is part of the test flow. 6. Describe the expected behavior in this scenario. 7. State the pass criteria for this scenario. Failure to provide this information will result in automatic resolution and closure of the issue. Thank you for your cooperation. Best regards, (In reply to Tin from comment #73) > This also affects me. I use a T480s with the 5.4.0-198 generic kernel on > Mint. This also has problems on other distributions. > I boot the notebook normally, it tries to connect to the E1000e ethernet > adapter on boot. Then I get the error "Failed to start > NetworkManager-wait-online.service". The next log says that the network is > online. > Once logged in, I can use my WiFi normally, but the lan is stuck in a loop, > timed out every 45 seconds. I found out that the systemd-networkd has an > error called Unknown state for interface NetworkctlListState(idx=2, > name='enp0s31f6, type='ether' ...). > I decided to restart every service with network in its name... :-) > Which unfortunately did not work, then I killed the driver - reconnected the > LAN cable - and when I restarted the driver with modprobe it worked fine. > Weirdly, I rebooted the machine and reset the OS configuration and on > reconnecting it worked fine again... I am running this in a school, so > telling the students to 'just reconnect the LAN cable' is not an option. Kernel 6.8 has the same error |