Bug 47331 - e1000: Detected Tx Unit Hang - network is not operational
Summary: e1000: Detected Tx Unit Hang - network is not operational
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-10 19:51 UTC by Khomutov Vladimir
Modified: 2020-09-07 20:46 UTC (History)
26 users (show)

See Also:
Kernel Version: 3.5.3
Tree: Mainline
Regression: No


Attachments
dmesg (1.96 KB, text/plain)
2012-09-10 19:51 UTC, Khomutov Vladimir
Details
ethtool (1.20 KB, text/plain)
2012-09-10 19:52 UTC, Khomutov Vladimir
Details
kernel config (91.25 KB, text/plain)
2012-09-10 19:52 UTC, Khomutov Vladimir
Details
lshw (25.38 KB, text/plain)
2012-09-10 19:52 UTC, Khomutov Vladimir
Details
lspci -vvv (41.34 KB, text/plain)
2012-09-10 19:53 UTC, Khomutov Vladimir
Details
uname (131 bytes, text/plain)
2012-09-10 19:53 UTC, Khomutov Vladimir
Details
dmesg output with verbose settings on (26.17 KB, text/plain)
2012-10-25 05:01 UTC, Khomutov Vladimir
Details
lspci -vvv after the issue occured (40.12 KB, text/plain)
2012-10-25 05:04 UTC, Khomutov Vladimir
Details
lspci -vvv after reboot (no issues) (40.10 KB, text/plain)
2012-10-25 17:52 UTC, Khomutov Vladimir
Details
kernel log with full dump of registers after the issue (148.67 KB, text/x-log)
2012-10-25 17:53 UTC, Khomutov Vladimir
Details
logs demonstrating the issue with 3.7.9 and 2 gig of ram (133.01 KB, application/x-gzip)
2013-02-27 20:30 UTC, Khomutov Vladimir
Details
dmesg (31.85 KB, text/plain)
2015-02-10 23:02 UTC, abandoned account
Details

Description Khomutov Vladimir 2012-09-10 19:51:47 UTC
Created attachment 79641 [details]
dmesg

Under any significant load the driver starts producing
detected Tx Unit hang messages and resets adapter, thus
networking is not really usable.
By significant load i mean an attempt to download big (>2 Mb) 
file over ssh. 
Please see attached files for system details.
The bug is 100% reproducible.

It looks like the problem is known for a long time, but
there is some mess with linux e1000 drivers: it looks
like kernel contains 7.x version (alive) and there is 
8.x version by intel on sourceforge (not maintained
and doesn't compile for newer kernels). A lot of suggestions
are about intel's version of the driver.
Please sched some light on the situation with e1000 in linux...
Comment 1 Khomutov Vladimir 2012-09-10 19:52:14 UTC
Created attachment 79651 [details]
ethtool
Comment 2 Khomutov Vladimir 2012-09-10 19:52:32 UTC
Created attachment 79661 [details]
kernel config
Comment 3 Khomutov Vladimir 2012-09-10 19:52:54 UTC
Created attachment 79671 [details]
lshw
Comment 4 Khomutov Vladimir 2012-09-10 19:53:18 UTC
Created attachment 79681 [details]
lspci -vvv
Comment 5 Khomutov Vladimir 2012-09-10 19:53:37 UTC
Created attachment 79691 [details]
uname
Comment 6 Stefan de Konink 2012-10-16 20:08:27 UTC
I want to confirm this also happened to me today using 3.3.1-gentoo after 192 days of uptime, while the system was _not_ under any significant load.

e1000 0000:01:03.0: eth0: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <bd>
  TDT                  <bd>
  next_to_use          <bd>
  next_to_clean        <73>
buffer_info[next_to_clean]
  time_stamp           <5fd26739>
  next_to_watch        <74>
  jiffies              <5fd267fb>
  next_to_watch.status <0>
e1000 0000:01:03.0: eth0: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <bd>
  TDT                  <bd>
  next_to_use          <bd>
  next_to_clean        <73>
buffer_info[next_to_clean]
  time_stamp           <5fd26739>
  next_to_watch        <74>
  jiffies              <5fd268c3>
  next_to_watch.status <0>
e1000 0000:01:03.0: eth0: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <bd>
  TDT                  <bd>
  next_to_use          <bd>
  next_to_clean        <73>
buffer_info[next_to_clean]
  time_stamp           <5fd26739>
  next_to_watch        <74>
  jiffies              <5fd2698b>
  next_to_watch.status <0>
------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0x1b3/0x1bc()
Hardware name: PowerEdge 650              
NETDEV WATCHDOG: eth0 (e1000): transmit queue 0 timed out
Modules linked in: usb_storage usb_libusual xt_comment nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables ohci_hcd usbcore usb_common
Pid: 0, comm: swapper Not tainted 3.3.1-gentoo #3
Call Trace:
 [<c101fa16>] warn_slowpath_common+0x67/0x8e
 [<c123d824>] ? dev_watchdog+0x1b3/0x1bc
 [<c123d824>] ? dev_watchdog+0x1b3/0x1bc
 [<c101fab9>] warn_slowpath_fmt+0x2e/0x30
 [<c123d824>] dev_watchdog+0x1b3/0x1bc
 [<c102a8d1>] run_timer_softirq+0xfb/0x28a
 [<c123d671>] ? netif_carrier_off+0x26/0x26
 [<c102461b>] __do_softirq+0x72/0x14a
 [<c10245a9>] ? __tasklet_hi_schedule_first+0x4b/0x4b
 <IRQ>  [<c102485f>] ? irq_exit+0x64/0x85
 [<c100395d>] ? do_IRQ+0x3d/0x84
 [<c12b6029>] ? common_interrupt+0x29/0x30
 [<c10082b3>] ? default_idle+0x4d/0x129
 [<c100167f>] ? cpu_idle+0x40/0x63
 [<c12aa515>] ? rest_init+0x55/0x60
 [<c13ce607>] ? start_kernel+0x24c/0x252
 [<c13ce13f>] ? loglevel+0x2b/0x2b
 [<c13ce044>] ? i386_start_kernel+0x44/0x46
---[ end trace 7a625de8614c18af ]---
Comment 7 Tushar 2012-10-23 22:10:55 UTC
Set the current msglvl by 'ethtool -s ethx msglvl 0x2c01' so driver will print hw ring info when problem occurs.
Please submit full dmesg log  and lspci -vvv output after issue occurs.

-Tushar
Comment 8 Khomutov Vladimir 2012-10-25 05:01:50 UTC
Created attachment 84761 [details]
dmesg output with verbose settings on
Comment 9 Khomutov Vladimir 2012-10-25 05:04:01 UTC
Created attachment 84771 [details]
lspci -vvv after the issue occured
Comment 10 Tushar 2012-10-25 05:14:25 UTC
dmesg log seems to be overwritten. It does not contain tx ring info. If you have not attached full dmesg please attach.

I do see PCI Master Abort error in lspci. 
07:03.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet 	Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
             ^^

You may need to freshly boot system and dump lspci -vvv before and after tx hang occurs to confirm that MAbort is the cause.
Comment 11 Khomutov Vladimir 2012-10-25 17:52:36 UTC
Created attachment 84821 [details]
lspci -vvv after reboot (no issues)
Comment 12 Khomutov Vladimir 2012-10-25 17:53:43 UTC
Created attachment 84831 [details]
kernel log with full dump of registers after the issue
Comment 13 Tushar 2012-10-25 23:25:19 UTC
So looking at lspci -vvv before and after confirms that root cause of the tx hang is PCI MAbort. (For some reason I don't see TX descriptor ring dump logged in dmesg log.)

Looks like system has 8GB RAM. Can you test with only 2 GB ram and see if issue occurs?
Comment 14 Stefan de Konink 2012-10-25 23:28:13 UTC
My problem was on a much older system, with only 1.5GB of RAM. Doubt that is related.
Comment 15 Tushar 2012-10-25 23:33:14 UTC
Was this working at all before and start appearing with kernel upgrade?
Comment 16 Khomutov Vladimir 2012-10-26 10:08:05 UTC
(In reply to comment #13)
 
> Looks like system has 8GB RAM. Can you test with only 2 GB ram and see if
> issue
> occurs?

When I've just installed the hardware and hit the issue, I spent some time
looking for solution, and one of possible causes named was ram > 4 G, so I've
tried with 4 G and the result was the same. Can't remember if I've tried with
2G. Anyway, I need it working with 8G.
Comment 17 Tushar 2012-11-01 20:53:44 UTC
pci bus trace would be very helpful to find out cause of Mabort error. Do you have facility to capture bus trace?

If not then can you send me the dmesg log again with Tx ring info.
(Last time you sent dmesg log - Comment #12, it has not dumped the tx ring. Make sure msglvl is set to value 0x2c01 - 'ethtool -s ethx msglvl 0x2c01')
Comment 18 Khomutov Vladimir 2012-11-01 21:01:27 UTC
I have no idea how to get PCI bus trace.
If it does not require some special hardware, I can try if you explain how.

I don't know why kernel didn't dump TX ring, since I've entered 
command 'ethtool -s ethx msglvl 0x2c01' and the result was that
there were a lot of verbose messages from driver, not just errors.

I thought that this line was one marking start of what you are looking for:

>> Oct 25 21:22:13 myhost kernel: e1000: Tx descriptor cache in 64bit format
Comment 19 Phil 2012-11-19 18:21:37 UTC
I upgraded a number of boxes using 82546GB from 3.1.x kernels to 3.6.x recently, and a number of them have started having these TX hangs regularly.  So what changed between 3.1 and 3.6 to cause this?  Any output I can provide which would assist?
Comment 20 Stefan de Konink 2012-11-19 18:27:19 UTC
With my report regarding 3.3.1 we can reduce that to: what happened between 3.1 and 3.3.1.
Comment 21 ulf kypke 2013-01-10 15:06:37 UTC
hey,
i have this at kernel 2.6 as well as with kernel 3.3.8
i use openwrt on a router with 5 intel e1000 and e1000e cards.
most of this routers have 1 or 2 gb ram
this bug happens after aprox. 5 hours of running, does not matter if there is a lot of traffic or not.
i can reproduce this bug on various kernel versions, but with kernel 3.3.8 it happens way more often then with kernel 2.6
g i also posted this bug @ 52571 (sorry for crossposting)

best ulf
Comment 22 Tushar 2013-01-10 21:22:57 UTC
(In reply to comment #18)
> I have no idea how to get PCI bus trace.
> If it does not require some special hardware, I can try if you explain how.
> I don't know why kernel didn't dump TX ring, since I've entered 
> command 'ethtool -s ethx msglvl 0x2c01' and the result was that
> there were a lot of verbose messages from driver, not just errors.

Yes please send me the full dmesg log taken.
> I thought that this line was one marking start of what you are looking for:
> >> Oct 25 21:22:13 myhost kernel: e1000: Tx descriptor cache in 64bit format

 
Would you also please try disabling tso with 'ethtool -K ethx tso off'. See if that makes any difference. Meanwhile I will see if I get hold off similar system as yours and can reproduce issue locally.
Comment 23 Rich Ercolani 2013-02-06 23:42:44 UTC
So this looks a lot like the problem I once had with these cards.

http://sourceforge.net/p/e1000/bugs/266/ is the relevant bug report.

Verbatim quote from the bug report at the time, though it seems to be gone from the updated version (sf.net migrated bug trackers I suppose):

"We were able to reproduce this bug here, and verify that the system in
question has a cache coherency problem.  the driver is correctly updating
the system memory and then requesting hardware to DMA the data.  When the
hardware DMA request is completed by the memory controller, the data in
question is stale (the value prior to the update) and then the software
suffers an apparent "tx hang"

We are still investigating if there is a fix possible."

I believe they eventually did indeed have a fix in the 8.x series - at least, at some point I downloaded it and used it, and the problem went away.

The motherboard being Intel-branded implies it probably isn't a crap board with cache coherency bugs...one hopes, at least.  But that is the same era of hardware we had these problems on.
Comment 24 Tushar 2013-02-06 23:43:13 UTC
I am on vacation 02/06

-Tushar
Comment 25 Khomutov Vladimir 2013-02-27 20:30:05 UTC
we all hope vacation was great =))

but ack on topic:

1) I was able to reproduce the problem with kernel 3.7.9
2) The problem is 100% reproducible with 2 gig of RAM
3) Turning tso off doesn't help
4) I was able to get full logs with "ethtool -s ethx msglvl 0x2c01"

I'm attaching tarball with logs (kernel log, lscpi and some other related information) for all this cases.
Comment 26 Khomutov Vladimir 2013-02-27 20:30:42 UTC
Created attachment 94201 [details]
logs demonstrating the issue with 3.7.9 and 2 gig of ram
Comment 27 abandoned account 2015-02-10 23:02:13 UTC
Created attachment 166421 [details]
dmesg

I got this same thing, unexpectedly, inside virtualbox(8GB RAM) with kernel 3.16.5-gentoo (found on install-amd64-minimal-20141204.iso)
Comment 28 peter.eldridge.bailey 2016-03-09 05:36:49 UTC
I am also affected by this on 4.4.0-1-686-pae #1 SMP Debian 4.4.2-3 (2016-02-21).
Comment 29 Victor Pablos Ceruelo 2016-05-16 15:58:53 UTC
Me too, Ubuntu 4.4.0-22.39-generic 4.4.8 + Intel(R) PRO/1000 Network Driver - 3.2.6-k

Trying now with options:

ethtool -K eno1 gso off gro off tso off
ethtool -s eno1 msglvl 0x2c01
ethtool --set-eee eno1 eee off
ethtool --set-eee eno1 advertise 0

Not tried yet to disable Active-State Power Management (boot option):

pcie_aspm=off

Seems to be the link goes off/on at some times, but no more details in dmesg.

Thinking into forcing the network adapter to work at 100Mb speed ...
Comment 30 Victor Pablos Ceruelo 2016-05-16 16:39:17 UTC
Hi again.

It seems to be the problem is fixed by setting the options I wrote before and reducing the speed to 100 Mb. 

ethtool -s eno1 speed 100 duplex full

I've been 1/2h downloading at 600Kb-1Mb.
At home I do not need more, but it could be interesting to know why ...

There are more ideas around, like 

ethtool -K eno1 gso off gro off tso off lro off

and upgrading to e1000e-3.3.3

https://communities.intel.com/thread/70244
https://downloadcenter.intel.com/download/15817/Network-Adapter-Driver-for-PCI-E-Gigabit-Network-Connections-under-Linux-?v=t

Hope it helps others ...
Comment 31 Till Schäfer 2018-03-01 13:29:19 UTC
I can confirm the issue with gentoo sources 4.15.2 on a C220 chipset under heavy load (> 500 Mbit / adapter hangs up every few seconds). I can also confirm, that the following workaround helps. The only popped up recently after using the hardware without any problem up to recent kernels. 

ethtool -K eno1 gso off gro off tso off



00:19.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection I217-LM [8086:153a] (rev 05)
        Subsystem: ASUSTeK Computer Inc. Ethernet Connection I217-LM [1043:8535]
        Flags: bus master, fast devsel, latency 0, IRQ 27
        Memory at f7d00000 (32-bit, non-prefetchable) [size=128K]
        Memory at f7d35000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at f080 [size=32]
        Capabilities: [c8] Power Management version 2
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [e0] PCI Advanced Features
        Kernel driver in use: e1000e
Comment 32 Nikolay Kichukov 2019-08-16 13:15:21 UTC
After a recent kernel upgrade to 5.2.8 on GNU/Gentoo Linux I started to experience the same problem. This is observed on Dell Precision Tower 5810 with x86_64 kernel.

This also seems to be a regression, as the issue was not there at least on the below earlier kernels: 

kernel-genkernel-x86_64-4.19.27-gentoo-r1
kernel-genkernel-x86_64-5.0.5-gentoo
kernel-genkernel-x86_64-5.0.7-gentoo
kernel-genkernel-x86_64-5.0.10-gentoo
kernel-genkernel-x86_64-5.1.0-gentoo
kernel-genkernel-x86_64-5.1.3-gentoo
kernel-genkernel-x86_64-5.1.7-gentoo
kernel-genkernel-x86_64-5.1.11-gentoo

Same ethernet driver from kernel 5.2.8 is in use on a Dell Latitude laptop and the problem has not yet shown, thus this seems to be hardware related.

On the Precision Tower 5810, disabling the segmentation offloading with:
ethtool -K enp0s25 gso off gro off tso off

does seem to provide relief.

Thanks!
-N
Comment 33 Nikolay Kichukov 2019-08-16 13:17:15 UTC
I overlooked the kernel driver name as originally reported in year 2012, I use 'e1000e' kernel driver nowadays, not 'e1000'.
Comment 34 SB 2019-09-15 12:55:16 UTC
This is still an issue.

I'm running OpenWRT with Kernel 4.14.131.  Any reasonable load on the Intel onboard I217LM NIC causes it to hardware fault repeatedly. 

[  917.996439] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
[  917.996439]   TDH                  <db>
[  917.996439]   TDT                  <f1>
[  917.996439]   next_to_use          <f1>
[  917.996439]   next_to_clean        <db>
[  917.996439] buffer_info[next_to_clean]:
[  917.996439]   time_stamp           <10000efce>
[  917.996439]   next_to_watch        <db>
[  917.996439]   jiffies              <10000f168>
[  917.996439]   next_to_watch.status <0>
[  917.996439] MAC Status             <80083>
[  917.996439] PHY Status             <796d>
[  917.996439] PHY 1000BASE-T Status  <3800>
[  917.996439] PHY Extended Status    <3000>
[  917.996439] PCI Status             <10>

I can confirm the "ethtool -K enp0s25 gso off gro off tso off" workaround does indeed appear to work.
Comment 35 SB 2019-09-15 13:29:02 UTC
I have this running and repeatable on a test-rig if there are any diagnostics that would be useful.
Comment 36 Charles Curley 2019-11-21 21:46:59 UTC
Still an issue for me also. I'm running on a single board computer w/ a 3 network PC/104+ card and getting these errors like crazy. 

[Thu Nov 21 21:08:35 2019] perf: interrupt took too long (5094 > 5083), lowering kernel.perf_event_max_sample_rate to 39000
[Thu Nov 21 21:09:48 2019] e1000 0000:03:06.0 eth-lcs: Detected Tx Unit Hang
                             Tx Queue             <0>
                             TDH                  <6d>
                             TDT                  <6d>
                             next_to_use          <6d>
                             next_to_clean        <5b>
                           buffer_info[next_to_clean]
                             time_stamp           <1002a3b17>
                             next_to_watch        <5c>
                             jiffies              <1002a3d00>
                             next_to_watch.status <0>
[Thu Nov 21 21:09:50 2019] e1000 0000:03:06.0 eth-lcs: Detected Tx Unit Hang
                             Tx Queue             <0>
                             TDH                  <6d>
                             TDT                  <6d>
                             next_to_use          <6d>
                             next_to_clean        <5b>
                           buffer_info[next_to_clean]
                             time_stamp           <1002a3b17>
                             next_to_watch        <5c>
                             jiffies              <1002a3f80>
                             next_to_watch.status <0>
[Thu Nov 21 21:09:52 2019] e1000 0000:03:06.0 eth-lcs: Detected Tx Unit Hang
                             Tx Queue             <0>
                             TDH                  <6d>
                             TDT                  <6d>
                             next_to_use          <6d>
                             next_to_clean        <5b>
                           buffer_info[next_to_clean]
                             time_stamp           <1002a3b17>
                             next_to_watch        <5c>
                             jiffies              <1002a4200>
                             next_to_watch.status <0>
[Thu Nov 21 21:09:54 2019] e1000 0000:03:06.0 eth-lcs: Detected Tx Unit Hang
                             Tx Queue             <0>
                             TDH                  <6d>
                             TDT                  <6d>
                             next_to_use          <6d>
                             next_to_clean        <5b>
                           buffer_info[next_to_clean]
                             time_stamp           <1002a3b17>
                             next_to_watch        <5c>
                             jiffies              <1002a4480>
                             next_to_watch.status <0>
[Thu Nov 21 21:09:55 2019] e1000 0000:03:06.0 eth-lcs: Reset adapter
[Thu Nov 21 21:10:00 2019] e1000: eth-lcs NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Comment 37 Jonathan Bisson 2020-06-05 15:09:20 UTC
Kernel Version impacted
-----------------------
5.4.43 and 5.6.15

Trigger
-------

On heavy network use (so the bug may be much older than the kernels I found it in).

Consequence
-----------
 I get random packet drops and packets with wrong data (for example checksums of some downloaded files fail)

Log
---

###dmesg
[68832.822344] e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
                 TDH                  <23>
                 TDT                  <42>
                 next_to_use          <42>
                 next_to_clean        <23>
               buffer_info[next_to_clean]:
                 time_stamp           <10139b38b>
                 next_to_watch        <24>
                 jiffies              <10139b540>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[68834.742296] e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
                 TDH                  <23>
                 TDT                  <42>
                 next_to_use          <42>
                 next_to_clean        <23>
               buffer_info[next_to_clean]:
                 time_stamp           <10139b38b>
                 next_to_watch        <24>
                 jiffies              <10139b780>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[68836.662275] e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
                 TDH                  <23>
                 TDT                  <42>
                 next_to_use          <42>
                 next_to_clean        <23>
               buffer_info[next_to_clean]:
                 time_stamp           <10139b38b>
                 next_to_watch        <24>
                 jiffies              <10139b9c0>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[68838.796845] e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
                 TDH                  <23>
                 TDT                  <42>
                 next_to_use          <42>
                 next_to_clean        <23>
               buffer_info[next_to_clean]:
                 time_stamp           <10139b38b>
                 next_to_watch        <24>
                 jiffies              <10139bc40>
                 next_to_watch.status <0>
               MAC Status             <80083>
               PHY Status             <796d>
               PHY 1000BASE-T Status  <3800>
               PHY Extended Status    <3000>
               PCI Status             <10>
[68839.648816] e1000e 0000:00:19.0 enp0s25: Reset adapter unexpectedly
[68843.384565] e1000e 0000:00:19.0 enp0s25: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

###


Temporary solution
------------------
It is solved with: 
ethtool -K enp0s25 tso off gso off


HW info (lshw)
--------------
*-network
             description: Ethernet interface
             product: Ethernet Connection (3) I218-LM
             vendor: Intel Corporation
             physical id: 19
             bus info: pci@0000:00:19.0
             logical name: enp0s25
             version: 03
             serial: <SNIP>
             size: 1Gbit/s
             capacity: 1Gbit/s
             width: 32 bits
             clock: 33MHz
             capabilities: pm msi bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
             configuration: autonegotiation=on broadcast=yes driver=e1000e driverversion=3.2.6-k duplex=full firmware=0.2-4 latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s
             resources: irq:48 memory:c1300000-c131ffff memory:c133d000-c133dfff ioport:5080(size=32)


HW info (lspci)
---------------

00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (3) I218-LM (rev 03)
	Subsystem: Hewlett-Packard Company Ethernet Connection (3) I218-LM
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 48
	Region 0: Memory at c1300000 (32-bit, non-prefetchable) [size=128K]
	Region 1: Memory at c133d000 (32-bit, non-prefetchable) [size=4K]
	Region 2: I/O ports at 5080 [disabled] [size=32]
	Capabilities: [c8] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee00358  Data: 0000
	Capabilities: [e0] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: e1000e
	Kernel modules: e1000e
Comment 38 Thomas Clark 2020-09-07 20:46:26 UTC
Running kernel 5.8.4, I am seeing the same errors.

Sep 06 22:30:10 kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                                                 TDH                  <39>
                                                 TDT                  <60>
                                                 next_to_use          <60>
                                                 next_to_clean        <38>
                                               buffer_info[next_to_clean]:
                                                 time_stamp           <1030819ce>
                                                 next_to_watch        <39>
                                                 jiffies              <103082240>
                                                 next_to_watch.status <0>
                                               MAC Status             <80083>
                                               PHY Status             <796d>
                                               PHY 1000BASE-T Status  <3800>
                                               PHY Extended Status    <3000>
                                               PCI Status             <10>
Sep 06 22:30:12 kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                                                 TDH                  <39>
                                                 TDT                  <60>
                                                 next_to_use          <60>
                                                 next_to_clean        <38>
                                               buffer_info[next_to_clean]:
                                                 time_stamp           <1030819ce>
                                                 next_to_watch        <39>
                                                 jiffies              <103082a00>
                                                 next_to_watch.status <0>
                                               MAC Status             <80083>
                                               PHY Status             <796d>
                                               PHY 1000BASE-T Status  <3800>
                                               PHY Extended Status    <3000>
                                               PCI Status             <10>
Sep 06 22:30:14 kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                                                 TDH                  <39>
                                                 TDT                  <60>
                                                 next_to_use          <60>
                                                 next_to_clean        <38>
                                               buffer_info[next_to_clean]:
                                                 time_stamp           <1030819ce>
                                                 next_to_watch        <39>
                                                 jiffies              <103083200>
                                                 next_to_watch.status <0>
                                               MAC Status             <80083>
                                               PHY Status             <796d>
                                               PHY 1000BASE-T Status  <3800>
                                               PHY Extended Status    <3000>
                                               PCI Status             <10>
Sep 06 22:30:16 kernel: e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                                                 TDH                  <39>
                                                 TDT                  <60>
                                                 next_to_use          <60>
                                                 next_to_clean        <38>
                                               buffer_info[next_to_clean]:
                                                 time_stamp           <1030819ce>
                                                 next_to_watch        <39>
                                                 jiffies              <1030839c0>
                                                 next_to_watch.status <0>
                                               MAC Status             <80083>
                                               PHY Status             <796d>
                                               PHY 1000BASE-T Status  <3800>
                                               PHY Extended Status    <3000>
                                               PCI Status             <10>
Sep 06 22:30:17 kernel: NETDEV WATCHDOG: eno1 (e1000e): transmit queue 0 timed out
Sep 06 22:30:17 kernel:  snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device at24 snd_pcm snd_timer i2c_i801 snd intel_pch_thermal lpc_ich mei_me i2c_sm>
Sep 06 22:30:17 kernel: e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
Sep 06 22:30:20 kernel: e1000e 0000:00:19.0 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None

Note You need to log in before you can comment on or make changes to this bug.