Most recent kernel where this bug did not occur: Happened on 2.6.11, 12, 13 and 14. Don't know a version which this error don't come up. Distribution: Debian 3.1 Hardware Environment: PC P4 Xeon (see dmesg for further details) Software Environment: Squid Web Cache Problem Description: The NIC don't come up, and start filling the console with "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang" and other errors. Steps to reproduce: Just after the link is detected, the problem starts to happen.
Created attachment 7154 [details] dmesg sysout
Created attachment 7156 [details] The error sysout
Created attachment 7157 [details] lspci sysout
Created attachment 7166 [details] e1000 patch between 2.6.16-rc1 and current The changelog and netdev suggest that the bug may be already fixed in post 2.6.16-rc1 kernel. Can you report how the patch behaves ? -- Ueimor
Hello Fran
Created attachment 7179 [details] The error on Linux kernel version 2.6.16-rc1 Fran
Rodrigo: [...] > Fran
Fran
Didn't work with lowmem (and your patch). The error is still happening. Any suggestion? Besides, I'm going to change the board.
Issue Still persists on certain hardware in 2.6.16-RC5 I am not a kernel developer so I dont know the right steps to nail down the cause, apologies if my comment is incomplete, here is my short history with e1000 and Intel pro/1000MT (82540EM): Host 1 - ok with 2.6.15.4: Debian 3.1 Sarge VIA P4M266A (Gigabyte 8VM533M-RZ) with Celeron 2.53 Did run Debian Kernel 2.6.8, was working but frequent e1000 down/up, I didnt really dig into it. Compiled official 2.6.15.4, everything OK working great LSPCI: 0000:00:00.0 Host bridge: VIA Technologies, Inc. P4M266 Host Bridge 0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8633 [Apollo Pro266 AGP] 0000:00:05.0 Communication controller: Tiger Jet Network Inc. Tiger3XX Modem/ISDN interface 0000:00:06.0 Unknown mass storage controller: Promise Technology, Inc. 20269 (rev 02) 0000:00:07.0 Ethernet controller: Intel Corp. 82540EM Gigabit Ethernet Controller (rev 02) 0000:00:0f.0 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 0000:00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 0000:00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 0000:00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 0000:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86) 0000:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [K8T800 South] 0000:00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78) 0000:01:00.0 VGA compatible controller: S3 Inc. VT8375 [ProSavage8 KM266/KL266] Host 2 - e1000 not ok with 2.6.16-rc5: Debian 3.1 Sarge Intel 865G (Gigabyte GA-8IG1000) with Celeron 1.8 Had issues with 2.6.15.4 so tried .5 and .16-rc5 and issue persists just like described in this bugreport. The Intel PRO/1000MT adapter is the exact same series as the other one. lspci: 0000:00:00.0 Host bridge: Intel Corp. 82865G/PE/P DRAM Controller/Host-Hub Interface (rev 02) 0000:00:02.0 VGA compatible controller: Intel Corp. 82865G Integrated Graphics Device (rev 02) 0000:00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #1 (rev 02) 0000:00:1d.1 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #2 (rev 02) 0000:00:1d.2 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #3 (rev 02) 0000:00:1d.3 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #4 (rev 02) 0000:00:1d.7 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02) 0000:00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev c2) 0000:00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Bridge (rev 02) 0000:00:1f.1 IDE interface: Intel Corp. 82801EB/ER (ICH5/ICH5R) Ultra ATA 100 Storage Controller (rev 02) 0000:00:1f.3 SMBus: Intel Corp. 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02) 0000:01:00.0 Ethernet controller: Intel Corp. 82540EM Gigabit Ethernet Controller (rev 02) 0000:01:02.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) 0000:01:03.0 RAID bus controller: 3ware Inc 3ware 7000-series ATA-RAID (rev 01) 0000:01:09.0 Ethernet controller: Marvell Technology Group Ltd. Yukon Gigabit Ethernet 10/100/1000Base-T Adapter (rev 13) dmesg output when bringing up the intel card (eth1) and trying a ping: e1000: eth1: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <0> TDT <2> next_to_use <2> next_to_clean <0> buffer_info[next_to_clean] time_stamp <ffff34b1> next_to_watch <0> jiffies <ffff365e> next_to_watch.status <0> e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang which is repeated several times with new time_stamp and jiffies, and after about 5 or so ends with: jiffies <ffff3c3a> next_to_watch.status <0> NETDEV WATCHDOG: eth1: transmit timed out e1000: eth1: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex
Created attachment 7510 [details] 2.6.16-rc5 config
Created attachment 7511 [details] 2.6.16-rc5 with e1000-7.0.33 sysinfo I forgot to mention that I also tried the latest driver from sf, 7.0.33 with the 2.6.16-rc5 kernel, made no difference. Attached some more system info. I can do more testing or get info on both my systems if you like, it just may take a day or two on the one that is ok (host 1) since its my backup storage box.
I've been testing all kinds of stuff with not luck, I found a Zonet Zen3300 card based on the rtl-8169 chip, and this one behaves similar. So I am starting to think I did something idiotic in my kernel config? (attachement 7510) - but if I didnt then what? The 10/100 rtl8139 card works fine and the onboard Marvell/Yukon gbit nic works flawlessly as well. Please let me know if I should provide some more info or try some tests/patches.
Jon, FYI, a UTP Broadcomm Tigon tg3 works flawlessly on this hardware. BTW, I tried changing the Intel card (I have some spare; maybe it would be a hardware failure on that specific board), but the problem persisted.
Created attachment 7932 [details] debug patch for tx hang you can apply this patch to 7.0.33 and send the output from dmesg (please don't send from syslog aka var/log/messages, it gets corrupted due to us printing too much info with this debug patch)
is this issue still outstanding?
Please reopen this bug if it's still present in kernel 2.6.17.