Bug 5965 - Problems with Intel Pro 1000 (82545GM)
Summary: Problems with Intel Pro 1000 (82545GM)
Status: REJECTED INSUFFICIENT_DATA
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Jesse Brandeburg
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-26 09:50 UTC by Rodrigo A B Freire
Modified: 2006-07-10 12:58 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.15
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
dmesg sysout (18.00 KB, text/plain)
2006-01-26 09:58 UTC, Rodrigo A B Freire
Details
The error sysout (19.44 KB, text/plain)
2006-01-26 10:00 UTC, Rodrigo A B Freire
Details
lspci sysout (1.48 KB, text/plain)
2006-01-26 10:02 UTC, Rodrigo A B Freire
Details
e1000 patch between 2.6.16-rc1 and current (170.50 KB, patch)
2006-01-28 09:18 UTC, Francois Romieu
Details | Diff
The error on Linux kernel version 2.6.16-rc1 (11.01 KB, text/plain)
2006-01-30 06:14 UTC, Rodrigo A B Freire
Details
2.6.16-rc5 config (32.66 KB, text/plain)
2006-03-05 19:36 UTC, Jon Thomas Stokkeland
Details
2.6.16-rc5 with e1000-7.0.33 sysinfo (17.17 KB, text/plain)
2006-03-05 21:42 UTC, Jon Thomas Stokkeland
Details
debug patch for tx hang (252 bytes, text/html)
2006-04-21 14:31 UTC, Jesse Brandeburg
Details

Description Rodrigo A B Freire 2006-01-26 09:50:59 UTC
Most recent kernel where this bug did not occur: Happened on 2.6.11, 12, 13 and
14. Don't know a version which this error don't come up.
Distribution: Debian 3.1
Hardware Environment: PC P4 Xeon (see dmesg for further details)
Software Environment: Squid Web Cache
Problem Description: The NIC don't come up, and start filling the console with
"e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang" and other errors.

Steps to reproduce: Just after the link is detected, the problem starts to happen.
Comment 1 Rodrigo A B Freire 2006-01-26 09:58:11 UTC
Created attachment 7154 [details]
dmesg sysout
Comment 2 Rodrigo A B Freire 2006-01-26 10:00:09 UTC
Created attachment 7156 [details]
The error sysout
Comment 3 Rodrigo A B Freire 2006-01-26 10:02:19 UTC
Created attachment 7157 [details]
lspci sysout
Comment 4 Francois Romieu 2006-01-28 09:18:47 UTC
Created attachment 7166 [details]
e1000 patch between 2.6.16-rc1 and current

The changelog and netdev suggest that the bug may be already fixed in
post 2.6.16-rc1 kernel.

Can you report how the patch behaves ?

-- 
Ueimor
Comment 5 Rodrigo A B Freire 2006-01-30 05:04:02 UTC
Hello Fran
Comment 6 Rodrigo A B Freire 2006-01-30 06:14:31 UTC
Created attachment 7179 [details]
The error on Linux kernel version 2.6.16-rc1

Fran
Comment 7 Francois Romieu 2006-02-02 16:34:26 UTC
Rodrigo:
[...]
> Fran
Comment 8 Rodrigo A B Freire 2006-02-03 07:13:46 UTC
Fran
Comment 9 Rodrigo A B Freire 2006-02-03 10:43:56 UTC
Didn't work with lowmem (and your patch). The error is still happening.

Any suggestion? Besides, I'm going to change the board.
Comment 10 Jon Thomas Stokkeland 2006-03-05 19:31:51 UTC
Issue Still persists on certain hardware in 2.6.16-RC5
I am not a kernel developer so I dont know the right steps to nail down the
cause, apologies if my comment is incomplete, here is my short history with
e1000 and Intel pro/1000MT (82540EM):

Host 1 - ok with 2.6.15.4: 
Debian 3.1 Sarge 
VIA P4M266A (Gigabyte 8VM533M-RZ) with Celeron 2.53
Did run Debian Kernel 2.6.8, was working but frequent e1000 down/up,
I didnt really dig into it.
Compiled official 2.6.15.4, everything OK working great
LSPCI:
0000:00:00.0 Host bridge: VIA Technologies, Inc. P4M266 Host Bridge
0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8633 [Apollo Pro266 AGP]
0000:00:05.0 Communication controller: Tiger Jet Network Inc. Tiger3XX
Modem/ISDN interface
0000:00:06.0 Unknown mass storage controller: Promise Technology, Inc. 20269
(rev 02)
0000:00:07.0 Ethernet controller: Intel Corp. 82540EM Gigabit Ethernet
Controller (rev 02)
0000:00:0f.0 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
0000:00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 81)
0000:00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 81)
0000:00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 81)
0000:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
0000:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [K8T800 South]
0000:00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78)
0000:01:00.0 VGA compatible controller: S3 Inc. VT8375 [ProSavage8 KM266/KL266]


Host 2 - e1000 not ok with 2.6.16-rc5:
Debian 3.1 Sarge
Intel 865G (Gigabyte GA-8IG1000) with Celeron 1.8
Had issues with 2.6.15.4 so tried .5 and .16-rc5 and issue persists just like
described in this bugreport. The Intel PRO/1000MT adapter is the exact same
series as the other one.
lspci:
0000:00:00.0 Host bridge: Intel Corp. 82865G/PE/P DRAM Controller/Host-Hub
Interface (rev 02)
0000:00:02.0 VGA compatible controller: Intel Corp. 82865G Integrated Graphics
Device (rev 02)
0000:00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #1
(rev 02)
0000:00:1d.1 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #2
(rev 02)
0000:00:1d.2 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #3
(rev 02)
0000:00:1d.3 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #4
(rev 02)
0000:00:1d.7 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2 EHCI
Controller (rev 02)
0000:00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev c2)
0000:00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Bridge (rev 02)
0000:00:1f.1 IDE interface: Intel Corp. 82801EB/ER (ICH5/ICH5R) Ultra ATA 100
Storage Controller (rev 02)
0000:00:1f.3 SMBus: Intel Corp. 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02)
0000:01:00.0 Ethernet controller: Intel Corp. 82540EM Gigabit Ethernet
Controller (rev 02)
0000:01:02.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
0000:01:03.0 RAID bus controller: 3ware Inc 3ware 7000-series ATA-RAID (rev 01)
0000:01:09.0 Ethernet controller: Marvell Technology Group Ltd. Yukon Gigabit
Ethernet 10/100/1000Base-T Adapter (rev 13)

dmesg output when bringing up the intel card (eth1) and trying a ping:

e1000: eth1: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex
e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH                  <0>
  TDT                  <2>
  next_to_use          <2>
  next_to_clean        <0>
buffer_info[next_to_clean]
  time_stamp           <ffff34b1>
  next_to_watch        <0>
  jiffies              <ffff365e>
  next_to_watch.status <0>
e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang

which is repeated several times with new time_stamp and jiffies, and after about
5 or so ends with:
  jiffies              <ffff3c3a>
  next_to_watch.status <0>
NETDEV WATCHDOG: eth1: transmit timed out
e1000: eth1: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex

Comment 11 Jon Thomas Stokkeland 2006-03-05 19:36:09 UTC
Created attachment 7510 [details]
2.6.16-rc5 config
Comment 12 Jon Thomas Stokkeland 2006-03-05 21:42:09 UTC
Created attachment 7511 [details]
2.6.16-rc5 with e1000-7.0.33 sysinfo

I forgot to mention that I also tried the latest driver from sf, 7.0.33 with
the 2.6.16-rc5 kernel, made no difference.
Attached some more system info. I can do more testing or get info on both my
systems if you like, it just may take a day or two on the one that is ok (host
1) since its my backup storage box.
Comment 13 Jon Thomas Stokkeland 2006-03-06 20:52:10 UTC
I've been testing all kinds of stuff with not luck, I found a Zonet Zen3300 card
based on the rtl-8169 chip, and this one behaves similar. So I am starting to
think I did something idiotic in my kernel config? (attachement 7510) - but if I
didnt then what? The 10/100 rtl8139 card works fine and the onboard
Marvell/Yukon gbit nic works flawlessly as well.
Please let me know if I should provide some more info or try some tests/patches.
Comment 14 Rodrigo A B Freire 2006-03-07 02:13:36 UTC
Jon,

FYI, a UTP Broadcomm Tigon tg3 works flawlessly on this hardware.
BTW, I tried changing the Intel card (I have some spare; maybe it would be a
hardware failure on that specific board), but the problem persisted.
Comment 15 Jesse Brandeburg 2006-04-21 14:31:26 UTC
Created attachment 7932 [details]
debug patch for tx hang

you can apply this patch to 7.0.33 and send the output from dmesg (please don't
send from syslog aka var/log/messages, it gets corrupted due to us printing too
much info with this debug patch)
Comment 16 Jesse Brandeburg 2006-06-16 11:42:34 UTC
is this issue still outstanding?
Comment 17 Adrian Bunk 2006-07-10 12:58:25 UTC
Please reopen this bug if it's still present in kernel 2.6.17.

Note You need to log in before you can comment on or make changes to this bug.