Bug 13568 - Intel e1000 4-port NIC - unable to communicate, slowly blinking
Summary: Intel e1000 4-port NIC - unable to communicate, slowly blinking
Status: CLOSED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 high
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-06-18 13:45 UTC by Peter
Modified: 2012-06-08 12:06 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.30
Subsystem:
Regression: No
Bisected commit-id:


Attachments
2.6.30 dmesg (37.12 KB, application/octet-stream)
2009-06-18 13:53 UTC, Peter
Details
ifconfig output (2.60 KB, text/x-log)
2009-06-18 13:54 UTC, Peter
Details
ethtool output (431 bytes, application/octet-stream)
2009-08-13 07:02 UTC, Peter
Details
lspci output (4.60 KB, application/x-bzip)
2009-08-13 07:03 UTC, Peter
Details
lspci after nomsi output (4.40 KB, application/x-bzip)
2009-08-13 07:03 UTC, Peter
Details
tcpdump output (281 bytes, application/x-bzip)
2009-08-13 07:04 UTC, Peter
Details
tcpdump output after nomsi (265 bytes, application/x-bzip)
2009-08-13 07:05 UTC, Peter
Details
kernel.log for nomsi bootup (13.11 KB, application/x-bzip)
2009-08-13 07:58 UTC, Peter
Details
ethtool -e eth2 (the first Intel NIC) (28.53 KB, text/plain)
2009-11-25 11:28 UTC, Peter
Details

Description Peter 2009-06-18 13:45:36 UTC
Hallo,

I have here Tyan GT20 (B2865G20S4H) barebone server with onboard GbE NICs (Broadcom BCM5721 and Marvell 88E1111-CAA PHY).

I have installed PCI-E Intel® PRO/1000 PT Quad Port Server Adapter (EXPI9404PTBLK) to gain 6 ports altogether. This is the card:

http://www.intel.com/products/server/adapters/pro1000pt-quadport/pro1000pt-quadport-overview.htm

Now, with Debian 5.0 Lenny (distributional kernel 2.6.26), only the onboard cards are functional. The Intel is also recognised by the kernel, is assigned ports (eth2-5), however none of them works when connected to network. The NIC is aware of plugging or un-plugging the cable (displays message on console), however no communication is ever performed, and the connected port just lazily blinks.

After some rochades hard to reproduce (plugging, unplugging, configuring interfaces, resetting etc), it even worked for few seconds, and then died out again forever.

I have also compiled fresh 2.6.30 kernel, no advance however. Interesting, that ifconfig always revails huge number of errors for the port.
Comment 1 Peter 2009-06-18 13:53:28 UTC
Created attachment 21986 [details]
2.6.30 dmesg
Comment 2 Peter 2009-06-18 13:54:38 UTC
Created attachment 21989 [details]
ifconfig output

Here is the ifconfig output. Please note the bogus statistics on eth2 (after a few ping attemps)
Comment 3 Andrew Morton 2009-06-29 23:00:08 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 18 Jun 2009 13:45:38 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=13568
> 
>            Summary: Intel e1000 4-port NIC - unable to communicate, slowly
>                     blinking
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.30
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Network
>         AssignedTo: drivers_network@kernel-bugs.osdl.org
>         ReportedBy: tuharsky@misbb.sk
>         Regression: No
> 
> 
> Hallo,
> 
> I have here Tyan GT20 (B2865G20S4H) barebone server with onboard GbE NICs
> (Broadcom BCM5721 and Marvell 88E1111-CAA PHY).
> 
> I have installed PCI-E Intel__ PRO/1000 PT Quad Port Server Adapter
> (EXPI9404PTBLK) to gain 6 ports altogether. This is the card:
> 
>
> http://www.intel.com/products/server/adapters/pro1000pt-quadport/pro1000pt-quadport-overview.htm
> 
> Now, with Debian 5.0 Lenny (distributional kernel 2.6.26), only the onboard
> cards are functional. The Intel is also recognised by the kernel, is assigned
> ports (eth2-5), however none of them works when connected to network. The NIC
> is aware of plugging or un-plugging the cable (displays message on console),
> however no communication is ever performed, and the connected port just
> lazily
> blinks.
> 
> After some rochades hard to reproduce (plugging, unplugging, configuring
> interfaces, resetting etc), it even worked for few seconds, and then died out
> again forever.
> 
> I have also compiled fresh 2.6.30 kernel, no advance however. Interesting,
> that
> ifconfig always revails huge number of errors for the port.
>
Comment 4 Jesse Brandeburg 2009-06-29 23:32:28 UTC
On Mon, 29 Jun 2009, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Thu, 18 Jun 2009 13:45:38 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=13568
> > 
> >            Summary: Intel e1000 4-port NIC - unable to communicate, slowly
> >                     blinking
> >            Product: Drivers
> >            Version: 2.5
> >     Kernel Version: 2.6.30
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Network
> >         AssignedTo: drivers_network@kernel-bugs.osdl.org
> >         ReportedBy: tuharsky@misbb.sk
> >         Regression: No
> > 
> > 
> > Hallo,
> > 
> > I have here Tyan GT20 (B2865G20S4H) barebone server with onboard GbE NICs
> > (Broadcom BCM5721 and Marvell 88E1111-CAA PHY).
> > 
> > I have installed PCI-E Intel__ PRO/1000 PT Quad Port Server Adapter
> > (EXPI9404PTBLK) to gain 6 ports altogether. This is the card:
> > 
> >
> http://www.intel.com/products/server/adapters/pro1000pt-quadport/pro1000pt-quadport-overview.htm
> > 
> > Now, with Debian 5.0 Lenny (distributional kernel 2.6.26), only the onboard
> > cards are functional. The Intel is also recognised by the kernel, is
> assigned
> > ports (eth2-5), however none of them works when connected to network. The
> NIC
> > is aware of plugging or un-plugging the cable (displays message on
> console),

that part is a good sign, probably indicating interrupts are working.

> > however no communication is ever performed, and the connected port just
> lazily
> > blinks.
> > 
> > After some rochades hard to reproduce (plugging, unplugging, configuring
> > interfaces, resetting etc), it even worked for few seconds, and then died
> out
> > again forever.
> > 
> > I have also compiled fresh 2.6.30 kernel, no advance however. Interesting,
> that
> > ifconfig always revails huge number of errors for the port.

errors even before traffic?

looking at the ifconfig output in the bug shows lots of rx_errors, 
indicating the counter went negative,

I wonder if you're having some kind of power issue.

can you do lspci -vvv after ifconfig shows errors?  also do 
ethtool -S eth2

do you get the arp/ping requests at the remote end? tcpdump on your remote 
and see if you get the packets.

you might want to (for debugging) try booting with pci=nomsi kernel option 
to disable MSI and see if that is related.

does tcpdump -i eth2 show any packets coming in?

there is a tool at e1000.sourceforge.net called ethregs that you can 
download/build and run, I would appreciate the output of that as well 
(probably gzipped)

Jesse
Comment 5 Peter 2009-08-13 07:02:32 UTC
Created attachment 22697 [details]
ethtool output

ethtool -S eth2
Comment 6 Peter 2009-08-13 07:03:13 UTC
Created attachment 22698 [details]
lspci output

lspci -vvv
(after errors)
Comment 7 Peter 2009-08-13 07:03:53 UTC
Created attachment 22699 [details]
lspci after nomsi output

Output from lspci -vvv
(after nomsi flag)
Comment 8 Peter 2009-08-13 07:04:13 UTC
Created attachment 22700 [details]
tcpdump output
Comment 9 Peter 2009-08-13 07:05:05 UTC
Created attachment 22701 [details]
tcpdump output after nomsi

these two tcpdumps are about incoming packets on the machine (tcpdump -i eth2)
Comment 10 Peter 2009-08-13 07:07:22 UTC
On the distant machine, tcpdump has shown for arping something like this:
arp who-has router.misbb.sk tell 10.2.2.49
arp reply router.misbb.sk is-at 00:e0:81:fb:bc:9a (oni-Unknown)

So it seems arp goes out from the buggy machine.

However, ping does nothing (Destination Host Unknown).
Comment 11 Peter 2009-08-13 07:07:45 UTC
I'll post syslog from nomsi bootup soon...
Comment 12 Peter 2009-08-13 07:58:57 UTC
Created attachment 22702 [details]
kernel.log for nomsi bootup
Comment 13 Jesse Brandeburg 2009-11-23 19:00:49 UTC
please include ethtool -e ethX output from first e1000e interface as well
Comment 14 Peter 2009-11-25 11:13:55 UTC
For the record, I have set up a spare system with remote access for Jesse, so that he could debug it "alive".
Comment 15 Peter 2009-11-25 11:28:41 UTC
Created attachment 23931 [details]
ethtool -e eth2 (the first Intel NIC)

This is the output of ethtool -e under Debian Lenny kernel.
Comment 16 Alan 2012-06-08 12:05:56 UTC
Closing as obsolete, if this is incorrect please re-open the bug and update the kernel version

Note You need to log in before you can comment on or make changes to this bug.