Bug 15944

Summary: e1000e fails probe, RHEL 5 works
Product: Drivers Reporter: Pete Zaitcev (zaitcev)
Component: NetworkAssignee: drivers_network (drivers_network)
Status: RESOLVED OBSOLETE    
Severity: normal CC: akpm, alan, tony
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: full dmesg of fail in 2.6.34-rc5
patch that fixes the issue (diffstat 403+,371-)
upcoming patch that works

Description Pete Zaitcev 2010-05-08 22:30:12 UTC
The e1000e driver in 2.6.34-rc5 fails to probe an interface, dmesg:

e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
e1000e: Copyright (c) 1999 - 2009 Intel Corporation.
e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
e1000e 0000:00:19.0: setting latency timer to 64
e1000e 0000:00:19.0: irq 34 for MSI/MSI-X
0000:00:19.0: 0000:00:19.0: MDI Error
e1000e 0000:00:19.0: PCI INT A disabled
e1000e: probe of 0000:00:19.0 failed with error -2

"MDI Error" means that an error bit is set in PHY, so I thought the
motherboard was dead. But then I found that RHEL 5 has a driver that
works (does both probe and subsequent traffic). So while the hardware
may be broken, a workaround is possible.
Comment 1 Pete Zaitcev 2010-05-08 22:32:16 UTC
Created attachment 26289 [details]
full dmesg of fail in 2.6.34-rc5
Comment 2 Pete Zaitcev 2010-05-09 04:46:05 UTC
Created attachment 26293 [details]
patch that fixes the issue (diffstat 403+,371-)

I diffed 2.6.34-rc5 with RHEL 5 and this is the first cut. Since I have
no clue what is actually going on, the patch has a lot of unrelated noise.
Comment 3 Pete Zaitcev 2010-05-09 07:10:18 UTC
It all starts with e1000e_get_phy_id failing (tried in both modes).
Then, a complex chain of defaults and workarounds brings up the link
with some possibly unrelated parameters.
Comment 4 Andrew Morton 2010-05-10 22:33:36 UTC
Pete, the netdev guys tend to avoid bugzilla.  Please email this as a regular old patch to the e1000 developers and cc netdev@vger.kernel.org.

IIRC there have been regular bunfights over the e1000 guys' liking to disable perfectly workable NICs.  They like it, but David (and I) don't.  I suggest you cc davem too ;)
Comment 5 Pete Zaitcev 2010-05-11 00:53:25 UTC
This should do it, hopefuly:
 http://marc.info/?t=127353902700007&r=1&w=3
Forgot to cc: DaveM though.
Comment 6 Andrew Morton 2010-05-11 01:13:13 UTC
you didn't cc the e1000e developers either ;(  Might need a resend later on if nothing happens.
Comment 7 Pete Zaitcev 2010-05-11 03:51:49 UTC
Created attachment 26328 [details]
upcoming patch that works
Comment 8 Anthony Awtrey 2010-10-21 17:32:06 UTC
Debian Squeeze uses the 2.6.32 kernel. This issue stops network installs on a CF-19 mk4 hardware (8086:10ea Intel Corp 82577LM). When is this fix going in? Or has it already gone in, but the bug is still open?
Comment 9 Anthony Awtrey 2010-10-21 21:45:38 UTC
Nevermind, I answered my own question by looking in the source. This is not my issue.
Comment 10 Anthony Awtrey 2010-10-21 21:52:08 UTC
Crap, I was on the wrong box... yes this is an issue for older kernels like the Debian install kernel.