Bug 14748

Summary: e1000e NIC not working after reboot
Product: Drivers Reporter: Maciek Sitarz (macieks)
Component: NetworkAssignee: Tushar (tushar.n.dave)
Status: CLOSED CODE_FIX    
Severity: normal CC: bruce.w.allan, florian, jbrandeb, rjw, tushar.n.dave
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.36 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 14230    
Attachments: lspci -vvv output NIC working properly
ethregs output NIC working properly
lspci -vvv output NIC NOT working
ethregs output NIC NOT working properly
control mdi-x mode
Logs for test scenario described in comment #13
adds debug output
Kernel logs with debug output and ethregs

Description Maciek Sitarz 2009-12-06 13:04:18 UTC
When I power up my system the NIC is working properly.
After every reboot the NIC is not working. I mean the eth0 is created, but neither dhcpcd gets IP nor static setup helps
.
ifconfig eth0 shows zero packets on Rx and Tx (no errors, overrunns, etc.)

logs after modprobing e1000e (NIC working OK):
Dec  6 12:29:28 mcxR kernel: e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
Dec  6 12:29:28 mcxR kernel: e1000e: Copyright (c) 1999-2008 Intel Corporation.
Dec  6 12:29:28 mcxR kernel: e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
Dec  6 12:29:28 mcxR kernel: e1000e 0000:00:19.0: setting latency timer to 64
Dec  6 12:29:28 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
Dec  6 12:29:28 mcxR kernel: 0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:15:58:cc:0f:35
Dec  6 12:29:28 mcxR kernel: 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
Dec  6 12:29:28 mcxR kernel: 0000:00:19.0: eth0: MAC: 6, PHY: 6, PBA No: ffffff-0ff
Dec  6 12:29:28 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
Dec  6 12:29:28 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
Dec  6 12:29:30 mcxR kernel: e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX


logs after rebooting system and modprobing e1000e (NIC not working):
Dec  6 11:57:46 mcxR kernel: e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
Dec  6 11:57:46 mcxR kernel: e1000e: Copyright (c) 1999-2008 Intel Corporation.
Dec  6 11:57:46 mcxR kernel: e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
Dec  6 11:57:46 mcxR kernel: e1000e 0000:00:19.0: setting latency timer to 64
Dec  6 11:57:46 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
Dec  6 11:57:46 mcxR kernel: 0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:15:58:cc:0f:35
Dec  6 11:57:46 mcxR kernel: 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
Dec  6 11:57:46 mcxR kernel: 0000:00:19.0: eth0: MAC: 6, PHY: 6, PBA No: ffffff-0ff
Dec  6 11:57:48 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
Dec  6 11:57:48 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X


Additional info:
Software:
 - distro: Arch Linux
 - kernel version: 2.6.32
 - e1000e version: 1.0.2-k2  


Hardware:
 - notebook: Lenovo ThinkPad R61
 - network card: Intel Gigabit

# lspci -v
00:19.0 Ethernet controller: Intel Corporation 82566MC Gigabit Network Connection (rev 03)
        Subsystem: Lenovo Device 20ba
        Flags: bus master, fast devsel, latency 0, IRQ 11
        Memory at fe200000 (32-bit, non-prefetchable) [size=128K]
        Memory at fe224000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at 1800 [size=32]
        Capabilities: [c8] Power Management version 2
        Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
        Kernel modules: e1000e

$ uname -a
Linux mcxR 2.6.32-ARCH #7 SMP PREEMPT Fri Dec 4 15:39:16 CET 2009 x86_64 Intel(R) Core(TM)2 Duo CPU T7100 @ 1.80GHz GenuineIntel GNU/Linux
Comment 1 Andrew Morton 2009-12-07 21:50:32 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sun, 6 Dec 2009 13:04:20 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=14748
> 
>            Summary: e1000e NIC not working after reboot
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.32
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Network
>         AssignedTo: drivers_network@kernel-bugs.osdl.org
>         ReportedBy: macieks@freesco.pl
>         Regression: Yes
> 
> 
> When I power up my system the NIC is working properly.
> After every reboot the NIC is not working. I mean the eth0 is created, but
> neither dhcpcd gets IP nor static setup helps
> .
> ifconfig eth0 shows zero packets on Rx and Tx (no errors, overrunns, etc.)
> 
> logs after modprobing e1000e (NIC working OK):
> Dec  6 12:29:28 mcxR kernel: e1000e: Intel(R) PRO/1000 Network Driver -
> 1.0.2-k2
> Dec  6 12:29:28 mcxR kernel: e1000e: Copyright (c) 1999-2008 Intel
> Corporation.
> Dec  6 12:29:28 mcxR kernel: e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level,
> low) -> IRQ 20
> Dec  6 12:29:28 mcxR kernel: e1000e 0000:00:19.0: setting latency timer to 64
> Dec  6 12:29:28 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
> Dec  6 12:29:28 mcxR kernel: 0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width
> x1)
> 00:15:58:cc:0f:35
> Dec  6 12:29:28 mcxR kernel: 0000:00:19.0: eth0: Intel(R) PRO/1000 Network
> Connection
> Dec  6 12:29:28 mcxR kernel: 0000:00:19.0: eth0: MAC: 6, PHY: 6, PBA No:
> ffffff-0ff
> Dec  6 12:29:28 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
> Dec  6 12:29:28 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
> Dec  6 12:29:30 mcxR kernel: e1000e: eth0 NIC Link is Up 100 Mbps Full
> Duplex,
> Flow Control: RX/TX
> 
> 
> logs after rebooting system and modprobing e1000e (NIC not working):
> Dec  6 11:57:46 mcxR kernel: e1000e: Intel(R) PRO/1000 Network Driver -
> 1.0.2-k2
> Dec  6 11:57:46 mcxR kernel: e1000e: Copyright (c) 1999-2008 Intel
> Corporation.
> Dec  6 11:57:46 mcxR kernel: e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level,
> low) -> IRQ 20
> Dec  6 11:57:46 mcxR kernel: e1000e 0000:00:19.0: setting latency timer to 64
> Dec  6 11:57:46 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
> Dec  6 11:57:46 mcxR kernel: 0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width
> x1)
> 00:15:58:cc:0f:35
> Dec  6 11:57:46 mcxR kernel: 0000:00:19.0: eth0: Intel(R) PRO/1000 Network
> Connection
> Dec  6 11:57:46 mcxR kernel: 0000:00:19.0: eth0: MAC: 6, PHY: 6, PBA No:
> ffffff-0ff
> Dec  6 11:57:48 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
> Dec  6 11:57:48 mcxR kernel: e1000e 0000:00:19.0: irq 30 for MSI/MSI-X
> 
> 
> Additional info:
> Software:
>  - distro: Arch Linux
>  - kernel version: 2.6.32
>  - e1000e version: 1.0.2-k2  
> 
> 
> Hardware:
>  - notebook: Lenovo ThinkPad R61
>  - network card: Intel Gigabit
> 
> # lspci -v
> 00:19.0 Ethernet controller: Intel Corporation 82566MC Gigabit Network
> Connection (rev 03)
>         Subsystem: Lenovo Device 20ba
>         Flags: bus master, fast devsel, latency 0, IRQ 11
>         Memory at fe200000 (32-bit, non-prefetchable) [size=128K]
>         Memory at fe224000 (32-bit, non-prefetchable) [size=4K]
>         I/O ports at 1800 [size=32]
>         Capabilities: [c8] Power Management version 2
>         Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
>         Kernel modules: e1000e
> 
> $ uname -a
> Linux mcxR 2.6.32-ARCH #7 SMP PREEMPT Fri Dec 4 15:39:16 CET 2009 x86_64
> Intel(R) Core(TM)2 Duo CPU T7100 @ 1.80GHz GenuineIntel GNU/Linux
> 

Thanks.  You don't mention which previous kernel version worked OK. 
Was it 2.6.31?
Comment 2 Jesse Brandeburg 2009-12-07 22:01:23 UTC
On Mon, 7 Dec 2009, Andrew Morton wrote:
> > When I power up my system the NIC is working properly.
> > After every reboot the NIC is not working. I mean the eth0 is created, but
> > neither dhcpcd gets IP nor static setup helps

We have a userspace tool called ethregs downloadable from 
http://downloads.sourceforge.net/project/e1000/Register%20Dump%20Tool/1.7.2/ethregs-1.7.2.tar.gz?use_mirror=iweb

if it is not too much trouble can you build this tool and run it before 
(when the port is working) and after (when the link didn't come up)

you can attach them to the bug, and reply to this thread would be best.

also please include the output of lspci -vvv after the failure.

Thanks,
  Jesse
Comment 3 Maciek Sitarz 2009-12-07 23:17:26 UTC
Created attachment 24082 [details]
lspci -vvv output NIC working properly
Comment 4 Maciek Sitarz 2009-12-07 23:18:45 UTC
Created attachment 24083 [details]
ethregs output NIC working properly
Comment 5 Maciek Sitarz 2009-12-07 23:20:16 UTC
Created attachment 24084 [details]
lspci -vvv output NIC NOT working
Comment 6 Maciek Sitarz 2009-12-07 23:21:14 UTC
Created attachment 24085 [details]
ethregs output NIC NOT working properly
Comment 7 Maciek Sitarz 2009-12-07 23:26:54 UTC
lspci -vvv and ethregs outputs attached to Bugzilla.
I checked kernel 2.6.31.6 and the problem exists there also.

PS. I had a problem building ethregs:

gcc -Wall -W -Wno-parentheses -Wstrict-prototypes -Wmissing-prototypes 
-Winline -DEXTERNAL_RELEASE -g -O2   -o ethregs  ethregs.o 8254x.o 
8257x.o ichlan.o 82575.o 82576.o 80003es2lan.o 82598.o 82599.o  -lpci -lz
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.4.2/../../../../lib/libpci.a(names-net.o): 
In function `pci_id_net_lookup':
(.text+0x138): undefined reference to `__res_query'
collect2: ld returned 1 exit status

I built it on another system and used on my system.

Best regards
Comment 8 Rafael J. Wysocki 2010-01-04 20:05:17 UTC
On Monday 04 January 2010, Maciej Sitarz wrote:
> On 29.12.2009 16:28, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32.  Please verify if it still should
> > be listed and let me know (either way).
> 
> I confirm. The problem still exists.
Comment 9 Anonymous Emailer 2010-01-27 00:41:03 UTC
Reply-To: jesse.brandeburg@gmail.com

On Mon, Dec 7, 2009 at 2:01 PM, Brandeburg, Jesse
<jesse.brandeburg@intel.com> wrote:
> On Mon, 7 Dec 2009, Andrew Morton wrote:
>> > When I power up my system the NIC is working properly.
>> > After every reboot the NIC is not working. I mean the eth0 is created, but
>> > neither dhcpcd gets IP nor static setup helps
>
> We have a userspace tool called ethregs downloadable from
>
> http://downloads.sourceforge.net/project/e1000/Register%20Dump%20Tool/1.7.2/ethregs-1.7.2.tar.gz?use_mirror=iweb
>
> if it is not too much trouble can you build this tool and run it before
> (when the port is working) and after (when the link didn't come up)
>
> you can attach them to the bug, and reply to this thread would be best.

I've looked at the ethregs dumps, the good news is it looks like the
hardware succeeds to self-init, but on the ethregs-fails.txt did you
load the driver?  it appears you did not, or at least didn't do
# ip link set eth0 up
# ethregs > regs.txt

also looked at the lspci -vvv information and in both cases MSI was
enabled, but in the fails case the value in the data field for the MSI
vector is different, which seems a a little strange but I'm not sure
if it is responsible for failure

if the driver was loaded, and failed dhcp, what happens when you run
ethtool -t eth0 offline?

when the driver is loaded, and the dhcp fails, can you assign an
address manually (and bring the interface up) and have it work?

one more thing to note please, can you send cat /proc/interrupts from
10 seconds apart when the driver is loaded and the port is UP, but not
working.  dhcpcd or dhclient both have a tendency to put the port DOWN
after they fail to get address, so thats why you may need to do # ip
link command above before gathering /proc/interrupts.

is your bios up to date?

Thanks, sorry for the delay, lets see if we can figure out what is up.

Jesse
Comment 10 Rafael J. Wysocki 2010-01-27 00:52:09 UTC
On Tuesday 26 January 2010, Maciej Sitarz wrote:
> On 24.01.2010 23:22, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32.  Please verify if it still should
> > be listed and let me know (either way).
> 
> The problem still exist I'm using kernel version 2.6.32.6 now.
> 
> I have one more observation:
> After the reboot, when the NIC is not working, both leds are on. Not 
> blinking, they light all the time, even if I remove the plug.
Comment 11 Jesse Brandeburg 2010-01-27 23:48:53 UTC
Created attachment 24750 [details]
control mdi-x mode

can you try the attached patch with module parameters set in /etc/modprobe.d/e1000e.conf

# also try value 2
alias e1000e mdix=1

<apply patch>
make M=drivers/net/e1000e modules modules_install
rmmod e1000e; modprobe e1000e mdix=1

This is just a shot in the dark on this one, since I just made this patch for another issue I thought we should test it here too, just in case.
Comment 12 Henrique de Moraes Holschuh 2010-01-29 22:25:33 UTC
Some data Maciej provided in LKML:

---
The problem still exist I'm using kernel version 2.6.32.6 now.

I have one more observation:
After the reboot, when the NIC is not working, both leds are on. Not
blinking, they light all the time, even if I remove the plug.
---
Comment 13 Maciek Sitarz 2010-01-30 00:55:19 UTC
I tried to test all the cases and provide all logs(In reply to comment #9)
> Reply-To: jesse.brandeburg@gmail.com
> 
> On Mon, Dec 7, 2009 at 2:01 PM, Brandeburg, Jesse
> <jesse.brandeburg@intel.com> wrote:
> > On Mon, 7 Dec 2009, Andrew Morton wrote:
> >> > When I power up my system the NIC is working properly.
> >> > After every reboot the NIC is not working. I mean the eth0 is created,
> but
> >> > neither dhcpcd gets IP nor static setup helps
> >
> > We have a userspace tool called ethregs downloadable from
> >
> http://downloads.sourceforge.net/project/e1000/Register%20Dump%20Tool/1.7.2/ethregs-1.7.2.tar.gz?use_mirror=iweb
> >
> > if it is not too much trouble can you build this tool and run it before
> > (when the port is working) and after (when the link didn't come up)
> >
> > you can attach them to the bug, and reply to this thread would be best.
> 
> I've looked at the ethregs dumps, the good news is it looks like the
> hardware succeeds to self-init, but on the ethregs-fails.txt did you
> load the driver?  it appears you did not, or at least didn't do
> # ip link set eth0 up
> # ethregs > regs.txt
> 
> also looked at the lspci -vvv information and in both cases MSI was
> enabled, but in the fails case the value in the data field for the MSI
> vector is different, which seems a a little strange but I'm not sure
> if it is responsible for failure
> 
> if the driver was loaded, and failed dhcp, what happens when you run
> ethtool -t eth0 offline?
> 
> when the driver is loaded, and the dhcp fails, can you assign an
> address manually (and bring the interface up) and have it work?
> 
> one more thing to note please, can you send cat /proc/interrupts from
> 10 seconds apart when the driver is loaded and the port is UP, but not
> working.  dhcpcd or dhclient both have a tendency to put the port DOWN
> after they fail to get address, so thats why you may need to do # ip
> link command above before gathering /proc/interrupts.

I did some tests you proposed. I described all the scenarios below and attached the gathered logs.

> is your bios up to date?

I updated the BIOS today and tested, but the problem remained.

> Thanks, sorry for the delay, lets see if we can figure out what is up.

No problem, I tried not to reboot too often :)


Test scenarios:
  STATUS: Working
	Description: I shutdown the computer and started it, then I did the steps
		before loading module: green LED shines all the time, orange LED blinks
		after loading module : green LED shines all the time, orange LED blinks

    mkdir working
    modprobe e1000e
    dhcpcd eth0  # OK
    ip link set eth0 up
    sleep 10 && cat /proc/interrupts > working/interrupts.log
    lspci -vvv > working/lspci_vvv.log
    ./ethregs > working/ethregs.log

  STATUS: Not working (module not loaded)
	Description: I rebooted the computer and then I did the steps
		before loading module: green LED shines all the time, orange LED blinks

    mkdir not_working
    sleep 10 && cat /proc/interrupts > not_working/interrupts.log
    lspci -vvv > not_working/lspci_vvv.log
    ./ethregs > not_working/ethregs.log

  STATUS: Not working (module loaded)
	Description: After the previous test I did the steps below
		before loading module: green LED shines all the time, orange LED blinks
		after loading module : green LED shines all the time, orange LED shunes all the time (!) KEYBOARD HAS LAG once a few seconds!!!

    mkdir not_working_module_loaded
    modprobe e1000e
    dhcpcd eth0                     # Timeouts at waiting for carrier
    ip link set eth0 up
    sleep 10 && cat /proc/interrupts > not_working_module_loaded/interrupts.log
    lspci -vvv > not_working_module_loaded/lspci_vvv.log
    ./ethregs > not_working_module_loaded/ethregs.log

  STATUS: Not working (module loaded and then unloaded )
	Description: After the previous test I just removed the module
		before loading module: green LED shines all the time, orange LED blinks
		after loading module : green LED shines all the time, orange LED shunes all the time (!) KEYBOARD HAS LAG once a few seconds!!!
		after removing module : Keyboard is working fine again! green and orange LED shine

    mkdir not_working_module_unloaded
    modprobe -r e1000e
    sleep 10 && cat /proc/interrupts > not_working_module_unloaded/interrupts.log
    lspci -vvv > not_working_module_unloaded/lspci_vvv.log
    ./ethregs > not_working_module_unloaded/ethregs.log



  STATUS: Patched module mdix=1 WORKS
	Description: I shutdown the system, started it loaded the new module, then I did the steps
		before loading module: green LED shines all the time, orange LED blinks
		after loading module : green LED shines all the time, orange LED blinks

    mkdir patched_module_mdix1_loaded
    modprobe e1000e mdix=1
    dhcpcd eth0               # OK
    ip link set eth0 up
    sleep 10 && cat /proc/interrupts > patched_module_mdix1_loaded/interrupts.log
    lspci -vvv > patched_module_mdix1_loaded/lspci_vvv.log
    ./ethregs > patched_module_mdix1_loaded/ethregs.log

  STATUS: Patched module mdix=1 after reboot NOT WORKING
	Description: I rebooted the system (new module was loaded), then I did the steps
		before loading module: green LED shines all the time, orange LED blinks
		after loading module : green LED shines all the time, orange LED shunes all the time (!) KEYBOARD HAS LAG once a few seconds!!!

    mkdir patched_module_mdix1_loaded_reboot
    modprobe e1000e mdix=1
    dhcpcd eth0                         # Timeouts at waiting for carrier
    ip link set eth0 up
    sleep 10 && cat /proc/interrupts > patched_module_mdix1_loaded_reboot/interrupts.log
    lspci -vvv > patched_module_mdix1_loaded_reboot/lspci_vvv.log
    ./ethregs > patched_module_mdix1_loaded_reboot/ethregs.log


  STATUS: Patched module mdix=2 WORKS
	Description: I shutdown the system, started it loaded the new module, then I did the steps
		before loading module: green LED shines all the time, orange LED blinks
		after loading module : green LED shines all the time, orange LED blinks

    mkdir patched_module_mdix2_loaded
    modprobe e1000e mdix=2
    dhcpcd eth0           # OK
    ip link set eth0 up
    sleep 10 && cat /proc/interrupts > patched_module_mdix2_loaded/interrupts.log
    lspci -vvv > patched_module_mdix2_loaded/lspci_vvv.log
    ./ethregs > patched_module_mdix2_loaded/ethregs.log

  STATUS: Patched module mdix=2 after reboot NOT WORKING
	Description: I shutdown the system, started it loaded the module and rebooted, then I did the steps
		before loading module: green LED shines all the time, orange LED blinks
		after loading module : green LED shines all the time, orange LED shunes all the time (!) KEYBOARD HAS LAG once a few seconds!!!

    mkdir patched_module_mdix2_loaded_reboot
    modprobe e1000e mdix=2
    dhcpcd eth0                   # Timeouts wainting for carrier
    ip link set eth0 up
    sleep 10 && cat /proc/interrupts > patched_module_mdix2_loaded_reboot/interrupts.log
    lspci -vvv > patched_module_mdix2_loaded_reboot/lspci_vvv.log
    ./ethregs > patched_module_mdix2_loaded_reboot/ethregs.log
Comment 14 Maciek Sitarz 2010-01-30 00:58:42 UTC
Created attachment 24784 [details]
Logs for test scenario described in comment #13
Comment 15 Maciek Sitarz 2010-02-08 10:05:19 UTC
Still an issue. Kernel version 2.6.32.7
Comment 16 Maciek Sitarz 2010-02-16 14:39:04 UTC
Still an issue. Kernel version 2.6.32.8
Comment 17 Jesse Brandeburg 2010-02-16 18:18:42 UTC
Thank you for your dilligent testing, we have someone looking at it.
Comment 18 Maciek Sitarz 2010-02-16 19:42:19 UTC
Sorry if I was too importunate, but I got this message every two weeks:

"This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32.  Please verify if it still should
be listed and let the tracking team know (either way)."

so I felt obliged to let you know :) I'll just wait now for news from you side.
Comment 19 Anonymous Emailer 2010-02-24 21:53:39 UTC
Reply-To: nicholasx.d.nunley@intel.com

I am looking into this bug but I am not able to reproduce it on my test machine, so if you could provide some debug info it would be very helpful. First, could you download the e1000e driver on sourceforge (http://sourceforge.net/projects/e1000/files/) and see if the problem is present there? The in-tree driver and the sourceforge driver are generally kept in-sync but sometimes an update to the kernel is overlooked. 

Secondly, please apply the attached patch and provide the debug output when the driver is working/not working as well as the phy register dumps accessible through ethtool -d ethX. Turning on the debug messages allows us to see if the driver is encountering any unusual conditions that we may be ignoring otherwise and printing the phy registers will allow us to see if the phy is being configured correctly.

Thanks,
Nick
Comment 20 nicholasx.d.nunley 2010-02-24 23:27:54 UTC
Created attachment 25199 [details]
adds debug output
Comment 21 Maciek Sitarz 2010-02-25 00:07:30 UTC
e1000e module compiled from sf.net didn't help.

I attached logs from patched module, but I can't find any lines containing "e1000e: phy reg offset" you wanted to print out.
It didn't show up in any file in my /var/log directory.
Comment 22 Maciek Sitarz 2010-02-25 00:09:36 UTC
Created attachment 25202 [details]
Kernel logs with debug output and ethregs

Kernel logs and ethregs output from module from sf.net patched with additional debug messages
Comment 23 Florian Mickler 2010-10-07 20:36:15 UTC
Is this issue still present in current mainline kernels?

(In reply to comment #18)
> Sorry if I was too importunate, but I got this message every two weeks:
> 
> "This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32.  Please verify if it still should
> be listed and let the tracking team know (either way)."
> 
> so I felt obliged to let you know :) I'll just wait now for news from you
> side.

There are too sides... one side is the regression tracking view. From that viewpoint, status updates are very much appreciated! Especially if it goes towards the end of the release cycle and Linus has to decide when to cut it. 

From the bug fixing perspective, I guess, as soon as the bug is acknowledged by a developer and worked upon, it is not that important anymore... yet, a ping from time to time does not harm.. 

Regards,
Flo
Comment 24 Florian Mickler 2010-10-26 09:11:31 UTC
Is this issue still a problem in current mainline kernels?

Regards,
Flo
Comment 25 Maciek Sitarz 2010-10-31 07:43:01 UTC
Still an issue. Kernel version 2.6.36

Regards,
Maciek
Comment 26 Maciek Sitarz 2010-12-15 15:26:14 UTC
Still an issue. Kernel version 2.6.36.2

Regards,
Maciek
Comment 27 Jesse Brandeburg 2010-12-15 18:54:45 UTC
updated kernel version and reassigned to Tushar
Comment 28 Tushar 2011-02-09 22:41:56 UTC
Maciek,
Sorry for so late in responding.
Have you tried latest e1000e driver from SF (i.e 1.2.20), if you not can you give it a try?
Comment 29 Maciek Sitarz 2011-02-20 21:23:23 UTC
The notebook with the network card is broken(graphic card). Right now it's being fix so I can't check if this will fix the problem. But as soon as I get it back I'll try to upgrade the driver and reproduce the issue.
Comment 30 Maciek Sitarz 2011-03-16 18:27:54 UTC
Tushar,
I got my notebook back and I tried to reproduce the problem. I did about 5-6 reboots and it seems to work fine.

I tested on:
$ uname -a
Linux mcxR 2.6.38-ARCH #1 SMP PREEMPT Tue Mar 15 09:36:10 CET 2011 x86_64 Intel(R) Core(TM)2 Duo CPU T7100 @ 1.80GHz GenuineIntel GNU/Linux

$ modinfo e1000e
filename:       /lib/modules/2.6.38-ARCH/kernel/drivers/net/e1000e/e1000e.ko.gz
version:        1.2.20-k2
license:        GPL
description:    Intel(R) PRO/1000 Network Driver
author:         Intel Corporation, <linux.nics@intel.com>
srcversion:     566D897FE2181A99FA51235


We can assume the bug is fixed now.
Comment 31 Florian Mickler 2011-03-17 08:32:50 UTC
Is it fixed in 2.6.38 or only in the sourceforge driver? If only in sf, will it be fixed in linus tree?
Comment 32 Jesse Brandeburg 2011-03-17 16:26:48 UTC
@Maciek: thanks for the feedback!

@florian: the 1.2.20-k2 driver version indicates it was the in-kernel driver.
Comment 33 Tushar 2011-06-28 21:44:34 UTC
Updating status to Resolved.
Comment 34 Florian Mickler 2011-06-29 07:54:14 UTC
[Hm.. bugzilla eating notifications emails again :( ]

Thanks for the update and closing down yet another regression.