Bug 10990
Summary: | e1000/e1000e driver doesn't work with gigabit connection | ||
---|---|---|---|
Product: | Drivers | Reporter: | mjc (mjc) |
Component: | Network | Assignee: | Jesse Brandeburg (jbrandeb) |
Status: | RESOLVED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | bruce.w.allan, bunk, jbrandeb, jeffrey.t.kirsher |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.25.6-55.fc9.i686 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: | proposed fix for iAMT interaction |
Description
mjc
2008-06-26 11:49:14 UTC
Reply-To: akpm@linux-foundation.org (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Thu, 26 Jun 2008 11:49:14 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=10990 > > Summary: e1000/e1000e driver doesn't work with gigabit connection > Product: Drivers > Version: 2.5 > KernelVersion: 2.6.25.6-55.fc9.i686 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Network > AssignedTo: jgarzik@pobox.com > ReportedBy: mjc@avtechpulse.com > > > Linux lyra 2.6.25.6-55.fc9.i686 #1 SMP Tue Jun 10 16:27:49 EDT 2008 i686 i686 > i386 GNU/Linux > > I have a multiple HP DC7700s with integrated Intel ethernet adapters. They > work > when the network cable is plugged into a 10/100 port on a switch. They do not > work when plugged into a 10/100/1000 port. I also have DC7700s with > integrated > Broadcom chips (tg3 driver) and they work fine. > > >From dmesg: > e1000e: Intel(R) PRO/1000 Network Driver - 0.2.0 > e1000e: Copyright (c) 1999-2007 Intel Corporation. > ... > eth0: (PCI Express:2.5GB/s:Width x1) 00:0f:fe:4a:68:37 > eth0: Intel(R) PRO/1000 Network Connection > eth0: MAC: 4, PHY: 6, PBA No: 1002ff-0ff > > When plugged into a 10/100/1000 porting, forcing: > ethtool -s eth0 autoneg off speed 1000 duplex full > > results in erratic (some pings work) or dead operation (no pings work). > > Using: > ethtool -s eth0 autoneg off speed 100 duplex full > > seems to work OK. > > Returning it to autoneg results in erratic/dead operation: > > [root@lyra ~]# ethtool -s eth0 autoneg on > [root@lyra ~]# ethtool eth0 > Settings for eth0: > Supported ports: [ TP ] > Supported link modes: 10baseT/Half 10baseT/Full > 100baseT/Half 100baseT/Full > 1000baseT/Full > Supports auto-negotiation: Yes > Advertised link modes: 10baseT/Half 10baseT/Full > 100baseT/Half 100baseT/Full > 1000baseT/Full > Advertised auto-negotiation: Yes > Speed: 1000Mb/s > Duplex: Full > Port: Twisted Pair > PHYAD: 1 > Transceiver: internal > Auto-negotiation: on > Supports Wake-on: pumbag > Wake-on: g > Current message level: 0x00000001 (1) > Link detected: yes > [root@lyra ~]# ping server2 > PING server2.domain.avtechpulse.com (192.168.0.3) 56(84) bytes of data. > 64 bytes from server2.domain.avtechpulse.com (192.168.0.3): icmp_seq=1 ttl=64 > time=0.079 ms > 64 bytes from server2.domain.avtechpulse.com (192.168.0.3): icmp_seq=2 ttl=64 > time=0.158 ms > 64 bytes from server2.domain.avtechpulse.com (192.168.0.3): icmp_seq=7 ttl=64 > time=0.161 ms > 64 bytes from server2.domain.avtechpulse.com (192.168.0.3): icmp_seq=11 > ttl=64 > time=0.113 ms > 64 bytes from server2.domain.avtechpulse.com (192.168.0.3): icmp_seq=12 > ttl=64 > time=0.131 ms > 64 bytes from server2.domain.avtechpulse.com (192.168.0.3): icmp_seq=17 > ttl=64 > time=0.127 ms > ^C > --- server2.domain.avtechpulse.com ping statistics --- > 18 packets transmitted, 6 received, 66% packet loss, time 17553ms > rtt min/avg/max/mdev = 0.079/0.128/0.161/0.028 ms > [root@lyra ~]# > > This was confirmed with multiple 10/100/1000 switches from different > manufacturers. > > > [root@lyra ~]# more /etc/modprobe.conf > alias eth0 e1000e #same behaviour with "e1000" > alias scsi_hostadapter libata > alias scsi_hostadapter1 ata_piix > alias snd-card-0 snd-hda-intel > options snd-card-0 index=0 > options snd-hda-intel index=0 > > > Need any additional data? > > - Mike >
>> When plugged into a 10/100/1000 porting, forcing:
>> ethtool -s eth0 autoneg off speed 1000 duplex full
gigabit requires autonegotiation, and if you force speeds you would have to force
it on both ends anyway. The above configuration is therefore not a valid
configuration.
I suggest changing the autonegotiation mask, which leaves autoneg enabled:
ethtool -s eth0 advertise 0x20
this is supported and should accomplish what you want.
> I suggest changing the autonegotiation mask, which leaves autoneg enabled:
>
> ethtool -s eth0 advertise 0x20
>
> this is supported and should accomplish what you want.
That doesn't fix anything for me - I still get unreliable operation:
[root@lyra ~]# ethtool -s eth0 advertise 0x20
...wait...
[root@lyra ~]# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbag
Wake-on: g
Current message level: 0x00000001 (1)
Link detected: yes
[root@lyra ~]# ping 192.168.0.3
PING 192.168.0.3 (192.168.0.3) 56(84) bytes of data.
64 bytes from 192.168.0.3: icmp_seq=3 ttl=64 time=0.163 ms
64 bytes from 192.168.0.3: icmp_seq=7 ttl=64 time=0.106 ms
64 bytes from 192.168.0.3: icmp_seq=8 ttl=64 time=0.127 ms
64 bytes from 192.168.0.3: icmp_seq=13 ttl=64 time=0.144 ms
64 bytes from 192.168.0.3: icmp_seq=17 ttl=64 time=0.118 ms
64 bytes from 192.168.0.3: icmp_seq=18 ttl=64 time=0.168 ms
^C
--- 192.168.0.3 ping statistics ---
19 packets transmitted, 6 received, 68% packet loss, time 18829ms
rtt min/avg/max/mdev = 0.106/0.137/0.168/0.026 ms
[root@lyra ~]#
Note the packet loss.
These systems worked OK before upgrading to FC9.
- Mike
Some extra data points / summary: - 1 Gb is erratic/dead on DC7700 Intel-integrated network adapter with Fedora 9. 100 Mb is OK. Tested on multiple machines and switches. - It was fine before the upgrade to F9 - DC7700s with embedded Broadcom chipsets are OK - In one DC7700 I added a PCI Intel PRO/1000 Gigabit adapter in a free PCI slot (not pci-e), and it worked fine. So it is something specific about the embedded Intel network adapter. - Mike Also, the problem occurs with both the original BIOS (1.05) and the latest applicable BIOS (1.14?). Mike, we've heard several reports about this kind of problem, I think it is related to the driver not communicating to firmware that it is loaded. If you're not using iAMT, you can fix this by disabling the iAMT management in the BIOS. Looks like you have to hit CTRL-P as soon as the monitor light turns on. This will get you into the Management bios options which should allow you to disable the management. If you want to see the CTRL-P prompt at boot, you can turn it on in the BIOS options, under Advance Setup, MEBx Setup Prompt = Enabled. see http://bizsupport.austin.hp.com/bc/docs/support/SupportManual/c01082181/c01082181.pdf and the BIOS user guide at http://bizsupport.austin.hp.com/bc/docs/support/SupportManual/c01302182/c01302182.pdf I'll be trying to get our internal lab to reproduce this so we can make sure the right patch gets to the in-kernel driver sooner rather than later. Created attachment 16850 [details]
proposed fix for iAMT interaction
This patch is against linus' tree for v2.6.26 tag or later. Let me know if you need a patch against another kernel.
this patch is untested.
Jesse, I disabled iAMT on one test computer, but it didn't seem to help anything. I'm not set up to test kernel patches. (Is it possible to patch a Fedora 9 system easily?) - Mike My previous comment was incorrect. Disabling iAMT (using ctrl+p) seems to fix the issue. - Mike Sorry Mike, I've pushed this patch upstream, and it is in 2.6.27-rc2 and newer. I mentioned this in the redhat bugzilla too. I don't know if there is a fedora-development kernel that you might be able to test. If you're running fedora and have our sourceforge driver, you can try that as the fix was already in it. rpmbuild -tb e1000e-0.4.1.7.tar.gz, then install that RPM. Thanks for the fix! The current rawhide kernels don't boot for me for other reasons, so I'll have to delaying the e1000e fix. - Mike That should read "delay testing the fix". |