Bug 1394 - (net 3c59x) problems with WOL changes
Summary: (net 3c59x) problems with WOL changes
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Jeff Garzik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-10-21 09:54 UTC by Lenar L
Modified: 2004-12-07 02:27 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.0-test11
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Revert the changes which enabled Wake-On-LAN changes by default in 2.6 (4.50 KB, patch)
2004-02-11 09:06 UTC, Tom Rini
Details | Diff

Description Lenar L 2003-10-21 09:54:38 UTC
Hardware Environment: nforce2 athlon xp2500+ 1.5GB RAM  
  
Problem Description:   
Computer is unusable due to failing ifconfig up on every second reboot.  
Never happens from cold start. But when I give 'reboot' command my  
linux won't come up completely. Instead I need to re-reboot when it gets  
stuck and then it comes up with no problem:  
  
Steps to reproduce:  
# modprobe 3c59x  
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html  
0000:01:08.0: 3Com PCI 3c905B Cyclone 100baseTx at 0x9000. Vers  
LK1.1.19  
> ifconfig eth0 192.168.0.2 &  
eth0: command 0x5800 did not complete! Status=0xffff  
> # second or two of silence  
eth0: command 0x2804 did not complete! Status=0xffff  
eth0: command 0x2804 did not complete! Status=0xffff  
> ping 192.168.0.1  
eth0: command 0x3002 did not complete! Status=0xffff  
eth0: command 0x3002 did not complete! Status=0xffff  
eth0: command 0x3002 did not complete! Status=0xffff  
# ... and this goes on and on. finally NETDEV WATCHDOG cries out  
# something about I should check my cabling and so on.  
# Didn't manage to capture that. If needed I can try harder.  
  
At this point I do sysrq+s,u,b and at next reboot it's all ok.  
At least until I want to reboot machine again.  
Never happens just after power-on.  
In 2.4 no such problems ever.
Comment 1 Andrew Morton 2003-10-21 11:45:03 UTC
Are you using ACPI?

If so, what happens if you disable it?
Comment 2 Lenar L 2003-10-21 12:46:51 UTC
Yes, I'm using ACPI. So I tried right now with acpi=off as boot 
parameter. No difference. The problem is still present. 
Comment 3 gary ng 2003-11-27 12:27:17 UTC
I can confirm the same behaviour on my pretty old Dell PII 400 with 3c905TX. 
The 2.6.0-test11 3c59x driver somehow does some strange thing to the card which 
make second reboot(warm boot) always fail, no matter the subsequent reboots is 
2.4.22 or 2.6.0. 
 
I have to either cold boot the machine or surprisingly reboot(warm boot) into 
Windows XP(which has no problem using the card) then warm boot again and both 
version of linux likes the card again. The following is the message when I 
'modprobe 3c59x' 
======================================================================== 
PCI: Enabling device 0000:00:0d.0 (0000 -> 0003) 
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html 
0000:00:0d.0: 3Com PCI 3c905B Cyclone 100baseTx at 0x1080. Vers LK1.1.19 
PCI: Setting latency timer of device 0000:00:0d.0 to 64 
  ***WARNING*** No MII transceivers found! 
======================================================================== 
 
For a proper loading, it would be like the following : 
 
======================================================================== 
PCI: Enabling device 00:0d.0 (0114 -> 0117) 
PCI: Found IRQ 11 for device 00:0d.0 
PCI: Sharing IRQ 11 with 01:00.0 
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html 
See Documentation/networking/vortex.txt 
00:0d.0: 3Com PCI 3c905B Cyclone 100baseTx at 0x1080. Vers LK1.1.18-ac 
 00:10:5a:ce:5f:50, IRQ 11 
  product code 5143 rev 00.9 date 11-11-98 
Full duplex capable 
  Internal config register is 2000000, transceivers 0xa. 
  8K byte-wide RAM 5:3 Rx:Tx split, 10baseT interface. 
  Enabling bus-master transmits and whole-frame receives. 
00:0d.0: scatter/gather enabled. h/w checksums enabled 
======================================================================= 
 
I thought it may be acpi related but I have also tried enable/disable it and 
the result is still the same. 
 
 
Comment 4 gary ng 2003-11-28 23:33:37 UTC
Another minor bug probably related with the 3c59x driver(again on my 6 years 
old 3c905Tx). The card is capable of 10/100 and auto-negotiate. I can see it 
working under XP(when I plug in a 10Mbps hub, it indicate so and when I plug in 
a 100Mbps switch/router, the it also indicate so correctly). However, under 
linux(both 2.4 and 2.6) it only works in 10Mbps and never use the higher speed. 
 
Another annoying problem which I don't know if it is related to the driver. 
When I checked with mii-tool, it ALWAYS said no link even though both the hub 
and the card's LED indicates there is link and the card is working 
properl(other than at the low speed). This breaks the Redhat script as it use 
mii-tool to check link before it brings up the network(ifup) and would always 
fail. 
 
Comment 5 eksot 2003-12-02 12:16:54 UTC
Hi! I've exactly the same problem as in this comment: Additional Comment #3 From 
gary ng 2003-11-27 12:27.

Mobo chipset is KT400 and card is old 3com 10/100 fast etherlink xl.

Kernel is 2.6.0-test11. On cold boots things work just fine, but when warm 
booting from linux to linux, eth0 refuses to work!

dmesg when warm booting:

=============================================================
PCI: Enabling device 0000:00:0b.0 (0000 -> 0003)
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
0000:00:0b.0: 3Com PCI 3c905B Cyclone 100baseTx at 0x1000. Vers LK1.1.19
PCI: Setting latency timer of device 0000:00:0b.0 to 64
  ***WARNING*** No MII transceivers found!
=============================================================

...and when programs try to use internet...

=============================================================
eth0: command 0x3002 did not complete! Status=0xffff
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 3 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 2 
bytes away.
eth0: command 0x3002 did not complete! Status=0xffff
NETDEV WATCHDOG: eth0: transmit timed out
eth0: transmit timed out, tx_status ff status ffff.
  diagnostics: net ffff media ffff dma ffffffff fifo ffff
eth0: Transmitter encountered 16 collisions -- network cable problem?
eth0: Interrupt posted but not delivered -- IRQ blocked by another device?
  Flags; bus-master 1, dirty 0(0) current 16(0)
  Transmit list ffffffff vs. dfd7e200.
eth0: command 0x3002 did not complete! Status=0xffff
  0: @dfd7e200  length 8000002a status 0000002a
  1: @dfd7e2a0  length 8000002a status 0000002a
  2: @dfd7e340  length 8000002a status 0000002a
  3: @dfd7e3e0  length 8000002a status 0000002a
  4: @dfd7e480  length 80000049 status 0c000049
  5: @dfd7e520  length 8000002a status 0000002a
  6: @dfd7e5c0  length 8000002a status 0000002a
  7: @dfd7e660  length 8000002a status 0000002a
  8: @dfd7e700  length 8000002a status 0000002a
  9: @dfd7e7a0  length 8000002a status 0000002a
  10: @dfd7e840  length 8000002a status 0000002a
  11: @dfd7e8e0  length 8000002a status 0000002a
  12: @dfd7e980  length 8000002a status 0000002a
  13: @dfd7ea20  length 80000049 status 0c000049
  14: @dfd7eac0  length 8000004a status 8c00004a
  15: @dfd7eb60  length 8000002a status 8000002a
eth0: command 0x5800 did not complete! Status=0xffff
eth0: Resetting the Tx ring pointer.
============================================================

When things work as they should, cold boot:

============================================================
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
0000:00:09.0: 3Com PCI 3c905B Cyclone 100baseTx at 0xa000. Vers LK1.1.19
============================================================

I've tried with and without ACPI and APIC. I've tried to change 3com card to 
different physical PCI-slot. No luck. 2.4-kernels work fine. And about those 
psmouse.c lines above, unplugging the mouse doesn't help to get networking work.
Comment 6 Jeff Garzik 2003-12-17 18:36:18 UTC
Any chance you can give 2.6.0-test11 a try?
Comment 7 eksot 2003-12-19 02:26:44 UTC
like i said, my kernel is 2.6.0-test11 and that problem is present there.
Comment 8 skol 2004-01-09 17:24:56 UTC
I have the same (or a similar) problem, which exists in 2.6.0, 2.6.1 and started
sometime in 2.6.0-test*.

The interface is recognised correctly if I boot 2.6.1 after it has been running
in a known good kernel (eg 2.4.24).  If I boot a good kernel after 2.6.1 it
doesn't get recognised properly, but after a second reboot to another kernel it
does.

dmesg when working:
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
0000:00:0d.0: 3Com PCI 3c905B Cyclone 100baseTx at 0xe800. Vers LK1.1.19

dmesg when broken:
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
0000:00:0d.0: 3Com PCI 3c905B Cyclone 100baseTx at 0x1000. Vers LK1.1.19
PCI: Setting latency timer of device 0000:00:0d.0 to 64
  ***WARNING*** No MII transceivers found!

I notice that the address of the device differs between the two cases.
Comment 9 Ian Monroe 2004-01-28 08:05:31 UTC
We are having simular problems with the IBM 300GL's (at least, the two we've
tried 2.6 on) and the 3c59x. If the power its turned off and on the machine
boots and connects to the internet normally. However, if its a soft reboot it
outputs the  ***WARNING*** No MII transceivers found! and does not connect to
the internet. Outside of that message and lack of connectivity, it boots normally.

We have run 2.6.0, 2.6.1 and 2.6.2-rc1 and it shows this behavior with all of those.
Comment 10 Ian Monroe 2004-01-28 08:17:07 UTC
Another description of the problem from LKML: "problems with 3c59x in
2.6.0-test11" http://lkml.org/lkml/2003/12/8/240
Comment 11 Jasmin Buchert 2004-01-28 14:31:07 UTC
I can confirm this bug. I have this problem with all 2.6 kernels (even 2.6.1 and
2.6.2-rc's). If I reboot the computer the network card doesn't work anymore. I
have to turn off and on my computer to get it working again.
Turning off ACPI etc does not help.
Comment 12 Tom Rini 2004-02-11 09:04:13 UTC
<Sending this to lkml/netdev as well>
Hello.  I believe I have tracked down a problem with the WOL support in
the 3c59x driver (2.6 varriant only right now).  The problem has been
seen I belive by a few people:
(my own reports)
http://marc.theaimsgroup.com/?l=linux-kernel&m=106297008218993&w=2
http://lkml.org/lkml/2003/9/2/167

I believe that the problem here is not with the driver per-se but with
the BIOS:
http://www.asus.com.tw/download/mbdriver/slot1-440lx.htm

The short description of the problem is that when the WOL code in the
driver is enabled, on some presumably buggy BIOSes the card ends up
getting put into the sleep state, or something as something like the
following gets reported as the driver inits:
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
See Documentation/networking/vortex.txt
0000:00:0e.0: 3Com PCI 3c905B Cyclone 100baseTx at 0xe480. Vers LK1.1.19
PCI: Setting latency timer of device 0000:00:0e.0 to 64
 ff:ff:ff:ff:ff:ff, IRQ 9
  product code ffff rev ffff.15 date 15-31-127
Full duplex capable
  Internal config register is ffffffff, transceivers 0xffff.
  1024K word-wide RAM 3:5 Rx:Tx split, autoselect/<invalid transceiver> interface.
  ***WARNING*** No MII transceivers found!
  Enabling bus-master transmits and early receives.
0000:00:0e.0: scatter/gather enabled. h/w checksums enabled

If the patch in 1.1046.589.6 (key:
akpm@osdl.org|ChangeSet|20030801165536|51693) is reversed, the problem
goes away.  But since that would be a drastic step for some buggy
BIOSes, would a patch to add in a disable_wol (and maybe a prink about
it, if no MII transceivers are found?) be an OK fix for this?
Comment 13 Tom Rini 2004-02-11 09:06:00 UTC
Created attachment 2080 [details]
Revert the changes which enabled Wake-On-LAN changes by default  in 2.6

This patch adds back the enable_wol param, and fixes the problem on at least
one machine for me.
Comment 14 Domen Puncer Kugler 2004-03-23 02:19:29 UTC
This looks like the bug i bit some time ago.

options=8 (autonegotiate) parameter fixed it for me.
Comment 15 Jeff Garzik 2004-03-25 19:31:48 UTC
Is this problem still present in 2.6.5-rc?
Comment 16 Tom Rini 2004-03-25 20:24:42 UTC
I do not believe so.  Shortly after I posted my workaround, this went into
kernel.org (via akpm) and I haven't seen the problem since.
Comment 17 Lenar L 2004-03-25 23:53:42 UTC
I think it was fixed at least in -mm series. About a week ago I rebooted many 
times in a row (soft reboot) and 3com NIC got up and correctly running every 
time. 
Comment 18 pagnon stephane 2004-07-27 13:23:37 UTC
After reading what you have said, i'm not sure i've got the solve but...

On my 3C905TX card, i got the same problem with a kernel 2.4.20 and 2.4.26...

i saw the problem with mii-tool that state card is running in 10mb, no link,
half duplex...

I solve it by compiling drivers inside my kernel instead of as a module, now
autonegotiation works and now card is 100mb/full/flow control with mii-tool
working...
Comment 19 Chris Lent 2004-11-23 06:31:16 UTC
Seems like this might be related to :
(net 3c59x) Cyclone Cards just recognized on cold boot not on restart
Bug#:   	 2248
http://bugzilla.kernel.org/show_bug.cgi?id=2248
Comment 20 Tom Rini 2004-11-23 06:48:49 UTC
> ------- Additional Comments From bzlent@cooper.edu  2004-11-23 06:31 -------
> Seems like this might be related to :
> (net 3c59x) Cyclone Cards just recognized on cold boot not on restart
> Bug#:          2248
> http://bugzilla.kernel.org/show_bug.cgi?id=2248

Someone should close this bug, as it's been fixed since 2.6.5....
Comment 21 Alexander Nyberg 2004-12-07 02:27:02 UTC
In mail exchange Lenar tells me problem is solved since about 2.6.7. If anyone
still has this problem please speak up and reopen the bug cause I'm closing it.

Note You need to log in before you can comment on or make changes to this bug.