Bug 5379 - skge driver turns off 3C940
Summary: skge driver turns off 3C940
Status: REJECTED UNREPRODUCIBLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-10-06 09:33 UTC by Stephen Hemminger
Modified: 2007-01-12 11:56 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.13
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
add pci posting and delays (3.43 KB, patch)
2006-12-18 11:56 UTC, Stephen Hemminger
Details | Diff

Description Stephen Hemminger 2005-10-06 09:33:00 UTC
From: Wolfgang Breyha <wbreyha@gmx.net>

Sorry for directly writing to you, but I didn't find any information on
google, usenet or from ASUS-Support about my Problem.

I've a ASUS P4P800 with BIOS Rev 1019 (the latest).

This board has a 3Com 3C940 NIC.

Since I couldn't fix my troubles with sk98lin I tried skge. With skge I
at least got a littlebit more information but the (same) troubles still
exist.

lspci -x shows...
02:05.0 Ethernet controller: 3Com Corporation 3c940 10/100/1000Base-T
[Marvell] (rev 12)
00: b7 10 00 17 17 01 b0 02 12 00 00 02 04 40 00 00
10: 00 80 ff f7 01 e8 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 eb 80
30: 00 00 00 00 48 00 00 00 00 00 00 00 05 01 17 1f

My problem is, that after doing a "ifconfig eth0 down" "lspci -x" shows...
02:05.0 Ethernet controller: 3Com Corporation 3c940 10/100/1000Base-T
[Marvell] (rev ff)
00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

if i do a "rmmod skge" at this point I'm not able to modprobe it again
because it's complaining about an unknown chiprev (0xff). That's not
that big suprise facing the lspci output;-) Rebooting has no affect. The
device is missing completly from the pci devicelisting after a reboot.
The only thing that helps is to pull the plug till the mainboard
completly looses power. After that the NIC reapears and is accessable.

The funny thing (and the only difference between sk98lin and skge) is,
that If a do a "ifconfig eth0 down" and a "ifconfig eth0 up" without
"rmmod skge" inbetween the NIC reapears with correct lspci data and is
fully functional again.
Comment 1 Stephen Hemminger 2005-10-06 09:33:36 UTC
I made some experiments with your new code.

First of all I had to remove two lines, to get it to compile with a
vanilla kernel 2.6.13.2 (see patch below at line ~734 and ~3141).
Then I had to put same udelay into the CHECK_DEAD() macro. The NIC seems
to need some time to disable. After doing that I was able to identify
the call which disables it (line 1873). Meanwhile I've verified that by
removing the functioncall. Funny thing is... after doing a "ifconfig
eth0 down; rmmod skge" and reenabling the NIC it has a partly changed
MAC-Adress.

I recognized that sometimes registers are set with skge_write32 and
sometimes the same register(value) is set using skge_write8. Is this
correct? I've no glue how to handle this peace of hardware, but that was
something disturbing me;-)

eg:
# grep -n GMAC_CTRL skge.c
1698:   skge_write32(hw, SK_REG(port, GMAC_CTRL), GMC_RST_SET);
1717:   skge_write32(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_ON |
GMC_RST_CLR);
1737:           skge_write32(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);
1856:   skge_write32(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);
1873:// skge_write8(hw, SK_REG(port, GMAC_CTRL), GMC_RST_SET);
2044:                   skge_write8(hw, SK_REG(port, GMAC_CTRL),
GMC_PAUSE_OFF);
2046:                   skge_write8(hw, SK_REG(port, GMAC_CTRL),
GMC_PAUSE_ON);

here is what my messages.log says about changed MAC etc...
first boot...
kernel: ACPI: PCI Interrupt 0000:02:05.0[A] -> GSI 22 (level, low) ->
IRQ 177
kernel: skge addr 0xf7ff8000 irq 177 chip Yukon rev 1
kernel: skge eth0: addr 00:0c:6e:a0:f7:98
kernel: skge eth0: enabling interface
kernel: skge eth0: Link is up at 100 Mbps, full duplex, flow control none

...then "ifconfig eth0 down"....
kernel: skge eth0: disabling interface
kernel: ACPI: PCI interrupt for device 0000:02:05.0 disabled
...then "rmmod skge"....the kernel then automagically reloads the module....
kernel: ACPI: PCI Interrupt 0000:02:05.0[A] -> GSI 22 (level, low) ->
IRQ 177
kernel: skge addr 0xf7ff8000 irq 177 chip Yukon rev 1
kernel: skge eth0: addr 00:04:6e:a0:f7:81
kernel: skge eth0: enabling interface
kernel: skge eth0: Link is up at 100 Mbps, full duplex, flow control none

Funny thing is.... 00:04:6e is cisco;-) 00:0c:6e is ASUSTek;-)

Last but not least the diff to the version you sent me...

Regards, Wolfgang

# diff -u skge/drivers/net/skge.c linux/drivers/net/skge.c
--- skge/drivers/net/skge.c     2005-09-29 20:42:53.000000000 +0200
+++ linux/drivers/net/skge.c    2005-10-03 12:38:19.000000000 +0200
@@ -730,7 +730,7 @@
        .phys_id        = skge_phys_id,
        .get_stats_count = skge_get_stats_count,
        .get_ethtool_stats = skge_get_ethtool_stats,
-       .get_perm_addr  = ethtool_op_get_perm_addr,
+//     .get_perm_addr  = ethtool_op_get_perm_addr,
 };

 /*
@@ -1813,7 +1813,7 @@
 }

 #define CHECK_DEAD(hw)
        \
-       { if  (skge_read8(hw, B2_CHIP_ID) == 0xff)                      \
+       { udelay(10000); if  (skge_read8(hw, B2_CHIP_ID) == 0xff)
                \
                printk(PFX "killed at %s:%d\n", __FILE__, __LINE__); }

 static void yukon_txstop(struct skge_hw *hw, int port)
@@ -1869,7 +1869,9 @@
        /* set GPHY Control reset */
        CHECK_DEAD(hw);
        skge_write8(hw, SK_REG(port, GPHY_CTRL), GPC_RST_SET);
-       skge_write8(hw, SK_REG(port, GMAC_CTRL), GMC_RST_SET);
+       CHECK_DEAD(hw);
+//     skge_write8(hw, SK_REG(port, GMAC_CTRL), GMC_RST_SET);
+       CHECK_DEAD(hw);
 }

 static void yukon_rxstop(struct skge_hw *hw, int port)
@@ -3138,7 +3140,7 @@

        /* read the mac address */
        memcpy_fromio(dev->dev_addr, hw->regs + B2_MAC_1 + port*8,
ETH_ALEN);
-       memcpy(dev->perm_addr, dev->dev_addr, dev->addr_len);
+//     memcpy(dev->perm_addr, dev->dev_addr, dev->addr_len);

        /* device is off until link detection */
        netif_carrier_off(dev);
Comment 2 Stephen Hemminger 2005-12-14 16:15:13 UTC
Is this still reproducible with latest version (2.6.15-rc5)?
Comment 3 Adrian Bunk 2006-03-08 13:37:07 UTC
Please reopen this bug if it's still present in recent 2.6 kernels.
Comment 4 Stephen Hemminger 2006-03-08 14:54:05 UTC
Yes, this is still a real bug. But I don't have that hardware to
reproduce the problem.  It works fine on other chip versions, so
I suspect interaction wiht 3C940 vs other Yukon versions.
Comment 5 Stephen Hemminger 2006-12-18 11:56:41 UTC
Created attachment 9867 [details]
add pci posting and delays

This may fix issues where the driver wasn't waiting long enough

Note You need to log in before you can comment on or make changes to this bug.