Most recent kernel where this bug did *NOT* occur: do not remember any more Distribution: Gentoo Hardware Environment: amd64, lspci follows at the end of post Software Environment: gcc version 4.1.1 (Gentoo 4.1.1-r3) Problem Description: sky2 driver starts ok. After some time or under heavy load it crashes. Sometimes I can use rmmod sky2 modprobe sky2 to bring it back, but this only works maybe a few times. Usually network becomes slow first, then dies. I have two network nices marvell (sky2) and nvidia (forcedeth). The problems occur if marvell is for internet and nvidia for local network AND vice versa. I tried compiling kernel with and without Optimize for size. Network dies in any situation. I was putting a lot of hope for this kernels (2.6.19-*) from reading the change logs. Good work guys, but my card still doesn't work. Here are the outputs from lspci 02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 19) Subsystem: Giga-byte Technology Marvell 88E8053 Gigabit Ethernet Controller (Gigabyte) Flags: fast devsel, IRQ 17 Memory at f1000000 (64-bit, non-prefetchable) [size=16K] I/O ports at b000 [size=256] [virtual] Expansion ROM at 50000000 [disabled] [size=128K] Capabilities: [48] Power Management version 2 Capabilities: [50] Vital Product Data Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable- Capabilities: [e0] Express Legacy Endpoint IRQ 0 Capabilities: [100] Advanced Error Reporting 00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3) Subsystem: Giga-byte Technology GA-K8N Ultra-9 Mainboard Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 20 Memory at f2101000 (32-bit, non-prefetchable) [size=4K] I/O ports at e400 [size=8] Capabilities: [44] Power Management version 2 Steps to reproduce: Turn on network. Dies sooner under heavy load (rsync, bittorrent), especially on both NICes.
System works afterwards, only network doesen't. Sometimes both NICes die, sometimes Marvell only, sometimes Marvel and Nvidia slows down.
As long as I remember It didn't work with any kernel I ever tried. From gentoo-sources, any mm, and other experimental kernels (many mm-based). With 2.6.19-rc6-mm2 it looked promissing at the beginning, but after a day it is the same. I just cant get marvel to work ok. I'm writing this with nvidia (forcedeth) only. Ater network crashed I couldnt get on again. I needed a hard reset of the machine (or two) to get forcedeth to work again. Sorry for two aditions. I remembered to add data afterwards. Ziga
Your problem is a duplicate of earlier bug. It occurs only on the 88e8053 version of the chip. I don't have that hardware to debug/fix the problem so resolution will be slow. You might try the vendor driver; but it has other problems *** This bug has been marked as a duplicate of 7579 ***
since my behavior resembles this bug the post I'll post it here: (hope the CC was right) I just encountered a hang with 2.6.23-rc7 & sky2: hardware: Asus P5W DH Deluxe, 2 Gigabit Lan 8053 chipset of the Marvell Yukon2 lan-adaptor, one port is connected to a linksys WRT54GL router with dd-wrt internet connection-speed: 8 MBit/s up, 0.5 MBit/s down kernel: 2.6.23-rc7, gcc-4.2.1 hardened, GNU/Gentoo hardened x86 (32bit) 03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 20) Subsystem: ASUSTeK Computer Inc. Marvell 88E8053 Gigabit Ethernet controller PCIe (Asus) Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at ebcfc000 (64-bit, non-prefetchable) [size=16K] I/O ports at a800 [size=256] Expansion ROM at ebcc0000 [disabled] [size=128K] Capabilities: [48] Power Management version 2 Capabilities: [50] Vital Product Data Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+ Queue=0/1 Enable- Capabilities: [e0] Express Legacy Endpoint IRQ 0 04:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 20) Subsystem: ASUSTeK Computer Inc. Marvell 88E8053 Gigabit Ethernet controller PCIe (Asus) Flags: bus master, fast devsel, latency 0, IRQ 17 Memory at ebdfc000 (64-bit, non-prefetchable) [size=16K] I/O ports at b800 [size=256] Expansion ROM at ebdc0000 [disabled] [size=128K] Capabilities: [48] Power Management version 2 Capabilities: [50] Vital Product Data Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+ Queue=0/1 Enable- Capabilities: [e0] Express Legacy Endpoint IRQ 0 steps taken: ethtool -s eth4 autoneg off speed 100 ethtool -s eth4 wol d (like recommended on http://www.lesswatts.org/tips/index.php to save some energy ;) ) surfed some minutes through the net, started MultiGet (downloadmanager), added http://ftp.utcluj.ro/pub/rofreesbie/devel/RoFreeSBIE-1.3_rc4.iso , surfed some more through the net (with constantly 7 MBit/s downloading), then wanted to download some documentation: http://www.rofreesbie.org/GUIDE_USER.pdf# BAM! everything stood still (network, no ping possible, even not to router, ...) here the output (I didn't find it at once so hopefully there's nothing missing ;) ): cat /sys/kernel/debug/sky2/eth4 IRQ src=0 mask=c000001d control=0 Status ring (empty) Tx ring pending=498...68 report=498 done=498 498: 0x3640908e(66) 499: 0x3640988e(66) 500: 0x363c64be(66) frag=0x343cd246(680) 502: 0x364094be(66) frag=0x343cd246(680) 504: 0x3630c48e(66) 505: 0x3630ca8e(66) 506: 0x36a2c68e(66) 507: 0x363c6c8e(66) 508: 0x363c6a8e(66) 509: 0x366eee8e(66) 510: 0x366a8e86(74) 511: 0x35dd9486(74) 0: 0x36708286(74) 1: 0x37091e8e(66) 2: 0x37091c8e(66) 3: 0x377fe28e(66) 4: 0x35dd968e(66) 5: 0x37f2c68e(66) 6: 0x37f2ca8e(66) 7: 0x35dd908e(66) 8: 0x366b288e(66) 9: 0x3670808e(66) 10: 0x36a02c8e(66) 11: 0x3670888e(66) 12: 0x36310e8e(66) 13: 0x36025a8e(66) 14: 0x3673508e(66) 15: 0x36a0248e(66) 16: 0x366a808e(66) 17: 0x3630ae9a(54) 18: 0x3630a29a(54) 19: csum=0x220028 0x352ebc02(73) 21: 0x3630a802(42) 22: 0x3630a002(42) 23: 0x36116002(42) 24: 0x352eba02(73) 25: 0x352eb402(42) 26: 0x352eb002(42) 27: 0x352ebe02(42) 28: 0x352eb802(42) 29: 0x352eb602(42) 30: 0x36116602(42) 31: 0x358a2a02(42) 32: 0x352eb202(42) 33: 0x33cc2e02(42) 34: 0x33cc2002(42) 35: 0x33cc2402(42) 36: 0x33cc2802(42) 37: 0x33cc2602(42) 38: 0x33cc2202(42) 39: 0x33cc2c02(42) 40: 0x33cda202(42) 41: 0x33cdac02(42) 42: 0x33cda602(42) 43: 0x33cdaa02(42) 44: 0x33cda402(42) 45: 0x36a02a02(42) 46: 0x33cda802(42) 47: 0x33cdae02(42) 48: 0x358a2802(42) 49: 0x33c7f202(42) 50: 0x33c7f402(42) 51: 0x33c7fc02(42) 52: 0x33c7fe02(42) 53: 0x33c7f002(42) 54: 0x33c7f802(42) 55: 0x33c7f602(42) 56: 0x3513d602(42) 57: 0x3513dc02(42) 58: 0x3513d202(42) 59: 0x3513d402(42) 60: 0x3513d802(42) 61: 0x3513d002(42) 62: 0x33c7fa02(42) 63: 0x33cda002(42) 64: 0x34154e02(42) 65: 0x34154002(42) 66: 0x34154c02(42) 67: 0x33aefc02(42) Rx ring hw get=828 put=988 last=1023 after that I made some pinging to 192.168.1.1 and started to reach some sites, but no reaction: cat /sys/kernel/debug/sky2/eth4 IRQ src=0 mask=c000001d control=0 Status ring (empty) Tx ring pending=498...145 report=498 done=498 498: 0x3640908e(66) 499: 0x3640988e(66) 500: 0x363c64be(66) frag=0x343cd246(680) 502: 0x364094be(66) frag=0x343cd246(680) 504: 0x3630c48e(66) 505: 0x3630ca8e(66) 506: 0x36a2c68e(66) 507: 0x363c6c8e(66) 508: 0x363c6a8e(66) 509: 0x366eee8e(66) 510: 0x366a8e86(74) 511: 0x35dd9486(74) 0: 0x36708286(74) 1: 0x37091e8e(66) 2: 0x37091c8e(66) 3: 0x377fe28e(66) 4: 0x35dd968e(66) 5: 0x37f2c68e(66) 6: 0x37f2ca8e(66) 7: 0x35dd908e(66) 8: 0x366b288e(66) 9: 0x3670808e(66) 10: 0x36a02c8e(66) 11: 0x3670888e(66) 12: 0x36310e8e(66) 13: 0x36025a8e(66) 14: 0x3673508e(66) 15: 0x36a0248e(66) 16: 0x366a808e(66) 17: 0x3630ae9a(54) 18: 0x3630a29a(54) 19: csum=0x220028 0x352ebc02(73) 21: 0x3630a802(42) 22: 0x3630a002(42) 23: 0x36116002(42) 24: 0x352eba02(73) 25: 0x352eb402(42) 26: 0x352eb002(42) 27: 0x352ebe02(42) 28: 0x352eb802(42) 29: 0x352eb602(42) 30: 0x36116602(42) 31: 0x358a2a02(42) 32: 0x352eb202(42) 33: 0x33cc2e02(42) 34: 0x33cc2002(42) 35: 0x33cc2402(42) 36: 0x33cc2802(42) 37: 0x33cc2602(42) 38: 0x33cc2202(42) 39: 0x33cc2c02(42) 40: 0x33cda202(42) 41: 0x33cdac02(42) 42: 0x33cda602(42) 43: 0x33cdaa02(42) 44: 0x33cda402(42) 45: 0x36a02a02(42) 46: 0x33cda802(42) 47: 0x33cdae02(42) 48: 0x358a2802(42) 49: 0x33c7f202(42) 50: 0x33c7f402(42) 51: 0x33c7fc02(42) 52: 0x33c7fe02(42) 53: 0x33c7f002(42) 54: 0x33c7f802(42) 55: 0x33c7f602(42) 56: 0x3513d602(42) 57: 0x3513dc02(42) 58: 0x3513d202(42) 59: 0x3513d402(42) 60: 0x3513d802(42) 61: 0x3513d002(42) 62: 0x33c7fa02(42) 63: 0x33cda002(42) 64: 0x34154e02(42) 65: 0x34154002(42) 66: 0x34154c02(42) 67: 0x33aefc02(42) 68: 0x33aef802(42) 69: 0x33aef602(42) 70: 0x33aefa02(42) 71: 0x34154602(42) 72: 0x34154202(42) 73: 0x33aef202(42) 74: 0x33aefe02(42) 75: 0x3630a602(42) 76: 0x33d97202(42) 77: 0x34154802(42) 78: 0x33d97802(42) 79: 0x33ade202(42) 80: 0x33ade002(42) 81: 0x33ade802(42) 82: 0x33adee02(42) 83: 0x3340a402(42) 84: 0x3340a802(42) 85: 0x33465a02(42) 86: 0x33465402(42) 87: 0x33465602(42) 88: 0x33465e02(42) 89: 0x3350a602(42) 90: 0x3340a202(42) 91: 0x33465802(42) 92: 0x3350a402(42) 93: 0x3350ac02(42) 94: 0x33414402(42) 95: 0x33414202(42) 96: 0x33465002(42) 97: 0x3350a002(42) 98: 0x335c3402(42) 99: 0x335c3e02(42) 100: 0x33792002(42) 101: 0x33646602(42) 102: 0x33646a02(42) 103: 0x3306d802(42) 104: 0x3306da02(42) 105: 0x3306d002(42) 106: 0x330aaa02(42) 107: 0x330aae02(42) 108: 0x330aa002(42) 109: 0x33646e02(42) 110: 0x3350ae02(42) 111: 0x330aac02(42) 112: 0x33792202(42) 113: 0x330dac02(42) 114: 0x330da202(42) 115: 0x330dae02(42) 116: 0x330da802(42) 117: 0x330da402(42) 118: 0x330ed802(42) 119: 0x330ed202(42) 120: 0x330ede02(42) 121: 0x330e9c02(42) 122: 0x330e9002(42) 123: 0x330cf802(42) 124: 0x330eda02(42) 125: 0x330edc02(42) 126: 0x335c3c02(42) 127: 0x330da002(42) 128: 0x330daa02(42) 129: 0x335c3802(42) 130: 0x330ed602(42) 131: 0x3306d402(42) 132: 0x330e9202(42) 133: 0x330e9e02(42) 134: 0x330e9802(42) 135: 0x330da602(42) 136: 0x330e9402(42) 137: 0x330ed002(42) 138: 0x330cfa02(42) 139: 0x330cf402(42) 140: 0x33646202(42) 141: 0x33792c02(42) 142: 0x33112a02(42) 143: 0x330cfc02(42) 144: 0x330e9a02(42) Rx ring hw get=828 put=988 last=1023 after that: ifconfig eth4 down waited some time, then ifconfig eth4 up lexa mat # ping -c 3 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. From 192.168.1.140 icmp_seq=2 Destination Host Unreachable From 192.168.1.140 icmp_seq=3 Destination Host Unreachable --- 192.168.1.1 ping statistics --- 3 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1999ms , pipe 2 lexa mat # cat /sys/kernel/debug/sky2/eth4 IRQ src=0 mask=c000001d control=0 Status ring (empty) Tx ring pending=21...21 report=21 done=21 Rx ring hw get=60 put=169 last=1023 didn't help ;( after modprobe -r sky2 && modprobe sky2 it is working again
Don't turn flow control off! Some of the chip versions (EC/XL) have a hardware bug where if the receive fifo gets full it gets stuck... If you have hardware flow control, the FIFO should never get full. Also, if you turn off flow control and the other side sends a flow control packet the chip might stop as well.
ok, acknowledged, sorry - all my bad - this time it nevertheless still hangs from time to time but in those cases simple ifconfig up & down does the trick thanks for that fast reply, btw, Stephen =) is it still advisable to append: pci=nomsi during bootup ?
MSI works fine if the underlying BIOS and hardware isn't broken. All chipsets by now should either work or MSI is automatically disabled via the PCI quirk table (for example AMD Opteron PCI-X chipsets have MSI marked broken).
This should be at least managed by the receive hang detection and recovery logic in 2.6.23 (and later kernels). Reopen the bug if it still occurs.