Bug 10885 - CK804 Ethernet Controller (rev a3) failure
Summary: CK804 Ethernet Controller (rev a3) failure
Status: CLOSED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-06-07 14:43 UTC by Karen Shaeffer
Modified: 2012-05-21 15:48 UTC (History)
6 users (show)

See Also:
Kernel Version: 2.6.23.12
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Karen Shaeffer 2008-06-07 14:43:34 UTC
Latest working kernel version: 2.6.25.4
Earliest failing kernel version: 2.6.22.10
Distribution: modified Rock linux 2.0.2
Hardware Environment: Sun Netra X4200 M2 Server, CK804 Ethernet Controller (rev a3)
Software Environment: Any recent kernel.org kernel, either 32 bit or 64 bit
Problem Description: The Nvidia CK48 chipset N2200 chip has an integrated NIC that hangs under specific conditions. The hang completely disables the NIC from sending  or receiving packets. The conditions are as follows:

1.) configure the NIC for 100 Mb autoneg on
2.) Connect the NIC to a managed switch port configured for 100 Mb autoneg on.
3.) Boot up the server. For these conditions, the NIC will always link at 100 Mb half duplex, while the switch will link at 100 Mb full duplex. This is a bug in itself, but it isn't the failure mode. (Note that this NIC can be forced to 100 Mb full duplex by configuring it as such with ethtool.) This link mismatch is a necessary condition to reproduce the NIC hang.
5.) The NIC can fail in normal service. It can fail at boot time link negotiation as well. The boot time failure is easiest to reproduce.
Feb  1 22:31:06 dut kernel: nv_stop_tx: TransmitterStatus remained busy<6>NETDEV WATCHDOG: eth2: transmit
timed out
Feb  1 22:31:06 dut kernel: eth2: Got tx_timeout. irq: 00000020
Feb  1 22:31:06 dut kernel: eth2: Ring at 21ebd8000
Feb  1 22:31:06 dut kernel: eth2: Dumping tx registers
Feb  1 22:31:06 dut kernel:   0: 00000020 00000000 00000003 009803ca 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel:  20: 00000014 4f9e6480 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel:  40: 0420e20e 0000a855 00002e20 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel:  60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel:  80: 003b0f3e 00000000 00000001 007f0020 0000061c 00000000 00000000 00002d5f
Feb  1 22:31:06 dut kernel:  a0: 0016070f 00000016 9e4f1400 00008064 00000001 00000000 6100cccd 0000049b
Feb  1 22:31:06 dut kernel:  c0: 10000101 00000001 00000001 00000001 00000001 00000001 00000001 00000001
Feb  1 22:31:06 dut kernel:  e0: 00000001 00000001 00000001 00000001 00000001 00000001 00000001 00000001
Feb  1 22:31:06 dut kernel: 100: 1ebd8800 1ebd8000 007f00ff 00008000 00000000 00000000 0000005f 1ebd8ae0
Feb  1 22:31:06 dut kernel: 120: 1ebd80e0 1fb1f440 ac0000ea 00000000 00000000 1ebd880c 1ebd800c 01e08000
Feb  1 22:31:06 dut kernel: 140: 00304120 c000260c 00000002 00000002 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel: 160: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel: 180: 00000016 00000008 0194796d 00008103 00000021 0000796d 0194000d 0000000f
Feb  1 22:31:06 dut kernel: 1a0: 00000016 00000008 0194796d 00008103 00000021 0000796d 0194000d 0000000f
Feb  1 22:31:06 dut kernel: 1c0: 00000016 00000008 0194796d 00008103 00000021 0000796d 0194000d 0000000f
Feb  1 22:31:06 dut kernel: 1e0: 00000016 00000008 0194796d 00008103 00000021 0000796d 0194000d 0000000f
Feb  1 22:31:06 dut kernel: 200: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel: 220: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel: 240: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel: 260: 00000000 00000000 fe020001 00000100 00000000 00000000 7e020001 00000100
Feb  1 22:31:07 dut kernel: 280: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 2a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 2c0: 00000000 00000000 00000000 00000000 00000000 00000001 00000001 00000001
Feb  1 22:31:07 dut kernel: eth2: Dumping tx ring
Feb  1 22:31:07 dut kernel: 000: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 004: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 008: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 00c: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 010: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 014: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 018: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 01c: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 020: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 024: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 028: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 02c: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 030: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 034: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 038: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 03c: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 040: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 044: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 048: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 04c: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 050: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 054: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 058: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 05c: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 060: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 064: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 068: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 06c: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 070: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 074: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 078: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 07c: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 080: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 084: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 088: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 08c: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 090: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 094: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 098: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 09c: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0a0: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0a4: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0a8: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0ac: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0b0: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0b4: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0b8: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0bc: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0c0: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0c4: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0c8: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0cc: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0d0: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0d4: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0d8: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0dc: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 0e0: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:08 dut kernel: 0e4: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:08 dut kernel: 0e8: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:08 dut kernel: 0ec: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:08 dut kernel: 0f0: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:08 dut kernel: 0f4: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:08 dut kernel: 0f8: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:31:08 dut kernel: 0fc: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Feb  1 22:35:48 dut init: Switching to runlevel: 0
rebooting...
Feb  1 22:48:50 dut kernel: eth2: forcedeth.c: subsystem: 010de:cb84 bound to 0000:00:0a.0

shuting down.
Feb  2 19:36:02 dut kernel: nv_stop_tx: TransmitterStatus remained busy<6>nv_stop_tx: TransmitterStatus
remained busy<5>audit(1202009762.086:103): audit_pid=0 old=2625 by auid=4294967295

Feb  1 22:31:06 dut kernel: nv_stop_tx: TransmitterStatus remained busy<6>NETDEV WATCHDOG: eth2: transmit
timed out
Feb  1 22:31:06 dut kernel: eth2: Got tx_timeout. irq: 00000020
Feb  1 22:31:06 dut kernel: eth2: Ring at 21ebd8000
Feb  1 22:31:06 dut kernel: eth2: Dumping tx registers
Feb  1 22:31:06 dut kernel:   0: 00000020 00000000 00000003 009803ca 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel:  20: 00000014 4f9e6480 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel:  40: 0420e20e 0000a855 00002e20 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel:  60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel:  80: 003b0f3e 00000000 00000001 007f0020 0000061c 00000000 00000000 00002d5f
Feb  1 22:31:06 dut kernel:  a0: 0016070f 00000016 9e4f1400 00008064 00000001 00000000 6100cccd 0000049b
Feb  1 22:31:06 dut kernel:  c0: 10000101 00000001 00000001 00000001 00000001 00000001 00000001 00000001
Feb  1 22:31:06 dut kernel:  e0: 00000001 00000001 00000001 00000001 00000001 00000001 00000001 00000001
Feb  1 22:31:06 dut kernel: 100: 1ebd8800 1ebd8000 007f00ff 00008000 00000000 00000000 0000005f 1ebd8ae0
Feb  1 22:31:06 dut kernel: 120: 1ebd80e0 1fb1f440 ac0000ea 00000000 00000000 1ebd880c 1ebd800c 01e08000
Feb  1 22:31:06 dut kernel: 140: 00304120 c000260c 00000002 00000002 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel: 160: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel: 180: 00000016 00000008 0194796d 00008103 00000021 0000796d 0194000d 0000000f
Feb  1 22:31:06 dut kernel: 1a0: 00000016 00000008 0194796d 00008103 00000021 0000796d 0194000d 0000000f
Feb  1 22:31:06 dut kernel: 1c0: 00000016 00000008 0194796d 00008103 00000021 0000796d 0194000d 0000000f
Feb  1 22:31:06 dut kernel: 1e0: 00000016 00000008 0194796d 00008103 00000021 0000796d 0194000d 0000000f
Feb  1 22:31:06 dut kernel: 200: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel: 220: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel: 240: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:06 dut kernel: 260: 00000000 00000000 fe020001 00000100 00000000 00000000 7e020001 00000100
Feb  1 22:31:07 dut kernel: 280: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 2a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb  1 22:31:07 dut kernel: 2c0: 00000000 00000000 00000000 00000000 00000000 00000001 00000001 00000001
Feb  1 22:31:07 dut kernel: eth2: Dumping tx ring
Feb  1 22:31:07 dut kernel: 000: 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000 // 00000000 00000000 00000000
Note -- a soft reboot did not clear the failure.

Steps to reproduce:
The NIC must be in 100 Mb half duplex mode while the switch port is in 100 Mb full duplex mode. The quickest way to recreate this failure is to write a boot
script that will repeatedly reboot the server, terminating the reboot sequence once the NIC has failed.

After configuring the NIC as described above, and with a sustained
packet rate of about 3000 packets per second, run a reboot test on
the Sun X4200 M2 server. The reboot test simply tests if the NIC is
linked and running after the boot. If it is working OK, then the
reboot test reboots the server.

My reboot test is quite effective at inducing this failure. Many
failures occur in a couple hours. The longest I have seen this
test run without failure is about 7 hours.

The failure can be definitively identified by running the
ethtool offline selftest on the NIC. This test will always
fail, when the NIC has failed. Note this is at 100Mb. I am
aware the ethtool selftest always fails when the NIC is
configured for 1000Mb. Also note this failure I am describing
does not occur when the CK804 NIC is configured for 1000Mb.

Once the NIC fails, it is fatally hung. A warm reboot will not clear
this failure. I have learned that I can clear the failure
by powering the appliance off. Then powering it up. Then
powering it down immediately after the power is on and the
BIOS POST is starting to execute. Then reboot and let the
X4200 M2 server boot up this time. The NIC will have the
error cleared.

Note this failure mode will also repeat running RHEL 5.1 kernels on either 32 or 64 bit installs. The failure is always the same with any of the kernels mentioned in this bug report. I did the extra testing with the RHEL kernels, because Sun Microsystems required it, even though the production kernels for the product are kernel.org kernels running Rock-2.0.2 distributions.

More info:
# lspci
00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)
00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2)
00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3)
00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2)
00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2)
00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology
Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology
Configuration
00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:03.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
04:00.0 PCI bridge: Intel Corporation 41210 [Lanai] Serial to Parallel PCI Bridge (A-Segment Bridge) (rev
09)
04:00.2 PCI bridge: Intel Corporation 41210 [Lanai] Serial to Parallel PCI Bridge (B-Segment Bridge) (rev
09)
80:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
80:01.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
80:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
80:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
80:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
80:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
80:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
80:10.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8132 PCI-X Bridge (rev 12)
80:10.1 PIC: Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC (rev 12)
80:11.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8132 PCI-X Bridge (rev 12)
80:11.1 PIC: Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC (rev 12)
83:00.0 PCI bridge: Intel Corporation 41210 [Lanai] Serial to Parallel PCI Bridge (A-Segment Bridge) (rev
09)
83:00.2 PCI bridge: Intel Corporation 41210 [Lanai] Serial to Parallel PCI Bridge (B-Segment Bridge) (rev
09)
8e:01.0 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 03)
8e:01.1 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 03)
8e:02.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064 PCI-X Fusion-MPT SAS (rev 02)

# lspci -t
-+-[0000:80]-+-00.0
 |           +-01.0
 |           +-0a.0
 |           +-0b.0-[0000:81]--
 |           +-0c.0-[0000:82]--
 |           +-0d.0-[0000:83-85]--+-00.0-[0000:85]--
 |           |                    \-00.2-[0000:84]--
 |           +-0e.0-[0000:86]--
 |           +-10.0-[0000:87]--
 |           +-10.1
 |           +-11.0-[0000:8e]--+-01.0
 |           |                 +-01.1
 |           |                 \-02.0
 |           \-11.1
 \-[0000:00]-+-00.0
             +-01.0
             +-01.1
             +-02.0
             +-02.1
             +-06.0
             +-09.0-[0000:01]----03.0
             +-0a.0
             +-0b.0-[0000:02]--
             +-0c.0-[0000:03]--
             +-0d.0-[0000:04-06]--+-00.0-[0000:06]--
             |                    \-00.2-[0000:05]--
             +-0e.0-[0000:07]--
             +-18.0
             +-18.1
             +-18.2
             +-18.3
             +-19.0
             +-19.1
             +-19.2
             \-19.3

# lspci -v
00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
        Flags: bus master, 66MHz, fast devsel, latency 0
        Capabilities: [44] HyperTransport: Slave or Primary Interface
        Capabilities: [e0] HyperTransport: MSI Mapping

00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3)
        Subsystem: nVidia Corporation Unknown device cb84
        Flags: bus master, 66MHz, fast devsel, latency 0

00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)
        Subsystem: nVidia Corporation Unknown device cb84
        Flags: 66MHz, fast devsel
        I/O ports at 2800 [size=32]
        I/O ports at 0400 [size=64]
        I/O ports at 0440 [size=64]
        Capabilities: [44] Power Management version 2

00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2) (prog-if 10 [OHCI])
        Subsystem: Sun Microsystems Computer Corp. Unknown device cb84
        Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 58
        Memory at fe3ff000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [44] Power Management version 2

00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3) (prog-if 20 [EHCI])
        Subsystem: Sun Microsystems Computer Corp. Unknown device cb84
        Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 66
        Memory at fe3fec00 (32-bit, non-prefetchable) [size=256]
        Capabilities: [44] Debug port
        Capabilities: [80] Power Management version 2

00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2) (prog-if 8a [Master SecP PriP])
        Subsystem: nVidia Corporation Unknown device cb84
        Flags: bus master, 66MHz, fast devsel, latency 0
        I/O ports at 2000 [size=16]
        Capabilities: [44] Power Management version 2

00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2) (prog-if 01 [Subtractive decode])
        Flags: bus master, 66MHz, fast devsel, latency 0
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=128
        I/O behind bridge: 0000c000-0000cfff
        Memory behind bridge: fc200000-fe2fffff
        Prefetchable memory behind bridge: e2000000-e20fffff

00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
        Subsystem: nVidia Corporation Unknown device cb84
        Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 90
        Memory at fe3fd000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at dc00 [size=8]
        Capabilities: [44] Power Management version 2

00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
        Capabilities: [40] Power Management version 2
        Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable+
        Capabilities: [58] HyperTransport: MSI Mapping
        Capabilities: [80] Express Root Port (Slot+) IRQ 0

00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
        Capabilities: [40] Power Management version 2
        Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable+
        Capabilities: [58] HyperTransport: MSI Mapping
        Capabilities: [80] Express Root Port (Slot+) IRQ 0

00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=00, secondary=04, subordinate=06, sec-latency=0
        Capabilities: [40] Power Management version 2
        Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable+
        Capabilities: [58] HyperTransport: MSI Mapping
        Capabilities: [80] Express Root Port (Slot+) IRQ 0

00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=00, secondary=07, subordinate=07, sec-latency=0
        Capabilities: [40] Power Management version 2
        Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable+
        Capabilities: [58] HyperTransport: MSI Mapping
        Capabilities: [80] Express Root Port (Slot+) IRQ 0

00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology
Configuration
        Flags: fast devsel
        Capabilities: [80] HyperTransport: Host or Secondary Interface
        Capabilities: [a0] HyperTransport: Host or Secondary Interface
        Capabilities: [c0] HyperTransport: Host or Secondary Interface

00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
        Flags: fast devsel

00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
        Flags: fast devsel

00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
        Flags: fast devsel
        Capabilities: [f0] #0f [0010]

00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology
Configuration
        Flags: fast devsel
        Capabilities: [80] HyperTransport: Host or Secondary Interface
        Capabilities: [a0] HyperTransport: Host or Secondary Interface
        Capabilities: [c0] HyperTransport: Host or Secondary Interface

00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
        Flags: fast devsel

00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
        Flags: fast devsel

00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
        Flags: fast devsel
        Capabilities: [f0] #0f [0010]

01:03.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) (prog-if 00 [VGA])
        Subsystem: ATI Technologies Inc Rage XL
        Flags: bus master, stepping, medium devsel, latency 64, IRQ 10
        Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
        I/O ports at c800 [size=256]
        Memory at fe2ff000 (32-bit, non-prefetchable) [size=4K]
        Expansion ROM at e2000000 [disabled] [size=128K]
        Capabilities: [5c] Power Management version 2

04:00.0 PCI bridge: Intel Corporation 41210 [Lanai] Serial to Parallel PCI Bridge (A-Segment Bridge) (rev
09) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=04, secondary=06, subordinate=06, sec-latency=64
        Capabilities: [44] Express PCI/PCI-X Bridge IRQ 0
        Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
        Capabilities: [6c] Power Management version 2
        Capabilities: [d8] PCI-X bridge device
04:00.2 PCI bridge: Intel Corporation 41210 [Lanai] Serial to Parallel PCI Bridge (B-Segment Bridge) (rev
09) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=04, secondary=05, subordinate=05, sec-latency=64
        Capabilities: [44] Express PCI/PCI-X Bridge IRQ 0
        Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
        Capabilities: [6c] Power Management version 2
        Capabilities: [d8] PCI-X bridge device

80:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
        Flags: bus master, 66MHz, fast devsel, latency 0
        Capabilities: [44] HyperTransport: Slave or Primary Interface
        Capabilities: [e0] HyperTransport: MSI Mapping

80:01.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
        Subsystem: nVidia Corporation Unknown device cb84
        Flags: bus master, 66MHz, fast devsel, latency 0
        Memory at feaff000 (32-bit, non-prefetchable) [size=4K]

80:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
        Subsystem: nVidia Corporation Unknown device cb84
        Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 106
        Memory at feafe000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at fc00 [size=8]
        Capabilities: [44] Power Management version 2

80:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=80, secondary=81, subordinate=81, sec-latency=0
        Capabilities: [40] Power Management version 2
        Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable+
        Capabilities: [58] HyperTransport: MSI Mapping
        Capabilities: [80] Express Root Port (Slot+) IRQ 0

80:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=80, secondary=82, subordinate=82, sec-latency=0
        Capabilities: [40] Power Management version 2
        Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable+
        Capabilities: [58] HyperTransport: MSI Mapping
        Capabilities: [80] Express Root Port (Slot+) IRQ 0

80:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=80, secondary=83, subordinate=85, sec-latency=0
        Capabilities: [40] Power Management version 2
        Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable+
        Capabilities: [58] HyperTransport: MSI Mapping
        Capabilities: [80] Express Root Port (Slot+) IRQ 0

80:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=80, secondary=86, subordinate=86, sec-latency=0
        Capabilities: [40] Power Management version 2
        Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable+
        Capabilities: [58] HyperTransport: MSI Mapping
        Capabilities: [80] Express Root Port (Slot+) IRQ 0

80:10.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8132 PCI-X Bridge (rev 12) (prog-if 00 [Normal
decode])
        Flags: bus master, fast devsel, latency 64
        Bus: primary=80, secondary=87, subordinate=87, sec-latency=64
        Capabilities: [60] PCI-X bridge device
        Capabilities: [b8] HyperTransport: Interrupt Discovery and Configuration
        Capabilities: [c0] HyperTransport: Slave or Primary Interface
        Capabilities: [f4] HyperTransport: MSI Mapping

80:10.1 PIC: Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC (rev 12) (prog-if 10 [IO-APIC])
        Subsystem: Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC
        Flags: bus master, medium devsel, latency 0
        Memory at feafd000 (64-bit, non-prefetchable) [size=4K]

80:11.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8132 PCI-X Bridge (rev 12) (prog-if 00 [Normal
decode])
        Flags: bus master, fast devsel, latency 64
        Bus: primary=80, secondary=8e, subordinate=8e, sec-latency=64
        I/O behind bridge: 0000e000-0000efff
        Memory behind bridge: fe500000-fe9fffff
        Capabilities: [60] PCI-X bridge device
        Capabilities: [b8] HyperTransport: Interrupt Discovery and Configuration
        Capabilities: [c0] HyperTransport: Revision ID: 2.00
        Capabilities: [f4] HyperTransport: MSI Mapping

80:11.1 PIC: Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC (rev 12) (prog-if 10 [IO-APIC])
        Subsystem: Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC
        Flags: bus master, medium devsel, latency 0
        Memory at feafc000 (64-bit, non-prefetchable) [size=4K]

83:00.0 PCI bridge: Intel Corporation 41210 [Lanai] Serial to Parallel PCI Bridge (A-Segment Bridge) (rev
09) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=83, secondary=85, subordinate=85, sec-latency=64
        Capabilities: [44] Express PCI/PCI-X Bridge IRQ 0
        Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
        Capabilities: [6c] Power Management version 2
        Capabilities: [d8] PCI-X bridge device

83:00.2 PCI bridge: Intel Corporation 41210 [Lanai] Serial to Parallel PCI Bridge (B-Segment Bridge) (rev
09) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=83, secondary=84, subordinate=84, sec-latency=64
        Capabilities: [44] Express PCI/PCI-X Bridge IRQ 0
        Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
        Capabilities: [6c] Power Management version 2
        Capabilities: [d8] PCI-X bridge device

8e:01.0 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 03)
        Subsystem: Intel Corporation PRO/1000 MT Dual Port Server Adapter
        Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 82
        Memory at fe9e0000 (64-bit, non-prefetchable) [size=128K]
        I/O ports at ec00 [size=64]
        Capabilities: [dc] Power Management version 2
        Capabilities: [e4] PCI-X non-bridge device
        Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-

8e:01.1 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 03)
        Subsystem: Intel Corporation PRO/1000 MT Dual Port Server Adapter
        Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 98
        Memory at fe9c0000 (64-bit, non-prefetchable) [size=128K]
        I/O ports at e800 [size=64]
        Capabilities: [dc] Power Management version 2
        Capabilities: [e4] PCI-X non-bridge device
        Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-

8e:02.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064 PCI-X Fusion-MPT SAS (rev 02)
        Subsystem: LSI Logic / Symbios Logic Unknown device 3060
        Flags: bus master, 66MHz, medium devsel, latency 72, IRQ 74
        I/O ports at e400 [disabled] [size=256]
        Memory at fe9bc000 (64-bit, non-prefetchable) [size=16K]
        Memory at fe9a0000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at fe600000 [disabled] [size=2M]
        Capabilities: [50] Power Management version 2
        Capabilities: [98] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
        Capabilities: [68] PCI-X non-bridge device
        Capabilities: [b0] MSI-X: Enable- Mask- TabSize=1
Comment 1 Karen Shaeffer 2008-06-07 19:03:54 UTC
One more detail. I personally reproduced the NIC mismatch and the NIC TX
failure many times in the lab on 5 different Netra servers, using quite
a few different kernel.org and RHEL kernels. It happened on every kernel I
tested, with 32 and 64 bit compiles. And it was produced in a data center
using the 2.6.22.10 kernel several times completely independent of me. For
more details, please see the bug ID 10885.

And I need to clarify that the last kernel I tested this on was actually
linux-2.6.24-rc8-git6. I mistated in the bug that I observed this failure
on the 2.6.25.4 kernel. That is inaccurate. I don't know, if it exist in
the 2.6.25.4 kernel, because I never tested that kernel. My error.
Comment 2 Roland Kletzing 2008-06-08 04:46:03 UTC
could you test if how that nic behaves with either "irqpoll" or "noapic acpi=off" ?
(see http://bugzilla.kernel.org/show_bug.cgi?id=9015 - maybe related)
Comment 3 Ayaz Abdulla 2008-06-09 10:04:49 UTC
Can you try the latest ethtool? I recall there was an issue with older ethtool that would not send down the correct settings to nic driver.
Comment 4 Laurent Jean-Rigaud 2008-09-08 08:41:07 UTC
For info, the problem appears also with last RHEL4 kernels also (2.6.9-78). And maybe before... forcedeth module versions 0.60 & 0.61 have the problem.

Forcedeth module fails during big transfers after some seconds IF static configuration is set on switch (no autoneg, Full duplex) and autoneg is set on forcedeth card. In this cas, duplex can not be negotiate and eth falls back to 100 half duplex. In other cases, the transfers is done w/o problem.

If transfer stalled, no more traffic can be done and network must be restarted.

dmesg :
../..
forcedeth: Reverse Engineered nForce ethernet driver. Version 0.61.
ACPI: PCI Interrupt 0000:00:14.0[A] -> GSI 21 (level, low) -> IRQ 201
PCI: Setting latency timer of device 0000:00:14.0 to 64
divert: allocating divert_blk for eth0
../..
forcedeth 0000:00:14.0: ifname eth0, PHY OUI 0x1c1 @ 0, addr 00:19:db:44:b6:b8
forcedeth 0000:00:14.0: highdma pwrctl timirq gbit lnktim desc-v3

../..
nv_stop_tx: TransmitterStatus remained busy<6>eth0: link down.
nv_stop_tx: TransmitterStatus remained busy<6>eth0: link up.
nv_stop_tx: TransmitterStatus remained busy
../..


As requested, "irqpoll" and "noapic acpi=off" options change nothing.


HW config: 
NEC POWERMATE_VL360 C51MCP51 AMD3800+
00:14.0 Bridge: nVidia Corporation MCP51 Ethernet Controller (rev a3)

Behaviour with different HW:
Problem does not appear with TG3 and E1000 modules.


PS: restart network needs MACADDRESS field to be set to HWADDRESS in ifcfg-eth0 config file (RHEL^h^h^h^hLSB ;-)) to avoid reverse numbering address problem.
Comment 5 Alan 2012-05-21 15:48:04 UTC
Please re-open if seen on a modern (2.6.32+) kernel

Note You need to log in before you can comment on or make changes to this bug.