Bug 2030

Summary: (net b44) driver kills integrated NIC
Product: Drivers Reporter: FRLinux (frlinux)
Component: NetworkAssignee: Jeff Garzik (jgarzik)
Status: RESOLVED CODE_FIX    
Severity: blocking    
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.2 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Always enable PHY on chip reset

Description FRLinux 2004-02-05 13:09:35 UTC
Distribution: Gentoo (same under LFS and Mandrake 10 Beta with kernel 2.6.2-rc3)
Hardware Environment: Asus a7v8x
Software Environment: Gentoo 1.4
Problem Description: kernel 2.6.2-rc3 and upper kills calls to my integrated
broadcom card. To fix, you need to unplug power, power on/off to empty the
motherboard from any power, then plug the cable back in. Bug wasn't in 2.6.2-rc2
(from which i'm currently filling that bug report) but is definitly in 2.6.2-rc3
and 2.6.2 final.

Here's my hardware spec : 
00:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T (rev 01)

Here's the error message : 
b44.c:v0.92 (Nov 4, 2003)
eth0: Broadcom 4400 10/100BaseT Ethernet 00:e0:18:b7:0b:11
b44: eth0: BUG!  Timeout waiting for bit 80000000 of register 428 to clear.
b44: eth0: BUG!  Timeout waiting for bit 80000000 of register 428 to clear.
b44: eth0: BUG!  Timeout waiting for bit 80000000 of register 428 to clear.
b44: eth0: BUG!  Timeout waiting for bit 80000000 of register 428 to clear.
b44: eth0: BUG!  Timeout waiting for bit 80000000 of register 428 to clear.
b44: eth0: BUG!  Timeout waiting for bit 80000000 of register 428 to clear.
b44: eth0: BUG!  Timeout waiting for bit 80000000 of register 428 to clear.
b44: eth0: BUG!  Timeout waiting for bit 80000000 of register 428 to clear.
request_module: failed /sbin/modprobe -- net-pf-10. error = 256
b44: eth0: Link is down.

And here's someone who got the same problem as me : 
http://www.spinics.net/lists/kernel/msg239176.html

Steps to reproduce:
Install kernel 2.6.2-rc3 or newer
reboot
watch your logs
no network ...
Comment 1 FRLinux 2004-02-07 08:32:48 UTC
I installed 2.6.2-mm1 today and that one works properly. I don't have any idea
of what has been fixed as the Changelog doesn't mention fixes to the broadcom
driver.

Steph
Comment 2 Pekka Pietikainen 2004-02-23 14:43:46 UTC
Hi

I finally was able to reproduce the bug. On my box it only happens if you
load b44 after you have loaded the broadcom bcm4400 driver (and works after 
unplugging the power).

An easier way to "fix" things is to add a return LM_STATUS_FAILURE around line
417 of b44lm.c (before b44_LM_SetMacAddress()), load and unload bcm4400 driver
and b44 works just fine again :-)

I've been trying to pinpoint the reason why this happens and why the "fix" works
(there seems to be some clues about this in the broadcom changelog about
accessing certain registers in some situations makes things break), but so far
haven't figured out what it is exactly. 

Datapoints gladly accepted.
Comment 3 Andy Schofield 2004-04-29 02:56:49 UTC
I too have this bug in kernel 2.6.5 (I have the b44 compiled into the kernel).

Steps to reproduce:
(1) Install the latest drivers for broadcom 4401 under windows XP. (I used the
HP service paq for my computer - an HP tc1100 tabletpc.
(2) Boot into windows XP and then restart into linux
(3) Symptoms as reported in the bug report. The card is recognized (LEDs lit
etc) but when trying to bring up the interface the card dies with the bug and
the card LEDs go out.

Work around:
Always do a cold boot into linux.
(Or roll back to the previous windows driver, I guess, -though I haven't tried
this). 
Comment 4 Pekka Pietikainen 2004-05-15 11:39:31 UTC
Created attachment 2873 [details]
Always enable PHY on chip reset
Comment 5 Pekka Pietikainen 2004-05-15 11:41:15 UTC
Please test the attached patch, which should fix the problem...
Comment 6 FRLinux 2004-05-18 17:01:05 UTC
Nice one. I'll test that on mandrake 10.0 Official, i just recompiled a
2.6.4-ck2 on it and got the same behavior again. Keep you posted.

Steph
Comment 7 FRLinux 2004-05-20 18:45:45 UTC
Had the bug again and applied patch on a 2.6.4-ck2 kernel, applied smoothly, 
recompiled modules and loaded up with success! 
 
b44.c:v0.95 (May 15, 2004) 
PCI: Found IRQ 4 for device 0000:00:09.0 
eth1: Broadcom 4400 10/100BaseT Ethernet 00:e0:18:b7:0b:11 
b44: eth1: Link is down. 
b44: eth1: Link is up at 100 Mbps, full duplex. 
b44: eth1: Flow control is off for TX and off for RX. 
eth1: no IPv6 routers present 
Disabled Privacy Extensions on device c0397e00(lo) 
b44: eth1: Link is up at 100 Mbps, full duplex. 
b44: eth1: Flow control is off for TX and off for RX. 
NETDEV WATCHDOG: eth1: transmit timed out 
b44: eth1: transmit timed out, resetting 
b44: eth1: Link is down. 
eth1: no IPv6 routers present 
b44: eth1: Link is up at 100 Mbps, full duplex. 
b44: eth1: Flow control is off for TX and off for RX. 
 
Thanks a lot ! 
Steph