Bug 1494

Summary: machine locks up while accessing NIC
Product: Platform Specific/Hardware Reporter: Christian Kujau (kernel)
Component: PPC-32Assignee: Tom Rini (trini)
Status: CLOSED CODE_FIX    
Severity: normal CC: bzkernel, paulus
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.5.72...2.6-current Subsystem:
Regression: --- Bisected commit-id:
Attachments: i posted the bug to lkml some time ago, but nobody cared :-(
changing 3 lines in arch/ppc/platforms/prep_pci.c does solve the issue.

Description Christian Kujau 2003-11-05 12:26:28 UTC
Distribution: Debian/GNU Linux (powerpc)

Hardware Environment: PReP Utah (Powerstack II Pro4000)

Software Environment:
Gnu C                  3.3.2
Gnu make               3.80
util-linux             2.12
mount                  2.12
module-init-tools      0.9.15-pre3
e2fsprogs              1.35-WIP
jfsutils               1.1.3
nfs-utils              1.0.6
Linux C Library        2.3.2
Dynamic linker (ldd)   2.3.2
Procps                 3.1.14
Net-tools              1.60
Console-tools          0.2.3
Sh-utils               5.0.91

Problem Description:
see http://marc.theaimsgroup.com/?l=linux-kernel&m=106735412521952&w=2

2 NICs, both working with 2.4.xx, under 2.6.0-test{6...9} machine locks up, when
the NICs is going to be configured. 

Steps to reproduce:

"modprobe tulip" or "modprobe 3c59x" is ok, even "ifconfig eth0" is showing an
(unconfigured) eth0 then. but configuring locks the machine up. (no oops msg)
Comment 1 Christian Kujau 2003-11-05 12:27:57 UTC
Created attachment 1365 [details]
i posted the bug to lkml some time ago, but nobody cared :-(
Comment 2 Martin J. Bligh 2003-11-05 12:48:22 UTC
Try alt+sysrq+t ... if that doesn't work, try nmi_watchdog.
Google for info on how to use them ;-)
Comment 3 Sven Hartge 2004-01-25 17:01:55 UTC
Just to bring this bug entry up to date:

2.6.1 and 2.6.2-rc1 all show the same problem, loading of tulip.ko or de4x5.ko
works fine, but when you try to configure the interface with ifconfig, the box
freezes hard, even sysrq does _not_ work.

And since the kdbg is a little bit out of order, I am not really able to provide
any more input.

Hopefully the patch from http://stop.crashing.org:16080/~trini/ for kdbg will
shed some light onto the problem, if I am able to resolve the conflicts.
Comment 4 Christian Kujau 2004-12-03 07:03:19 UTC
i've checked again with a recent 2.6-BK kernel and the issue is still unresolved:

the PReP i have does not have any harddisk any more, so nfsroot will the way to
go here. but this the very problem: networking is not working since > 2.5.30 on
PReP and i am only able to get netconsole running, sometimes a ping will get
through to the machine. as stated earlier, the PReP has an on-board NIC, de4x5
or tulip did fine with 2.4 kernels. i can boot off the
(openfirmware?)bootprompt, the kernel gets loaded, then i can try different
drivers for the NIC:

http://nerdbynature.de/bits/sheep/2.6.10-rc2/3c59x-boot.log
http://nerdbynature.de/bits/sheep/2.6.10-rc2/8139too-boot.log
-> more to come: http://nerdbynature.de/bits/sheep/2.6.10-rc2/

then the nfsroot is supposed to be mounted, the logs says
 "mount: server 192.168.10.10 not responding"
but 192.168.10.10 *would* respond, but it does not even receive and requests
(tcpdump teld me...) and i am not able to ping the machine. the only time i was
able to ping the machine was with the 8139too driver, but reply-time went from
130ms to 32202ms...

i've compile the kernels with gcc-3.4.2 and binutils-2.15 in a i386-crosscompile
environment.

if anybody cares, tell me if other information is needed / new patches are
available to address this issue.

thanks,
Christian.
Comment 5 Christian Kujau 2004-12-03 07:07:59 UTC
oh, just to fullfill an old request from martin (yeah, i'm really late):

> Try alt+sysrq+t ... if that doesn't work, try nmi_watchdog.
> Google for info on how to use them ;-)

http://nerdbynature.de/bits/sheep/2.6.10-rc2/3c59x-sysrq+t.log

thanks,
Christian.
Comment 6 Christian Kujau 2004-12-05 14:25:17 UTC
Created attachment 4235 [details]
changing 3 lines in arch/ppc/platforms/prep_pci.c does solve the issue.

Sebastian Heutling gave me the following hint:
> It turned out that the pci slot numbering has changed
> sometime and this wasn't reflected in arch/ppc/platforms/prep_pci.c.
> After having set up the slot0...slot8 using the values of
> slot10...slot18 (except for slot1 which got value 4 so IDE is usable out
> of the box), the machine booted a 2.6 kernel (2.6.8).

i only had to create an appropriate patch (attached) and now everything works.
it's been a long time since 2.5.30....

now the patch has to reach mainline somehow to close this bug. (hint, hint!)

thanks to all involved,
Christian.
Comment 7 Tom Rini 2004-12-07 15:14:39 UTC
Linus has applied this patch, closing.