Bug 7808 - Xilinx ML403 does not boot
Summary: Xilinx ML403 does not boot
Status: CLOSED INVALID
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: PPC-32 (show other bugs)
Hardware: i386 Linux
: P2 blocking
Assignee: Grant Likely
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-01-11 00:45 UTC by Bernd Sowislo
Modified: 2007-01-14 06:41 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.19.1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
the correct version (4.24 KB, text/plain)
2007-01-13 07:09 UTC, Bernd Sowislo
Details
with weak statement (4.07 KB, text/plain)
2007-01-13 07:10 UTC, Bernd Sowislo
Details

Description Bernd Sowislo 2007-01-11 00:45:59 UTC
Most recent kernel where this bug did *NOT* occur:
I only tried out 2.6.19.1
Distribution:
kernel.org
Hardware Environment:
Xilinx ML403 board
Software Environment:
PPC crosscompiler on Centos linux i386
Problem Description:

a ramdom number (most likely 0x00000000)
is displayed as ramsize at the boot console:

Now booting the kernel

loaded at:     00400000 004DF13C
board data at: C01D16F8 C01D1710
relocated to:  00404040 00404058
zimage at:     00404E39 004DC857
avail ram:     004E0000 00000000

Linux/PPC load: console=ttyS0,9600
Uncompressing Linux...oops... out of memory
pause
done.
Now booting the kernel

reason:
board info is not returned from call to embed_config() in
arch/ppc/boot/simple/embed_config.c
caller: routine arch/ppc/boot/simple/misc_embedded.c, load_kernel()

first solution:
I removed the weak declaration of embed_config() in
arch/ppc/boot/simple/misc_embedded.c to force
the use of the procedure defined within 
arch/ppc/boot/simple/embed_config.c

Steps to reproduce:
use the arch/ppc/configs/ml403_defconfig as .config
for creating the kernel
Comment 1 Andrew Morton 2007-01-11 00:57:20 UTC

Begin forwarded message:

Date: Thu, 11 Jan 2007 00:52:43 -0800
From: bugme-daemon@bugzilla.kernel.org
To: bugme-new@lists.osdl.org
Subject: [Bugme-new] [Bug 7808] New: Xilinx ML403 does not boot


http://bugzilla.kernel.org/show_bug.cgi?id=7808

           Summary: Xilinx ML403 does not boot
    Kernel Version: 2.6.19.1
            Status: NEW
          Severity: blocking
             Owner: platform_ppc-32@kernel-bugs.osdl.org
         Submitter: BSowislo@gmx.de


Most recent kernel where this bug did *NOT* occur:
I only tried out 2.6.19.1
Distribution:
kernel.org
Hardware Environment:
Xilinx ML403 board
Software Environment:
PPC crosscompiler on Centos linux i386
Problem Description:

a ramdom number (most likely 0x00000000)
is displayed as ramsize at the boot console:

Now booting the kernel

loaded at:     00400000 004DF13C
board data at: C01D16F8 C01D1710
relocated to:  00404040 00404058
zimage at:     00404E39 004DC857
avail ram:     004E0000 00000000

Linux/PPC load: console=ttyS0,9600
Uncompressing Linux...oops... out of memory
pause
done.
Now booting the kernel

reason:
board info is not returned from call to embed_config() in
arch/ppc/boot/simple/embed_config.c
caller: routine arch/ppc/boot/simple/misc_embedded.c, load_kernel()

first solution:
I removed the weak declaration of embed_config() in
arch/ppc/boot/simple/misc_embedded.c to force
the use of the procedure defined within 
arch/ppc/boot/simple/embed_config.c

Steps to reproduce:
use the arch/ppc/configs/ml403_defconfig as .config
for creating the kernel

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

Comment 2 Grant Likely 2007-01-11 06:27:33 UTC
On 1/11/07, Andrew Morton <akpm@osdl.org> wrote:
>
> Begin forwarded message:
>
> Date: Thu, 11 Jan 2007 00:52:43 -0800
> From: bugme-daemon@bugzilla.kernel.org
> To: bugme-new@lists.osdl.org
> Subject: [Bugme-new] [Bug 7808] New: Xilinx ML403 does not boot

okay, I will investigate today

g.

Comment 3 Grant Likely 2007-01-11 21:52:28 UTC
On 1/11/07, Grant Likely <grant.likely@secretlab.ca> wrote:
> okay, I will investigate today

Hmmm...

Using the top of Linus' tree, I cannot reproduce this problem.  Could
this be something exposed by the compiler version?

Cross compiler: ELDK 4.0 on and amd64 Linux host.
Target: Xilinx EDK 8.1 reference design bitstream.

Console output:

loaded at:     00400000 004DC13C
board data at: 004DA124 004DA13C
relocated to:  0040408C 004040A4
zimage at:     00404E19 004D93A0
avail ram:     004DD000 04000000

Linux/PPC load: console=ttyS0,9600
Uncompressing Linux...done.
Now booting the kernel
[    0.000000] Linux version 2.6.20-rc4-g0404f87f
(grant@weasley-twins) (gcc version 4.0.0 (DENX ELDK 4.0 4.0.0)) #317
Thu Jan 11 14:28:05 MST 2007
[    0.000000] Xilinx ML403 Reference System (Virtex-4 FX)
[    0.000000] Zone PFN ranges:
...

Comment 4 Bernd Sowislo 2007-01-12 00:40:01 UTC
Hello,
crosstool:
gcc-4.1.0-glibc-2.3.6/powerpc-405-linux-gnu/bin/

I tried out Kernel 2.6.17.1, same problem
I use EDK8.2 + latest patches with
upgraded reference design ml403_emb_ppc_EDK8.1

There is another question:
the embed-config declares a static pointer bp
for the info structure
this pointer is used by the calling load_kernel
routine.
is a variable valid for the caller if it is declared
in the context of a subroutine?

where can I find the matching sysace,
TEMAC and Ethernet 10/100 sources?

best regards + thanks for your work
Bernd


-----Urspr
Comment 5 Grant Likely 2007-01-12 08:35:30 UTC
> There is another question:
> the embed-config declares a static pointer bp
> for the info structure
> this pointer is used by the calling load_kernel
> routine.
> is a variable valid for the caller if it is declared
> in the context of a subroutine?

You mean this at line 28?

/* For those boards that don't provide one.
*/
#if !defined(CONFIG_MBX)
static  bd_t    bdinfo;
#endif

Yes, this is okay to pass around.

...

Let's get one possible issue out of the way: You said you used
ml430_defconfig as .config.  Did you do a 'make ml403_defconfig', or
did you just copy the file?  If you just copied the file, did you do a
'make oldconfig' before building?  If you didn't do either 'make
ml403_defconfig' or 'cp; make oldconfig' then you will probably have
config problems.

...

Try this: get an object dump of zImage and see where embed_config is
referenced.  Below are the relevant sections from my image.

The interesting bits are there are 2 definitions for embed_config (1
weak) in my zImage; but it is easy to see that the call to
embed_config is linked to the strong (correct) one at address 4005fc.
The weak one is at 400158.

What do you see?

$ ppc_4xx-objdump -dS arch/ppc/boot/images/zImage.elf | grep embed_config -C 5
  40014c:       39 20 00 00     li      r9,0
  400150:       7d 28 03 a6     mtlr    r9
  400154:       4e 80 00 20     blr
 */
void __attribute__ ((weak))
embed_config(bd_t **bdp)
{
}
  400158:       4e 80 00 20     blr

0040015c <load_kernel>:
--
        unsigned long initrd_size;

        /* First, capture the embedded board information.  Then
         * initialize the serial console port.
         */
        embed_config(&bp);
  400170:       38 61 00 18     addi    r3,r1,24
  400174:       90 01 00 44     stw     r0,68(r1)
  400178:       90 c1 00 18     stw     r6,24(r1)
  40017c:       48 00 04 81     bl      4005fc <embed_config>
#if defined(CONFIG_SERIAL_CPM_CONSOLE) || defined(CONFIG_SERIAL_8250_CONSOLE)
        com_port = serial_init(0, bp);
  400180:       80 81 00 18     lwz     r4,24(r1)
  400184:       38 60 00 00     li      r3,0
  400188:       48 00 0f b5     bl      40113c <serial_init>
--
  4005ec:       bb 01 00 20     lmw     r24,32(r1)
  4005f0:       7c 08 03 a6     mtlr    r0
  4005f4:       38 21 00 40     addi    r1,r1,64
  4005f8:       4e 80 00 20     blr

004005fc <embed_config>:
         * - If the data cache is turned on this must have been done by
         *   a bootloader and we assume that the cache contents are
         *   valid.
         */
        __asm__("mfdccr %0": "=r" (dccr));
  4005fc:       7c 1a fa a6     mfdccr  r0
        if (dccr == 0) {
  400600:       2f 80 00 00     cmpwi   cr7,r0,0
  400604:       40 9e 00 1c     bne-    cr7,400620 <embed_config+0x24>
  400608:       38 00 01 00     li      r0,256
  40060c:       7c 09 03 a6     mtctr   r0
  400610:       39 20 00 00     li      r9,0
                for (addr = 0;
                     addr < (congruence_classes * line_size);
                     addr += line_size) {
                        __asm__("dccci 0,%0": :"b"(addr));
  400614:       7c 00 4b 8c     dccci   r0,r9
  400618:       39 29 00 20     addi    r9,r9,32
  40061c:       42 00 ff f8     bdnz+   400614 <embed_config+0x18>
                }
        }

        bd = &bdinfo;
        *bdp = bd;

Comment 6 Bernd Sowislo 2007-01-12 09:00:51 UTC
Hello

here is a quick first answer to your questions

that's right: line 28
it's static, but I think it is only valid within
the subroutine, after leaving the subroutine
it could be overritten?

I copied the example config for ml403 to .config
I also modified the original .config
for my options, always the same effect

I swapped the init serial call with
the call to the sub and did some print
in the subroutine
so with the weak statement I had no output
on screen, commenting it out I had output
so I was sure not to go through the subroutine
with the statement in line 28

best regards
Bernd


-----Urspr
Comment 7 Bernd Sowislo 2007-01-13 07:09:26 UTC
Created attachment 10067 [details]
the correct version
Comment 8 Bernd Sowislo 2007-01-13 07:10:04 UTC
Created attachment 10068 [details]
with weak statement
Comment 9 Grant Likely 2007-01-13 10:13:25 UTC
I believe this is a bug in gcc 4.1.0; see link below.

Upgrading gcc should solve your problem

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27781
Comment 10 Grant Likely 2007-01-14 06:41:40 UTC
Reporter changed version of gcc he was using and confirmed that it solves the
problem.

Note You need to log in before you can comment on or make changes to this bug.