Bug 7808

Summary: Xilinx ML403 does not boot
Product: Platform Specific/Hardware Reporter: Bernd Sowislo (BSowislo)
Component: PPC-32Assignee: Grant Likely (grant.likely)
Status: CLOSED INVALID    
Severity: blocking CC: grant.likely
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.19.1 Subsystem:
Regression: --- Bisected commit-id:
Attachments: the correct version
with weak statement

Description Bernd Sowislo 2007-01-11 00:45:59 UTC
Most recent kernel where this bug did *NOT* occur:
I only tried out 2.6.19.1
Distribution:
kernel.org
Hardware Environment:
Xilinx ML403 board
Software Environment:
PPC crosscompiler on Centos linux i386
Problem Description:

a ramdom number (most likely 0x00000000)
is displayed as ramsize at the boot console:

Now booting the kernel

loaded at:     00400000 004DF13C
board data at: C01D16F8 C01D1710
relocated to:  00404040 00404058
zimage at:     00404E39 004DC857
avail ram:     004E0000 00000000

Linux/PPC load: console=ttyS0,9600
Uncompressing Linux...oops... out of memory
pause
done.
Now booting the kernel

reason:
board info is not returned from call to embed_config() in
arch/ppc/boot/simple/embed_config.c
caller: routine arch/ppc/boot/simple/misc_embedded.c, load_kernel()

first solution:
I removed the weak declaration of embed_config() in
arch/ppc/boot/simple/misc_embedded.c to force
the use of the procedure defined within 
arch/ppc/boot/simple/embed_config.c

Steps to reproduce:
use the arch/ppc/configs/ml403_defconfig as .config
for creating the kernel
Comment 1 Andrew Morton 2007-01-11 00:57:20 UTC

Begin forwarded message:

Date: Thu, 11 Jan 2007 00:52:43 -0800
From: bugme-daemon@bugzilla.kernel.org
To: bugme-new@lists.osdl.org
Subject: [Bugme-new] [Bug 7808] New: Xilinx ML403 does not boot


http://bugzilla.kernel.org/show_bug.cgi?id=7808

           Summary: Xilinx ML403 does not boot
    Kernel Version: 2.6.19.1
            Status: NEW
          Severity: blocking
             Owner: platform_ppc-32@kernel-bugs.osdl.org
         Submitter: BSowislo@gmx.de


Most recent kernel where this bug did *NOT* occur:
I only tried out 2.6.19.1
Distribution:
kernel.org
Hardware Environment:
Xilinx ML403 board
Software Environment:
PPC crosscompiler on Centos linux i386
Problem Description:

a ramdom number (most likely 0x00000000)
is displayed as ramsize at the boot console:

Now booting the kernel

loaded at:     00400000 004DF13C
board data at: C01D16F8 C01D1710
relocated to:  00404040 00404058
zimage at:     00404E39 004DC857
avail ram:     004E0000 00000000

Linux/PPC load: console=ttyS0,9600
Uncompressing Linux...oops... out of memory
pause
done.
Now booting the kernel

reason:
board info is not returned from call to embed_config() in
arch/ppc/boot/simple/embed_config.c
caller: routine arch/ppc/boot/simple/misc_embedded.c, load_kernel()

first solution:
I removed the weak declaration of embed_config() in
arch/ppc/boot/simple/misc_embedded.c to force
the use of the procedure defined within 
arch/ppc/boot/simple/embed_config.c

Steps to reproduce:
use the arch/ppc/configs/ml403_defconfig as .config
for creating the kernel

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

Comment 2 Grant Likely 2007-01-11 06:27:33 UTC
On 1/11/07, Andrew Morton <akpm@osdl.org> wrote:
>
> Begin forwarded message:
>
> Date: Thu, 11 Jan 2007 00:52:43 -0800
> From: bugme-daemon@bugzilla.kernel.org
> To: bugme-new@lists.osdl.org
> Subject: [Bugme-new] [Bug 7808] New: Xilinx ML403 does not boot

okay, I will investigate today

g.

Comment 3 Grant Likely 2007-01-11 21:52:28 UTC
On 1/11/07, Grant Likely <grant.likely@secretlab.ca> wrote:
> okay, I will investigate today

Hmmm...

Using the top of Linus' tree, I cannot reproduce this problem.  Could
this be something exposed by the compiler version?

Cross compiler: ELDK 4.0 on and amd64 Linux host.
Target: Xilinx EDK 8.1 reference design bitstream.

Console output:

loaded at:     00400000 004DC13C
board data at: 004DA124 004DA13C
relocated to:  0040408C 004040A4
zimage at:     00404E19 004D93A0
avail ram:     004DD000 04000000

Linux/PPC load: console=ttyS0,9600
Uncompressing Linux...done.
Now booting the kernel
[    0.000000] Linux version 2.6.20-rc4-g0404f87f
(grant@weasley-twins) (gcc version 4.0.0 (DENX ELDK 4.0 4.0.0)) #317
Thu Jan 11 14:28:05 MST 2007
[    0.000000] Xilinx ML403 Reference System (Virtex-4 FX)
[    0.000000] Zone PFN ranges:
...

Comment 4 Bernd Sowislo 2007-01-12 00:40:01 UTC
Hello,
crosstool:
gcc-4.1.0-glibc-2.3.6/powerpc-405-linux-gnu/bin/

I tried out Kernel 2.6.17.1, same problem
I use EDK8.2 + latest patches with
upgraded reference design ml403_emb_ppc_EDK8.1

There is another question:
the embed-config declares a static pointer bp
for the info structure
this pointer is used by the calling load_kernel
routine.
is a variable valid for the caller if it is declared
in the context of a subroutine?

where can I find the matching sysace,
TEMAC and Ethernet 10/100 sources?

best regards + thanks for your work
Bernd


-----Urspr
Comment 5 Grant Likely 2007-01-12 08:35:30 UTC
> There is another question:
> the embed-config declares a static pointer bp
> for the info structure
> this pointer is used by the calling load_kernel
> routine.
> is a variable valid for the caller if it is declared
> in the context of a subroutine?

You mean this at line 28?

/* For those boards that don't provide one.
*/
#if !defined(CONFIG_MBX)
static  bd_t    bdinfo;
#endif

Yes, this is okay to pass around.

...

Let's get one possible issue out of the way: You said you used
ml430_defconfig as .config.  Did you do a 'make ml403_defconfig', or
did you just copy the file?  If you just copied the file, did you do a
'make oldconfig' before building?  If you didn't do either 'make
ml403_defconfig' or 'cp; make oldconfig' then you will probably have
config problems.

...

Try this: get an object dump of zImage and see where embed_config is
referenced.  Below are the relevant sections from my image.

The interesting bits are there are 2 definitions for embed_config (1
weak) in my zImage; but it is easy to see that the call to
embed_config is linked to the strong (correct) one at address 4005fc.
The weak one is at 400158.

What do you see?

$ ppc_4xx-objdump -dS arch/ppc/boot/images/zImage.elf | grep embed_config -C 5
  40014c:       39 20 00 00     li      r9,0
  400150:       7d 28 03 a6     mtlr    r9
  400154:       4e 80 00 20     blr
 */
void __attribute__ ((weak))
embed_config(bd_t **bdp)
{
}
  400158:       4e 80 00 20     blr

0040015c <load_kernel>:
--
        unsigned long initrd_size;

        /* First, capture the embedded board information.  Then
         * initialize the serial console port.
         */
        embed_config(&bp);
  400170:       38 61 00 18     addi    r3,r1,24
  400174:       90 01 00 44     stw     r0,68(r1)
  400178:       90 c1 00 18     stw     r6,24(r1)
  40017c:       48 00 04 81     bl      4005fc <embed_config>
#if defined(CONFIG_SERIAL_CPM_CONSOLE) || defined(CONFIG_SERIAL_8250_CONSOLE)
        com_port = serial_init(0, bp);
  400180:       80 81 00 18     lwz     r4,24(r1)
  400184:       38 60 00 00     li      r3,0
  400188:       48 00 0f b5     bl      40113c <serial_init>
--
  4005ec:       bb 01 00 20     lmw     r24,32(r1)
  4005f0:       7c 08 03 a6     mtlr    r0
  4005f4:       38 21 00 40     addi    r1,r1,64
  4005f8:       4e 80 00 20     blr

004005fc <embed_config>:
         * - If the data cache is turned on this must have been done by
         *   a bootloader and we assume that the cache contents are
         *   valid.
         */
        __asm__("mfdccr %0": "=r" (dccr));
  4005fc:       7c 1a fa a6     mfdccr  r0
        if (dccr == 0) {
  400600:       2f 80 00 00     cmpwi   cr7,r0,0
  400604:       40 9e 00 1c     bne-    cr7,400620 <embed_config+0x24>
  400608:       38 00 01 00     li      r0,256
  40060c:       7c 09 03 a6     mtctr   r0
  400610:       39 20 00 00     li      r9,0
                for (addr = 0;
                     addr < (congruence_classes * line_size);
                     addr += line_size) {
                        __asm__("dccci 0,%0": :"b"(addr));
  400614:       7c 00 4b 8c     dccci   r0,r9
  400618:       39 29 00 20     addi    r9,r9,32
  40061c:       42 00 ff f8     bdnz+   400614 <embed_config+0x18>
                }
        }

        bd = &bdinfo;
        *bdp = bd;

Comment 6 Bernd Sowislo 2007-01-12 09:00:51 UTC
Hello

here is a quick first answer to your questions

that's right: line 28
it's static, but I think it is only valid within
the subroutine, after leaving the subroutine
it could be overritten?

I copied the example config for ml403 to .config
I also modified the original .config
for my options, always the same effect

I swapped the init serial call with
the call to the sub and did some print
in the subroutine
so with the weak statement I had no output
on screen, commenting it out I had output
so I was sure not to go through the subroutine
with the statement in line 28

best regards
Bernd


-----Urspr
Comment 7 Bernd Sowislo 2007-01-13 07:09:26 UTC
Created attachment 10067 [details]
the correct version
Comment 8 Bernd Sowislo 2007-01-13 07:10:04 UTC
Created attachment 10068 [details]
with weak statement
Comment 9 Grant Likely 2007-01-13 10:13:25 UTC
I believe this is a bug in gcc 4.1.0; see link below.

Upgrading gcc should solve your problem

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27781
Comment 10 Grant Likely 2007-01-14 06:41:40 UTC
Reporter changed version of gcc he was using and confirmed that it solves the
problem.