Bug 15392

Summary: [bisected] The kernel does not start up.
Product: Platform Specific/Hardware Reporter: Kristóf Ralovich (kristof.ralovich)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: CLOSED INVALID    
Severity: normal CC: akpm, florian, hjl.tools, hpa, kristof.ralovich, rjw, suresh.b.siddha
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 2.6.33 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 14885    
Attachments: This is the configuration file I compiled the kernel with.
the .config for 2.6.33-rc1 (should be the same as for 2.6.33)
System.map
System.map
dmesg for 2.6.32-good

Description Kristóf Ralovich 2010-02-25 06:52:03 UTC
After grub starts the kernel, the last line the kernel prints is "Decompressing linux...", after that in a few seconds the machine is rebooted.

The machine is Thinkpad T500 2089-A81.

The same issue was present with 2.6.33-rc8. The 2.6.32.y series works for me.
Comment 1 Kristóf Ralovich 2010-02-25 06:53:34 UTC
Created attachment 25206 [details]
This is the configuration file I compiled the kernel with.
Comment 2 Andrew Morton 2010-03-01 22:25:44 UTC
I'll reassign this to x86_64.  The problem could of course lie elsewhere, hard to tell.
Comment 3 Rafael J. Wysocki 2010-03-01 23:46:35 UTC
Can you test 2.6.33-rc1, please?
Comment 4 Kristóf Ralovich 2010-03-02 00:21:26 UTC
I will give it a try tonight after work.
Comment 5 Kristóf Ralovich 2010-03-02 02:40:54 UTC
I have checked with 2.6.33-rc1, the issue is the same as with 2.6.33.

I have checked again, and the last lines the kernel print are the following (for both 2.6.33 and 2.6.33-rc1):

Decompressing Linux. Parsing ELF...done.
Booting the kernel.


At this point the output stops and the machine reboots in a few seconds.
Comment 6 Kristóf Ralovich 2010-03-02 02:41:48 UTC
Created attachment 25304 [details]
the .config for 2.6.33-rc1 (should be the same as for 2.6.33)
Comment 7 Kristóf Ralovich 2010-03-19 03:03:43 UTC
The issue is still there with 2.6.33.1.
Comment 8 Rafael J. Wysocki 2010-03-22 21:21:52 UTC
On Monday 22 March 2010, RALOVICH, Kristóf wrote:
> The issue is still there with 2.6.33-rc1, 2.6.33 and 2.6.33.1 too.
Comment 9 Kristóf Ralovich 2010-03-24 00:36:17 UTC
I took a look: I tested 2.6.33.1 with bz2 instead of LZMA, the same
issue is still there.
Comment 10 Kristóf Ralovich 2010-04-04 16:20:19 UTC
The issue is still there with 2.6.33.2.
Comment 11 Florian Mickler 2010-04-09 08:12:48 UTC
what is your kernel commandline?
Comment 12 H. Peter Anvin 2010-04-09 18:34:09 UTC
Also, could you do a "git bisect" between 2.6.32 and 2.6.33, or if that is not possible try the 2.6.33 release candidates?
Comment 13 Kristóf Ralovich 2010-04-09 18:48:53 UTC
(In reply to comment #11)
> what is your kernel commandline?

I'll post it tonight when I get to that machine.
Comment 14 Kristóf Ralovich 2010-04-09 18:56:00 UTC
(In reply to comment #12)
> Also, could you do a "git bisect" between 2.6.32 and 2.6.33, or if that is
> not
> possible try the 2.6.33 release candidates?

I have tried 2.6.33-rc1 (see comment #5 and comment #6) and that fails already. I am afraid I don't have currently time to do further bisecting between 2.6.32 and 2.6.33-rc1.

Summary:

2.6.32 - works fine
2.6.32.y - works fine (stable tree)
2.6.33-rc1 - fails
2.6.33.1 - fails
2.6.33.2 - fails
2.6.34-rc3 - fails

My guess is that the regression occured somewhere between 2.6.32 and 2.6.33-rc1.
Comment 15 H. Peter Anvin 2010-04-09 18:58:24 UTC
That's a merge window... without a git bisect there isn't much additional to do on.
Comment 16 Florian Mickler 2010-04-09 19:39:04 UTC
do you have specified the panic=x parameter on your kernel commandline? (reboot on panic) if yes, leaving that out may eventually print some error... other than that an git-bisect might be the easiest to get to the culprit...
Comment 17 Rafael J. Wysocki 2010-04-09 19:47:24 UTC
On Friday 09 April 2010, RALOVICH, Kristóf wrote:
> The issue still exists!
Comment 18 Kristóf Ralovich 2010-04-11 06:16:12 UTC
I had bisected the failure, the regression is introduced with the following commit:

74e081797bd9d2a7d8005fe519e719df343a2ba8

x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA
Comment 19 Florian Mickler 2010-04-11 15:33:31 UTC
did you double check that if you disable config_debug_rodata the kernel boots? (or check that reverting that commit on top of current git (if possible) boots?)

also you can put a '[bisected]'
Comment 20 Florian Mickler 2010-04-11 15:36:22 UTC
... in the bug title, if you verified that this is the culprit. 

i have added suresh siddha to the cc'list ..

cheers,
Flo

p.s. sorry for not finishing the last comment... somehow i fat-fingered my browser...
Comment 21 Kristóf Ralovich 2010-04-11 16:55:36 UTC
I compiled a 2.6.33.2 kernel with CONFIG_DEBUG_RODATA disabled, and it works for me. So this option has to do something with the kernel not starting up.
Comment 22 Suresh B Siddha 2010-04-12 18:12:23 UTC
Kristof, I am not able to reproduce the problem with your failing config. Perhaps some differences in linker etc hitting the bug  somewhere. Can you please provide me your failing kernel's System.map and also earlyprintk=vga shows any messages on the console before it reboots? Thanks.
Comment 23 Kristóf Ralovich 2010-04-18 23:43:59 UTC
Created attachment 26045 [details]
System.map

A failing System.map this is for the kernel (74e081797bd9d2a7d8005fe519e719df343a2ba8) introducing the regression. Don't let the filename fool you, this is for 74e081797bd9d2a7d8005fe519e719df343a2ba8 not for 2.6.32-rc4!
Comment 24 Kristóf Ralovich 2010-04-19 01:16:09 UTC
Created attachment 26046 [details]
System.map

In comment #23 I was wrong about attachment (id=26045).

These are the correct System.map files, for the last good (b9af7c0d44b8bb71e3af5e94688d076414aa8c87) and first regressed (74e081797bd9d2a7d8005fe519e719df343a2ba8) revisions.
Comment 25 Suresh B Siddha 2010-04-19 23:09:02 UTC
Kristof, I didn't find anything wrong with the System.map-2.6.32-bad. No luck again today for me in reproducing the failure here. Also, fedora 13 kernels (using 2.6.33.x) etc have this CONFIG_DEBUG_RODATA enabled with no similar failure reports. There must be something unique on your setup that exposes this problem. Can you please help me get the below info, to evaluate my next steps.

a) Kernel log of your system for the successful boot case.

b) mark_rodata_ro() function gets called quite a bit later in the boot. If there is something wrong in mark_rodata_ro() code, then we should be seeing quite a few kernel messages before it crashes. Can you please remove "quiet" boot param and add something like "vga=6 earlyprintk=vga" and check if you see any kernel log for the failure case.

c) For the failing case, can you please apply just this change to static_protections() in arch/x86/mm/pageattr.c and see if it changes anything.

-#if defined(CONFIG_X86_64) && defined(CONFIG_DEBUG_RODATA)
+#if 0 && defined(CONFIG_X86_64) && defined(CONFIG_DEBUG_RODATA)
Comment 26 Kristóf Ralovich 2010-04-21 01:26:42 UTC
Created attachment 26074 [details]
dmesg for 2.6.32-good
Comment 27 Kristóf Ralovich 2010-04-21 01:50:45 UTC
(In reply to comment #25)
> Kristof, I didn't find anything wrong with the System.map-2.6.32-bad. No luck
> again today for me in reproducing the failure here. Also, fedora 13 kernels
> (using 2.6.33.x) etc have this CONFIG_DEBUG_RODATA enabled with no similar
> failure reports. There must be something unique on your setup that exposes
> this
> problem. Can you please help me get the below info, to evaluate my next
> steps.
> 
> a) Kernel log of your system for the successful boot case.

See previous comment for dmesg on the last good revision.

> 
> b) mark_rodata_ro() function gets called quite a bit later in the boot. If
> there is something wrong in mark_rodata_ro() code, then we should be seeing
> quite a few kernel messages before it crashes. Can you please remove "quiet"
> boot param and add something like "vga=6 earlyprintk=vga" and check if you
> see
> any kernel log for the failure case.

I don not have "quiet" on my kernel command line.

Adding "vga=6 earlyprintk=vga" changes the console resolution, but no additional test is printed than what I had originally reported.

> 
> c) For the failing case, can you please apply just this change to
> static_protections() in arch/x86/mm/pageattr.c and see if it changes
> anything.
> 
> -#if defined(CONFIG_X86_64) && defined(CONFIG_DEBUG_RODATA)
> +#if 0 && defined(CONFIG_X86_64) && defined(CONFIG_DEBUG_RODATA)

I applied this on the first bad revision, but it did not help anything.
Comment 28 Suresh B Siddha 2010-04-21 20:26:34 UTC
Thanks Kristof. It is interesting that we don't see any kernel log messages in the failing case. Seems like we have some problem in grub loading the kernel itself. Can you please

a) comment the mark_rodata_ro() call in init_post() in init/main.c and see if that changes in anything. If the problem is with the grub loading it, this should also fail.

b) Can you please provide your failing vmlinux and bzImage so that I can try few things here?

I am hoping to find the root cause in the next couple of days, otherwise I will post a patch which reverts that commit :(
Comment 29 Kristóf Ralovich 2010-04-22 00:30:39 UTC
(In reply to comment #28)
> Thanks Kristof. It is interesting that we don't see any kernel log messages
> in
> the failing case. Seems like we have some problem in grub loading the kernel
> itself. Can you please
> 
> a) comment the mark_rodata_ro() call in init_post() in init/main.c and see if
> that changes in anything. If the problem is with the grub loading it, this
> should also fail.

I commented that line, but the failure is still there.

> 
> b) Can you please provide your failing vmlinux and bzImage so that I can try
> few things here?

look here:
http://people.freedesktop.org/~tade/kernel_debugging/

> 
> I am hoping to find the root cause in the next couple of days, otherwise I
> will
> post a patch which reverts that commit :(

I am using standard debian grub:

ii  grub                                     0.97-47lenny2                            GRand Unified Bootloader (Legacy version)
ii  grub-common                              1.96+20080724-16                         GRand Unified Bootloader, version 2 (common files)
Comment 30 Suresh B Siddha 2010-04-22 21:12:29 UTC
Kristof, It is the linker issue why you are seeing this issue.

On my working kernel, if I do "readelf -l my-vmlinux" I see this for the first entry:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000200000 0xffffffff81000000 0x0000000001000000
                 0x00000000008ec000 0x00000000008ec000  R E    200000

And for your kernel, I see this:

  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000001000 0xffffffff81000000 0x0000000001000000
                 0x000000000056e000 0x000000000056e000  R E    1000

If you see the last column which is Align, mine has 2MB alignment and yours has 4k. I talked to HJ and he says when the linker uses 4K alignment, kernel might have a problem if it uses 2MB alignment.

What linker (ld --version) are you using? Have you recompiled your linker or perhaps debian linker is using 4k as the max-page-size? HJ says, debian might have two linkers (ld and ld.gold) and they might be using different page-sizes.

I can reproduce your problem if I add this to (arch/x86/Makefile) my kernel compilation:

LDFLAGS_vmlinux += -z max-page-size=0x1000

Can you check if the kernel boots if you add this to the arch/x86/Makefile:

LDFLAGS_vmlinux += -z max-page-size=0x200000

Also HJ pointed me to this ld.gold bug now http://sourceware.org/bugzilla/show_bug.cgi?id=11490

Depending on your results and my discussion with HJ, we can see if we can have a workaround in the kernel or we need to address this in the linker.
Comment 31 Suresh B Siddha 2010-04-22 21:17:07 UTC
Kristof, It is the linker issue why you are seeing this issue.

On my working kernel, if I do "readelf -l my-vmlinux" I see this for the first entry:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000200000 0xffffffff81000000 0x0000000001000000
                 0x00000000008ec000 0x00000000008ec000  R E    200000

And for your kernel, I see this:

  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000001000 0xffffffff81000000 0x0000000001000000
                 0x000000000056e000 0x000000000056e000  R E    1000

If you see the last column which is Align, mine has 2MB alignment and yours has 4k. I talked to HJ and he says when the linker uses 4K alignment, kernel might have a problem if it uses 2MB alignment.

What linker (ld --version) are you using? Have you recompiled your linker or perhaps debian linker is using 4k as the max-page-size? HJ says, debian might have two linkers (ld and ld.gold) and they might be using different page-sizes.

I can reproduce your problem if I add this to (arch/x86/Makefile) my kernel compilation:

LDFLAGS_vmlinux += -z max-page-size=0x1000

Can you check if the kernel boots if you add this to the arch/x86/Makefile:

LDFLAGS_vmlinux += -z max-page-size=0x200000

Also HJ pointed me to this ld.gold bug now http://sourceware.org/bugzilla/show_bug.cgi?id=11490

Depending on your results and my discussion with HJ, we can see if we can have a workaround in the kernel or we need to address this in the linker.
Comment 32 Kristóf Ralovich 2010-04-23 04:02:52 UTC
(In reply to comment #31)
> Kristof, It is the linker issue why you are seeing this issue.
> 
> On my working kernel, if I do "readelf -l my-vmlinux" I see this for the
> first
> entry:
> 
> Program Headers:
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
>   LOAD           0x0000000000200000 0xffffffff81000000 0x0000000001000000
>                  0x00000000008ec000 0x00000000008ec000  R E    200000
> 
> And for your kernel, I see this:
> 
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
>   LOAD           0x0000000000001000 0xffffffff81000000 0x0000000001000000
>                  0x000000000056e000 0x000000000056e000  R E    1000
> 
> If you see the last column which is Align, mine has 2MB alignment and yours
> has
> 4k. I talked to HJ and he says when the linker uses 4K alignment, kernel
> might
> have a problem if it uses 2MB alignment.
> 
> What linker (ld --version) are you using? Have you recompiled your linker or
> perhaps debian linker is using 4k as the max-page-size? HJ says, debian might
> have two linkers (ld and ld.gold) and they might be using different
> page-sizes.
> 
> I can reproduce your problem if I add this to (arch/x86/Makefile) my kernel
> compilation:
> 
> LDFLAGS_vmlinux += -z max-page-size=0x1000
> 
> Can you check if the kernel boots if you add this to the arch/x86/Makefile:
> 
> LDFLAGS_vmlinux += -z max-page-size=0x200000
> 
> Also HJ pointed me to this ld.gold bug now
> http://sourceware.org/bugzilla/show_bug.cgi?id=11490
> 
> Depending on your results and my discussion with HJ, we can see if we can
> have
> a workaround in the kernel or we need to address this in the linker.

Mea culpa - I always had this feeling in the back of my had that gold is still experimental. Right now I don't have too much time, but I purged gold, and linked the kernel with regular ld and the problem is gone for me. As soon as I got around to it, I'll try gold + LDFLAGS_vmlinux flag.
Comment 33 Kristóf Ralovich 2010-05-02 21:18:56 UTC
I will not have any time soon to look into experimenting linking a kernel with gold.

I am confirming that 74e081797bd9d2a7d8005fe519e719df343a2ba8 built with gold exhibits the problem and the same kernel and config built with ld works for me.

Thanks for your input finding the source of the problem.