Bug 8040 - Hang before INIT when CONFIG_HIGHMEM4G=y
Summary: Hang before INIT when CONFIG_HIGHMEM4G=y
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 blocking
Assignee: Andrew Morton
URL:
Keywords:
: 8039 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-02-19 05:59 UTC by Nilshar
Modified: 2008-02-14 23:19 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.20
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Nilshar 2007-02-19 05:59:40 UTC
Most recent kernel where this bug did *NOT* occur: 2.6.20-rc2
Distribution: Debian Sarge
Hardware Environment: Dell PowerEdge 850
Software Environment: 
Problem Description: 
I had a hang just after "Freeing unused kernel memory" where INIT is supposed to
start. It's not a total freeze, keyboard is working, I can ctrl-alt-suppr etc..
but INIT is not starting.
This does not happen with CONFIG_HIGHMEM4G=y but only with CONFIG_HIGHMEM64G=y

Diff between working 2.6.20 and non working 2.6.20 :

181,182c181,182
< # CONFIG_HIGHMEM4G is not set
< CONFIG_HIGHMEM64G=y
---
> CONFIG_HIGHMEM4G=y
> # CONFIG_HIGHMEM64G is not set
185d184
< CONFIG_X86_PAE=y
191c190
< CONFIG_RESOURCES_64BIT=y
---
> # CONFIG_RESOURCES_64BIT is not set

On a 2.6.20-rc2 and previous, CONFIG_HIGHMEM64G=y all is working fine.

Steps to reproduce:
Compile a kernel with CONFIG_HIGHMEM64G and use it on a Dell PE850.
Comment 1 Nilshar 2007-02-19 06:04:30 UTC
*** Bug 8039 has been marked as a duplicate of this bug. ***
Comment 2 Hrvoje 2007-03-05 07:38:50 UTC
This bug also happens on Xeon 3000 model Supermicro
Comment 3 js 2007-03-07 22:17:44 UTC
Please see bug 8148 too.
I have neither of those kernel options enabled and with an older K6 600 mhz
based system I have the exact same symptoms.
Comment 4 Nilshar 2007-03-14 03:13:24 UTC
Any news on that bug please ?
Comment 5 Anonymous Emailer 2007-03-14 04:17:22 UTC
Reply-To: akpm@linux-foundation.org

> On Wed, 14 Mar 2007 03:13:25 -0700 bugme-daemon@bugzilla.kernel.org wrote:
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=8040
> 
> 
> 
> 
> 
> ------- Additional Comments From Nilshar@gmail.com  2007-03-14 03:13 -------
> Any news on that bug please ?

None whatsoever.  Three people are reporting this and it's a drop-dead
showstopper for a 2.6.21 release so we just have to wait until someone
wakes up and thinks about it.

It would be very useful if one of the reporters could perform a git-bisect
search to identify the offending change, please.

I would dearly like to point you at a document or web page which describes
kernel-git-bisect-for-newbies, but afaik there isn't such a thing, which is
a huge failing.

Comment 6 Randy Dunlap 2007-03-14 04:25:46 UTC
On Wed, 14 Mar 2007, Andrew Morton wrote:

> > On Wed, 14 Mar 2007 03:13:25 -0700 bugme-daemon@bugzilla.kernel.org wrote:
> >
> > http://bugzilla.kernel.org/show_bug.cgi?id=8040
> >
> >
> > ------- Additional Comments From Nilshar@gmail.com  2007-03-14 03:13 -------
> > Any news on that bug please ?
>
> None whatsoever.  Three people are reporting this and it's a drop-dead
> showstopper for a 2.6.21 release so we just have to wait until someone
> wakes up and thinks about it.
>
> It would be very useful if one of the reporters could perform a git-bisect
> search to identify the offending change, please.
>
> I would dearly like to point you at a document or web page which describes
> kernel-git-bisect-for-newbies, but afaik there isn't such a thing, which is
> a huge failing.

I have one of those one-of-my-machines-wont-boot-2.6.21-rc*
and I expect that I'll try to use git bisect on it, in which case
I will also document it.

Comment 7 Michal Piotrowski 2007-03-14 04:31:03 UTC
On 14/03/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Wed, 14 Mar 2007 03:13:25 -0700 bugme-daemon@bugzilla.kernel.org wrote:
> >
> > http://bugzilla.kernel.org/show_bug.cgi?id=8040
> >
> >
> >
> >
> >
> > ------- Additional Comments From Nilshar@gmail.com  2007-03-14 03:13 -------
> > Any news on that bug please ?
>
> None whatsoever.  Three people are reporting this and it's a drop-dead
> showstopper for a 2.6.21 release so we just have to wait until someone
> wakes up and thinks about it.
>
> It would be very useful if one of the reporters could perform a git-bisect
> search to identify the offending change, please.
>
> I would dearly like to point you at a document or web page which describes
> kernel-git-bisect-for-newbies, but afaik there isn't such a thing, which is
> a huge failing.

"Linux testers handbook" should be translated in a few weeks.

Here is a "git-bisect basics" movie :)
http://www.youtube.com/watch?v=R7_LY-ceFbE

Regards,
Michal

Comment 8 Nilshar 2007-03-14 04:33:35 UTC
I'll try git-bisect, not sure what it is exactly, but I can certainly do it to
try  helping in resolving this bug. I'll search google, but if you got a good
link, I can use it :)
Comment 9 Leroy Raymond van Logchem 2007-03-14 14:47:16 UTC
Bisecting went well, after 13 compiles this commit was found:

a1f3bb9ae4497a2ed3eac773fd7798ac33a0371f is first bad commit
commit a1f3bb9ae4497a2ed3eac773fd7798ac33a0371f
Author: Roland McGrath <roland <at> redhat.com>
Date:   Fri Jan 26 00:56:46 2007 -0800

    [PATCH] Fix CONFIG_COMPAT_VDSO

    I wouldn't mind if CONFIG_COMPAT_VDSO went away entirely.  But if it's there,
    it should work properly.  Currently it's quite haphazard: both real vma and
    fixmap are mapped, both are put in the two different AT_* slots, sysenter
    returns to the vma address rather than the fixmap address, and core dumps yet
    are another story.

    This patch makes CONFIG_COMPAT_VDSO disable the real vma and use the fixmap
    area consistently.  This makes it actually compatible with what the old vdso
    implementation did.

    Signed-off-by: Roland McGrath <roland <at> redhat.com>
    Cc: Ingo Molnar <mingo <at> elte.hu>
    Cc: Paul Mackerras <paulus <at> samba.org>
    Cc: Benjamin Herrenschmidt <benh <at> kernel.crashing.org>
    Cc: Andi Kleen <ak <at> suse.de>
    Signed-off-by: Andrew Morton <akpm <at> osdl.org>
    Signed-off-by: Linus Torvalds <torvalds <at> linux-foundation.org>

:040000 040000 802ab3366a651ecba28c8677fa84a9f7c506392b
f44adc4dcdab733e5965b68ccd0d643f0a550a80 M      arch
:040000 040000 be1e217152d8b3fcd05f09aa2b3f4f9dcb8208aa
46cc86427e861350dd3fef9469474c55119f27ce M      include

I had both CONFIG_COMPAT_VDSO=y and CONFIG_HIGHMEM64G=y configured.
Using a 4GB Supermicro 7044 SMP dual Xeon. Details upon request.

--
Leroy
Comment 10 Nilshar 2007-03-15 01:00:54 UTC
So if I set CONFIG_COMPAT_VDSO=n I should be able to boot ?


2007/3/14, bugme-daemon@bugzilla.kernel.org <bugme-daemon@bugzilla.kernel.org>:
> http://bugzilla.kernel.org/show_bug.cgi?id=8040
>
>
>
>
>
> ------- Additional Comments From leroy.vanlogchem@wldelft.nl  2007-03-14 14:47 -------
> Bisecting went well, after 13 compiles this commit was found:
>
> a1f3bb9ae4497a2ed3eac773fd7798ac33a0371f is first bad commit
> commit a1f3bb9ae4497a2ed3eac773fd7798ac33a0371f
> Author: Roland McGrath <roland <at> redhat.com>
> Date:   Fri Jan 26 00:56:46 2007 -0800
>
>     [PATCH] Fix CONFIG_COMPAT_VDSO
>
>     I wouldn't mind if CONFIG_COMPAT_VDSO went away entirely.  But if it's there,
>     it should work properly.  Currently it's quite haphazard: both real vma and
>     fixmap are mapped, both are put in the two different AT_* slots, sysenter
>     returns to the vma address rather than the fixmap address, and core dumps yet
>     are another story.
>
>     This patch makes CONFIG_COMPAT_VDSO disable the real vma and use the fixmap
>     area consistently.  This makes it actually compatible with what the old vdso
>     implementation did.
>
>     Signed-off-by: Roland McGrath <roland <at> redhat.com>
>     Cc: Ingo Molnar <mingo <at> elte.hu>
>     Cc: Paul Mackerras <paulus <at> samba.org>
>     Cc: Benjamin Herrenschmidt <benh <at> kernel.crashing.org>
>     Cc: Andi Kleen <ak <at> suse.de>
>     Signed-off-by: Andrew Morton <akpm <at> osdl.org>
>     Signed-off-by: Linus Torvalds <torvalds <at> linux-foundation.org>
>
> :040000 040000 802ab3366a651ecba28c8677fa84a9f7c506392b
> f44adc4dcdab733e5965b68ccd0d643f0a550a80 M      arch
> :040000 040000 be1e217152d8b3fcd05f09aa2b3f4f9dcb8208aa
> 46cc86427e861350dd3fef9469474c55119f27ce M      include
>
> I had both CONFIG_COMPAT_VDSO=y and CONFIG_HIGHMEM64G=y configured.
> Using a 4GB Supermicro 7044 SMP dual Xeon. Details upon request.
>
> --
> Leroy
>
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>

Comment 11 Leroy Raymond van Logchem 2007-03-15 15:36:25 UTC
Chuck Ebbert at redhat.com asked:

> Can you please double check this by trying with/without again -- sometimes
bisects go bad.

As requested I started to redo the test but now without git using 
kernel.org tars.
The results now are, still using the same .config:
linux-2.6.20.tar.gz   : bad
linux-2.6.20.1.tar.gz: bad (boot log equal)
linux-2.6.20.2.tar.gz: good
linux-2.6.20.3.tar.gz: good
(triple checked)

Really strange. Nilshar, please try these kernels too with:
COMPAT_VDSO=y
CONFIG_HIGHMEM64G=y

Nilshar did try and says 2.6.20.3 works fine. So only 2.6.20 and 2.6.20.1 had
this 'hang' at boot behaviour.
Comment 12 Natalie Protasevich 2008-02-14 23:19:12 UTC
Since no more reports for later kernel releases, I guess the bug can be closed as fixed. Please reopen if you believe the problem is still there.

Note You need to log in before you can comment on or make changes to this bug.