Bug 14279 - Suspend to RAM freeze totally since 2.6.32-rc1 - Acer Aspire 1511Lmi laptop
Suspend to RAM freeze totally since 2.6.32-rc1 - Acer Aspire 1511Lmi laptop
Status: CLOSED CODE_FIX
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64
All Linux
: P1 normal
Assigned To: platform_x86_64@kernel-bugs.osdl.org
:
Depends on:
Blocks: 7216 14230
  Show dependency treegraph
 
Reported: 2009-09-30 18:14 UTC by Christian Casteyde
Modified: 2009-10-26 18:27 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.32-rc1
Tree: Mainline
Regression: Yes


Attachments
dmesg log for the failing platform (26.74 KB, text/plain)
2009-09-30 18:14 UTC, Christian Casteyde
Details
bisection log (2.64 KB, text/plain)
2009-10-05 21:49 UTC, Christian Casteyde
Details

Description Christian Casteyde 2009-09-30 18:14:56 UTC
Created attachment 23209 [details]
dmesg log for the failing platform

Hardware: Athlon 64 3000 in 64 bits mode (Acer Aspire 1511Lmi laptop)
Distro: Bluewhite64

Do not have more information, the laptop freeze totally (black screen, nothing responds, must use the power button for more than 5s to stop it).

Dmesg output before suspend appended
Comment 1 Andrew Morton 2009-09-30 23:43:49 UTC
It's a regression from which previous kernel version?  2.6.31?

Thanks.
Comment 2 Christian Casteyde 2009-10-01 17:15:47 UTC
Yes, 2.6.31 works, 2.6.32-rc1 freeze.

More precisely:
2.6.30 works perfectly, with no warning or whatsoever.

2.6.31 suspends and resumes, but it prints an error at resume, saying there is an unexpected NMI error, this is an hardware problem on PCI, etc.
This is already logged in http://bugzilla.kernel.org/show_bug.cgi?id=13987

2.6.32 occasionnaly issues the same warning **at init** (as said in the same bugzilla entry), and totally freeze at suspend. It cannot resume therefore (have to reboot).
Comment 3 Rafael J. Wysocki 2009-10-01 20:33:13 UTC
Since this looks 100% reproducible, can you try to find the breaking commit by bisection?

Of course, first please check if the problem hasn't been fixed already.
Comment 4 Christian Casteyde 2009-10-05 21:47:41 UTC
10 compile and 15 reboots later ;-)

The faulty commit is the following:

root@athor:/home/christian# cd /usr/src/linux-git
root@athor:/usr/src/linux-git# git bisect bad
5f68563996e812f9ca35b3939ad2a42e5d254d66 is first bad commit
commit 5f68563996e812f9ca35b3939ad2a42e5d254d66
Author: Jan Beulich <JBeulich@novell.com>
Date:   Fri Sep 4 09:16:22 2009 +0100

    x86: cpuinit-annotate SMP boot trampolines properly

    Add missing annotations, and make use of include/linux/init.h's
    macros.

    Signed-off-by: Jan Beulich <jbeulich@novell.com>
    LKML-Reference: <4AA0E8F60200007800013703@vpn.id2.novell.com>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

:040000 040000 cc41898cf8c087ce4d5b1a9d7f98cef17b4766a1 311e4ed7f38529ce629d5ef130d3bf52e15f783c M      arch
root@athor:/usr/src/linux-git#

I'm appendig the git log for more info.
I've also checked -rc3, the problem is still present.

Since it seems to be related to code annotation/compiler/linker, I'm running BW64 as said, that is gcc 4.3.3, ld 2.18.50.0.9 and glibc 2.9:

root@athor:/usr/src/linux-git# gcc -v
Reading specs from /usr/lib/gcc/x86_64-pc-linux/4.3.3/specs
Target: x86_64-pc-linux
Configured with: ../gcc-4.3.3/configure --prefix=/usr --libdir=/usr/lib --enable-shared --enable-bootstrap --enable-languages=ada,c,c++,fortran,java,objc --enable-threads=posix --enable-checking=release --with-system-zlib --disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp --with-gnu-ld --verbose --disable-multilib --target=x86_64-pc-linux --build=x86_64-pc-linux --host=x86_64-pc-linux
Thread model: posix
gcc version 4.3.3 (GCC)
root@athor:/usr/src/linux-git# ld -v
GNU ld (Linux/GNU Binutils) 2.18.50.0.9.20080822
root@athor:/usr/src/linux-git# ldd --version
ldd (GNU libc) 2.9
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
Comment 5 Christian Casteyde 2009-10-05 21:49:43 UTC
Created attachment 23272 [details]
bisection log
Comment 6 Rafael J. Wysocki 2009-10-05 22:18:58 UTC
[Switched to e-mail, please reply in this thread and don't drop
bugzilla-daemon@bugzilla.kernel.org from the Cc list.]

On Monday 05 October 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=14279
> 
> --- Comment #4 from Christian Casteyde <casteyde.christian@free.fr>  2009-10-05 21:47:41 ---
> 10 compile and 15 reboots later ;-)
> 
> The faulty commit is the following:
> 
> root@athor:/home/christian# cd /usr/src/linux-git
> root@athor:/usr/src/linux-git# git bisect bad
> 5f68563996e812f9ca35b3939ad2a42e5d254d66 is first bad commit
> commit 5f68563996e812f9ca35b3939ad2a42e5d254d66
> Author: Jan Beulich <JBeulich@novell.com>
> Date:   Fri Sep 4 09:16:22 2009 +0100
> 
>     x86: cpuinit-annotate SMP boot trampolines properly
> 
>     Add missing annotations, and make use of include/linux/init.h's
>     macros.
> 
>     Signed-off-by: Jan Beulich <jbeulich@novell.com>
>     LKML-Reference: <4AA0E8F60200007800013703@vpn.id2.novell.com>
>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> 
> :040000 040000 cc41898cf8c087ce4d5b1a9d7f98cef17b4766a1
> 311e4ed7f38529ce629d5ef130d3bf52e15f783c M      arch
> root@athor:/usr/src/linux-git#

Thanks a lot for bisecting!
 
> I'm appendig the git log for more info.
> I've also checked -rc3, the problem is still present.
> 
> Since it seems to be related to code annotation/compiler/linker, I'm running
> BW64 as said, that is gcc 4.3.3, ld 2.18.50.0.9 and glibc 2.9:
> 
> root@athor:/usr/src/linux-git# gcc -v
> Reading specs from /usr/lib/gcc/x86_64-pc-linux/4.3.3/specs
> Target: x86_64-pc-linux
> Configured with: ../gcc-4.3.3/configure --prefix=/usr --libdir=/usr/lib
> --enable-shared --enable-bootstrap
> --enable-languages=ada,c,c++,fortran,java,objc --enable-threads=posix
> --enable-checking=release --with-system-zlib --disable-libunwind-exceptions
> --enable-__cxa_atexit --enable-libssp --with-gnu-ld --verbose
> --disable-multilib --target=x86_64-pc-linux --build=x86_64-pc-linux
> --host=x86_64-pc-linux
> Thread model: posix
> gcc version 4.3.3 (GCC)
> root@athor:/usr/src/linux-git# ld -v
> GNU ld (Linux/GNU Binutils) 2.18.50.0.9.20080822
> root@athor:/usr/src/linux-git# ldd --version
> ldd (GNU libc) 2.9
> Copyright (C) 2008 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> Written by Roland McGrath and Ulrich Drepper.

Jan, Ingo, can you have a look at this, please?

The trampoline is also used for resuming from suspend to RAM on x86_64.

Rafael
Comment 7 Rafael J. Wysocki 2009-10-05 22:19:08 UTC
First-Bad-Commit : 5f68563996e812f9ca35b3939ad2a42e5d254d66
Comment 8 Jan Beulich 2009-10-06 07:52:20 UTC
>>> "Rafael J. Wysocki" <rjw@sisk.pl> 06.10.09 00:20 >>>
>On Monday 05 October 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
>> http://bugzilla.kernel.org/show_bug.cgi?id=14279 
>> 
>> --- Comment #4 from Christian Casteyde <casteyde.christian@free.fr>  2009-10-05 21:47:41 ---
>> 10 compile and 15 reboots later ;-)
>> 
>> The faulty commit is the following:
>> 
>> root@athor:/home/christian# cd /usr/src/linux-git
>> root@athor:/usr/src/linux-git# git bisect bad
>> 5f68563996e812f9ca35b3939ad2a42e5d254d66 is first bad commit
>> commit 5f68563996e812f9ca35b3939ad2a42e5d254d66
>> Author: Jan Beulich <JBeulich@novell.com>
>> Date:   Fri Sep 4 09:16:22 2009 +0100
>> 
>>     x86: cpuinit-annotate SMP boot trampolines properly
>> 
>>     Add missing annotations, and make use of include/linux/init.h's
>>     macros.
>> 
>>     Signed-off-by: Jan Beulich <jbeulich@novell.com>
>>     LKML-Reference: <4AA0E8F60200007800013703@vpn.id2.novell.com>
>>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
>> 
>> :040000 040000 cc41898cf8c087ce4d5b1a9d7f98cef17b4766a1
>> 311e4ed7f38529ce629d5ef130d3bf52e15f783c M      arch
>> root@athor:/usr/src/linux-git#
>
>Thanks a lot for bisecting!
 
>...
>Jan, Ingo, can you have a look at this, please?
>
>The trampoline is also used for resuming from suspend to RAM on x86_64.

Correct, but SUSPEND+SMP select HOTPLUG_CPU, and that results in
the trampoline not being discarded. So this must be a !SMP config, and
I must have missed a section mismatch warning resulting from
acpi_save_state_mem() referencing setup_trampoline() (it's too bad that
these warning continue to be suppressed by default).

In order to not introduce even uglier CONFIG_X86_64 conditionals, I
think the only solution is to revert the whole patch, even though I
dislike retaining the trampoline on 32-bits just because it's needed on
64-bits.

Jan
Comment 9 Jan Beulich 2009-10-06 15:32:47 UTC
>>> "Rafael J. Wysocki" <rjw@sisk.pl> 06.10.09 00:20 >>>
>The trampoline is also used for resuming from suspend to RAM on x86_64.

Ingo, if you didn't already revert the original patch, here's a proper fix,
moving the trampoline and accessors back out of .cpuinit.* for the
case of 64-bits+ACPI_SLEEP.

Signed-off-by: Jan Beulich <jbeulich@novell.com>

---
 arch/x86/kernel/trampoline.c    |   12 ++++++++++--
 arch/x86/kernel/trampoline_64.S |    4 ++++
 2 files changed, 14 insertions(+), 2 deletions(-)

--- linux-2.6.32-rc3/arch/x86/kernel/trampoline.c	2009-10-05 12:02:21.000000000 +0200
+++ 2.6.32-rc3-x86_64-trampoline/arch/x86/kernel/trampoline.c	2009-10-06 16:50:58.000000000 +0200
@@ -3,8 +3,16 @@
 #include <asm/trampoline.h>
 #include <asm/e820.h>
 
+#if defined(CONFIG_X86_64) && defined(CONFIG_ACPI_SLEEP)
+#define __trampinit
+#define __trampinitdata
+#else
+#define __trampinit __cpuinit
+#define __trampinitdata __cpuinitdata
+#endif
+
 /* ready for x86_64 and x86 */
-unsigned char *__cpuinitdata trampoline_base = __va(TRAMPOLINE_BASE);
+unsigned char *__trampinitdata trampoline_base = __va(TRAMPOLINE_BASE);
 
 void __init reserve_trampoline_memory(void)
 {
@@ -26,7 +34,7 @@ void __init reserve_trampoline_memory(vo
  * bootstrap into the page concerned. The caller
  * has made sure it's suitably aligned.
  */
-unsigned long __cpuinit setup_trampoline(void)
+unsigned long __trampinit setup_trampoline(void)
 {
 	memcpy(trampoline_base, trampoline_data, TRAMPOLINE_SIZE);
 	return virt_to_phys(trampoline_base);
--- linux-2.6.32-rc3/arch/x86/kernel/trampoline_64.S	2009-10-05 12:02:21.000000000 +0200
+++ 2.6.32-rc3-x86_64-trampoline/arch/x86/kernel/trampoline_64.S	2009-10-06 16:30:18.000000000 +0200
@@ -32,8 +32,12 @@
 #include <asm/segment.h>
 #include <asm/processor-flags.h>
 
+#ifdef CONFIG_ACPI_SLEEP
+.section .rodata, "a", @progbits
+#else
 /* We can free up the trampoline after bootup if cpu hotplug is not supported. */
 __CPUINITRODATA
+#endif
 .code16
 
 ENTRY(trampoline_data)
Comment 10 Christian Casteyde 2009-10-12 19:46:58 UTC
With the proposed patch, my hardware suspends and resumes OK.
Thanks
Comment 11 Rafael J. Wysocki 2009-10-12 20:56:25 UTC
Handled-By : Jan Beulich <jbeulich@novell.com>
Patch : http://bugzilla.kernel.org/show_bug.cgi?id=14279#c9
Comment 12 Rafael J. Wysocki 2009-10-26 18:27:17 UTC
Fixed by commit 7a4b7e5e741fe0a72a517b0367a2659aa53f7c44 .

Note You need to log in before you can comment on or make changes to this bug.