Bug 12878 - [regression after 2.6.18] resume failure - MSI PR200WX-058EU
[regression after 2.6.18] resume failure - MSI PR200WX-058EU
Status: CLOSED CODE_FIX
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend
All Linux
: P1 normal
Assigned To: H. Peter Anvin
:
: 33752 (view as bug list)
Depends on:
Blocks: 7216
  Show dependency treegraph
 
Reported: 2009-03-15 08:03 UTC by Eddy Petrișor
Modified: 2014-02-10 19:41 UTC (History)
11 users (show)

See Also:
Kernel Version: 3.2
Tree: Mainline
Regression: Yes


Attachments
dmesg, fresh right after boot (42.55 KB, text/plain)
2009-03-16 04:02 UTC, Eddy Petrișor
Details
dmesg after reboot with i915 loaded (42.95 KB, text/plain)
2009-03-16 04:13 UTC, Eddy Petrișor
Details
pm_debug: dmesg after recovery (pm_test==core) (51.39 KB, text/plain)
2009-03-19 14:04 UTC, Eddy Petrișor
Details
pm_debug: dmesg after recovery (pm_test==core) - without old_ordering (51.54 KB, text/plain)
2009-03-19 15:05 UTC, Eddy Petrișor
Details
patch: introduce acpi_sleep=s3_sci_enable (2.00 KB, patch)
2009-03-23 01:47 UTC, Zhang Rui
Details | Diff
use the RTC cmos area(0x60-0x64) to track whether suspend/resume hangs (5.20 KB, patch)
2009-04-03 01:19 UTC, ykzhao
Details | Diff
/proc/cmos after reboot (30 bytes, application/octet-stream)
2009-04-03 10:39 UTC, Eddy Petrișor
Details
dmesg after reboot with cmos hack (49.39 KB, text/plain)
2009-04-03 10:41 UTC, Eddy Petrișor
Details
dmesg before sleep in 2.6.18-6-amd64 (working sleep/resume) (19.56 KB, application/octet-stream)
2009-06-13 11:47 UTC, Eddy Petrișor
Details
dmesg after sleep resume in 2.6.18-6-amd64 (working sleep/resume) (29.28 KB, application/octet-stream)
2009-06-13 11:48 UTC, Eddy Petrișor
Details
Output of dmesg before Suspend with Kernel 2.6.29 sci enabled (38.99 KB, text/plain)
2009-09-02 13:29 UTC, Oliver
Details
dmesg before sleep with 2.6.32-rc2 (45.91 KB, text/plain)
2009-10-12 07:06 UTC, Eddy Petrișor
Details
dmesg before sleep (62.62 KB, text/plain)
2011-04-01 06:31 UTC, Eddy Petrișor
Details
sleeptest: the script I used for the bisect (useful on debian based systems) (1.70 KB, application/octet-stream)
2011-04-01 06:39 UTC, Eddy Petrișor
Details
sleepit: the script that tries to extract the dmesgs before and after the sleep (336 bytes, application/octet-stream)
2011-04-01 06:44 UTC, Eddy Petrișor
Details
linux-build: the script that does the kernel building and creates the deb package for it (3.13 KB, application/octet-stream)
2011-04-01 06:47 UTC, Eddy Petrișor
Details
2.6.38.2 dmesg after echo core > /sys/power/pm_test (53.63 KB, application/octet-stream)
2011-04-21 18:51 UTC, tadziu23
Details
2.6.38.2 dmesg after echo 1 > /sys/power/pm_test (53.24 KB, application/octet-stream)
2011-04-21 18:52 UTC, tadziu23
Details
2.6.38.2 dmesg after echo core > /sys/power/pm_test (14.60 KB, application/x-gzip)
2011-04-22 11:36 UTC, tadziu23
Details
2.6.38.2 dmesg after echo 1 > /sys/power/pm_trace (14.70 KB, application/x-gzip)
2011-04-22 11:38 UTC, tadziu23
Details
2.6.38.2 dmesg after echo core > /sys/power/pm_test (proper) (17.56 KB, application/x-gzip)
2011-04-26 12:16 UTC, tadziu23
Details
dmesg output from suspend/resume (29.55 KB, application/x-gzip)
2012-01-19 19:30 UTC, tadziu23
Details
[PATCH] Enable A20 using KBC for some MSI laptops (2.77 KB, patch)
2012-10-23 21:18 UTC, Ondrej Zary
Details | Diff

Description Eddy Petrișor 2009-03-15 08:03:08 UTC
Latest working kernel version: -
Earliest failing kernel version: 2.6.25
Distribution: Debian GNU/Linux Lenny
Hardware Environment: MSI PR200WX-058EU laptop (Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 03))
Software Environment: Debian Lenny (GNOME) 
Problem Description: When trying to put the laptop to sleep, it works without any problem, but when I press the power button to recover from sleep the laptop seems to recover, but the screen remains blank and the fans start to spin up and nothing can be done until I press the power button once more. At that point the laptop restarts.

Neither opening the lid nor pressing other keys wake up the laptop from sleep, but this is consistent with the way Windows wakes up.

The laptop recovers fine from sleep in Windows XP. The DSDT disassembly shows different paths for Windows and Linux.

Steps to reproduce:
1. put the laptop to sleep from gnome-power-manager
2. try to wake up the laptop by pressing the power button
3. the screen remains blank
Comment 1 Zhang Rui 2009-03-15 18:23:24 UTC
what's the X server version of your system?

please verify if this bug still exists in the latest kernel.
please attach the dmesg output after boot.
please attach the acpidump output.
Comment 2 ykzhao 2009-03-15 18:26:30 UTC
Hi, Eddy
    Will you please load the i915 driver under the console mode and see whether the box can be resumed?
    It will be great if you can confirm whether it can't be resumed from S3 or the screen is blank.
    Will you please add the boot option of "acpi_sleep=beep" and do the following test?
    a. kill the process which is using /proc/acpi/event
    b. dmesg >dmesg_before; echo mem > /sys/power/state; dmesg >dmesg_after; sync;
    c. press the power button and see whether the box can be resumed. 
    If it can't be resumed, please reboot the box and check whether there exists the file of dmesg_after.

   
    Thanks.
Comment 3 Eddy Petrișor 2009-03-16 02:24:40 UTC
(In reply to comment #1)
> what's the X server version of your system?

0 eddy@heidi ~/usr/src/linux/linux-2.6 $ Xorg -version

X.Org X Server 1.4.2
Release Date: 11 June 2008
X Protocol Version 11, Revision 0
Build Operating System: Linux Debian (xorg-server 2:1.4.2-10)
Current Operating System: Linux heidi 2.6.29-rc7-heidi #1 SMP Fri Mar 13 00:21:39 EET 2009 x86_64
Build Date: 09 January 2009  02:16:05AM
 
	Before reporting problems, check http://wiki.x.org
	to make sure that you have the latest version.
Module Loader present

> please verify if this bug still exists in the latest kernel.

TBH, the kernel was not 2.6.29.rc7, but it was this git version:

0 eddy@heidi ~/usr/src/linux/linux-2.6 $ git show
commit ebdcc81c71937b30e09110c02a1e8a21fa770b6f
Merge: 01f6750... 260cf8a...
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed Mar 11 12:14:55 2009 -0700

    Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
    
    * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
      drm: fix EDID parser problem with positive/negative hsync/vsync



Do I need to test the current master?

> please attach the dmesg output after boot.
> please attach the acpidump output.

The acpidump is at http://bugzilla.kernel.org/show_bug.cgi?id=10855#c11

Comment 4 Eddy Petrișor 2009-03-16 02:37:30 UTC
I forgot to say, before using this kernel, I was trying to address some issues in the older kernels and I am booting regularly with the following kernel options:

quiet ec_intr=0 usbcore.autosuspend=1

or

quiet ec_intr=0 usbcore.autosuspend=1 splash video=intelfb

(The second variant is needed to allow splashy to work).



Do these matter now? Do I have to remove any of them?
Comment 5 Eddy Petrișor 2009-03-16 04:02:58 UTC
Created attachment 20541 [details]
dmesg, fresh right after boot

(In reply to comment #1)
> please attach the dmesg output after boot.

This is the dmesg output right after boot, in console mode, gdm not started.
Comment 6 Eddy Petrișor 2009-03-16 04:13:05 UTC
Created attachment 20542 [details]
dmesg after reboot with i915 loaded
Comment 7 Eddy Petrișor 2009-03-16 04:14:00 UTC
(In reply to comment #2)
> Hi, Eddy
>     Will you please load the i915 driver under the console mode and see whether
> the box can be resumed?
>     It will be great if you can confirm whether it can't be resumed from S3 or
> the screen is blank.

Actually my initial report was somewhat inaccurate. When pressing the power button during sleep the following happen:

1. laptop seems to wake up, screen remains blank
2. after about 1 or 2 seconds from the press the laptop reboots itself without me doing anything
3. after the self triggered reboot the screen remains blank/black and looks as if the laptop doesn't do anything while the fans keep on spinning


In order to recover I have to press the power button (long press). After these things occur, the led indicating sleep remains lit (during sleep it flashes). I tried to boot in Windows and put the laptop in stand-by then recover in the hope it will shut down the led, but it didn't (after recovering from sleep, the led was on).

>     Will you please add the boot option of "acpi_sleep=beep" and do the
> following test?

I suspect you are talking about a single test in this paragraph as well as in the previous one. If not, please elaborate.

>     a. kill the process which is using /proc/acpi/event

I stopped acpid (invoke-rc.d acpid stop).

>     b. dmesg >dmesg_before; echo mem > /sys/power/state; dmesg >dmesg_after;
> sync;
>     c. press the power button and see whether the box can be resumed. 
>     If it can't be resumed, please reboot the box and check whether there
> exists the file of dmesg_after.

The laptop didn't resume (see details above). No dmesg_after was written.


Comment 8 Eddy Petrișor 2009-03-16 04:16:27 UTC
It seems I forgot to say that the laptop hibernates and recovers properly.
Comment 9 Zhang Rui 2009-03-16 20:17:30 UTC
(In reply to comment #3)
> (In reply to comment #1)
> 
> TBH, the kernel was not 2.6.29.rc7, but it was this git version:
> 
> 0 eddy@heidi ~/usr/src/linux/linux-2.6 $ git show
> commit ebdcc81c71937b30e09110c02a1e8a21fa770b6f
> Merge: 01f6750... 260cf8a...
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Wed Mar 11 12:14:55 2009 -0700
> 
>     Merge branch 'drm-fixes' of
> git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
> 
>     * 'drm-fixes' of
> git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
>       drm: fix EDID parser problem with positive/negative hsync/vsync
> 
> 
> 
> Do I need to test the current master?
> 
No, this one is okay.


(In reply to comment #7)
> (In reply to comment #2)
> >     Will you please add the boot option of "acpi_sleep=beep" and do the
> > following test?
> 
> I suspect you are talking about a single test in this paragraph as well as in
> the previous one. If not, please elaborate.
> 
can you hear the beep after pressing the power button?

another some things to verify,
1. problem exists in all the kernels you've tried.
2. the symptom is the same in both console/X mode
3. problem doesn't exist in Windows
right?
Comment 10 Eddy Petrișor 2009-03-18 02:56:57 UTC
(In reply to comment #9)
> (In reply to comment #3)
> > (In reply to comment #1)
> > 
> > TBH, the kernel was not 2.6.29.rc7, but it was this git version:
> > 
> > 0 eddy@heidi ~/usr/src/linux/linux-2.6 $ git show
> > commit ebdcc81c71937b30e09110c02a1e8a21fa770b6f

[..]
> > 
> > Do I need to test the current master?
> > 
> No, this one is okay.

Great!

> (In reply to comment #7)
> > (In reply to comment #2)
> > >     Will you please add the boot option of "acpi_sleep=beep" and do the
> > > following test?
> > 
> > I suspect you are talking about a single test in this paragraph as well as in
> > the previous one. If not, please elaborate.
> > 
> can you hear the beep after pressing the power button?

No. I suspected that should happen after passing that option, but I forgot to report.

> another some things to verify,
> 1. problem exists in all the kernels you've tried.

All kernels I have tried had this issue in: 2.6.25, 2.6.26, 2.6.27, 2.6.29.rc7+ . The 2.6.25 and 2.6.26 kernels were the stock ones from Debian. 2.6.27 and 2.6.29.rc7+ were compiled from linus' git (2.6.27 from the tag, 2.6.29.rc7 from ebdcc81c71.

I don't recall explicitly trying sleep with 2.6.24 since it's stay on my laptop was short lived, but since it was the first linux kernel installed I suspect I tried and it failed.

> 2. the symptom is the same in both console/X mode

If there's no answer within 2-3 hours please consider an implicit "yes, the same happens in X".

I'll submit this comment and add a reply stating otherwise, if the behaviour is different for X (I have to leave for work and sleep would be the last thing I'd try).

> 3. problem doesn't exist in Windows

Yes

> right?
> 

Comment 11 ykzhao 2009-03-18 18:44:42 UTC
Will you please try the boot option of "acpi_sleep=old_ordering" and do the test as mentioned in comment #2?
   Thanks.
Comment 12 Zhang Rui 2009-03-18 23:00:52 UTC
please,
1. set CONFIG_PM_DEBUG and rebuild the kernel
2. boot into the new kernel
3. echo {freezer, devices, platform, processor, core} > /sys/power/pm_test
4. echo mem > /sys/power/state
5. please attach the dmesg output when pm_test==core if the system can come back in a few seconds
Comment 13 Eddy Petrișor 2009-03-19 02:59:36 UTC
(In reply to comment #11)
> Will you please try the boot option of "acpi_sleep=old_ordering" and do the
> test as mentioned in comment #2?

I tried with old_ordering with and without acpi_sleep=beep, the result was the same.
Comment 14 Eddy Petrișor 2009-03-19 14:04:15 UTC
Created attachment 20599 [details]
pm_debug: dmesg after recovery (pm_test==core)
Comment 15 Eddy Petrișor 2009-03-19 14:05:05 UTC
(In reply to comment #12)
> please,
> 1. set CONFIG_PM_DEBUG and rebuild the kernel
> 2. boot into the new kernel
> 3. echo {freezer, devices, platform, processor, core} > /sys/power/pm_test

I wasn't sure if these should be given in a sequence as:

echo freezer > /sys/power/pm_test
echo devices > /sys/power/pm_test
echo platform > /sys/power/pm_test
echo processor > /sys/power/pm_test
echo core > /sys/power/pm_test

then step 4 or if they should have been like:

echo freezer > /sys/power/pm_test
4.
echo devices > /sys/power/pm_test
4.
echo platform > /sys/power/pm_test
4.
echo processor > /sys/power/pm_test
4.
echo core > /sys/power/pm_test


I did the sequence as in the first variant.

> 4. echo mem > /sys/power/state
> 5. please attach the dmesg output when pm_test==core if the system can come
> back in a few seconds

It came back and I did a dmesg before and after the sleep. They are both attached.


Comment 16 Eddy Petrișor 2009-03-19 14:09:24 UTC
Err, I meant I attached the after, since it didn't made sense to add the before if after was present.
Comment 17 Eddy Petrișor 2009-03-19 15:05:44 UTC
Created attachment 20603 [details]
pm_debug: dmesg after recovery (pm_test==core) - without old_ordering

Because of the previous tests, my system defaulted to booting with acpi_sleep=old_ordering, which (from a diff) looks relevant to my untrained eye.

This dmesg is also obtained in the same way after resuming from sleep, pm_test==core, but without old_ordering.
Comment 18 Zhang Rui 2009-03-23 01:46:35 UTC
Linux kernel seems to work perfectly during suspend/resume.
this is probably a BIOS/Hardware issue.
please apply the debug patch attached below on top of 2.6.29-rc8 kernel, reboot with boot option "acpi_sleep=s3_sci_enable" and see if there is any difference.
Comment 19 Zhang Rui 2009-03-23 01:47:23 UTC
Created attachment 20633 [details]
patch: introduce acpi_sleep=s3_sci_enable
Comment 20 Eddy Petrișor 2009-03-24 01:59:47 UTC
2.6.29 was just released. Is OK if I try that kernel with your patch?
Comment 21 Zhang Rui 2009-03-24 18:19:33 UTC
yes, please. :)
Comment 22 Eddy Petrișor 2009-03-25 23:07:54 UTC
(In reply to comment #21)
> yes, please. :)

I tired the 2.6.29-rc8 with your patch, the result was the same.

Note that I still have in my boot parameters "quiet ec_intr=0 usbcore.autosuspend=1". Does this matter in any way?


I am compiling the 2.6.29 with your patch, and try that too. Maybe something better works.
Comment 23 Eddy Petrișor 2009-03-28 11:03:02 UTC
Same thing happened with 2.6.29 with your patch.
Comment 24 Len Brown 2009-04-01 01:13:07 UTC
> acpi_sleep=beep

was there any sound from the "PC speaker" on (failed) resume when you used this option?
Comment 25 Eddy Petrișor 2009-04-02 19:20:36 UTC
(In reply to comment #24)
> > acpi_sleep=beep
> 
> was there any sound from the "PC speaker" on (failed) resume when you used this
> option?

No. I would have said if there was anything different.
Comment 26 ykzhao 2009-04-03 01:19:07 UTC
Created attachment 20782 [details]
use the RTC cmos area(0x60-0x64) to track whether suspend/resume hangs

Will you please use the debug patch on the latest kernel(2.6.29) and do the following test?
   a.echo 25 > /proc/cmos ; echo mem > /sys/power/state so that the box enters
the suspend state
   b. press the power button. If the box can't be resumed, please reboot the
system.
   c. after the system is rebooted, please cat /proc/cmos and attach the output
of dmesg.

    Thanks.
Comment 27 Eddy Petrișor 2009-04-03 08:54:26 UTC
At some point in the past I extracted the DSDT and saw that Linux had a different table than Windows. Is this relevant in any way?
Comment 28 Eddy Petrișor 2009-04-03 09:09:16 UTC
(In reply to comment #26)
> Created an attachment (id=20782) [details]
> use the RTC cmos area(0x60-0x64) to track whether suspend/resume hangs
> 
> Will you please use the debug patch on the latest kernel(2.6.29) and do the
> following test?

By "the debug patch" you mean this patch, right?

Yes, no problem.

>    a.echo 25 > /proc/cmos ; echo mem > /sys/power/state so that the box enters
> the suspend state
>    b. press the power button. If the box can't be resumed, please reboot the
> system.
>    c. after the system is rebooted, please cat /proc/cmos and attach the output
> of dmesg.
> 
>     Thanks.

No, thank you!
Comment 29 Eddy Petrișor 2009-04-03 10:35:57 UTC
(In reply to comment #28)
> (In reply to comment #26)
> > Created an attachment (id=20782) [details] [details]
> > use the RTC cmos area(0x60-0x64) to track whether suspend/resume hangs
> > 
> > Will you please use the debug patch on the latest kernel(2.6.29) and do the
> > following test?
> 
> By "the debug patch" you mean this patch, right?
> 
> Yes, no problem.
> 
> >    a.echo 25 > /proc/cmos ; echo mem > /sys/power/state so that the box enters
> > the suspend state
> >    b. press the power button. If the box can't be resumed, please reboot the
> > system.
> >    c. after the system is rebooted, please cat /proc/cmos and attach the output
> > of dmesg.

I installed the 2.6.29 with the last patch (I also have the s3_sci_enable patch applied, too) and did what you asked me. As usual, nothing changed, but I have the information you requested after booting with these parameters:

0 eddy@heidi ~ $ cat /proc/cmdline 
BOOT_IMAGE=Linux ro root=fe00 quiet ec_intr=0 usbcore.autosuspend=1 splash video=intelfb acpi_sleep=beep
Comment 30 Eddy Petrișor 2009-04-03 10:39:41 UTC
Created attachment 20792 [details]
/proc/cmos after reboot

After the failed sleep recovery the BIOS detected that the CMOS checksums were broken. I suspect this is correct behaviour since CMOS memory is written to.


I booted with what the BIOS offered as fail-safe settings and got this out of /proc/cmos
Comment 31 Eddy Petrișor 2009-04-03 10:41:52 UTC
Created attachment 20793 [details]
dmesg after reboot with cmos hack

dmesg output after rebooting the 2.6.29+cmos kernel, post sleep recovering failure.
Comment 32 Eddy Petrișor 2009-04-12 20:00:49 UTC
Are there any new things/patches I should try?
Comment 33 Alexey Starikovskiy 2009-04-12 20:39:30 UTC
There is a last patch in bug #12011, could you try it?
Comment 34 Eddy Petrișor 2009-04-16 06:40:25 UTC
(In reply to comment #33)
> There is a last patch in bug #12011, could you try it?

I tried the patch over the 2.6.29 kernel and I got the same result. I made sure ec_intr wasn't present in the boot command line.

Note that I don't seem to have any issues with battery status disappearing with any of the kernels I tired which were newer 2.6.27 (including 2.6.27).
Comment 35 Eddy Petrișor 2009-04-16 06:43:04 UTC
(In reply to comment #34)
> (In reply to comment #33)
> > There is a last patch in bug #12011, could you try it?
> 
> I tried the patch over the 2.6.29 kernel and I got the same result. I made sure
> ec_intr wasn't present in the boot command line.
> 
> Note that I don't seem to have any issues with battery status disappearing with
> any of the kernels I tired which were newer 2.6.27 (including 2.6.27).

Oh, and the kernel was the 2.6.29 kernel with the patches proposed before in this bug report:

commit a90b2eeeb208567241aa21eced696ec010a9b6cc
Author: Eddy Petrișor <eddy.petrisor@gmail.com>
Date:   Tue Apr 14 09:20:05 2009 +0300

    merge modes and disable burst #2 (patch from #12011)
    
    Burst mode should be automatically disabled by controller, if it is not
    accessed for 400us. Now there is a delay of 550us and some are saying that
    550us is better. Thus, enabling of burst mode in first place seems to be a
    wrong move.

commit 8877cd8b24c26dfc7560f94d83b90e95e1d4d58f
Author: Eddy Petrișor <eddy.petrisor@gmail.com>
Date:   Fri Apr 3 12:04:32 2009 +0300

    Use the RTC cmos area to track where suspend/resume hangs

commit 6fd63c2f584e62355675c1735020acb3e4fad76f
Author: Eddy Petrișor <eddy.petrisor@gmail.com>
Date:   Tue Mar 24 11:02:59 2009 +0200

    Introduce kernel parameter acpi_sleep=s3_sci_enable
    
    some laptop requires SCI_EN being set directly on resume,
    or else they hung somewhere in the resume code path.
    
    We already have a blacklist for these lattops but we still needs
    this option, especially for debugging some suspend/resume problems.
    
    Signed-off-by: Zhang Rui <rui.zhang@intel.com>
    ---
     arch/x86/kernel/acpi/sleep.c |    4 ++++
     drivers/acpi/sleep.c         |    6 ++++++
     include/linux/acpi.h         |    3 +++
     3 files changed, 13 insertions(+)

commit 8e0ee43bc2c3e19db56a4adaa9a9b04ce885cd84
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Mon Mar 23 16:12:14 2009 -0700

    Linux 2.6.29
Comment 36 Eddy Petrișor 2009-05-20 08:36:10 UTC
Hello,

I just checked 2.6.30.rc6 (22ef37eed..) and it has the same problem wrt to sleep.

Note that I don't experience battery miss-readings since 2.6.27.
Comment 37 Oliver 2009-06-09 11:20:07 UTC
Several entries on the net suggest, that this actually is a regression. It is said, that resume worked with kernel 2.6.18 and stopped working with kernel 2.6.22 onwards. I have a MSI VR201 which is the direct successor of the PR200 and is hardwarewise almost the same. The VR201 suffers from exactly the same bug (no wonder because HW and BIOS are almost the same as with the PR200 / Daru2 machines). If I can be of any help solving this issue, please let me know. I can offer testing.
Comment 38 Zhang Rui 2009-06-10 02:52:15 UTC
well, I have no idea how to debug this bug.
it would be great if you can run git-bisect to find out which commit introduces this regression.
Comment 39 Eddy Petrișor 2009-06-13 10:56:28 UTC
(In reply to comment #37)
> Several entries on the net suggest, that this actually is a regression. It is
> said, that resume worked with kernel 2.6.18 and stopped working with kernel
> 2.6.22 onwards. I have a MSI VR201 which is the direct successor of the PR200
> and is hardwarewise almost the same. The VR201 suffers from exactly the same
> bug (no wonder because HW and BIOS are almost the same as with the PR200 /
> Daru2 machines). If I can be of any help solving this issue, please let me
> know. I can offer testing.

GREAT NEWS! I tested with the Debian kernel 2.6.18 from Debian Etch, since it was the easiest way to get a kernel that old, and after resume the voice synthesizer started to acknowledge the laptop RESUMED properly from sleep, although the display was bla[nc]k the whole time (probably a display driver issue since X didn't start with that old kernel).


I'll try to run a bisect on the kernel tree after I compile the pristine 2.6.18 to confirm it works with that version.
Comment 40 Eddy Petrișor 2009-06-13 11:47:58 UTC
Created attachment 21890 [details]
dmesg before sleep in 2.6.18-6-amd64 (working sleep/resume)
Comment 41 Eddy Petrișor 2009-06-13 11:48:50 UTC
Created attachment 21891 [details]
dmesg after sleep resume in 2.6.18-6-amd64 (working sleep/resume)
Comment 42 Eddy Petrișor 2009-06-13 11:49:46 UTC
(In reply to comment #38)
> well, I have no idea how to debug this bug.
> it would be great if you can run git-bisect to find out which commit introduces
> this regression.

I'll do that, now that I know is a regression ;-) .
Comment 43 Zhang Rui 2009-06-17 07:20:02 UTC
ping Eddy, any updates?
Comment 44 Eddy Petrișor 2009-06-17 09:23:34 UTC
I am having difficulties booting my self compiled kernels, although the config was the one from debian (with mild changes). The initramfs stops accusing some syntax error in the bootkeymap . Since I have root on LVM I am forced to use an initramfs.


I can confirm that Debian's 2.6.24-etchnhalf.1-amd64 is bad, while Debian's 2.6.18 is good.
Comment 45 Alexey Starikovskiy 2009-06-17 20:18:16 UTC
Could you please try vanilla 2.6.30?
Comment 46 Oliver 2009-06-17 21:16:01 UTC
(In reply to comment #45)
> Could you please try vanilla 2.6.30?

I tried it with vanilla 2.6.30 on my MSI VR201. Unfortunately that does not solve the issue.
Comment 47 Eddy Petrișor 2009-06-20 19:03:32 UTC
I managed to do the bisect with this core script:


0 eddy@heidi ~/usr/src/linux $ cat /root/bin/sleepit 
#!/bin/sh

FAILEDRESUME=/failed-resume
RESUMED=/resumed

modprobe i915
invoke-rc.d acpid stop
echo "$(uname -r)" > $FAILEDRESUME
dmesg >dmesg_before_$(uname -r); echo mem > /sys/power/state; dmesg >dmesg_after_$(uname -r); sync
echo 'resumed, oh my god' > resumed
echo "$(uname -r)" >> $RESUMED
rm -f $FAILEDRESUME
sync
sleep 10
reboot


So any kernel which ever failed had the /failed-resume file left behind after reboot. What I find strange is that, although I always had these in the command line, I haven't heard any beep, nor the speech that I did with older kernels (e.g. 2.6.18)


Please note that I have never seen during the bisect the screen to recover properly and I did all the tests from the console (without X running - I disabled gdm)



Please tell me if you need the dmesg files.


This is the bad commit and appeared after 2.6.22 was released:



91a6c462b02d8dc02dbe95e5a407d78078a38d01 is first bad commit
commit 91a6c462b02d8dc02dbe95e5a407d78078a38d01
Author: H. Peter Anvin <hpa@zytor.com>
Date:   Wed Jul 11 12:18:57 2007 -0700

    Use the new x86 setup code for x86-64; unify with i386

    This unifies arch/*/boot (except arch/*/boot/compressed) between
    i386 and x86-64, and uses the new x86 setup code for x86-64 as well.

    Signed-off-by: H. Peter Anvin <hpa@zytor.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Comment 48 Eddy Petrișor 2009-06-20 19:06:20 UTC
Looking at the comments of the commit I realised I should clarify that I am running an x86_64 kernel with x86_64 userland (Debian GNU/Linux Lenny 5.0, amd64)
Comment 49 Shaohua 2009-07-01 06:00:16 UTC
the commit is changing the boot, but your bug is about suspend/resume, sounds not related. Can you double check please?
Comment 50 Eddy Petrișor 2009-07-01 11:14:12 UTC
Taking into account that I did this bisect in an almost automatic fashion, I have NO reason to believe that anything was wrong with the bisect*.

OTOH, if all the HW components are not properly set up (the code is about hardware set up at boot time) it could be possible to mess up part of the behaviour.

I would suggest talking to Peter Anvin if he has any idea which part of the setup procedure could be responsible about this bug.



* in spite of that I will double check that the commit in question is the one responsible for this (of course, the problem is in a previous commit part of the set up code rewrite, but this one enables the new set up code on my arch)
Comment 51 H. Peter Anvin 2009-07-01 18:13:19 UTC
Quite likely, none!

Unfortunately it could be such that slight differences in the handing of especially graphics might trigger BIOS bugs, but that's like finding a needle in a haystack.

Now, the *current* code uses the new code for the resume path as well, but that is not the commit you fingered...
Comment 52 Eddy Petrișor 2009-07-01 22:16:48 UTC
(In reply to comment #51)
> Quite likely, none!

Sorry, could you clear up to what were you answering here? Is it this part: "if he has any idea which part of the setup procedure could be responsible about this bug" ?

> Unfortunately it could be such that slight differences in the handing of
> especially graphics might trigger BIOS bugs, but that's like finding a needle
> in a haystack.

And aiui, I can't simply revert part some commits in the rewrite commits and expect things to work or even compile (e.g. revert 91a6c462b02~X and hope we can narrow down the issue). Is that correct or do I have a chance to be able to boot such a kernel?


Note that the working kernels didn't resumed the graphics card properly and the screen was always blank after resume.

> Now, the *current* code uses the new code for the resume path as well, but that
> is not the commit you fingered...

By current I assume you mean linux-2.6 master, right? I could try to see if it was fixed by any chance. The last time I tried to compile, it was right after 2.6.30 release and that kernel didn't boot for me (but it might have been incorrect configuration).
Comment 53 Eddy Petrișor 2009-07-01 22:49:22 UTC
I just tried the linux-image packages I created when I did the tests and with those packaged kernels indeed the bad commit seems to be the one indicated before.


Unless I screwed up really big time (although I remember the automation script worked properly that time, too) and versioned the wrong kernel I am 100% sure that commit screwed up sleep for my machine and I suspect that with a 32-bit kernel the failing kernel would have been 4fd06960 (aka 91a6c462~1), but I don't have a system with 32 bit userland to test.
Comment 54 Eddy Petrișor 2009-07-01 23:32:53 UTC
I tried the distribution packaged 2.6.30 (2.6.30-bpo.1-amd64) and it doesn't resume.
Comment 55 ykzhao 2009-07-06 09:26:04 UTC
Hi, Eddy
    Will you please do the following test on the 2.6.30 distribution?
    a. kill the process which is using /proc/acpi/event
    b. dmesg >dmesg_before; echo mem > /sys/power/state; dmesg >dmesg_after;
sync;
    c. press the power button and see whether the box can be resumed. 
    If it can't be resumed, please reboot the box and check whether there
exists the file of dmesg_after
    
If there is no file of dmesg_after, maybe it can't be resumed from BIOS.

Thanks.
Comment 56 Eddy Petrișor 2009-07-06 09:55:04 UTC
(In reply to comment #55)
> Hi, Eddy
>     Will you please do the following test on the 2.6.30 distribution?
>     a. kill the process which is using /proc/acpi/event
>     b. dmesg >dmesg_before; echo mem > /sys/power/state; dmesg >dmesg_after;
> sync;
>     c. press the power button and see whether the box can be resumed. 
>     If it can't be resumed, please reboot the box and check whether there
> exists the file of dmesg_after

Something like the script I used?

0 eddy@heidi ~ $ cat /root/bin/sleepit 
#!/bin/sh

FAILEDRESUME=/failed-resume
RESUMED=/resumed

modprobe i915
invoke-rc.d acpid stop
echo "$(uname -r)" > $FAILEDRESUME
dmesg >dmesg_before_$(uname -r); echo mem > /sys/power/state; dmesg >dmesg_after_$(uname -r); sync
echo 'resumed, oh my god' > resumed
echo "$(uname -r)" >> $RESUMED
rm -f $FAILEDRESUME
sync
sleep 10
reboot

> If there is no file of dmesg_after, maybe it can't be resumed from BIOS.

The machine is perfectly capable or resuming; as I said before, I did a bisect and some kernels before the commit I indicated did resume properly.


0 eddy@heidi /root/var/debug/sleep/regression $ ls -l */dmesg_after_*
-rw-r--r-- 1 root root 32896 2009-07-02 02:26 2.6.18-128.el5/dmesg_after_2.6.18-128.el5
-rw-r--r-- 1 root root 30945 2009-06-19 03:03 2.6.20-rc2-g0f5486ec-heidi/dmesg_after_2.6.20-rc2-g0f5486ec-heidi
-rw-r--r-- 1 root root 43534 2009-06-20 12:05 2.6.22-g0a85e9a2-heidi/dmesg_after_2.6.22-g0a85e9a2-heidi
-rw-r--r-- 1 root root 35077 2009-06-20 01:12 2.6.22-g0c73f18b-heidi/dmesg_after_2.6.22-g0c73f18b-heidi
-rw-r--r-- 1 root root 42770 2009-06-20 21:39 2.6.22-g4fd06960-heidi/dmesg_after_2.6.22-g4fd06960-heidi
-rw-r--r-- 1 root root 41439 2009-06-20 02:09 2.6.22-g4fda25a2-heidi/dmesg_after_2.6.22-g4fda25a2-heidi
-rw-r--r-- 1 root root 33534 2009-06-19 10:21 2.6.22-g7dcca30a-heidi/dmesg_after_2.6.22-g7dcca30a-heidi
-rw-r--r-- 1 root root 42821 2009-06-20 05:01 2.6.22-g7e69c3ac-heidi/dmesg_after_2.6.22-g7e69c3ac-heidi
-rw-r--r-- 1 root root 42962 2009-06-20 11:35 2.6.22-gc6e16295-heidi/dmesg_after_2.6.22-gc6e16295-heidi
-rw-r--r-- 1 root root 42871 2009-06-20 03:41 2.6.22-gf2d98ae6-heidi/dmesg_after_2.6.22-gf2d98ae6-heidi
Comment 57 Eddy Petrișor 2009-08-27 23:26:27 UTC
Isn't anyone able to help with the fix for the bug?

Is clear the bug originates in the new start up code.

I am willing to test different patches to help diagnose what needs to be changed to be able to put the laptop into sleep mode again.
Comment 58 Zhang Rui 2009-08-31 07:09:34 UTC
(In reply to comment #18)
> Linux kernel seems to work perfectly during suspend/resume.
> this is probably a BIOS/Hardware issue.
> please apply the debug patch attached below on top of 2.6.29-rc8 kernel, reboot
> with boot option "acpi_sleep=s3_sci_enable" and see if there is any difference.

hah, I did not see "acpi_sleep=s3_sci_enable".
could you please re-do the test and verify if the patch in comment #19 works WITH boot option "acpi_sleep=s3_sci_enable"?
Comment 59 Oliver 2009-09-02 12:46:45 UTC
(In reply to comment #58)
> (In reply to comment #18)
> > Linux kernel seems to work perfectly during suspend/resume.
> > this is probably a BIOS/Hardware issue.
> > please apply the debug patch attached below on top of 2.6.29-rc8 kernel, reboot
> > with boot option "acpi_sleep=s3_sci_enable" and see if there is any difference.
> 
> hah, I did not see "acpi_sleep=s3_sci_enable".
> could you please re-do the test and verify if the patch in comment #19 works
> WITH boot option "acpi_sleep=s3_sci_enable"?

I have just done that test. Unfortunately that does not change anything. Logs are to follow asap.
Comment 60 Oliver 2009-09-02 13:29:15 UTC
Created attachment 22975 [details]
Output of dmesg before Suspend with Kernel 2.6.29 sci enabled

I used Eddy's script, as the machine did not resume properly there is no dmesg after resume. As wished sci was enabled.
Comment 61 Rafael J. Wysocki 2009-10-09 20:58:53 UTC
Please try to suspend with 2.6.32-rc3.

If resume still doesn't work, please attach full dmesg output.
Comment 62 Eddy Petrișor 2009-10-12 07:04:47 UTC
(In reply to comment #61)
> Please try to suspend with 2.6.32-rc3.
> 
> If resume still doesn't work, please attach full dmesg output.

Not only it doesn't work, there is again the regression regarding the correct reading of the battery information which was fixed in 2.6.30.
Comment 63 Eddy Petrișor 2009-10-12 07:06:31 UTC
Created attachment 23355 [details]
dmesg before sleep with 2.6.32-rc2
Comment 64 Eddy Petrișor 2009-10-12 07:07:57 UTC
(In reply to comment #62)
> (In reply to comment #61)
> > Please try to suspend with 2.6.32-rc3.
> > 
> > If resume still doesn't work, please attach full dmesg output.
> 
> Not only it doesn't work, there is again the regression regarding the correct
> reading of the battery information which was fixed in 2.6.30.

... that is bug #10855
Comment 65 Eddy Petrișor 2009-10-13 06:51:26 UTC
(In reply to comment #62)
> (In reply to comment #61)
> > Please try to suspend with 2.6.32-rc3.
> > 
> > If resume still doesn't work, please attach full dmesg output.
> 
> Not only it doesn't work, there is again the regression regarding the correct
> reading of the battery information which was fixed in 2.6.30.

(In reply to comment #63)
> Created an attachment (id=23355) [details]
> dmesg before sleep with 2.6.32-rc2

The system doesn't resume either if ec_intr=0 is not passed as a boot parameter.
Comment 66 Eddy Petrișor 2009-10-23 06:58:05 UTC
(In reply to comment #51)
> Quite likely, none!
> 
> Unfortunately it could be such that slight differences in the handing of
> especially graphics might trigger BIOS bugs, but that's like finding a needle
> in a haystack.
> 
> Now, the *current* code uses the new code for the resume path as well, but that
> is not the commit you fingered...

I was looking over this BR and I wanted to point out that currently the sleep LED  behaviour is not correct and since the first tests about sleep it remained lit (while the machine is on, obviously) and booting in Windows or an older kernel did not change things.


OTOH, testing sleep with the newer 2.6.32-rc3 I have observed that, although the sleep-resume cycle still doesn't work properly, there is a slight change in behaviour:

Before:
- trigger sleep
- pressing the power button for resume resulted in:
  - some activity
  - auto shut-down
  - trying to power the laptop again would result in an "on" cycle which didn't initialize the graphics card properly and the fans would speed (probably due to some infinite cycle)
    - to recover I had to power off again, then power on


With the new kernel the sequence is the same up until the auto shut-down (including it), but then, when trying to power on the laptop doesn't result in the fake failing power cycle.






Another idea, could someone help me come up with a patch that would enable me to use even the incremental development commits of the setup sequence so I could pin point which of the setup changes is actually at fault?

AIUI, now I am actually pointing to a 'blob' in a way, so I would like to pin point which specific change in that blob is at fault.
Comment 67 Eddy Petrișor 2009-10-23 07:00:29 UTC
(In reply to comment #62)

> Not only it doesn't work, there is again the regression regarding the correct
> reading of the battery information which was fixed in 2.6.30.

For people hitting this issue, this is bug #14446.
Comment 68 Oliver 2009-12-19 23:36:10 UTC
There is not much going on here anymore. Is there anything I can do? Test something new for example?
Comment 69 Zhang Rui 2009-12-22 03:28:39 UTC
well, this is a tough bug.
I don't know how to debug this issue for now.
Comment 70 Len Brown 2011-01-18 06:31:05 UTC
So 2.6.18 suspend/resume worked on this system, but with no video restore.
2.6.22 through 2.6.32 fail.

Does it still fail when using the most recent stable kernel, 2.6.37?
Comment 71 Oliver 2011-01-18 21:56:41 UTC
I have just tested kernel 2.6.37 using an Ubuntu Mainline Kernel. Unfortunately the problem still exists.
Comment 72 Eddy Petrișor 2011-03-30 08:32:29 UTC
(In reply to comment #70)
> So 2.6.18 suspend/resume worked on this system, but with no video restore.
> 2.6.22 through 2.6.32 fail.
> 
> Does it still fail when using the most recent stable kernel, 2.6.37?

I will try and tell you. Sorry for the delayed response.

I have been hitting other issues related to display corruption after hibernate-resume cycles, I hope this won't impact this issue.
Comment 73 Eddy Petrișor 2011-04-01 06:29:07 UTC
I tried 2.6.37 and it has the same problem. Fake reboot, too (see comments above).
Comment 74 Eddy Petrișor 2011-04-01 06:31:00 UTC
Created attachment 52912 [details]
dmesg before sleep

This is the dmesg output before sleep obtained with 2.6.37.
Comment 75 Eddy Petrișor 2011-04-01 06:39:25 UTC
Created attachment 52922 [details]
sleeptest: the script I used for the bisect (useful on debian based systems)

I used this script to make the git bisect and identify the bad and the good versions. It relies on a stable kernel (has no git hash in the uname), a git source of the vanilla kernel, make-kpkg (debian utility to make kernel packages), linux-build (a wrapper I wrote to build the vanilla kernels) and dpkg. It almost does everything automatically, from performing the test to tracking the results and building new versions to test.

I thought that by making this public I will help others test themselves this regression.
Comment 76 Eddy Petrișor 2011-04-01 06:40:21 UTC
I changed the "Regression" value to Yes.
Comment 77 Eddy Petrișor 2011-04-01 06:44:30 UTC
Created attachment 52932 [details]
sleepit: the script that tries to extract the dmesgs before and after the sleep

This script appends to /resumed all kernel versions which managed to resume and will leave in /failed-resume the 'uname -r' of the last failed to resume kernel.

Currently my /resumed file contains:

2.6.20-rc2-g0f5486ec-heidi
2.6.22-g7dcca30a-heidi
2.6.22-g0c73f18b-heidi
2.6.22-g4fda25a2-heidi
2.6.22-gf2d98ae6-heidi
2.6.22-g7e69c3ac-heidi
2.6.22-gc6e16295-heidi
2.6.22-g0a85e9a2-heidi
2.6.22-g4fd06960-heidi
2.6.22-g4fd06960-heidi
2.6.18-128.el5
Comment 78 Eddy Petrișor 2011-04-01 06:47:47 UTC
Created attachment 52942 [details]
linux-build: the script that does the kernel building and creates the deb package for it

This is the script I use to build kernel .debs . With this script, my entire test frame for this bug is public, in case somebody else wants to do a bisect for themselves.
Comment 79 Zhang Rui 2011-04-19 08:10:47 UTC
(In reply to comment #27)
> At some point in the past I extracted the DSDT and saw that Linux had a
> different table than Windows. Is this relevant in any way?

Oh, how did you know Windows and Linux are using different ACPI tables?
Comment 80 tadziu23 2011-04-20 10:15:25 UTC
my computer is affected with exactly same bug on msi ex 600 x machine.

just reporting that 2.6.38 did not resolved the issue.
Comment 81 Rafael J. Wysocki 2011-04-20 19:16:23 UTC
*** Bug 33752 has been marked as a duplicate of this bug. ***
Comment 82 Zhang Rui 2011-04-21 06:33:49 UTC
please build a 2.6.38 kernel with CONFIG_ACPI_DEBUG set, and then run
1. echo core > /sys/power/pm_test
   echo mem > /sys/power/state
   and then attach the dmesg output after this time.
2. echo 1 > /sys/power/pm_trace
   echo mem > /sys/power/state
   and then attach the dmesg output of the next boot after the hang.
Comment 83 Zhang Rui 2011-04-21 06:34:48 UTC
please build a 2.6.38 kernel with CONFIG_ACPI_DEBUG set, and then run
1. echo core > /sys/power/pm_test
   echo mem > /sys/power/state
   and then attach the dmesg output after this test.
2. echo 1 > /sys/power/pm_trace
   echo mem > /sys/power/state
   and then attach the dmesg output of the next boot after the hang.
Comment 84 tadziu23 2011-04-21 07:31:42 UTC
okay, but i'm at work right now so you will have to wait till evening gmt+1.
Comment 85 tadziu23 2011-04-21 18:51:53 UTC
Created attachment 54892 [details]
2.6.38.2 dmesg after echo core > /sys/power/pm_test
Comment 86 tadziu23 2011-04-21 18:52:37 UTC
Created attachment 54902 [details]
2.6.38.2 dmesg after echo 1 > /sys/power/pm_test
Comment 87 tadziu23 2011-04-21 19:00:08 UTC
i'm not shure whether first dmesg log is proper, first echo mem > sys/power/state caused my machine to shutdown and halt, so dmesg output was created after reboot. 

command from the second point caused standart suspend, and unfotunately same result - blank screen; self reboot with blank screen after; manual shutdown and normal boot, dmesg log done after that.

please be patitent if i did something wrong or just misunderstood, i'm just a librarian with semi advanced computer and linux skills.

best regards

/t
Comment 88 Zhang Rui 2011-04-22 01:37:45 UTC
(In reply to comment #87)
> i'm not shure whether first dmesg log is proper, first echo mem >
> sys/power/state caused my machine to shutdown and halt, so dmesg output was
> created after reboot. 
> 
did you run "echo core > /sys/power/pm_test" first?
If yes, please run "echo mem > /sys/power/state" and wait for about 10 seconds to see if the machine resumes automatically.

(In reply to comment #86)
> Created an attachment (id=54902) [details]
> 2.6.38.2 dmesg after echo 1 > /sys/power/pm_test

it should be "echo 1 > /sys/power/pm_trace"
Comment 89 tadziu23 2011-04-22 06:26:18 UTC
>did you run "echo core > /sys/power/pm_test" first?
>If yes, please run "echo mem > /sys/power/state" and wait for about 10 seconds
>to see if the machine resumes automatically.

yes, but didn't wait, and resumed manualy.

>it should be "echo 1 > /sys/power/pm_trace"

that's the way i did it, just mistaken in comment copy/paste.
i'll send new logs later.
Comment 90 tadziu23 2011-04-22 11:36:37 UTC
Created attachment 55032 [details]
2.6.38.2 dmesg after echo core > /sys/power/pm_test
Comment 91 tadziu23 2011-04-22 11:38:23 UTC
Created attachment 55042 [details]
2.6.38.2 dmesg after echo 1 > /sys/power/pm_trace
Comment 92 tadziu23 2011-04-22 11:43:35 UTC
echo core > /sys/power/pm_test
echo mem > /sys/power/state

causes my machine to shutdown and halt. no automatic resume after that (i've waited five minutes). dmesg after manual bootup.

echo 1 > /sys/power/pm_trace
echo mem > /sys/power/state

reproduces bug, dmesg output after reboot.
Comment 93 Zhang Rui 2011-04-25 08:46:32 UTC
(In reply to comment #92)
> echo core > /sys/power/pm_test
> echo mem > /sys/power/state
> 
> causes my machine to shutdown and halt. no automatic resume after that (i've
> waited five minutes). dmesg after manual bootup.
> 
oh, this sounds like a kernel issue.
please echo one of these items {core processors platform devices freezer} > /sys/power/pm_test each time. and check which one starts to give you the automatic resume.
Comment 94 tadziu23 2011-04-26 12:16:12 UTC
Created attachment 55562 [details]
2.6.38.2 dmesg after echo core > /sys/power/pm_test (proper)
Comment 95 tadziu23 2011-04-26 12:28:07 UTC
i dont know why, but running echo core > /sys/power/pm_test in recovery mode does not result in system shutdown. same goes with processors, platform and devices echoed, which cause system to shutdown in standard kernel mode.

only freezer work both in recovery and normal kernel modes.

do you need rest of dmesg suspend debug logs (after echoing of processor, platform and devices)?
Comment 96 Rafael J. Wysocki 2011-04-26 17:00:26 UTC
(In reply to comment #95)
> i dont know why, but running echo core > /sys/power/pm_test in recovery mode
> does not result in system shutdown. same goes with processors, platform and
> devices echoed, which cause system to shutdown in standard kernel mode.

What do you mean by "shutdown"?  The test modes are supposed to simulate
suspend without putting the system into the sleep state (i.e. they should
return to command prompt after several seconds).

> only freezer work both in recovery and normal kernel modes.
> 
> do you need rest of dmesg suspend debug logs (after echoing of processor,
> platform and devices)?

If they don't work as intended, then yes, we do.
Comment 97 tadziu23 2011-04-26 20:00:06 UTC
as i said in comment #92, by shutdown i mean that executing those two commands 'echo core > /sys/power/pm_test' and 'echo mem > /sys/power/state' one after another cause my machine to simply turn off. i've used word shutdown and halt because that is what happens when you execute command 'shutdown -h 0', but the difference is, that when testing suspend it just turns off immediately. 

i guess that all of tests (core, processors and so on) work properly in recovery mode, since my machine does not shutdown and whole system run flawlessly after test.
Comment 98 Rafael J. Wysocki 2011-04-26 20:28:40 UTC
Yes, the tests appear to work correctly in the recovery mode.

What exactly is the difference between the recovery mode and the normal
working mode (at least from the kernel's perspective)?
Comment 99 Zhang Rui 2011-04-27 07:24:39 UTC
recovery mode equals single mode?

so it seems that some device driver breaks the resume?
Comment 100 tadziu23 2011-04-27 14:36:42 UTC
i meant single user mode (called recovery in grub menu) and multiuser mode with network services. 

it is possible that it is some device driver, but i dont use any properitary or closed source drivers. problem occures using pure debian squeeze even without x server installed. 

i cant find that link right now, but this bug was also present in windows xp with one version of the nvidia graphics driver, but i'm not shure if it is right clue since i dont use either windows or linux nvidia drivers.  

it's just idea but this bug is common for msi pr200 and msi ex600x and i haven't found any bug reports for other msi laptops, so maybe comapring hardware specs could narrow suspected driver? 

i've also tried contacting msi poland with that problem, but only answer i got was that they don't support linux (what could i expect anyway).
Comment 101 Zhang Rui 2012-01-18 01:41:56 UTC
It's great that kernel bugzilla is back.

can you please verify if the problem still exists in the latest upstream kernel?
Comment 102 tadziu23 2012-01-19 18:48:21 UTC
problem persist with 3.2.1 kernel. 

as soon as i will finish writing my phd i really can donate this problematic machine to some developer since (however still working) its falling apart by itself ;) 

cheers
Comment 103 tadziu23 2012-01-19 19:30:49 UTC
Created attachment 72134 [details]
dmesg output from suspend/resume

dmesg output created with script from comment #77
Comment 104 Len Brown 2012-06-05 04:41:35 UTC
[  568.550249] [drm] Initialized drm 1.1.0 20060810
[  568.607973] [drm:i915_init] *ERROR* drm/i915 can't work without intel_agp module!

Can you fix this config issue?
Comment 105 Ondrej Zary 2012-10-22 06:26:09 UTC
I have the same problem with MSI EX600. It's still present in 3.6 kernel.
2.6.18-amd64 works but 2.6.18-i386 does not. 2.6.22 does not work. I'm going to test if commit 91a6c462b02d8dc02dbe95e5a407d78078a38d01 really breaks it.
Comment 106 Ondrej Zary 2012-10-22 15:57:12 UTC
Resume works with 2.6.22 amd64 (x86_64) kernel when booted using linux16 command of grub2 but does not work when booted using linux command.
Resume does not work with 2.6.23 even with linux16.
2.6.21 and older can only be booted with linux16.
Comment 107 H. Peter Anvin 2012-10-22 16:34:05 UTC
[Slightly off topic]

The "linux" command in Grub2 is just plain broken.  "linux16" is the right thing on a BIOS platform; the fact that it is a non-obvious default is just another case of massive Grub2 brain damage.
Comment 108 Ondrej Zary 2012-10-22 17:11:46 UTC
Agreed, this crap causes various weird problems (like APM breakage on older machines). Just downgraded to grub-legacy for the rest of this testing.

Bisect between 2.6.22 and 2.6.23 does not seem to work, the first kernel was non-bootable and "git bisect skip" produced 4 more unbootable kernels so I gave up.

But reverting commit "91a6c462b02d8dc02dbe95e5a407d78078a38d01" (and also "c39736823232bc3ca113c8228fa852c09fba300e" for the build to work) produced 2.6.23 kernel that resumes properly.
Comment 109 Ondrej Zary 2012-10-22 19:41:41 UTC
Found out by copying parts of x86_64 setup.S to i386 setup.S in 2.6.22 kernel that the problem is A20-related. When the x86_64 A20 code is put into i386 setup.S, resume works with i386 kernel.

More testing revealed that the a20_test always succeeds, preventing any A20 switching. This allowed me to produce a 3.6 kernel with working resume by commenting out all a20_test_short() and a20_test_long() calls (except the last one) in enable_a20() in arch/x86/boot/a20.c
Comment 110 Ondrej Zary 2012-10-22 21:57:23 UTC
Seems that the BIOS requires enable_a20_kbc() even if A20 seems to be enabled. Luckily, it can be done later (just did it with a userspace program on running system and resume worked then).

So the early code can be left unmodified and a DMI-based quirk can be created that will do this later. Where should that code be put? Maybe into i8042.c?

Also I wonder what Windows does with A20...
Comment 111 H. Peter Anvin 2012-10-22 22:14:10 UTC
OK, I guess what happens is that the bootloader (or possibly the BIOS) probably enables E820 via port 92h whereas the BIOS expects it to have been enabled via the KBC.  There is a reason the Linux code does the order BIOS, KBC, port 92h even though port 92h is faster.

What I'd like to know is if calling enable_a20_bios() unconditionally works on your system (i.e. disable the first a20_test_short() only?)
Comment 112 Ondrej Zary 2012-10-23 06:11:51 UTC
I've already tried that - it does not work, unfortunately.
Comment 113 Ondrej Zary 2012-10-23 19:23:39 UTC
Some tests in DOS show that A20 is disabled on boot. BIOS A20 functions (enable/disable/read status) seem to work correctly. And debugger shows that BIOS uses port 92h...

So probably GRUB enables A20 using BIOS. Linux would do the same if A20 was disabled. And BIOS itself is buggy, INT 15h uses 92h but resume requires KBC. 

Is there anything that we can do except adding some DMI-based quirk?
Comment 114 Ondrej Zary 2012-10-23 21:18:35 UTC
Created attachment 84551 [details]
[PATCH] Enable A20 using KBC for some MSI laptops



This patch fixes the problem on my EX600 laptop and should also fix it on EX700, GX700, VR201, VR601 and PR200 (list and DMI data found in bug reports at Ubuntu Launchpad). Patched kernel works also with Grub2 linux command, both i386 and x86_64.

Is something like this acceptable?
Comment 115 frank 2012-10-23 21:43:04 UTC
works on PR200, thank you!
Comment 116 H. Peter Anvin 2012-10-24 00:52:20 UTC
Seems reasonable to me.  Send me the patch with proper header and Signed-off-by: and I'll apply it.
Comment 117 Eddy Petrișor 2012-10-24 17:11:01 UTC
(In reply to comment #114)
> Created an attachment (id=84551) [details]
> [PATCH] Enable A20 using KBC for some MSI laptops
> 
> 
> This patch fixes the problem on my EX600 laptop and should also fix it on
> EX700, GX700, VR201, VR601 and PR200 (list and DMI data found in bug reports at
> Ubuntu Launchpad). Patched kernel works also with Grub2 linux command, both
> i386 and x86_64.
> 
> Is something like this acceptable?

I have a MSI PR200, over what tree should this patch be applied?
I want to test this myself, too.
Comment 118 Ondrej Zary 2012-10-24 18:15:04 UTC
I've created it with 3.6-rc5. I hope that it will apply to 3.7 too.
Comment 119 Eddy Petrișor 2012-10-25 07:45:53 UTC
(In reply to comment #118)
> I've created it with 3.6-rc5. I hope that it will apply to 3.7 too.

It works for my MSI PR200 (patch applied over v3.6.0). Thank you very much for this fix.

Peter, do you know which official release will be the first to contain this patch?
Comment 120 Ondrej Zary 2012-12-13 08:12:20 UTC
Finally, a much simpler patch was merged:
http://git.kernel.org/tip/ad68652412276f68ad4fe3e1ecf5ee6880876783
Comment 121 Len Brown 2013-02-08 18:35:06 UTC
commit ad68652412276f68ad4fe3e1ecf5ee6880876783
Author: Ondrej Zary <linux@rainbow-software.org>
Date:   Tue Dec 11 22:18:05 2012 +0100

    x86, 8042: Enable A20 using KBC to fix S3 resume on some MSI laptops


shipped in 3.8-rc1

closed.
Comment 122 Eddy Petrișor 2013-07-28 20:41:47 UTC
In case somebody is wondering, I have been using kernel 3.10 which contains this patch and haven't encountered any issues related to sleep (except some oops-es, butthose are from the video driver).
Comment 123 tadziu23 2014-02-09 10:12:14 UTC
is that patch removed from 3.12 kernel? hibernation is broken again the same way after updating from 3.8 in debian.
Comment 124 Alan 2014-02-10 10:18:46 UTC
It's still present. Please try and find which kernel version your laptop fails at again. Probably some other bug
Comment 125 tadziu23 2014-02-10 12:28:48 UTC
3.12-1-amd64. it is debian jessie default kernel in main branch.  

yesterday i've recompiled 3.12 kernel with applied patch provided here by Ondriej Zary and hibernation works again. and i've copied .config form debian stock kernel if that matters.
Comment 126 Ondrej Zary 2014-02-10 12:57:32 UTC
The patch attached to this bug report is old version and completely different from the patch present in upstream kernel:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ad68652412276f68ad4fe3e1ecf5ee6880876783
Comment 127 tadziu23 2014-02-10 19:41:06 UTC
(In reply to Ondrej Zary from comment #126)
> The patch attached to this bug report is old version and completely
> different from the patch present in upstream kernel:

yes i know. to clear some things up:

- for debian 3.2 stock kernel hibernation works, patch had to be applied by debian devs)
- 3.8 kernel was compiled by myself without additional applying patch (yours or upstream) hibernation was working.
- upgraded wheezy to jessie, 3.12 stock kernel has broken hibernation 
- recompilation of 3.12 kernel with upstream patch (taken from comment 120) failed for me (i could not apply patch - hard to debug for me as i'm not skilled coder, i've had error in line 6)
- recompilation with your patch went all good, hibernation working again in 3.12

i can try to make some debug logs with stock 3.12 using method from comment 83
but i can take few days as i have very little free time recently.

Note You need to log in before you can comment on or make changes to this bug.