Latest working kernel version: - Earliest failing kernel version: 2.6.25 Distribution: Debian GNU/Linux Lenny Hardware Environment: MSI PR200WX-058EU laptop (Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 03)) Software Environment: Debian Lenny (GNOME) Problem Description: When trying to put the laptop to sleep, it works without any problem, but when I press the power button to recover from sleep the laptop seems to recover, but the screen remains blank and the fans start to spin up and nothing can be done until I press the power button once more. At that point the laptop restarts. Neither opening the lid nor pressing other keys wake up the laptop from sleep, but this is consistent with the way Windows wakes up. The laptop recovers fine from sleep in Windows XP. The DSDT disassembly shows different paths for Windows and Linux. Steps to reproduce: 1. put the laptop to sleep from gnome-power-manager 2. try to wake up the laptop by pressing the power button 3. the screen remains blank
what's the X server version of your system? please verify if this bug still exists in the latest kernel. please attach the dmesg output after boot. please attach the acpidump output.
Hi, Eddy Will you please load the i915 driver under the console mode and see whether the box can be resumed? It will be great if you can confirm whether it can't be resumed from S3 or the screen is blank. Will you please add the boot option of "acpi_sleep=beep" and do the following test? a. kill the process which is using /proc/acpi/event b. dmesg >dmesg_before; echo mem > /sys/power/state; dmesg >dmesg_after; sync; c. press the power button and see whether the box can be resumed. If it can't be resumed, please reboot the box and check whether there exists the file of dmesg_after. Thanks.
(In reply to comment #1) > what's the X server version of your system? 0 eddy@heidi ~/usr/src/linux/linux-2.6 $ Xorg -version X.Org X Server 1.4.2 Release Date: 11 June 2008 X Protocol Version 11, Revision 0 Build Operating System: Linux Debian (xorg-server 2:1.4.2-10) Current Operating System: Linux heidi 2.6.29-rc7-heidi #1 SMP Fri Mar 13 00:21:39 EET 2009 x86_64 Build Date: 09 January 2009 02:16:05AM Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Module Loader present > please verify if this bug still exists in the latest kernel. TBH, the kernel was not 2.6.29.rc7, but it was this git version: 0 eddy@heidi ~/usr/src/linux/linux-2.6 $ git show commit ebdcc81c71937b30e09110c02a1e8a21fa770b6f Merge: 01f6750... 260cf8a... Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Mar 11 12:14:55 2009 -0700 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm: fix EDID parser problem with positive/negative hsync/vsync Do I need to test the current master? > please attach the dmesg output after boot. > please attach the acpidump output. The acpidump is at http://bugzilla.kernel.org/show_bug.cgi?id=10855#c11
I forgot to say, before using this kernel, I was trying to address some issues in the older kernels and I am booting regularly with the following kernel options: quiet ec_intr=0 usbcore.autosuspend=1 or quiet ec_intr=0 usbcore.autosuspend=1 splash video=intelfb (The second variant is needed to allow splashy to work). Do these matter now? Do I have to remove any of them?
Created attachment 20541 [details] dmesg, fresh right after boot (In reply to comment #1) > please attach the dmesg output after boot. This is the dmesg output right after boot, in console mode, gdm not started.
Created attachment 20542 [details] dmesg after reboot with i915 loaded
(In reply to comment #2) > Hi, Eddy > Will you please load the i915 driver under the console mode and see > whether > the box can be resumed? > It will be great if you can confirm whether it can't be resumed from S3 > or > the screen is blank. Actually my initial report was somewhat inaccurate. When pressing the power button during sleep the following happen: 1. laptop seems to wake up, screen remains blank 2. after about 1 or 2 seconds from the press the laptop reboots itself without me doing anything 3. after the self triggered reboot the screen remains blank/black and looks as if the laptop doesn't do anything while the fans keep on spinning In order to recover I have to press the power button (long press). After these things occur, the led indicating sleep remains lit (during sleep it flashes). I tried to boot in Windows and put the laptop in stand-by then recover in the hope it will shut down the led, but it didn't (after recovering from sleep, the led was on). > Will you please add the boot option of "acpi_sleep=beep" and do the > following test? I suspect you are talking about a single test in this paragraph as well as in the previous one. If not, please elaborate. > a. kill the process which is using /proc/acpi/event I stopped acpid (invoke-rc.d acpid stop). > b. dmesg >dmesg_before; echo mem > /sys/power/state; dmesg >dmesg_after; > sync; > c. press the power button and see whether the box can be resumed. > If it can't be resumed, please reboot the box and check whether there > exists the file of dmesg_after. The laptop didn't resume (see details above). No dmesg_after was written.
It seems I forgot to say that the laptop hibernates and recovers properly.
(In reply to comment #3) > (In reply to comment #1) > > TBH, the kernel was not 2.6.29.rc7, but it was this git version: > > 0 eddy@heidi ~/usr/src/linux/linux-2.6 $ git show > commit ebdcc81c71937b30e09110c02a1e8a21fa770b6f > Merge: 01f6750... 260cf8a... > Author: Linus Torvalds <torvalds@linux-foundation.org> > Date: Wed Mar 11 12:14:55 2009 -0700 > > Merge branch 'drm-fixes' of > git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 > > * 'drm-fixes' of > git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: > drm: fix EDID parser problem with positive/negative hsync/vsync > > > > Do I need to test the current master? > No, this one is okay. (In reply to comment #7) > (In reply to comment #2) > > Will you please add the boot option of "acpi_sleep=beep" and do the > > following test? > > I suspect you are talking about a single test in this paragraph as well as in > the previous one. If not, please elaborate. > can you hear the beep after pressing the power button? another some things to verify, 1. problem exists in all the kernels you've tried. 2. the symptom is the same in both console/X mode 3. problem doesn't exist in Windows right?
(In reply to comment #9) > (In reply to comment #3) > > (In reply to comment #1) > > > > TBH, the kernel was not 2.6.29.rc7, but it was this git version: > > > > 0 eddy@heidi ~/usr/src/linux/linux-2.6 $ git show > > commit ebdcc81c71937b30e09110c02a1e8a21fa770b6f [..] > > > > Do I need to test the current master? > > > No, this one is okay. Great! > (In reply to comment #7) > > (In reply to comment #2) > > > Will you please add the boot option of "acpi_sleep=beep" and do the > > > following test? > > > > I suspect you are talking about a single test in this paragraph as well as > in > > the previous one. If not, please elaborate. > > > can you hear the beep after pressing the power button? No. I suspected that should happen after passing that option, but I forgot to report. > another some things to verify, > 1. problem exists in all the kernels you've tried. All kernels I have tried had this issue in: 2.6.25, 2.6.26, 2.6.27, 2.6.29.rc7+ . The 2.6.25 and 2.6.26 kernels were the stock ones from Debian. 2.6.27 and 2.6.29.rc7+ were compiled from linus' git (2.6.27 from the tag, 2.6.29.rc7 from ebdcc81c71. I don't recall explicitly trying sleep with 2.6.24 since it's stay on my laptop was short lived, but since it was the first linux kernel installed I suspect I tried and it failed. > 2. the symptom is the same in both console/X mode If there's no answer within 2-3 hours please consider an implicit "yes, the same happens in X". I'll submit this comment and add a reply stating otherwise, if the behaviour is different for X (I have to leave for work and sleep would be the last thing I'd try). > 3. problem doesn't exist in Windows Yes > right? >
Will you please try the boot option of "acpi_sleep=old_ordering" and do the test as mentioned in comment #2? Thanks.
please, 1. set CONFIG_PM_DEBUG and rebuild the kernel 2. boot into the new kernel 3. echo {freezer, devices, platform, processor, core} > /sys/power/pm_test 4. echo mem > /sys/power/state 5. please attach the dmesg output when pm_test==core if the system can come back in a few seconds
(In reply to comment #11) > Will you please try the boot option of "acpi_sleep=old_ordering" and do the > test as mentioned in comment #2? I tried with old_ordering with and without acpi_sleep=beep, the result was the same.
Created attachment 20599 [details] pm_debug: dmesg after recovery (pm_test==core)
(In reply to comment #12) > please, > 1. set CONFIG_PM_DEBUG and rebuild the kernel > 2. boot into the new kernel > 3. echo {freezer, devices, platform, processor, core} > /sys/power/pm_test I wasn't sure if these should be given in a sequence as: echo freezer > /sys/power/pm_test echo devices > /sys/power/pm_test echo platform > /sys/power/pm_test echo processor > /sys/power/pm_test echo core > /sys/power/pm_test then step 4 or if they should have been like: echo freezer > /sys/power/pm_test 4. echo devices > /sys/power/pm_test 4. echo platform > /sys/power/pm_test 4. echo processor > /sys/power/pm_test 4. echo core > /sys/power/pm_test I did the sequence as in the first variant. > 4. echo mem > /sys/power/state > 5. please attach the dmesg output when pm_test==core if the system can come > back in a few seconds It came back and I did a dmesg before and after the sleep. They are both attached.
Err, I meant I attached the after, since it didn't made sense to add the before if after was present.
Created attachment 20603 [details] pm_debug: dmesg after recovery (pm_test==core) - without old_ordering Because of the previous tests, my system defaulted to booting with acpi_sleep=old_ordering, which (from a diff) looks relevant to my untrained eye. This dmesg is also obtained in the same way after resuming from sleep, pm_test==core, but without old_ordering.
Linux kernel seems to work perfectly during suspend/resume. this is probably a BIOS/Hardware issue. please apply the debug patch attached below on top of 2.6.29-rc8 kernel, reboot with boot option "acpi_sleep=s3_sci_enable" and see if there is any difference.
Created attachment 20633 [details] patch: introduce acpi_sleep=s3_sci_enable
2.6.29 was just released. Is OK if I try that kernel with your patch?
yes, please. :)
(In reply to comment #21) > yes, please. :) I tired the 2.6.29-rc8 with your patch, the result was the same. Note that I still have in my boot parameters "quiet ec_intr=0 usbcore.autosuspend=1". Does this matter in any way? I am compiling the 2.6.29 with your patch, and try that too. Maybe something better works.
Same thing happened with 2.6.29 with your patch.
> acpi_sleep=beep was there any sound from the "PC speaker" on (failed) resume when you used this option?
(In reply to comment #24) > > acpi_sleep=beep > > was there any sound from the "PC speaker" on (failed) resume when you used > this > option? No. I would have said if there was anything different.
Created attachment 20782 [details] use the RTC cmos area(0x60-0x64) to track whether suspend/resume hangs Will you please use the debug patch on the latest kernel(2.6.29) and do the following test? a.echo 25 > /proc/cmos ; echo mem > /sys/power/state so that the box enters the suspend state b. press the power button. If the box can't be resumed, please reboot the system. c. after the system is rebooted, please cat /proc/cmos and attach the output of dmesg. Thanks.
At some point in the past I extracted the DSDT and saw that Linux had a different table than Windows. Is this relevant in any way?
(In reply to comment #26) > Created an attachment (id=20782) [details] > use the RTC cmos area(0x60-0x64) to track whether suspend/resume hangs > > Will you please use the debug patch on the latest kernel(2.6.29) and do the > following test? By "the debug patch" you mean this patch, right? Yes, no problem. > a.echo 25 > /proc/cmos ; echo mem > /sys/power/state so that the box > enters > the suspend state > b. press the power button. If the box can't be resumed, please reboot the > system. > c. after the system is rebooted, please cat /proc/cmos and attach the > output > of dmesg. > > Thanks. No, thank you!
(In reply to comment #28) > (In reply to comment #26) > > Created an attachment (id=20782) [details] [details] > > use the RTC cmos area(0x60-0x64) to track whether suspend/resume hangs > > > > Will you please use the debug patch on the latest kernel(2.6.29) and do the > > following test? > > By "the debug patch" you mean this patch, right? > > Yes, no problem. > > > a.echo 25 > /proc/cmos ; echo mem > /sys/power/state so that the box > enters > > the suspend state > > b. press the power button. If the box can't be resumed, please reboot > the > > system. > > c. after the system is rebooted, please cat /proc/cmos and attach the > output > > of dmesg. I installed the 2.6.29 with the last patch (I also have the s3_sci_enable patch applied, too) and did what you asked me. As usual, nothing changed, but I have the information you requested after booting with these parameters: 0 eddy@heidi ~ $ cat /proc/cmdline BOOT_IMAGE=Linux ro root=fe00 quiet ec_intr=0 usbcore.autosuspend=1 splash video=intelfb acpi_sleep=beep
Created attachment 20792 [details] /proc/cmos after reboot After the failed sleep recovery the BIOS detected that the CMOS checksums were broken. I suspect this is correct behaviour since CMOS memory is written to. I booted with what the BIOS offered as fail-safe settings and got this out of /proc/cmos
Created attachment 20793 [details] dmesg after reboot with cmos hack dmesg output after rebooting the 2.6.29+cmos kernel, post sleep recovering failure.
Are there any new things/patches I should try?
There is a last patch in bug #12011, could you try it?
(In reply to comment #33) > There is a last patch in bug #12011, could you try it? I tried the patch over the 2.6.29 kernel and I got the same result. I made sure ec_intr wasn't present in the boot command line. Note that I don't seem to have any issues with battery status disappearing with any of the kernels I tired which were newer 2.6.27 (including 2.6.27).
(In reply to comment #34) > (In reply to comment #33) > > There is a last patch in bug #12011, could you try it? > > I tried the patch over the 2.6.29 kernel and I got the same result. I made > sure > ec_intr wasn't present in the boot command line. > > Note that I don't seem to have any issues with battery status disappearing > with > any of the kernels I tired which were newer 2.6.27 (including 2.6.27). Oh, and the kernel was the 2.6.29 kernel with the patches proposed before in this bug report: commit a90b2eeeb208567241aa21eced696ec010a9b6cc Author: Eddy Petrișor <eddy.petrisor@gmail.com> Date: Tue Apr 14 09:20:05 2009 +0300 merge modes and disable burst #2 (patch from #12011) Burst mode should be automatically disabled by controller, if it is not accessed for 400us. Now there is a delay of 550us and some are saying that 550us is better. Thus, enabling of burst mode in first place seems to be a wrong move. commit 8877cd8b24c26dfc7560f94d83b90e95e1d4d58f Author: Eddy Petrișor <eddy.petrisor@gmail.com> Date: Fri Apr 3 12:04:32 2009 +0300 Use the RTC cmos area to track where suspend/resume hangs commit 6fd63c2f584e62355675c1735020acb3e4fad76f Author: Eddy Petrișor <eddy.petrisor@gmail.com> Date: Tue Mar 24 11:02:59 2009 +0200 Introduce kernel parameter acpi_sleep=s3_sci_enable some laptop requires SCI_EN being set directly on resume, or else they hung somewhere in the resume code path. We already have a blacklist for these lattops but we still needs this option, especially for debugging some suspend/resume problems. Signed-off-by: Zhang Rui <rui.zhang@intel.com> --- arch/x86/kernel/acpi/sleep.c | 4 ++++ drivers/acpi/sleep.c | 6 ++++++ include/linux/acpi.h | 3 +++ 3 files changed, 13 insertions(+) commit 8e0ee43bc2c3e19db56a4adaa9a9b04ce885cd84 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Mon Mar 23 16:12:14 2009 -0700 Linux 2.6.29
Hello, I just checked 2.6.30.rc6 (22ef37eed..) and it has the same problem wrt to sleep. Note that I don't experience battery miss-readings since 2.6.27.
Several entries on the net suggest, that this actually is a regression. It is said, that resume worked with kernel 2.6.18 and stopped working with kernel 2.6.22 onwards. I have a MSI VR201 which is the direct successor of the PR200 and is hardwarewise almost the same. The VR201 suffers from exactly the same bug (no wonder because HW and BIOS are almost the same as with the PR200 / Daru2 machines). If I can be of any help solving this issue, please let me know. I can offer testing.
well, I have no idea how to debug this bug. it would be great if you can run git-bisect to find out which commit introduces this regression.
(In reply to comment #37) > Several entries on the net suggest, that this actually is a regression. It is > said, that resume worked with kernel 2.6.18 and stopped working with kernel > 2.6.22 onwards. I have a MSI VR201 which is the direct successor of the PR200 > and is hardwarewise almost the same. The VR201 suffers from exactly the same > bug (no wonder because HW and BIOS are almost the same as with the PR200 / > Daru2 machines). If I can be of any help solving this issue, please let me > know. I can offer testing. GREAT NEWS! I tested with the Debian kernel 2.6.18 from Debian Etch, since it was the easiest way to get a kernel that old, and after resume the voice synthesizer started to acknowledge the laptop RESUMED properly from sleep, although the display was bla[nc]k the whole time (probably a display driver issue since X didn't start with that old kernel). I'll try to run a bisect on the kernel tree after I compile the pristine 2.6.18 to confirm it works with that version.
Created attachment 21890 [details] dmesg before sleep in 2.6.18-6-amd64 (working sleep/resume)
Created attachment 21891 [details] dmesg after sleep resume in 2.6.18-6-amd64 (working sleep/resume)
(In reply to comment #38) > well, I have no idea how to debug this bug. > it would be great if you can run git-bisect to find out which commit > introduces > this regression. I'll do that, now that I know is a regression ;-) .
ping Eddy, any updates?
I am having difficulties booting my self compiled kernels, although the config was the one from debian (with mild changes). The initramfs stops accusing some syntax error in the bootkeymap . Since I have root on LVM I am forced to use an initramfs. I can confirm that Debian's 2.6.24-etchnhalf.1-amd64 is bad, while Debian's 2.6.18 is good.
Could you please try vanilla 2.6.30?
(In reply to comment #45) > Could you please try vanilla 2.6.30? I tried it with vanilla 2.6.30 on my MSI VR201. Unfortunately that does not solve the issue.
I managed to do the bisect with this core script: 0 eddy@heidi ~/usr/src/linux $ cat /root/bin/sleepit #!/bin/sh FAILEDRESUME=/failed-resume RESUMED=/resumed modprobe i915 invoke-rc.d acpid stop echo "$(uname -r)" > $FAILEDRESUME dmesg >dmesg_before_$(uname -r); echo mem > /sys/power/state; dmesg >dmesg_after_$(uname -r); sync echo 'resumed, oh my god' > resumed echo "$(uname -r)" >> $RESUMED rm -f $FAILEDRESUME sync sleep 10 reboot So any kernel which ever failed had the /failed-resume file left behind after reboot. What I find strange is that, although I always had these in the command line, I haven't heard any beep, nor the speech that I did with older kernels (e.g. 2.6.18) Please note that I have never seen during the bisect the screen to recover properly and I did all the tests from the console (without X running - I disabled gdm) Please tell me if you need the dmesg files. This is the bad commit and appeared after 2.6.22 was released: 91a6c462b02d8dc02dbe95e5a407d78078a38d01 is first bad commit commit 91a6c462b02d8dc02dbe95e5a407d78078a38d01 Author: H. Peter Anvin <hpa@zytor.com> Date: Wed Jul 11 12:18:57 2007 -0700 Use the new x86 setup code for x86-64; unify with i386 This unifies arch/*/boot (except arch/*/boot/compressed) between i386 and x86-64, and uses the new x86 setup code for x86-64 as well. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Looking at the comments of the commit I realised I should clarify that I am running an x86_64 kernel with x86_64 userland (Debian GNU/Linux Lenny 5.0, amd64)
the commit is changing the boot, but your bug is about suspend/resume, sounds not related. Can you double check please?
Taking into account that I did this bisect in an almost automatic fashion, I have NO reason to believe that anything was wrong with the bisect*. OTOH, if all the HW components are not properly set up (the code is about hardware set up at boot time) it could be possible to mess up part of the behaviour. I would suggest talking to Peter Anvin if he has any idea which part of the setup procedure could be responsible about this bug. * in spite of that I will double check that the commit in question is the one responsible for this (of course, the problem is in a previous commit part of the set up code rewrite, but this one enables the new set up code on my arch)
Quite likely, none! Unfortunately it could be such that slight differences in the handing of especially graphics might trigger BIOS bugs, but that's like finding a needle in a haystack. Now, the *current* code uses the new code for the resume path as well, but that is not the commit you fingered...
(In reply to comment #51) > Quite likely, none! Sorry, could you clear up to what were you answering here? Is it this part: "if he has any idea which part of the setup procedure could be responsible about this bug" ? > Unfortunately it could be such that slight differences in the handing of > especially graphics might trigger BIOS bugs, but that's like finding a needle > in a haystack. And aiui, I can't simply revert part some commits in the rewrite commits and expect things to work or even compile (e.g. revert 91a6c462b02~X and hope we can narrow down the issue). Is that correct or do I have a chance to be able to boot such a kernel? Note that the working kernels didn't resumed the graphics card properly and the screen was always blank after resume. > Now, the *current* code uses the new code for the resume path as well, but > that > is not the commit you fingered... By current I assume you mean linux-2.6 master, right? I could try to see if it was fixed by any chance. The last time I tried to compile, it was right after 2.6.30 release and that kernel didn't boot for me (but it might have been incorrect configuration).
I just tried the linux-image packages I created when I did the tests and with those packaged kernels indeed the bad commit seems to be the one indicated before. Unless I screwed up really big time (although I remember the automation script worked properly that time, too) and versioned the wrong kernel I am 100% sure that commit screwed up sleep for my machine and I suspect that with a 32-bit kernel the failing kernel would have been 4fd06960 (aka 91a6c462~1), but I don't have a system with 32 bit userland to test.
I tried the distribution packaged 2.6.30 (2.6.30-bpo.1-amd64) and it doesn't resume.
Hi, Eddy Will you please do the following test on the 2.6.30 distribution? a. kill the process which is using /proc/acpi/event b. dmesg >dmesg_before; echo mem > /sys/power/state; dmesg >dmesg_after; sync; c. press the power button and see whether the box can be resumed. If it can't be resumed, please reboot the box and check whether there exists the file of dmesg_after If there is no file of dmesg_after, maybe it can't be resumed from BIOS. Thanks.
(In reply to comment #55) > Hi, Eddy > Will you please do the following test on the 2.6.30 distribution? > a. kill the process which is using /proc/acpi/event > b. dmesg >dmesg_before; echo mem > /sys/power/state; dmesg >dmesg_after; > sync; > c. press the power button and see whether the box can be resumed. > If it can't be resumed, please reboot the box and check whether there > exists the file of dmesg_after Something like the script I used? 0 eddy@heidi ~ $ cat /root/bin/sleepit #!/bin/sh FAILEDRESUME=/failed-resume RESUMED=/resumed modprobe i915 invoke-rc.d acpid stop echo "$(uname -r)" > $FAILEDRESUME dmesg >dmesg_before_$(uname -r); echo mem > /sys/power/state; dmesg >dmesg_after_$(uname -r); sync echo 'resumed, oh my god' > resumed echo "$(uname -r)" >> $RESUMED rm -f $FAILEDRESUME sync sleep 10 reboot > If there is no file of dmesg_after, maybe it can't be resumed from BIOS. The machine is perfectly capable or resuming; as I said before, I did a bisect and some kernels before the commit I indicated did resume properly. 0 eddy@heidi /root/var/debug/sleep/regression $ ls -l */dmesg_after_* -rw-r--r-- 1 root root 32896 2009-07-02 02:26 2.6.18-128.el5/dmesg_after_2.6.18-128.el5 -rw-r--r-- 1 root root 30945 2009-06-19 03:03 2.6.20-rc2-g0f5486ec-heidi/dmesg_after_2.6.20-rc2-g0f5486ec-heidi -rw-r--r-- 1 root root 43534 2009-06-20 12:05 2.6.22-g0a85e9a2-heidi/dmesg_after_2.6.22-g0a85e9a2-heidi -rw-r--r-- 1 root root 35077 2009-06-20 01:12 2.6.22-g0c73f18b-heidi/dmesg_after_2.6.22-g0c73f18b-heidi -rw-r--r-- 1 root root 42770 2009-06-20 21:39 2.6.22-g4fd06960-heidi/dmesg_after_2.6.22-g4fd06960-heidi -rw-r--r-- 1 root root 41439 2009-06-20 02:09 2.6.22-g4fda25a2-heidi/dmesg_after_2.6.22-g4fda25a2-heidi -rw-r--r-- 1 root root 33534 2009-06-19 10:21 2.6.22-g7dcca30a-heidi/dmesg_after_2.6.22-g7dcca30a-heidi -rw-r--r-- 1 root root 42821 2009-06-20 05:01 2.6.22-g7e69c3ac-heidi/dmesg_after_2.6.22-g7e69c3ac-heidi -rw-r--r-- 1 root root 42962 2009-06-20 11:35 2.6.22-gc6e16295-heidi/dmesg_after_2.6.22-gc6e16295-heidi -rw-r--r-- 1 root root 42871 2009-06-20 03:41 2.6.22-gf2d98ae6-heidi/dmesg_after_2.6.22-gf2d98ae6-heidi
Isn't anyone able to help with the fix for the bug? Is clear the bug originates in the new start up code. I am willing to test different patches to help diagnose what needs to be changed to be able to put the laptop into sleep mode again.
(In reply to comment #18) > Linux kernel seems to work perfectly during suspend/resume. > this is probably a BIOS/Hardware issue. > please apply the debug patch attached below on top of 2.6.29-rc8 kernel, > reboot > with boot option "acpi_sleep=s3_sci_enable" and see if there is any > difference. hah, I did not see "acpi_sleep=s3_sci_enable". could you please re-do the test and verify if the patch in comment #19 works WITH boot option "acpi_sleep=s3_sci_enable"?
(In reply to comment #58) > (In reply to comment #18) > > Linux kernel seems to work perfectly during suspend/resume. > > this is probably a BIOS/Hardware issue. > > please apply the debug patch attached below on top of 2.6.29-rc8 kernel, > reboot > > with boot option "acpi_sleep=s3_sci_enable" and see if there is any > difference. > > hah, I did not see "acpi_sleep=s3_sci_enable". > could you please re-do the test and verify if the patch in comment #19 works > WITH boot option "acpi_sleep=s3_sci_enable"? I have just done that test. Unfortunately that does not change anything. Logs are to follow asap.
Created attachment 22975 [details] Output of dmesg before Suspend with Kernel 2.6.29 sci enabled I used Eddy's script, as the machine did not resume properly there is no dmesg after resume. As wished sci was enabled.
Please try to suspend with 2.6.32-rc3. If resume still doesn't work, please attach full dmesg output.
(In reply to comment #61) > Please try to suspend with 2.6.32-rc3. > > If resume still doesn't work, please attach full dmesg output. Not only it doesn't work, there is again the regression regarding the correct reading of the battery information which was fixed in 2.6.30.
Created attachment 23355 [details] dmesg before sleep with 2.6.32-rc2
(In reply to comment #62) > (In reply to comment #61) > > Please try to suspend with 2.6.32-rc3. > > > > If resume still doesn't work, please attach full dmesg output. > > Not only it doesn't work, there is again the regression regarding the correct > reading of the battery information which was fixed in 2.6.30. ... that is bug #10855
(In reply to comment #62) > (In reply to comment #61) > > Please try to suspend with 2.6.32-rc3. > > > > If resume still doesn't work, please attach full dmesg output. > > Not only it doesn't work, there is again the regression regarding the correct > reading of the battery information which was fixed in 2.6.30. (In reply to comment #63) > Created an attachment (id=23355) [details] > dmesg before sleep with 2.6.32-rc2 The system doesn't resume either if ec_intr=0 is not passed as a boot parameter.
(In reply to comment #51) > Quite likely, none! > > Unfortunately it could be such that slight differences in the handing of > especially graphics might trigger BIOS bugs, but that's like finding a needle > in a haystack. > > Now, the *current* code uses the new code for the resume path as well, but > that > is not the commit you fingered... I was looking over this BR and I wanted to point out that currently the sleep LED behaviour is not correct and since the first tests about sleep it remained lit (while the machine is on, obviously) and booting in Windows or an older kernel did not change things. OTOH, testing sleep with the newer 2.6.32-rc3 I have observed that, although the sleep-resume cycle still doesn't work properly, there is a slight change in behaviour: Before: - trigger sleep - pressing the power button for resume resulted in: - some activity - auto shut-down - trying to power the laptop again would result in an "on" cycle which didn't initialize the graphics card properly and the fans would speed (probably due to some infinite cycle) - to recover I had to power off again, then power on With the new kernel the sequence is the same up until the auto shut-down (including it), but then, when trying to power on the laptop doesn't result in the fake failing power cycle. Another idea, could someone help me come up with a patch that would enable me to use even the incremental development commits of the setup sequence so I could pin point which of the setup changes is actually at fault? AIUI, now I am actually pointing to a 'blob' in a way, so I would like to pin point which specific change in that blob is at fault.
(In reply to comment #62) > Not only it doesn't work, there is again the regression regarding the correct > reading of the battery information which was fixed in 2.6.30. For people hitting this issue, this is bug #14446.
There is not much going on here anymore. Is there anything I can do? Test something new for example?
well, this is a tough bug. I don't know how to debug this issue for now.
So 2.6.18 suspend/resume worked on this system, but with no video restore. 2.6.22 through 2.6.32 fail. Does it still fail when using the most recent stable kernel, 2.6.37?
I have just tested kernel 2.6.37 using an Ubuntu Mainline Kernel. Unfortunately the problem still exists.
(In reply to comment #70) > So 2.6.18 suspend/resume worked on this system, but with no video restore. > 2.6.22 through 2.6.32 fail. > > Does it still fail when using the most recent stable kernel, 2.6.37? I will try and tell you. Sorry for the delayed response. I have been hitting other issues related to display corruption after hibernate-resume cycles, I hope this won't impact this issue.
I tried 2.6.37 and it has the same problem. Fake reboot, too (see comments above).
Created attachment 52912 [details] dmesg before sleep This is the dmesg output before sleep obtained with 2.6.37.
Created attachment 52922 [details] sleeptest: the script I used for the bisect (useful on debian based systems) I used this script to make the git bisect and identify the bad and the good versions. It relies on a stable kernel (has no git hash in the uname), a git source of the vanilla kernel, make-kpkg (debian utility to make kernel packages), linux-build (a wrapper I wrote to build the vanilla kernels) and dpkg. It almost does everything automatically, from performing the test to tracking the results and building new versions to test. I thought that by making this public I will help others test themselves this regression.
I changed the "Regression" value to Yes.
Created attachment 52932 [details] sleepit: the script that tries to extract the dmesgs before and after the sleep This script appends to /resumed all kernel versions which managed to resume and will leave in /failed-resume the 'uname -r' of the last failed to resume kernel. Currently my /resumed file contains: 2.6.20-rc2-g0f5486ec-heidi 2.6.22-g7dcca30a-heidi 2.6.22-g0c73f18b-heidi 2.6.22-g4fda25a2-heidi 2.6.22-gf2d98ae6-heidi 2.6.22-g7e69c3ac-heidi 2.6.22-gc6e16295-heidi 2.6.22-g0a85e9a2-heidi 2.6.22-g4fd06960-heidi 2.6.22-g4fd06960-heidi 2.6.18-128.el5
Created attachment 52942 [details] linux-build: the script that does the kernel building and creates the deb package for it This is the script I use to build kernel .debs . With this script, my entire test frame for this bug is public, in case somebody else wants to do a bisect for themselves.
(In reply to comment #27) > At some point in the past I extracted the DSDT and saw that Linux had a > different table than Windows. Is this relevant in any way? Oh, how did you know Windows and Linux are using different ACPI tables?
my computer is affected with exactly same bug on msi ex 600 x machine. just reporting that 2.6.38 did not resolved the issue.
*** Bug 33752 has been marked as a duplicate of this bug. ***
please build a 2.6.38 kernel with CONFIG_ACPI_DEBUG set, and then run 1. echo core > /sys/power/pm_test echo mem > /sys/power/state and then attach the dmesg output after this time. 2. echo 1 > /sys/power/pm_trace echo mem > /sys/power/state and then attach the dmesg output of the next boot after the hang.
please build a 2.6.38 kernel with CONFIG_ACPI_DEBUG set, and then run 1. echo core > /sys/power/pm_test echo mem > /sys/power/state and then attach the dmesg output after this test. 2. echo 1 > /sys/power/pm_trace echo mem > /sys/power/state and then attach the dmesg output of the next boot after the hang.
okay, but i'm at work right now so you will have to wait till evening gmt+1.
Created attachment 54892 [details] 2.6.38.2 dmesg after echo core > /sys/power/pm_test
Created attachment 54902 [details] 2.6.38.2 dmesg after echo 1 > /sys/power/pm_test
i'm not shure whether first dmesg log is proper, first echo mem > sys/power/state caused my machine to shutdown and halt, so dmesg output was created after reboot. command from the second point caused standart suspend, and unfotunately same result - blank screen; self reboot with blank screen after; manual shutdown and normal boot, dmesg log done after that. please be patitent if i did something wrong or just misunderstood, i'm just a librarian with semi advanced computer and linux skills. best regards /t
(In reply to comment #87) > i'm not shure whether first dmesg log is proper, first echo mem > > sys/power/state caused my machine to shutdown and halt, so dmesg output was > created after reboot. > did you run "echo core > /sys/power/pm_test" first? If yes, please run "echo mem > /sys/power/state" and wait for about 10 seconds to see if the machine resumes automatically. (In reply to comment #86) > Created an attachment (id=54902) [details] > 2.6.38.2 dmesg after echo 1 > /sys/power/pm_test it should be "echo 1 > /sys/power/pm_trace"
>did you run "echo core > /sys/power/pm_test" first? >If yes, please run "echo mem > /sys/power/state" and wait for about 10 seconds >to see if the machine resumes automatically. yes, but didn't wait, and resumed manualy. >it should be "echo 1 > /sys/power/pm_trace" that's the way i did it, just mistaken in comment copy/paste. i'll send new logs later.
Created attachment 55032 [details] 2.6.38.2 dmesg after echo core > /sys/power/pm_test
Created attachment 55042 [details] 2.6.38.2 dmesg after echo 1 > /sys/power/pm_trace
echo core > /sys/power/pm_test echo mem > /sys/power/state causes my machine to shutdown and halt. no automatic resume after that (i've waited five minutes). dmesg after manual bootup. echo 1 > /sys/power/pm_trace echo mem > /sys/power/state reproduces bug, dmesg output after reboot.
(In reply to comment #92) > echo core > /sys/power/pm_test > echo mem > /sys/power/state > > causes my machine to shutdown and halt. no automatic resume after that (i've > waited five minutes). dmesg after manual bootup. > oh, this sounds like a kernel issue. please echo one of these items {core processors platform devices freezer} > /sys/power/pm_test each time. and check which one starts to give you the automatic resume.
Created attachment 55562 [details] 2.6.38.2 dmesg after echo core > /sys/power/pm_test (proper)
i dont know why, but running echo core > /sys/power/pm_test in recovery mode does not result in system shutdown. same goes with processors, platform and devices echoed, which cause system to shutdown in standard kernel mode. only freezer work both in recovery and normal kernel modes. do you need rest of dmesg suspend debug logs (after echoing of processor, platform and devices)?
(In reply to comment #95) > i dont know why, but running echo core > /sys/power/pm_test in recovery mode > does not result in system shutdown. same goes with processors, platform and > devices echoed, which cause system to shutdown in standard kernel mode. What do you mean by "shutdown"? The test modes are supposed to simulate suspend without putting the system into the sleep state (i.e. they should return to command prompt after several seconds). > only freezer work both in recovery and normal kernel modes. > > do you need rest of dmesg suspend debug logs (after echoing of processor, > platform and devices)? If they don't work as intended, then yes, we do.
as i said in comment #92, by shutdown i mean that executing those two commands 'echo core > /sys/power/pm_test' and 'echo mem > /sys/power/state' one after another cause my machine to simply turn off. i've used word shutdown and halt because that is what happens when you execute command 'shutdown -h 0', but the difference is, that when testing suspend it just turns off immediately. i guess that all of tests (core, processors and so on) work properly in recovery mode, since my machine does not shutdown and whole system run flawlessly after test.
Yes, the tests appear to work correctly in the recovery mode. What exactly is the difference between the recovery mode and the normal working mode (at least from the kernel's perspective)?
recovery mode equals single mode? so it seems that some device driver breaks the resume?
i meant single user mode (called recovery in grub menu) and multiuser mode with network services. it is possible that it is some device driver, but i dont use any properitary or closed source drivers. problem occures using pure debian squeeze even without x server installed. i cant find that link right now, but this bug was also present in windows xp with one version of the nvidia graphics driver, but i'm not shure if it is right clue since i dont use either windows or linux nvidia drivers. it's just idea but this bug is common for msi pr200 and msi ex600x and i haven't found any bug reports for other msi laptops, so maybe comapring hardware specs could narrow suspected driver? i've also tried contacting msi poland with that problem, but only answer i got was that they don't support linux (what could i expect anyway).
It's great that kernel bugzilla is back. can you please verify if the problem still exists in the latest upstream kernel?
problem persist with 3.2.1 kernel. as soon as i will finish writing my phd i really can donate this problematic machine to some developer since (however still working) its falling apart by itself ;) cheers
Created attachment 72134 [details] dmesg output from suspend/resume dmesg output created with script from comment #77
[ 568.550249] [drm] Initialized drm 1.1.0 20060810 [ 568.607973] [drm:i915_init] *ERROR* drm/i915 can't work without intel_agp module! Can you fix this config issue?
I have the same problem with MSI EX600. It's still present in 3.6 kernel. 2.6.18-amd64 works but 2.6.18-i386 does not. 2.6.22 does not work. I'm going to test if commit 91a6c462b02d8dc02dbe95e5a407d78078a38d01 really breaks it.
Resume works with 2.6.22 amd64 (x86_64) kernel when booted using linux16 command of grub2 but does not work when booted using linux command. Resume does not work with 2.6.23 even with linux16. 2.6.21 and older can only be booted with linux16.
[Slightly off topic] The "linux" command in Grub2 is just plain broken. "linux16" is the right thing on a BIOS platform; the fact that it is a non-obvious default is just another case of massive Grub2 brain damage.
Agreed, this crap causes various weird problems (like APM breakage on older machines). Just downgraded to grub-legacy for the rest of this testing. Bisect between 2.6.22 and 2.6.23 does not seem to work, the first kernel was non-bootable and "git bisect skip" produced 4 more unbootable kernels so I gave up. But reverting commit "91a6c462b02d8dc02dbe95e5a407d78078a38d01" (and also "c39736823232bc3ca113c8228fa852c09fba300e" for the build to work) produced 2.6.23 kernel that resumes properly.
Found out by copying parts of x86_64 setup.S to i386 setup.S in 2.6.22 kernel that the problem is A20-related. When the x86_64 A20 code is put into i386 setup.S, resume works with i386 kernel. More testing revealed that the a20_test always succeeds, preventing any A20 switching. This allowed me to produce a 3.6 kernel with working resume by commenting out all a20_test_short() and a20_test_long() calls (except the last one) in enable_a20() in arch/x86/boot/a20.c
Seems that the BIOS requires enable_a20_kbc() even if A20 seems to be enabled. Luckily, it can be done later (just did it with a userspace program on running system and resume worked then). So the early code can be left unmodified and a DMI-based quirk can be created that will do this later. Where should that code be put? Maybe into i8042.c? Also I wonder what Windows does with A20...
OK, I guess what happens is that the bootloader (or possibly the BIOS) probably enables E820 via port 92h whereas the BIOS expects it to have been enabled via the KBC. There is a reason the Linux code does the order BIOS, KBC, port 92h even though port 92h is faster. What I'd like to know is if calling enable_a20_bios() unconditionally works on your system (i.e. disable the first a20_test_short() only?)
I've already tried that - it does not work, unfortunately.
Some tests in DOS show that A20 is disabled on boot. BIOS A20 functions (enable/disable/read status) seem to work correctly. And debugger shows that BIOS uses port 92h... So probably GRUB enables A20 using BIOS. Linux would do the same if A20 was disabled. And BIOS itself is buggy, INT 15h uses 92h but resume requires KBC. Is there anything that we can do except adding some DMI-based quirk?
Created attachment 84551 [details] [PATCH] Enable A20 using KBC for some MSI laptops This patch fixes the problem on my EX600 laptop and should also fix it on EX700, GX700, VR201, VR601 and PR200 (list and DMI data found in bug reports at Ubuntu Launchpad). Patched kernel works also with Grub2 linux command, both i386 and x86_64. Is something like this acceptable?
works on PR200, thank you!
Seems reasonable to me. Send me the patch with proper header and Signed-off-by: and I'll apply it.
(In reply to comment #114) > Created an attachment (id=84551) [details] > [PATCH] Enable A20 using KBC for some MSI laptops > > > This patch fixes the problem on my EX600 laptop and should also fix it on > EX700, GX700, VR201, VR601 and PR200 (list and DMI data found in bug reports > at > Ubuntu Launchpad). Patched kernel works also with Grub2 linux command, both > i386 and x86_64. > > Is something like this acceptable? I have a MSI PR200, over what tree should this patch be applied? I want to test this myself, too.
I've created it with 3.6-rc5. I hope that it will apply to 3.7 too.
(In reply to comment #118) > I've created it with 3.6-rc5. I hope that it will apply to 3.7 too. It works for my MSI PR200 (patch applied over v3.6.0). Thank you very much for this fix. Peter, do you know which official release will be the first to contain this patch?
Finally, a much simpler patch was merged: http://git.kernel.org/tip/ad68652412276f68ad4fe3e1ecf5ee6880876783
commit ad68652412276f68ad4fe3e1ecf5ee6880876783 Author: Ondrej Zary <linux@rainbow-software.org> Date: Tue Dec 11 22:18:05 2012 +0100 x86, 8042: Enable A20 using KBC to fix S3 resume on some MSI laptops shipped in 3.8-rc1 closed.
In case somebody is wondering, I have been using kernel 3.10 which contains this patch and haven't encountered any issues related to sleep (except some oops-es, butthose are from the video driver).
is that patch removed from 3.12 kernel? hibernation is broken again the same way after updating from 3.8 in debian.
It's still present. Please try and find which kernel version your laptop fails at again. Probably some other bug
3.12-1-amd64. it is debian jessie default kernel in main branch. yesterday i've recompiled 3.12 kernel with applied patch provided here by Ondriej Zary and hibernation works again. and i've copied .config form debian stock kernel if that matters.
The patch attached to this bug report is old version and completely different from the patch present in upstream kernel: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ad68652412276f68ad4fe3e1ecf5ee6880876783
(In reply to Ondrej Zary from comment #126) > The patch attached to this bug report is old version and completely > different from the patch present in upstream kernel: yes i know. to clear some things up: - for debian 3.2 stock kernel hibernation works, patch had to be applied by debian devs) - 3.8 kernel was compiled by myself without additional applying patch (yours or upstream) hibernation was working. - upgraded wheezy to jessie, 3.12 stock kernel has broken hibernation - recompilation of 3.12 kernel with upstream patch (taken from comment 120) failed for me (i could not apply patch - hard to debug for me as i'm not skilled coder, i've had error in line 6) - recompilation with your patch went all good, hibernation working again in 3.12 i can try to make some debug logs with stock 3.12 using method from comment 83 but i can take few days as i have very little free time recently.