Created attachment 93771 [details] dmesg Toshiba Z930 fails to enter suspend and seem to have general issue with power management. I have tested on a number of distros with different kernels. This effects all distros that use kernels of 3.3 or greater. Any kernel < 3.3 work as expected. And Kernel => 3.3 including test 3.8 kernels (using fedora 18). It seems to be able to enter suspend once, it resumes from suspend then subsequently cannot enter suspend again. Trying to get the machine to enter suspend just causes the machine to lock up, the screen goes off the fan goes on full and it gets hot. Hard reset is required to get it back. This laptop seems to run Linux well other than the power management issues.Using a 3.2 kernel with Sabayon distro the laptop seems to work well. The BIOS is in legacy mode as I could not get the machine to boot at all (or install) with UEFI switched on. I have fiddled around with BIOS setting to see if anything would help to no avail. Steps to Reproduce: 1.Suspend the laptop 2.Wake the laptop 3.Suspend the laptop 4.Cant enter Suspend 5.Locks up
Hi Daniel, Thanks for your report! Is it possible for you to follow kernel document Documentation/basic-pm-debugging.txt to see what might be the problem after a suspend/resume cycle? Simply put, to test if devices failed to be suspended, you can do: # echo devices > /sys/power/pm_test # echo mem > /sys/power/state If there is any error, please attach the dmesg here, thanks.
Running all of the tests in the documentation seems to work. This was run on 3.7.9-201.fc18.x86_64 on Fedora 18. Anyway attached is the dmesg after a couple of "echo mem > /sys/power/state".
Created attachment 94001 [details] dmesg from pm testing. Did a "echo core > /sys/power/pm_test" and a couple of "echo mem > /sys/power/state".
Thanks for the test. I wonder if it is possible for you to boot into console mode with the following kernel parameter: nomodeset no_console_suspend and then reproduce this problem, see what's blocking the 2nd suspend? And better to test on the upstream kernel, not the disto one.
Hi Daniel, Any update on this?
Hi I have booted to init=/bin/sh with the other prams and it seems to do the "echo mem > /sys/power/state" fine. I cant see any errors and it seems to be able to do this test fine any number of times. This was on the distro kernel which is at 3.8.1 at the moment. It still wont suspend the second time normally. I have downloaded and built a vanilla kernel and will see what I get with that and will report back.
(In reply to comment #6) > Hi > I have booted to init=/bin/sh with the other prams and it seems to do the > "echo > mem > /sys/power/state" fine. I cant see any errors and it seems to be able > to > do this test fine any number of times. Thanks for the test. Please try to boot with init=/bin/sh and no_console_suspend(don't add nomodeset this time), if S3 still works OK, then it is possible to be a graphics driver issue. > This was on the distro kernel which is at 3.8.1 at the moment. > It still wont suspend the second time normally. > I have downloaded and built a vanilla kernel and will see what I get with > that > and will report back. Thanks.
Hi Daniel, Any update?
Hi Sorry for the delay again. I have done a fair bit of play around tonight. If I don't do a 'echo <something> > /sys/power/pm_test' it has the issue even from the console. If I boot to init=/bin/sh and do an 'echo mem > /sys/power/state' the second suspend fails as it does in normal mode. I have also tried with i915 RC6=0 to see if it make a difference and it does not. Seems as soon as I echo something to /sys/power/pm_test the bug does not happen. For example I 'echo devices > /sys/power/pm_test' then suspend works. I have tested all the levels in basic-pm-debugging.txt and all work once I echo the level to pm_test. Also tested with nomodeset and it make no difference.
I noticed Toshiba had a new firmware out for this device. I have just put the latest BIOS/Firmware on this device (had to install windows to do it grrrrr) and still same.
(In reply to comment #9) > Hi > > Sorry for the delay again. > > I have done a fair bit of play around tonight. > > If I don't do a 'echo <something> > /sys/power/pm_test' it has the issue even > from the console. > > If I boot to init=/bin/sh and do an 'echo mem > /sys/power/state' the second > suspend fails as it does in normal mode. What's the error message? > > I have also tried with i915 RC6=0 to see if it make a difference and it does > not. > > Seems as soon as I echo something to /sys/power/pm_test the bug does not > happen. > > For example I 'echo devices > /sys/power/pm_test' then suspend works. > > I have tested all the levels in basic-pm-debugging.txt and all work once I > echo > the level to pm_test. Work once? So do you mean after a fresh boot: # echo devices > /sys/power/pm_test # echo mem > /sys/power/state works ok # echo mem > /sys/power/state failed this time ?
I mean: # echo mem > /sys/power/state Fails the second time after a fresh boot. # echo devices > /sys/power/pm_test # echo mem > /sys/power/state This does not fail at all. Seems to work correctly. As for error messages I can not see any errors. The system locks up on the second suspend as described above so errors are no visible. On the first suspend that does work, it can not see any thing unusual in the logs of dmesg. Seems that something is getting corrupted after it resumes from the first suspend so the second one does not work and this does not happen when in test mode (by echoing something to the /sys/power/pm_test file). Thanks for help and please let me know how I can further help.
Can you do a git bisect to find the offending commit?
There is some info here with details of what commit causes this: b74f05d61b73af584d0c39121980171389ecfaaa
And the link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1094800
So this is also the commit that breaks your system?
I can confirm that this issue also exists on the Toshiba Portege Z935-ST4N04 with Arch Linux. As in the Ubuntu bug that Daniel referenced, stock kernel 3.3.8 works properly, v3.4 and above do not. Starting with v3.4-rc1, the the system reboots when trying to wake from the first resume. Starting with v3.4-rc3, the reboot issue no longer occurs, but the "second suspend" issue exists. I am testing more now, but I assume that commit from the Ubuntu bug accurately is the one that switches from the reboot issue to the second suspend issue. The common head of 3.3.8 and 3.4 (v3.3) also appears to have the second suspend issue - so it may have been fixed between 3.3 and 3.3.8 before a regression in 3.4. Once my bisecting of v3.4-rc1 and 3.4-rc2 is complete, I will start bisecting 3.3 and 3.3.8. Please let me know if there is any other information I can provide.
So, the true culprit appears to be: commit 2feec47d4c5f80b05f1650f5a24865718978eea4 Author: Bob Moore <robert.moore@intel.com> Date: Tue Feb 14 15:00:53 2012 +0800 ACPICA: ACPI 5: Support for new FADT SleepStatus, SleepControl registers Adds sleep and wake support for systems with these registers. One new file, hwxfsleep.c When the new extended functions are used, the "second sleep freezes" issue occurs. If I force the system to use the legacy_sleep and legacy_wake functions, the machine properly resumes, even on 3.8.4. I am investigating now if there is a way to fix the ACPI 5 sleep functions, or if there is just a way to force it to use the legacy functions without changing the source.
Thanks Brint, it is very helpful. Add the commit author Bob. Hi Bob, Can you please take a look? Commit 2feec47d4c5f80b05f1650f5a24865718978eea4 will make some Toshiba laptop fail to suspend the 2nd time. Thanks.
Hi Brint, Please attach acpidump output: # acpidump > acpidump.out Thanks.
Created attachment 96401 [details] acpidump from Toshiba 935 kernel v3.8.4 acpidump from Toshiba 935 kernel v3.8.4
In the FADT, the "hardware reduced" bit is not set, but the extended sleep control and sleep status registers are populated: Hardware Reduced (V5) : 0 [0F4h 0244 12] Sleep Control Register : [Generic Address Structure] [0F4h 0244 1] Space ID : 01 [SystemIO] [0F5h 0245 1] Bit Width : 08 [0F6h 0246 1] Bit Offset : 00 [0F7h 0247 1] Encoded Access Width : 03 [DWord Access:32] [0F8h 0248 8] Address : 0000000000000405 [100h 0256 12] Sleep Status Register : [Generic Address Structure] [100h 0256 1] Space ID : 01 [SystemIO] [101h 0257 1] Bit Width : 08 [102h 0258 1] Bit Offset : 00 [103h 0259 1] Encoded Access Width : 03 [DWord Access:32] [104h 0260 8] Address : 0000000000000401 The extended sleep control and the sleep status registers overlap the legacy PM1A registers: [038h 0056 4] PM1A Event Block Address : 00000400 [040h 0064 4] PM1A Control Block Address : 00000404 I don't see anything really obviously wrong here, unless the use of the extended registers somehow confuses the machine. One possiblity is that the extended registers specify an access width of 32 bits -- meaning that a 32-bit read or write will always be performed (even though the actual bit width of each register is only 8 bits.) You may have to dig down and figure out what is the difference in the I/O behavior between the legacy case and the extended case.
Thanks, Bob. Do you have any suggestions on how to start tracking this down? I'm comfortable tweaking the code, but don't know much about ACPI internals, so I'm not sure where to begin. Is it reasonable to introduce some kind of boot-time option to force the use of the legacy functions, or is it more worth the effort to dissect the extended functions to see if we can fix them?
I would hope that we don't need yet another boot-time option. I'm not really sure that the code that uses the extended registers has actually ever seen a real ACPI 5.0 FADT. So, it is possible that there is a problem. The code in question: acpi_hw_extended_sleep (and wake): hwesleep.c acpi_hw_legacy_sleep (and wake): hwsleep.c A bit of instrumentation here to see what exactly is being written to which registers (in both extended and legacy cases) may reveal the problem.
Created attachment 96521 [details] acpidump from the z930
Comment on attachment 96521 [details] acpidump from the z930 This is on 3.8.4
Hi Bob, I'll move this bug to ACPICA and assign it to you, is it OK?
It is not clear what the problem is here. Please do not move it yet.
Hi Bob, Section 4.8.3.7 of ACPI spec has words like this: The optional ACPI sleep registers (SLEEP_CONTROL_REG and SLEEP_STATUS_REG) specify a standard mechanism for system sleep state entry on HW-Reduced ACPI systems. So it seems to me, if this is not a HW reduced system, we shouldn't use these regs, no matter what the values are in the FADT table. Does this make sense to you?
I find the spec to be a bit ambiguous: "When implemented, the Sleep registers are a replacement for the SLP_TYP, SLP_EN and WAK_STS registers in the PM1_BLK" For most (or all) of the other extended GAS registers in the FADT, "when implemented" means "non-zero". In any case, the system in question has apparently valid values in these register fields. So we still don't really know why the system is broken.
Created attachment 99981 [details] [RFC Patch 2/3] ACPICA: Hardware: Modify sleep Thanks for reporting this bug. Your report can be a guide for us to implement sleep functionality that is compatible with ACPI 5.0. Could you please give me a test using the attached patches to confirm whether the fix is valid? The patch set is generated against latest kernel release. If you have problem in applying this patch set, please let me know and post your kernel version number on the bugzilla. Thanks in advance.
I confirm, that the patch solved problem with second suspend for me on notebook Toshiba Tecra R950. I tested kernel 3.9-rc8 with applied patch. Second suspend problem still occurs on kernel 3.9-rc8 without patch. Thanks in advance.
Confirming the same findings as kk1 on the Z935 with 3.9-rc8. Without the patches, the second suspend issue occurs. With the patches, the system suspends and resumes multiple times without issue. Thanks, Lv!
Hi, I'll propose this fix to ACPICA. Thanks for reporting and testing, I'll Cc you guys when it is accepted by ACPICA and converted to Linuxized patch during an ACPICA release process. Thanks
*** Bug 53011 has been marked as a duplicate of this bug. ***
Note there is a loadable Kernel module here while we wait main line. https://bugzilla.redhat.com/show_bug.cgi?id=904303
The patches that have been pasted to fix the sleep flow and the bit-masked register access are not merged currently. The solution from comment #29 is upstreamed, it can also fix the issues found on the Toshiba machines. author Lv Zheng <lv.zheng@intel.com> 2013-06-08 00:59:18 (GMT) committer Rafael J. Wysocki <rafael.j.wysocki@intel.com> 2013-06-15 22:56:22 (GMT) commit 7cec7048fe22e3e92389da2cd67098f6c4284e7f (patch) tree 69ace1151d4870173b6bf35dce23a9d3bd608f11 parent 42f47869c6a73a6893c998725365b587b0311f9a (diff) ACPICA: Do not use extended sleep registers unless HW-reduced bit is setPrevious implementation incorrectly used the ACPI 5.0 extended sleep registers if they were simply populated. This caused problems on some non-HW-reduced machines. As per the ACPI spec, they should only be used if the HW-reduced bit is set. Lv Zheng, ACPICA BZ 1020. This bug is going to be closed.