I open a bug report in Ubuntu, but they suggest that a should open a bug here. the bug in launchpad is: https://bugs.launchpad.net/bugs/962142 I have a Toshiba Tecra R840 and with Ubuntu 11.10 suspend/resume work fine. Last kernel I use is 3.0.0-16-generic-pae. After upgrade to 12.04 beta, resume stop work. The laptop enter suspend state, but I was unable to resume. The PC became unresponsive, just the cooler fan is at maximum speed. I test it with 3.2.0, and mainline kernel 3.3.0 from http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.3-precise/. The symptoms are the same. Thanks, Artur
is /sys/power/pm_test available in your new kernel? If no, please rebuild a kernel with CONFIG_PM_DEBUG set. If yes, please try "echo processors > /sys/power/pm_test; echo mem > /sys/power/state" Does the system come back after about 10s?
per above, please try the steps noted in the source tree: Documentation/power/basic-pm-debugging.txt
Hi, I made the test you ask, and the system came back with no problem. I repeat the test from freeze to core and all resume ok. The only thing I notice from dmesg are: [ 2582.577508] ACPI Error: [GTF0] Namespace lookup failure, AE_NOT_FOUND (20110623/psargs-359) [ 2582.577546] ACPI Error: Method parse/execution failed [\_SB_.PCI0.SAT0.PRT0._SDD] (Node ffff880136a47cd0), AE_NOT_FOUND (20110623/psparse-536) [ 2582.577656] ACPI Error: [GTF0] Namespace lookup failure, AE_NOT_FOUND (20110623/psargs-359) [ 2582.577666] ACPI Error: Method parse/execution failed [\_SB_.PCI0.SAT0.PRT0._GTF] (Node ffff880136a47cf8), AE_NOT_FOUND (20110623/psparse-536)
I boot the system with init=/bin/bash, suspend with s2ram and try to resume, but the system became unresponsive. I try to boot with "init=/bin/bash acpi=off" but the system seems to freeze after something like "Probing PCI devices". There are no "Caps Lock" blinking. The systems just freeze, no disk activity, no response to keyboard, no panic...
Hi, this night I try s2disk (I haven't tried it before) and it work, suspend to disk/resume with success. Don't know if this help.
Hi, I've tried kernel 3.3.1 and 3.4.0-rc1. in 3.3.1 the problem remains. in 3.4.0-rc1 the computer reboots after resuming.
After s2ram resuming, is there any output? Please add "no_console_suspend" kernel parameter and test again to find some outputs.
No, the display never turns on. There are no output with "no_console_suspend". Tried with normal boot and with init=/bin/bash.
Last night I tried last Ubuntu live-cd 32 bits version. Suspend/resume work OK. All versions I tested before and failed are 64 bits. My laptop suspend/resume without problems with Ubuntu 11.10 32bits. So I think this is not related to Kernel version, but rather to 32 or 64 bits kernel.
I'm encountering the same problem as described above, but on a Toshiba Portege Z830. The test described in comment #1 succeeds on this machine as well, but resuming from suspend is broken. When I trigger a resume (tapping a key), the power light changes from amber to green, but there is *zero* disk activity and the screen stays off. Anything I can do to help debug the issue?
Oops, should mention I'm running v3.5.3.
Hi, Today I decided that I needed to debug this, because I need the 64-bit kernel. I tried acpi_sleep=s3_beep to see if that can give me some clue, and... surprise, my laptop resumed without problem. I made several suspend/resume, and all worked correctly. I tried with kernel 3.2.0-24 and 3.7.0-999 from Ubuntu repository and both work. Then I tried all other acpi_sleep options (removing acpi_sleep=s3_beep), but none worked.
Fascinating. I just tested on the Portege Z830 with Linux 3.6.9, and I can confirm that acpi_sleep=s3_beep somehow makes things resume just fine.
Looking at the code path for s3_beep, I am guessing that the actual code in s3_beep isn't directly affecting suspend/resume. The code doesn't really do anything too different from the normal resume path (enable speakers, beep for some time, disable speakers, and so on). I think the real impact here is the udelay() calls in there. I'm about to head to my office, but can someone try something like this and see if the resume works without s3_beep enabled? diff --git a/arch/x86/realmode/rm/wakemain.c b/arch/x86/realmode/rm/wakemain.c index 91405d5..bc9bdc9 100644 --- a/arch/x86/realmode/rm/wakemain.c +++ b/arch/x86/realmode/rm/wakemain.c @@ -71,6 +71,8 @@ void main(void) if (wakeup_header.realmode_flags & 4) send_morse("...-"); + udelay(US_PER_DOT * 20); + if (wakeup_header.realmode_flags & 1) asm volatile("lcallw $0xc000,$3");
Just tested locally with the above patch. The udelay() makes resume take ~1630ms, but it -does- successfully resume. Anyone have any ideas why the delay is needed?
Hi Artur and Steven, Thanks for the findings! Can you please test if the latest upstream kernel still has this problem? According to Artur, v3.0 doesn't have this problem while 3.2+ all have this problem. What about v3.1? And Steven, is this the same to you? I asked this because I want to collect more information for the x86 code maintainer, so that they can better understand what is the problem and then fix it, thanks.
Hi Aaron, maybe I don't explain properly. With all x86 32bit kernel I was able to resume. The problem is with amd64. All kernel I tried (3.2, 3.3, 3.4 and 3.7) have the same problem, fail to resume. I haven't tried this lately because I solve the problem with "acpi_sleep=s3_beep" during kernel boot. I'm going to try the latest upsream kernel from Ubuntu Mainline Kernels Archive (3.9-rc2). Thanks Artur
I haven't used my Toshiba machine for ages solely because of this bug, but I'll re-image it and see how it behaves on the latest.
I just tested Linux 3.9.0-rc2-00188-g6c23cbb on my Toshiba Portege Z830. It still fails to resume from suspend without that extra udelay thrown in.
Hi, I also tested Linux 3.9.0-rc2. Fail without acpi_sleep=s3_beep Resume work with acpi_sleep=s3_beep
Thanks for your test. So this only occurs on 64 bit kernels, right? The following kernel versions I mentioned are all about 64 bits, let's forget 32 bits kernels now, since they don't have any problem :-) According to Artur, v3.2+ all fail, and v3.0 resume OK, is this correct? If so, what about v3.1? I hope we can find the first failing kernel and the last working kernel if possible, that would help people to find the problem quicker. Thanks.
Just tested 3.0.68. Same failure to resume.
(In reply to comment #22) > Just tested 3.0.68. Same failure to resume. Thanks Steven. So this means there is no known working 64 bit kernel. Let's see what Artur's situation is(I hope it's the same :-).
Hi, All 32 bit kernel I've tested resume ok. All 64 bit kernel fail. I haven't notice this in the beginning of this thread. Initially I think the problem was related to kernel 3.2, but that's not true. The real problem is related to 32 or 64 bits kernel.
Adding more people. A brief description of the problem: Artur and Steven experienced a S3 resume problem only on 64 bit kernels: on resume, they have to use acpi_sleep=s3_beep to make resume work, or system will hang. And Steven also tried to use a delay instead of s3_beep as showed in comment #14, it also worked. There are no working 64 bit kernels for them.
Hello x86 experts, Any suggestions about this bug? Thanks.
Hi Artur and Steven, Perhaps we can raise this question to the kernel mailing list, it doesn't seem there are people looking at this bug page. So can you please send an email to linux-kernel@vger.kernel.org?
Created attachment 97771 [details] debug patch: check if outb helps please try the debug patch attached, boot without any s3_xxx options, and see if it helps.
Created attachment 97781 [details] debug patch 2: check if any function call helps please try this patch. boot without any s3_xxx options, and see if it helps.
Well. The delay added in comment #14 is in the real mode code that should be the same for 32-bit and 64-bit kernels, so it's really puzzling. I wonder if the amount of delay added actually matters.
The 64-bit kernel does a WBINVD (which I have no idea why it's there) at the top of trampoline_64.S. This is a very slow instruction, and might have the effect of a delay.
On Tuesday, April 09, 2013 03:47:53 AM bugzilla-daemon@bugzilla.kernel.org wrote: > > --- Comment #31 from H. Peter Anvin <hpa@zytor.com> 2013-04-09 03:47:52 --- > The 64-bit kernel does a WBINVD (which I have no idea why it's there) at the > top of trampoline_64.S. Interesting. I have no idea why it's there too. > This is a very slow instruction, and might have the effect of a delay. I wonder what happens if we remove that instruction?
Created attachment 97811 [details] x86: Remove wbinvd from trampoline_64.S Whoever can reproduce this problem, can you please test the attached patch too (apply without any previous patches/workarounds from this bug entry)?
Well, there are some more apparently arbitrary differences between trampoline_64.S and trampoline_32.S. For example, the 64-bit trampoline sets up the stack at rm_stack_end, while the 32-bit one doesn't do that. Moreover, the 32-bit trampoline doesn't touch the stack segment. Not to mention the ordering differences.
Actually, the 32-bit trampoline executes the wbinvd too.
Some of those differences aren't arbitrary at all, rather they are a reflection of the inherent differences between the 32- and 64-bit environments. That being said, the differences are probably bigger than they need to be. The uses of the trampolines are also different; the 32-bit trampoline isn't used *at all* for the BSP during resume for example (the APs will still use it, of course.)
This is interesting, because APs are not resumed. They are turned on via CPU online (hotplug), so the BSP is the only CPU that executes the real mode resume code (the main() function in wakemain.c in particular). Also, I wonder if checking cpuid in the 64-bit trampoline is actually necessary? It looks like we could do without it.
The APs go though the 32-bit trampoline exactly because they are not resumed. The checking of CPUID in the 64-bit trampoline isn't necessary for resume, but is important for AP bringup: we can't transfer to a set of NX-containing page tables until we have made sure NX is enabled, for example.
Well, perhaps the 64-bit BSP resume should follow the 32-bit variant, then?
That would require a lot more complexity.
OK Artur, Steven, can you please check the debug patches from comments #28, #29, and #33?
Tested all three patches (#28, #29, #33). None had any discernible impact on resume behavior.
I too am experiencing this issue on a Portege Z830. As far as I can see all suggested patches have been tried without success. Any further suggestions? Any further data I can gather?
Same here/ I have Toshiba R840. Put the latest kernel 3.8.0-21...still no luck. Same problem. What is the potential fix kernel/timeframe?
can you please try this patch https://patchwork.kernel.org/patch/2593741/ and see if it helps? Note that this is probably not a fix, according to HPA's comments. But let's see if this is the same problem addressed in this patch.
I tried the patch as per #45. I downloaded the latest stable kernel 3.9.3. Patched it and recompile it. No effect. Same problem. So it didn't help.
Created attachment 102941 [details] DMI for Toshiba Portege Z830 Today I received a Toshiba Portege Z830, I've tested v3.2 shipped with debian and v3.9 shipped with fedora, both 64 bits kernel and both resumed fine. DMI for this laptop model attached.
Aaron, can you please compare .configs?
Hi Steven & Jamin, I suspect we are using different processor in z830. What model are yours, is it a Sandy bridge one or Ivy one? Thanks.
It's a Sandy Bridge.
I'll happily upload any details, dumps, diagnostic information anyone is interested in seeing from the unit.
(In reply to comment #51) > I'll happily upload any details, dumps, diagnostic information anyone is > interested in seeing from the unit. Output of dmidecode please, thanks. (In reply to comment #50) > It's a Sandy Bridge. Looks like this is related to processor.
Created attachment 105211 [details] dmidecode output Sandybridge I'm attaching the requested dmidecode output from the system. Please let me know if there is anything else you would like or other testing that I can perform.
(In reply to comment #53) > Created an attachment (id=105211) [details] > dmidecode output Sandybridge > > I'm attaching the requested dmidecode output from the system. Please let me > know if there is anything else you would like or other testing that I can > perform. So looks like this only occurs on Sandybridge CPUs.
Maybe worth to try the nox2apic kernel command line, x2apic is only available under x86_64 and from Artur's dmesg, x2apic is enabled.
Please also test intremap=off, which disables interrupt remapping in addition to x2apic, thanks.
Anyone?
Hi Aaron, sorry for the late answer... I test the nox2apic and have made several suspend/resume with success. I haven't test the intremap=off. Do you steel want me to test it? Thank you very much for your help. Artur
(In reply to Artur from comment #58) > Hi Aaron, > > sorry for the late answer... > > I test the nox2apic and have made several suspend/resume with success. > > I haven't test the intremap=off. Do you steel want me to test it? Yes if possible, but I think nox2apic works is already a good hint. Thanks a lot for your test.
Using intremap=off without nox2apic also allows suspend/resume with success.
Add Youquan and David. Some user has reported that systems doesn't resume on 64 bits kernel, while 32 bits kernel is OK. Adding acpi_sleep=s3_beep command line is a work around, and the cure is actually adding a little delay early. It turned out x2apic is enabled on the system, if nox2apic is used, no workaround is needed and resume is just fine. Any ideas? Thanks.
Sorry for the late update, but I can confirm that using nox2apic allows for a functional suspend/resume.
Hello is there any update on this bug. I also have a Toshiba Tecra R840 with this problem and tested nox2apic and it works. Latest kernels 3.11 and 3.12 still have this problem. Thanks