Bug 21952 - resume hangs unless intel_idle.max_cstate=3 or maxcpus=1 - Samsung N145, N148, N150, N210, Lenovo S10-3
resume hangs unless intel_idle.max_cstate=3 or maxcpus=1 - Samsung N145, N148...
Status: CLOSED OBSOLETE
Product: ACPI
Classification: Unclassified
Component: Power-Sleep-Wake
All Linux
: P1 normal
Assigned To: Len Brown
:
Depends on:
Blocks: 7216 16055
  Show dependency treegraph
 
Reported: 2010-11-04 00:36 UTC by Benjamin Nave
Modified: 2013-08-15 21:39 UTC (History)
21 users (show)

See Also:
Kernel Version: 3.2
Tree: Mainline
Regression: Yes


Attachments
lspci -v (Samsung N150+) (8.58 KB, text/plain)
2011-01-20 03:35 UTC, David P Kleinschmidt
Details
default boot parameters (3.32 KB, text/plain)
2011-01-20 03:38 UTC, David P Kleinschmidt
Details
default boot parameters, hyperthreading disabled in BIOS (2.00 KB, text/plain)
2011-01-20 03:39 UTC, David P Kleinschmidt
Details
maxcpus=1 (2.01 KB, text/plain)
2011-01-20 03:40 UTC, David P Kleinschmidt
Details
intel_idle.max_cstate=0 (682 bytes, text/plain)
2011-01-20 03:42 UTC, David P Kleinschmidt
Details
intel_idle.max_cstate=3 (2.87 KB, text/plain)
2011-01-20 03:43 UTC, David P Kleinschmidt
Details
nolapic_timer (3.30 KB, text/plain)
2011-01-20 03:44 UTC, David P Kleinschmidt
Details
patch vs 2.6.37 (3.37 KB, patch)
2011-01-20 05:26 UTC, Len Brown
Details | Diff
patch vs 2.6.36 (3.37 KB, patch)
2011-01-20 05:30 UTC, Len Brown
Details | Diff
patch vs 2.6.35.9 (3.39 KB, patch)
2011-01-20 05:32 UTC, Len Brown
Details | Diff
lspci -v output on Samsung N145 Plus (6.46 KB, text/plain)
2011-01-20 11:27 UTC, Richard Schütz
Details
dmesg|grep idle and cpuidle sysfs with default configuration (3.00 KB, text/plain)
2011-01-20 11:30 UTC, Richard Schütz
Details
dmesg|grep idle with intel_idle.max_cstate=0 (359 bytes, text/plain)
2011-01-20 11:34 UTC, Richard Schütz
Details
cpuidle sysfs with nolapic_timer (2.65 KB, text/plain)
2011-01-20 11:36 UTC, Richard Schütz
Details
Debug output with intel_idle.max_cstate=0 (2.66 KB, text/plain)
2011-02-28 20:50 UTC, Seth Forshee
Details

Description Benjamin Nave 2010-11-04 00:36:01 UTC
Samsung N150 Fails to Resume after Suspend/Hibernate on Kernel Version 2.6.35.

Please see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/640100 for further information.
Comment 1 Aleksandr Tishin 2010-11-04 15:54:18 UTC
Also affects Samsung N148 Plus (same hardware as in N150 Plus but without preinstalled Windows)
Comment 2 Mark Lord 2010-11-06 14:19:18 UTC
Ditto for Samsung N210 netbook.  Very difficult to debug, too.
Suspend (to RAM) appears to work, but the wakeup simply hangs with a dark screen.  No serial ports, no way to attach one either (USB is no good for this).
-ml
Comment 3 Davide Lima Duarte Daum 2010-11-07 04:28:27 UTC
Same problem for Samsung N150 Plus. 
With the 2.6.35.4 (Kubuntu Maverik) suspend not work, but with the 2.6.34.7 works. Is there some way to have useful info about the problem?
I tried the procedure described here: https://wiki.ubuntu.com/DebuggingKernelSuspend
but every time report a different hash.

If you need someone to make a test on this platform, i offer my support.
Comment 4 Юрий Чудновский 2010-11-07 12:24:16 UTC
Problem exists with 2.6.34-2.6.36 kernels.
On 2.6.33 no hang-ups detected for now.
2.6.27 not tested (since is not released).
Comment 5 Davide Lima Duarte Daum 2010-11-07 14:00:16 UTC
I don't know if can help to solve the problem, but on the ubuntu bug track system i found a bug that covered the same problem in june. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/594885.
The focus was on the setting CONFIG_PM_DISABLE_CONSOLE.

In this email there is the ubuntu patch that create the problem:
https://lists.ubuntu.com/archives/kernel-team/2010-June/011203.html
In this thread ubuntu kernel team drop this patch from the kernel build, but now we suffer the same problem...
Comment 6 Юрий Чудновский 2010-11-08 09:27:28 UTC
Davide, that bug is definitely not the same, since in bug 594885 problem is only not show picture on display while all other is working, and current bug is about totally hanged-up system and nothing works except power button. It is much bigger problem and by another (for now still unknown) reason.
Comment 7 Mark Lord 2010-11-08 13:25:51 UTC
Well, on my N210 at least, that's what happens -- no backlight, and total system lockup on resume.  ALT-SYSRQ-X buttons don't work either.

-ml
Comment 8 Davide Lima Duarte Daum 2010-11-09 08:00:12 UTC
Right Юрий Чудновский! Sorry for the mistake...
Comment 9 Юрий Чудновский 2010-11-10 05:21:56 UTC
Re-tested 2.6.34 kernel - seems to work good. Maybe i've tested by mistake not 34 brunch kernel earlier. Confirm regression begins withing 2.6.35 branch.
Comment 10 Mark Lord 2010-11-19 00:48:37 UTC
Looks very similar to the symptoms from this more generic bug:

http://bugzilla.kernel.org/show_bug.cgi?id=21652
Comment 11 Anna Nachesa 2010-12-12 09:54:07 UTC
Also affects Samsung N210 Plus (Ubuntu 10.10 network edition, kernel 2.6.35-23-generic; dual-boot with Windows 7 Starter)
Comment 12 abhijeet.1989 2010-12-27 18:06:21 UTC
I had this issue on my netbook samsung n210 with 2.6.32.6 kernel.

Appending intel_idle.max_cstate=0 to the kernel boot parameters solves the issue. I wonder what this means and how it solves the issue! Can anyone throw some light?
Comment 13 Anna Nachesa 2010-12-27 22:10:22 UTC
Appending intel_idle.max_cstate=0 to the kernel boot parameters solved the issue for my samsung n210 as well (using 2.6.35.24 kernel now).
Comment 14 Mark Lord 2010-12-30 19:52:38 UTC
Setting max_cstate=0 essentially disables aggressive power-saving on the CPU.  So battery life will likely be reduced.

Why does this help?  Dunno, but it does indicate some kind of race condition in the kernel.  With normal settings, operations can have higher latency as the CPU transitions from lower (slower) cstates up to cstate-0 (the fastest).

Disabling cstates means the timing is always fast, and predictable.

Cheers
Comment 15 Mark Lord 2010-12-30 19:54:19 UTC
So.. possibly a better fix is to have the suspend script read/save the current max_cstate value, set it to zero afterward before suspending.  Then on resume, the resume script could restore the old value.

This way, everything ought to work well enough.  Ideally, I expect this should really be done in-kernel, since that's where the bug is.

Cheers
Comment 16 David P Kleinschmidt 2011-01-01 16:19:10 UTC
I am able to set intel_idle.max_cstate=3 on my Samsung N150+ without triggering this bug, and it's not until intel_idle.max_cstate=4 that I run into problems.  That indicates to me that the problem isn't so much that the intel_idle driver needs to be disabled, but rather that cstate-4 is glitchy.

Power management is a mystery to me so I don't know whether there's much to be gained by running with max_cstate=3 rather than max_cstate=0, but I hope this can at least help someone more knowledgeable narrow down the problem.

Linux tinytronix 2.6.35-24-generic #43~ppa1~loms~maverick-Ubuntu SMP Fri Dec 24 18:15:40 UTC 2010 i686 GNU/Linux

BOOT_IMAGE=/boot/vmlinuz-2.6.35-24-generic root=UUID=66817f18-a5c7-4d9e-8a29-3220974cb618 ro intel_idle.max_cstate=3 quiet splash

- dpk
Comment 17 Rafael J. Wysocki 2011-01-16 22:41:45 UTC
Is the problem still present in 2.6.37?
Comment 18 Richard Schütz 2011-01-18 15:41:17 UTC
I am experiencing the problem with my Samsung N145 Plus and kernel 2.6.37, too. I noticed that disabling Hyper Threading in BIOS or at runtime (echo 0 > /sys/devices/system/cpu/cpu1/online) helps. Can someone confirm this? Furthermore all the devices mentioned in topic seem to have Hyper Threading capable processors.
Comment 19 David P Kleinschmidt 2011-01-19 02:58:27 UTC
Confirmed, disabling hyperthreading in BIOS resolves the issue without requiring intel_idle.max_cstate to be specified, on 2.6.35 as well.

- dpk
Comment 20 Len Brown 2011-01-20 02:17:29 UTC
please show the output from

dmesg |grep idle
grep . /sys/devices/system/cpu/cpu*/cpuidle/*/*

in the default configuration, and also when
booting with intel_idle.max_cstate=0 (to fall back to ACPI).

does booting with "maxcpus=1" also work around the problem?
Comment 21 Len Brown 2011-01-20 02:33:54 UTC
does booting with "lapic_timer" help?
If yes, is the kernel being tested include the
patch from bug #21032?

please show the lspci output for each of the failing systems.
Comment 22 David P Kleinschmidt 2011-01-20 03:35:05 UTC
Created attachment 44232 [details]
lspci -v (Samsung N150+)
Comment 23 David P Kleinschmidt 2011-01-20 03:38:18 UTC
Created attachment 44242 [details]
default boot parameters
Comment 24 David P Kleinschmidt 2011-01-20 03:39:55 UTC
Created attachment 44252 [details]
default boot parameters, hyperthreading disabled in BIOS
Comment 25 David P Kleinschmidt 2011-01-20 03:40:43 UTC
Created attachment 44262 [details]
maxcpus=1
Comment 26 David P Kleinschmidt 2011-01-20 03:42:13 UTC
Created attachment 44272 [details]
intel_idle.max_cstate=0
Comment 27 David P Kleinschmidt 2011-01-20 03:43:43 UTC
Created attachment 44282 [details]
intel_idle.max_cstate=3
Comment 28 David P Kleinschmidt 2011-01-20 03:44:58 UTC
Created attachment 44292 [details]
nolapic_timer
Comment 29 David P Kleinschmidt 2011-01-20 03:54:09 UTC
'maxcpus=1' and 'nolapic_timer' both work around the issue on 2.6.35. I don't know whether the patch has been applied.

- dpk
Comment 30 Len Brown 2011-01-20 05:18:23 UTC
> 00:00.0 Host bridge: Intel Corporation N10 Family DMI Bridge

Intel NM10.  Good to know.

> intel_idle: lapic_timer_reliable_states 0x6

This should be 0x2

This means that your kernel (vmlinuz-2.6.35-24-generic)
is older than upstream 2.6.35.9, which is when the patch
from bug #21032 shipped.

However, I don't think that patch will actually help here.
That is because comment #27 shows that intel_idle.max_cstate=3
fixes your system, and that disables ATM-C4, yet leaves
ATM-C2 active.

So the problem here is with ATM-C4.

> intel_idle.max_cstate=0 (the acpi_idle case)

No output from "# grep . /sys/devices/system/cpu/cpu*/cpuidle/*/*" ?

If that is the case, then ACPI C-states deeper than C1
are somehow disabled on this system.
Is it running with default BIOS SETUP settings?

Do you see the same thing when on AC vs on Battery?

> nolapic_timer output in comment #28

The output shows that with nolapic_timer you are not entering
*any* c-states, not even c1; instead you are polling.
It looks like something is broken related to the nolapic_timer
option -- need to look into that; because it would otherwise
implicate the lapic timer; but here it nukes all the c-states,
which doesn't tell us anything about the lapic timer.
Comment 31 Len Brown 2011-01-20 05:26:09 UTC
Created attachment 44302 [details]
patch vs 2.6.37

please Shaohua's broadcast clock event patch from 2.6.38-rc1
This version should apply cleanly to 2.6.37
Comment 32 Len Brown 2011-01-20 05:30:32 UTC
Created attachment 44312 [details]
patch vs 2.6.36

Here is the same patch, back-ported to apply cleanly to 2.6.36.3
Comment 33 Len Brown 2011-01-20 05:32:28 UTC
Created attachment 44322 [details]
patch vs 2.6.35.9

Here is the same patch, back-ported to apply cleanly to 2.6.35.9
Comment 34 Richard Schütz 2011-01-20 11:27:58 UTC
Created attachment 44382 [details]
lspci -v output on Samsung N145 Plus
Comment 35 Richard Schütz 2011-01-20 11:30:50 UTC
Created attachment 44392 [details]
dmesg|grep idle and cpuidle sysfs with default configuration
Comment 36 Richard Schütz 2011-01-20 11:34:00 UTC
Created attachment 44402 [details]
dmesg|grep idle with intel_idle.max_cstate=0

there's no data for cpuidle in sysfs
Comment 37 Richard Schütz 2011-01-20 11:36:41 UTC
Created attachment 44412 [details]
cpuidle sysfs with nolapic_timer

patch from bug #21032 should be applied as it's 2.6.37
Comment 38 Richard Schütz 2011-01-20 11:40:03 UTC
maxcpus=1 as well as nolapic_timer (obviously because it disables the c-states)  help.
Comment 39 Richard Schütz 2011-01-20 13:04:46 UTC
The patch from #31 does not fix the problem in my case.
Comment 40 Rolf Kutz 2011-01-20 14:05:24 UTC
The patch from #33 didn't fix it for me, with 2.6.35.10.
Comment 41 Martin Rogge 2011-01-23 20:51:50 UTC
I am not sure this is the same issue or not (pls ignore if not) but I just wanted to make you aware of a similar 2.6.35 regression that went away in 2.6.36: 

https://bugzilla.kernel.org/show_bug.cgi?id=16532 

2.6.37 is still good on the hardware in question.
Comment 42 Seth Forshee 2011-02-24 01:03:23 UTC
I've been debugging this issue a little with an N150. It's still present as of v2.6.38-rc6.

In addition to what's been reported here, I've observed that acpi_skip_timer_override and nohpet seem to make this issue go away. I've traced it down to something going wrong when bringing the secondary logical CPU online. When the hpet code receives the CPU_ONLINE notification the primary CPU schedules some work on the secondary CPU and waits for it to complete, but the work is never getting executed. The secondary CPU is coming online and executing instructions, and I haven't isolated exactly where it hangs.

I've also noticed that this problem seems to be timing sensitive, so it's entirely possible that some of the command-line options that "fix" the issue just alter the timing enough to mask it.

Let me know if there's anything you want me to try, and I'll post any further findings here as well.
Comment 43 Seth Forshee 2011-02-28 20:50:39 UTC
Created attachment 49672 [details]
Debug output with intel_idle.max_cstate=0

Attached requested data with intel_idle.max_cstate=0. I got some data from the cpuidle sysfs nodes, it just seemed to take a while after boot before they appeared for some reason. I still see the same nolapic_timer behavior with 2.6.38.

I also note that acpi_idle only seems to utilize C3 and higher, so if this is a problem with C4 it makes sense that disabling intel_idle eliminates the issue.
Comment 44 Юрий Чудновский 2011-03-01 09:19:40 UTC
Did someone gets rarely (let's say, once per day) suddenly hangups on 2.6.35+ kernels (even with intel_idle.max_cstate=3 or etc)? If so, its may be same regression, because I don't remember any hangups on 2.6.32.
Comment 45 Seth Forshee 2011-03-16 19:03:05 UTC
I got some time to look into this a little more. I have some more information, but still no clear answer.

The secondary CPU starts executing and hits idle at least once. It hangs after coming out of idle and re-enabling irqs -- I can see that it makes it as far as local_irq_enable() in intel_idle(), but no farther in that function. Seeing where it goes from there is more of a challenge, given the limited debug capabilities in this state. However, I don't see it hitting smp_reschedule_interrupt() which is expected from the schedule_delayed_work_on() call from hpet_cpuhp_notify().
Comment 46 Olivier Parent-Colombel 2011-04-08 18:20:03 UTC
I have the same problem with a samsung n220 with kernel 2.6.38.2. I tried to change intel_idle.max_cstate to 0 (I also tried 1,2 and 3) and I still can't resume from suspend.
Comment 47 Len Brown 2011-08-01 18:41:55 UTC
re: comment #43
/sys/devices/system/cpu/cpu0/cpuidle/state1/desc:ACPI FFH INTEL MWAIT 0x0
/sys/devices/system/cpu/cpu0/cpuidle/state2/desc:ACPI FFH INTEL MWAIT 0x10

it seems that when running acpi_idle via "intel_idle.max_cstate=0"
that only C1 and C2 are exposed.

Is this on AC?  Please try it on DC to see if additional
C-states show up under ACPI.
Comment 48 Seth Forshee 2011-08-01 18:51:30 UTC
Sorry, I'm not in possession of that machine any more so I'll be unable to do any more testing.
Comment 49 Richard Schütz 2011-08-07 19:43:58 UTC
FYI: With Linux 3.0.1 (didn't try 3.0) on my Samsung N145P netbook suspend is working fine now.

What could be the patch, that fixed it?
Comment 50 Olivier Parent-Colombel 2011-08-21 14:18:47 UTC
same thing for me, resume now works without any workaround with linux 3.0.1 (Samsung N220)
Comment 51 Zhang Rui 2012-01-18 02:22:51 UTC
Good to know.
Bug closed.
Comment 52 Richard Schütz 2012-01-18 08:59:41 UTC
Please reopen. I was a bit overhasty: Sometimes suspend is working fine on my Samsung N145P, but sometimes it still fails in the same way like before.
Comment 53 Viktor_D 2012-03-30 18:23:19 UTC
I'm using and learning Ubuntu since Maverick, and since that time when I'm initiating suspend system can't wake up normally. Just disk activity indicator is flashing for short time, screen remains black.

DistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-20-generic 3.2.0-20.33
ProcVersionSignature: Ubuntu 3.2.0-20.33-generic 3.2.12
Uname: Linux 3.2.0-20-generic i686

MachineType: LENOVO S10-3
Proc: Intel Atom N450 1.66 GHz
Motherboard: Intel NM10
Video: Intel Graphics Media Accelerator (GMA) 3150
Network: Realtek PCIe GBE Family Controller (10/100/1000MBit), Atheros AR9285 Wireless Network Adapter (bgn), 2.1+EDR Bluetooth
Comment 54 Viktor_D 2012-04-03 10:06:54 UTC
Just updated system with update manager and installed Linux kernel 3.3.1 to test - problem remains. My Lenovo can't get up from suspend.
Comment 55 Benji 2012-04-29 10:33:53 UTC
As Viktor, I got the same issu with a Lenovo S10. After suspend, screen remains black. I got no idea how to debug since I even can't access the netbook thriugh SSH.
Comment 56 Feng Tang 2012-07-05 05:27:30 UTC
(In reply to comment #54)
> Just updated system with update manager and installed Linux kernel 3.3.1 to
> test - problem remains. My Lenovo can't get up from suspend.

I saw asimilar problem on my Lenovo s10-3t, but on my machine it can resume back from suspend, sometimes after 120 seconds, sometimes 150 seconds or 300 seconds.

So could you try to wait 6 minutes to see whether it could come back.
Comment 57 Viktor_D 2012-07-05 07:26:19 UTC
(In reply to comment #56)
> 
> So could you try to wait 6 minutes to see whether it could come back.

no, without intel_idle.max_cstate=3 it doesn't wake even after 10 min.
cstate=3 makes computer wake up fast (but it badly affects browser, Firefox starts to load the memory and processor).
Comment 58 Len Brown 2012-08-24 03:34:11 UTC
Regarding the Lenovo S10-3...

originally its resume problem was fixed by this patch in 2.6.36:

commit 4731fdcf6f7bdab3e369a3f844d4ea4d4017284d
Author: Len Brown <len.brown@intel.com>
Date:   Fri Sep 24 21:02:27 2010 -0400

    intel_idle: PCI quirk to prevent Lenovo Ideapad s10-3 boot hang

You can tell if that quirk is running b/c it spews a dmesg line:
[    0.624375] pci 0000:00:1f.0: [Firmware Bug]: TigerPoint LPC.BM_STS cleared

The way that original issue was debugged was finding the difference
between the working acpi_idle and the failing intel_idle.

But today the failure is different.

I have access to a Lenovo S10-3
I just dropped FC17 on it, which is 3.5.2, and resume hangs
with a black screen.  The quirk above is in place.

intel_idle.max_cstate=3 allows resume to work.

But some cmdline params that fail are surprising:
intel_idle.max_cstate=1

maxcpus=1
intel_idle.max_cstate=0
intel_idle.max_cstate=0 processor.max_cstate=1
intel_idle.max_cstate=0 processor.max_cstate=2 (gives MWAIT 0x10)
cpuidle.off=1 crashes on boot
nohpet
idle=poll

here is a clue, after leaving the system "failed" for about 5 minutes
it actually resumed, and dmesg says this:
[  118.624575] PM: Syncing filesystems ... done.
[  118.627338] PM: Preparing system for mem sleep
[  118.779349] Freezing user space processes ... (elapsed 0.01 seconds) done.
[  118.791287] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[  118.802295] PM: Entering mem sleep
[  118.802338] Suspending console(s) (use no_console_suspend to debug)
[  118.803318] sd 1:0:0:0: [sda] Synchronizing SCSI cache
[  118.803577] sd 1:0:0:0: [sda] Stopping disk
[  119.169169] ACPI handle has no context!
[  119.365118] PM: suspend of devices complete after 562.255 msecs
[  119.365590] PM: late suspend of devices complete after 0.459 msecs
[  119.377247] pcieport 0000:00:1c.0: wake-up capability enabled by ACPI
[  119.388261] ehci_hcd 0000:00:1d.7: wake-up capability enabled by ACPI
[  119.399244] uhci_hcd 0000:00:1d.3: wake-up capability enabled by ACPI
[  120.757907] uhci_hcd 0000:00:1d.1: wake-up capability enabled by ACPI
[  120.758006] uhci_hcd 0000:00:1d.0: wake-up capability enabled by ACPI
[  120.758137] PM: noirq suspend of devices complete after 1392.536 msecs
[  120.758174] ACPI: Preparing to enter system sleep state S3
[  120.784273] PM: Saving platform NVS memory
[  120.784330] Disabling non-boot CPUs ...
[  120.786548] CPU 1 is now offline
[  120.787428] Extended CMOS year: 2000
[  120.787428] ACPI: Low-level resume complete
[  120.787428] PM: Restoring platform NVS memory
[  120.787428] CPU0: Thermal monitoring handled by SMI
[  120.787428] Extended CMOS year: 2000
[  120.787428] microcode: CPU0 updated to revision 0x107, date = 2009-08-25
[  120.792528] Enabling non-boot CPUs ...
[  120.792749] Booting Node 0 Processor 1 APIC 0x1
[  120.807014] microcode: CPU1 updated to revision 0x107, date = 2009-08-25
[  120.812158] CPU1 is up
[  120.812539] ACPI: Waking up from system sleep state S3
[  424.700077] ACPI Exception: AE_TIME, Returned by Handler for [EmbeddedControl] (20120320/evregion-501)
[  424.700185] ACPI Error: Method parse/execution failed [\_SB_.PCI0.LPCB.EC0_.DSSV] (Node ffff88003d1d0488), AE_TIME (20120320/psparse-536)
[  424.700228] ACPI Error: Method parse/execution failed [\_WAK] (Node ffff88003d1cbaf0), AE_TIME (20120320/psparse-536)
[  424.700402] ACPI Exception: AE_TIME, While executing method \_WAK (20120320/hwesleep-82)
[  424.721458] uhci_hcd 0000:00:1d.0: wake-up capability disabled by ACPI
[  424.728327] uhci_hcd 0000:00:1d.1: wake-up capability disabled by ACPI
[  424.732428] uhci_hcd 0000:00:1d.3: wake-up capability disabled by ACPI
[  424.732540] ehci_hcd 0000:00:1d.7: wake-up capability disabled by ACPI
[  424.733523] PM: noirq resume of devices complete after 21.159 msecs
[  424.733927] PM: early resume of devices complete after 0.323 msecs


So it appears that we had some kind of time-out in the EC while evaluating _WAK

This is not intel_idle specific, and it looks like ACPI,
so moving bug categories.
Comment 59 Len Brown 2012-08-24 03:45:04 UTC
acpi_sleep=nonvs


hmm, we get the _WAK EC timeout also in the working intel_idle.max_cstate=3 case:


[   62.018118] PM: Syncing filesystems ... done.
[   63.634734] PM: Preparing system for mem sleep
[   63.768209] Freezing user space processes ... (elapsed 0.01 seconds) done.
[   63.779276] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[   63.790285] PM: Entering mem sleep
[   63.790330] Suspending console(s) (use no_console_suspend to debug)
[   63.791278] sd 1:0:0:0: [sda] Synchronizing SCSI cache
[   63.791622] sd 1:0:0:0: [sda] Stopping disk
[   64.157146] ACPI handle has no context!
[   64.392113] PM: suspend of devices complete after 601.259 msecs
[   64.392513] PM: late suspend of devices complete after 0.389 msecs
[   64.403259] pcieport 0000:00:1c.0: wake-up capability enabled by ACPI
[   64.414241] ehci_hcd 0000:00:1d.7: wake-up capability enabled by ACPI
[   64.425214] uhci_hcd 0000:00:1d.3: wake-up capability enabled by ACPI
[   65.783698] uhci_hcd 0000:00:1d.1: wake-up capability enabled by ACPI
[   65.783791] uhci_hcd 0000:00:1d.0: wake-up capability enabled by ACPI
[   65.783900] PM: noirq suspend of devices complete after 1391.379 msecs
[   65.783936] ACPI: Preparing to enter system sleep state S3
[   65.811197] PM: Saving platform NVS memory
[   65.811251] Disabling non-boot CPUs ...
[   65.813252] CPU 1 is now offline
[   65.814435] Extended CMOS year: 2000
[   65.814435] ACPI: Low-level resume complete
[   65.814435] PM: Restoring platform NVS memory
[   65.814435] CPU0: Thermal monitoring handled by SMI
[   65.814435] Extended CMOS year: 2000
[   65.814435] microcode: CPU0 updated to revision 0x107, date = 2009-08-25
[   65.819131] Enabling non-boot CPUs ...
[   65.819449] Booting Node 0 Processor 1 APIC 0x1
[   65.853992] microcode: CPU1 updated to revision 0x107, date = 2009-08-25
[   65.858078] CPU1 is up
[   65.858479] ACPI: Waking up from system sleep state S3
[   69.862782] ACPI Exception: AE_TIME, Returned by Handler for [EmbeddedControl] (20120320/evregion-501)
[   69.862803] ACPI Error: Method parse/execution failed [\_SB_.PCI0.LPCB.EC0_.DSSV] (Node ffff88003d1d0488), AE_TIME (20120320/psparse-536)
[   69.862829] ACPI Error: Method parse/execution failed [\_WAK] (Node ffff88003d1cbaf0), AE_TIME (20120320/psparse-536)
[   69.862861] ACPI Exception: AE_TIME, While executing method \_WAK (20120320/hwesleep-82)
[   69.864039] Clocksource tsc unstable (delta = 1099511324104 ns)
[   69.864227] Switching to clocksource hpet
Comment 60 Feng Tang 2012-08-24 04:51:15 UTC
Len,

The resume problems for s10-3 has 2 types:
1. hang on resume for ever
2. the resume will hang for 2-5 minutes, and then come back to life.

For the 2nd one I have a debug patch to fix it, pls check in bugzilla 41932
https://bugzilla.kernel.org/show_bug.cgi?id=41932
Comment 61 Len Brown 2012-08-24 05:01:11 UTC
acpi_sleep=nonvs
no joy

acpi.ec_delay=5000
no joy
Comment 62 Len Brown 2012-08-24 05:21:31 UTC
Feng,
Apparently I'm mostly seeing failure #1, because the patch
in bug 41932 doesn't seem to help.  (applied w/ typo fixed to 3.5.2)
Comment 63 Len Brown 2012-08-24 05:26:44 UTC
that said...

suspend seems to always work on my lenovo s10-3
when I use intel_idle.max_cstate=3 on Linux-3.5.2, and I've
not the foggiest idea why.

The same c-state accessed by acpi-idle doesn't work,
and shallower c-states don't work.  bizarre.
Comment 64 Feng Tang 2012-08-24 05:53:49 UTC
(In reply to comment #62)
> Feng,
> Apparently I'm mostly seeing failure #1, because the patch
> in bug 41932 doesn't seem to help.  (applied w/ typo fixed to 3.5.2)

I see, my machine is a s10-3t, which is different from your s10-3.

the bios version is Rev 0.25, released on 05/26/2010
Comment 65 Zhang Rui 2013-04-17 13:51:01 UTC
Len, any idea/progress on this?
Comment 66 Len Brown 2013-08-15 21:39:41 UTC
Resume fails on my Lenovo s10-3 running Ubuntu 13.04 (Linux 3.8.0-27-generic),
but the recent upstream kernels I try all work fine:

3.11.0-rc5-gf1d6e17
3.10.7
3.8.13
3.8.0

For the newest one, I tested AC, DC, intel_idle.max_cstate=0 -- all OK.

Ubuntu 10.04's kernel still fails always --
no matter if running intel_idle, acpi_idle, or even idle=halt
I grabbed the latest -- 3.8.0-29-generic from raring-proposed, but no joy.

So I installed Ubuntu 13.10's daily build -- 3.11.0-2-generic (Aug 12th)
and suspend/resume on the Lenovo Ideapad S10-3 works fine.
I don't know what 13.04's problem was, but since upstream is
working and 13.10 is working, we seem to be done here.

Note You need to log in before you can comment on or make changes to this bug.