Bug 16252

Summary: Since 2.6.35-rc2 suspend to ram is very shaky
Product: Power Management Reporter: Norbert Preining (preining)
Component: Hibernation/SuspendAssignee: power-management_other
Status: CLOSED INVALID    
Severity: normal CC: chxanders, error27, maciej.rutecki, preining, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.35-rc3 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: config of the running kernel
dmesg after a successfull suspend to ram

Description Norbert Preining 2010-06-19 16:08:51 UTC
This bug is a clone of 16179 to deal with the regression that suspends often hangs now.

+++ This bug was initially created as a clone of Bug #16179 +++

Subject    : 2.6.35-rc2 completely hosed on intel gfx?
Submitter  : Norbert Preining <preining@logic.at>
Date       : 2010-06-06 11:55
Message-ID : 20100606115534.GA9399@gamma.logic.tuwien.ac.at
References : http://marc.info/?l=linux-kernel&m=127582534931581&w=2

This entry is being used for tracking a regression from 2.6.34.  Please don't
close it until the problem is fixed in the mainline.

=====================================================

Up to 2.6.35-rc1 suspend was rock-solid on my computer. Since 2.6.35-rc2 I am often seeing hard freezes (not even Sysrq is working).

Following the suggestions of the original bug report I attach the .config plus the dmesg output after a successfull suspend.

I cannot attach the dmesg output after a not-successfull suspend, nothing is in there, since it is a hard freeze without even sysrq working.

Hope that helps

Norbert
Comment 1 Norbert Preining 2010-06-19 16:09:29 UTC
Created attachment 26862 [details]
config of the running kernel
Comment 2 Norbert Preining 2010-06-19 16:10:33 UTC
Created attachment 26863 [details]
dmesg after a successfull suspend to ram
Comment 3 Dan Carpenter 2010-06-21 10:32:50 UTC
Can you try suspending from outside X?

echo 7 7 7 7 > /proc/sys/kernel/printk
echo 1 > /sys/power/pm_trace
echo mem > /sys/power/state

Having the last couple lines of output would really help.
Comment 4 Norbert Preining 2010-06-21 13:44:22 UTC
> echo mem > /sys/power/state

Do I need config PM_TRACE? With my config I don't have /sys/power/pm_trace,
arrrrgg:
config CAN_PM_TRACE
        def_bool y
        depends on PM_DEBUG && PM_SLEEP && EXPERIMENTAL
Comment 5 Dan Carpenter 2010-06-21 20:03:47 UTC
Gar...  Sorry about that.  If you end up recompiling your kernel can you enable CONFIG_ACPI_DEBUG as well?

In the end I'm mostly just poking through:
http://ubuntuforums.org/showthread.php?p=3066404
Comment 6 Norbert Preining 2010-06-23 15:55:27 UTC
Hi everyone!

Ok, I have to say that it seems not to be a regression, but more a *developed* incompatibility I am tracking down.

Because doing
  echo mem > /sys/power/state
works 100% fine, without any problem.

BUT using suspend with pm-utils suspend mode breaks. First I thought because it uses uswsusp. So I switched to use the kernel mode for pm-utils suspend, but still there are some hangs.

I am trying to find the culprit, please see also Debian bug #586674
(http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=586674)

So I am not sure how to go from here with that bug report, please do whatever you think is appropriate.

Best

Norbert
Comment 7 Rafael J. Wysocki 2010-06-23 19:57:50 UTC
Well, apparently it is not a bug from the kernel point of view, so I'm marking it as resolved.  Please reopen if you find out that the this is a kernel problem after all.
Comment 8 Norbert Preining 2010-07-02 15:23:35 UTC
Hi Rafael,

I have to say it is a bit mysterious. Normally, in most of the cases with the echo mem > ... incantation the suspend/resume works without any problems, while using pm-utils it sometimes hangs.

In the cases of the pm-utils hangs it was always in the resume stage.

Just now I got a very similar hang with the pure kernel suspend/resume and I could see where it is hanging, at the backlight restore.

I made a photo of the hang and here are the last lines:
[16273.....] PM: resume of devices complete after 2115.046 msecs
[16273.....] Restarting tasks ... done
[16273.....] video LNXVIDEO:00: Restoring backlight state

and there it was off, the CPU started to work like hell (seemingly, the fans started), and not even sysrq worked, only 4-sec hold.

Whether this qualifies for a reopen I leave to you.

If you need any further information let me know.

Norbert

AH, sorry, that was with git kernel up to commit 97e02140