Most recent kernel where this bug did not occur: 2.6.20
Hardware Environment: IBM Thinkpad T43
Suspend and resume to/from disk randomly fails. Sometimes pressing any key during supend or resume leads to succesful suspend/resume.
Subject: Re: [Bugme-new] New: Suspend/resume fails/stalls until
keyboard interrupt occurs
On Tue, 26 Jun 2007 13:05:44 -0700 (PDT)
> Summary: Suspend/resume fails/stalls until keyboard interrupt
> Product: Power Management
> Version: 2.5
> KernelVersion: 2.6.22-rc6
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Hibernation/Suspend
> AssignedTo: email@example.com
> ReportedBy: firstname.lastname@example.org
> Most recent kernel where this bug did not occur: 2.6.20
> Hardware Environment: IBM Thinkpad T43
> Problem Description:
> Suspend and resume to/from disk randomly fails. Sometimes pressing any key
> during supend or resume leads to succesful suspend/resume.
(pleae respond via emailed reply-to-all)
This might be a post-2.6.21 regression. Are you able to find out if 2.6.21
had the same bug?
I think I'm seeing this same bug on 2.6.23-rc1+git, with my T43
I noticed as I was coming out of S3, that I was getting a very slow flashing cursor so I waited and pressed keys and about 5 minutes later the laptop finished coming out of suspend and I'm working on it now.
There seem to be no interesting messages in dmesg, but I'll attach mine + config. What can I do to help debug?
Created attachment 12205 [details]
my dmesg during suspend/resume
Created attachment 12206 [details]
(In reply to comment #4)
> my .config
Can you please try without CONFIG_NO_HZ and CONFIG_HIGH_RES_TIMERS?
It happened once more, but I've had many suspend/resume cycles where it was fine. Just wanted to add a note that it is difficult to reproduce.
I still have to try the CONFIG_NO_HZ=n
There is a fix that may be related to this issue in the current Linus' tree (2.6.23-rc7-git4 as of today). Please test it.
I tried with the latest git, rc7+, and still reproduced this issue with CONFIG_NO_HZ=y
all I did was boot into X, log in, run powertop, and suspend/resume a few times first with power cord plugged in, and ran into the problem with power cord unplugged on the third or fourth resume. Its not immediately apparent that keyboard presses help (or hurt) I just can't tell, but it took like 4 or more minutes to resume with a very very slow flashing cursor in upper left.
btw, I'm running in vesafb mode 0x318 on console, don't know if that matters or not.
Well, please try CONFIG_NO_HZ=n and CONFIG_HIGH_RES_TIMERS=n, then.
it seems it is *probably* related to CONFIG_NO_HZ and CONFIG_HIGH_RES_TIMERS.
I've done 10 S3/resume cycles with varying states of plugged in/unplugged, and have not been able to repro.
I'll reboot again to the old kernel with NO_HZ and HRT enabled and try once more to repro, just to make sure I'm using a valid test.
ah, bugzilla is back. with the NO_HZ kernel my laptop eventually had the problem again, but after 7-10 s3/resume cycles.
Are there some more debugging options that I can turn on? Maybe timestamping dmesg? I'm willing to try some more stuff, and I'm sorry that I haven't been able to nail down a test case to 100% repro.
unrelated to this bug but FYI, the hot-keys only work once to put the machine to sleep, I suppose it could be my user-space.
(In reply to comment #12)
> ah, bugzilla is back. with the NO_HZ kernel my laptop eventually had the
> problem again, but after 7-10 s3/resume cycles.
> Are there some more debugging options that I can turn on?
Well, nothing obviously useful comes to mind.
> Maybe timestamping dmesg?
Yes, you can try that.
> I'm willing to try some more stuff, and I'm sorry that I haven't been able
> to nail down a test case to 100% repro.
There's nothing to be sorry about. :-) Thanks for your patience.
> unrelated to this bug but FYI, the hot-keys only work once to put the machine
> to sleep, I suppose it could be my user-space.
Yes, it could be, but also it may be an ACPI problem. Please try to rule out the user space and create a separate bugizlla entry if that turns out to be a kernel problem.
- Is this ever happening, when you do suspend/resume while AC is plugged in ?
- Did you unplug AC while the box was in suspend ?
- Can you please try "clocksource=acpi_pm" on the kernel command line ?
Please provide the output of /proc/acpi/processor/CPU0/power for AC plugged and unplugged. Also the output of /proc/timer_list for both states.
> - Is this ever happening, when you do suspend/resume while AC is
> plugged in ?
well, honestly I'm not sure. I think it has happenned in the past when
plugged into AC.
> - Did you unplug AC while the box was in suspend ?
in the interests of trying various stuff when attempting to repro, yes,
but I don't know if it is required as it is very difficult to repro and
I was trying all sorts of stuff.
It appears I just need to go in and out of suspend with wireless
enabled, until I hit the problem.
> - Can you please try "clocksource=acpi_pm" on the kernel command line
okay, that's up next, I'm assuming you mean with the NO_HZ kernel.
> Please provide the output of /proc/acpi/processor/CPU0/power for AC
> plugged and unplugged. Also the output of /proc/timer_list for both
I'll attach that in a minute.
> It appears I just need to go in and out of suspend with wireless
> enabled, until I hit the problem.
> > - Can you please try "clocksource=acpi_pm" on the kernel command line
> > ?
> okay, that's up next, I'm assuming you mean with the NO_HZ kernel.
okay, with clocksource=acpi_pm:
15 S3/resume cycles on AC
suspend, unplug AC
1 Successful wake
then repro again!
its not immediately clear that pressing any keys helps it wake up faster, but it will eventually come back (cursor flashes really really slow)
Created attachment 12969 [details]
CPU0/power on AC
Created attachment 12970 [details]
CPU0/power on DC
Created attachment 12971 [details]
timer_list on AC
Created attachment 12972 [details]
timer_list on DC
Hmm, that's odd:
active state: C2 (with AC)
active state: C3 (w/o AC)
How can this happen ? This is an UP machine and it is in C2(3) while it reads out /proc/acpi/...../CPU0/power
Jesse, is this problem still there with .23 or later ?
I haven't seen it lately, i'm sorry i'm only able to test infrequently.
I tested 11 S3/wake cycles when on battery with wireless enabled, I'm using 2.6.24-rc4, with HRT and NO_HZ.
I believe it is fixed at this point. I have to say suspend resume is much more usable now that i can wake the T43 with fn key or power or lid switch reliably.
Thanks for your attention. I'll have to leave it to someone else to mark this fixed because I didn't file the bug.
thanks for testing, Jesse.