Bug 19422 - [resume] one out of 30 suspend cycles locks the system
Summary: [resume] one out of 30 suspend cycles locks the system
Status: CLOSED WILL_NOT_FIX
Alias: None
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: power-management_other
URL:
Keywords:
Depends on:
Blocks: 7216
  Show dependency tree
 
Reported: 2010-10-01 11:17 UTC by tomas m
Modified: 2012-01-18 12:03 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.35
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg from an about to hang system (128.50 KB, text/plain)
2010-10-01 11:17 UTC, tomas m
Details
lspc -vv (11.13 KB, text/plain)
2010-10-01 22:44 UTC, tomas m
Details
xorg.log (26.97 KB, text/plain)
2010-10-01 22:45 UTC, tomas m
Details
dmesg from an oops right after hibernate (suspend to ram) (118.33 KB, text/plain)
2010-10-19 11:53 UTC, tomas m
Details

Description tomas m 2010-10-01 11:17:44 UTC
Created attachment 32192 [details]
dmesg from an about to hang system

im not sure if this is 1 out of 30, more, less, or what actually triggers this issue.

but with this laptop. every once in a while, after resuming (system comes back) xorg freezes after several seconds.

this has been happening since...2.6.20's kernels, but i have never been able to detect the issue or it was something else entirely and it got introduced in later versions.

ive been trying to collect data in the past to report this but i have been unable to do so due to the hardness of the locking. (alt-sysrq does not work when it locks).

now ive accelerated the resume cycle disabling some actions. and managed to grab a dmesg dump, which im attaching

if i switch back to a vt before the hang, i can use the system (with partially working hardware, eg. wireless does not work).

the dmesg has a stack trace in it.

if there is any other info i can provide, please let me know.
Comment 1 Rafael J. Wysocki 2010-10-01 20:32:27 UTC
We have just merged a patch that may help your box resume correctly.  It is attached to bug #16396.
Comment 2 Rafael J. Wysocki 2010-10-01 20:35:17 UTC
To test it on top of 2.6.35 you need to apply attachment #27170 [details] first and then attachment #31422 [details].  Alternatively, you can wait for 2.6.36-rc7 and test that kernel when it's out.
Comment 3 Len Brown 2010-10-01 20:38:48 UTC
marking RESOLVED, as a patch to test is available
Comment 4 tomas m 2010-10-01 22:12:37 UTC
so none of you read who reported that bug?

thats a different issue.

that was a 100% reproducible bug.

this is a 1 in 20 case

having added nonvs does no fix it.
the screen when this happens actually turns on. the freeze occurs in X.
Comment 5 Rafael J. Wysocki 2010-10-01 22:27:58 UTC
I admit that Len was a bit too quick to mark this bug as resolved,
but your response is not really encouraging.

I don't really remember who reported what and why, I only look for
similarities.  When I saw the name of your machine in the log, I
thought of the other bug.

Anything which is 1 in 20 or so and has never worked is probably beyond our
debugging capabilities.  Since you're saying it was happening with 2.6.20
and now happens with 2.6.35, it doesn't seem to depend on what graphics
driver is used (I guess you're using KMS graphics drivers).

Have you changed user space (most importantly X) since when it started
to happen?
Comment 6 Rafael J. Wysocki 2010-10-01 22:32:23 UTC
BTW, please attach a dmesg output from that box, preferably also
the output of "lspci -vv" and the Xorg server's log.
Comment 7 tomas m 2010-10-01 22:42:25 UTC
Rafael, sorry if my post sounded harsh. it wasnt meant to, i just reread it and realized it could be taken badly.

concerning the issue. i know its quite hard to debug, and i havent posted a report before since i had nothing to show till now.

what i meant when i said it happened with 2.6.20 and 2.6.35... i meant from 2.6.20ish through all kernel upgrades till 2.6.35.. (2.6.36-rc too)

all i can say is the wireless adapter goes nuts when this happens (which triggers me to switch to a vt and rapidly copy the dmesg).

of course im using KMS, but this used to happen with UMS too, im almost sure this is not a video issue. im more inclined in usb or wireless. although i already tried removing usb modules (and any module that depends on them) before suspending and it does not help.

for userspace....what X was available with the 2.6.20 kernels? 1.5? 

im now at 1.8RC2 and downloading 1.9 as we speak.
Comment 8 tomas m 2010-10-01 22:44:16 UTC
Created attachment 32332 [details]
lspc -vv

there already is a dmesg from the system with a kernel stack trace in it.

if you need another one, please let me know
Comment 9 tomas m 2010-10-01 22:45:24 UTC
Created attachment 32342 [details]
xorg.log
Comment 10 tomas m 2010-10-01 22:46:46 UTC
and ive been using xorg 1.9 already
Comment 11 Rafael J. Wysocki 2010-10-02 21:49:57 UTC
OK

Well, the box is Intel-based so I'm not really expecting problems with
the graphics.

This wireless thing appears to be on USB.  Is this correct?  What happens
if you disconnect it before suspending?
Comment 12 tomas m 2010-10-03 02:14:59 UTC
disconnecting it isnt an option, since its a usb device, but its plugged inside the notebook the same way a so-dimm memory is. i did try removing the kernel modules concerning wireless and usb but it does not help.
Comment 13 tomas m 2010-10-19 11:53:06 UTC
Created attachment 34122 [details]
dmesg from an oops right after hibernate (suspend to ram)

im not sure if this is actually useful, but im attaching a new dmesg. it might shed some light on the issue.

this line seems suspicious right before the stack trace.

rt73usb 1-2:1.0: no reset_resume for driver rt73usb?
Comment 14 Zhang Rui 2012-01-18 02:19:22 UTC
It's great that kernel bugzilla is back.

can you please verify if the problem still exists in the latest upstream
kernel?
Comment 15 tomas m 2012-01-18 12:03:19 UTC
SDHCI device is crap on this system,

blacklisting the k.module fixes this and i have no plans to try and fix the sdhci reader (which breaks windows suspend too afaik)

closing

Note You need to log in before you can comment on or make changes to this bug.