Bug 12031

Summary: DRM enabled kernel hangs hard on resume (Intel graphics)
Product: Drivers Reporter: Rafael J. Wysocki (rjw)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: CLOSED CODE_FIX    
Severity: normal CC: airlied, axboe, bug-track, jbarnes, jobi, keithp, maximlevitsky
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.28-rc3 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 11808    
Attachments: Patch fixing the issue on Toshiba Portege R500
Script for automated stress-testing of suspend/resume with the help of RTC wake alarm

Description Rafael J. Wysocki 2008-11-14 16:20:10 UTC
Subject    : Re: Suspend to disk broken in latest 2.6.28-rc3
Submitter  : Jens Axboe <jens.axboe@oracle.com>
Date       : 2008-11-12 18:42
References : http://marc.info/?l=linux-kernel&m=122651551216820&w=4
Handled-By : Jesse Barnes <jbarnes@virtuousgeek.org>

This entry is being used for tracking a regression from 2.6.27.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Rafael J. Wysocki 2008-11-24 05:15:55 UTC
I can reproduce this on Toshiba Portege R500.

The behavior is the same as on the Jens' box:

- suspend to RAM and resume succeeds 100% of the time from the console
- resume fails 90% of the time from under X
- resume from hibernation is also broken if hibernated from under X
- suspend to RAM / resume works with 2.6.27.7

The box is all Intel, so suspend/resume ought to work on it.  I suspect there's a problem with DRM.
Comment 2 Maxim Levitsky 2008-11-24 09:14:58 UTC
Does this look similar:

http://sourceforge.net/mailarchive/message.php?msg_name=492A3485.5030407%40gmail.com

Do you run compiz:
Comment 3 Jesse Barnes 2008-11-24 10:56:05 UTC
Yeah, I wonder if this is a DUP of the irq install/uninstall problem.  If so, VT switch would probably also hang.
Comment 4 Rafael J. Wysocki 2008-11-24 12:57:54 UTC
No, VT switches don't hang on this box.
Comment 5 Rafael J. Wysocki 2008-11-24 12:59:20 UTC
Created attachment 19007 [details]
Patch fixing the issue on Toshiba Portege R500

This patch form Keith Packard fixes the issue on Toshiba Portege R500.
Comment 6 Rafael J. Wysocki 2008-11-24 13:00:42 UTC
Patch : http://bugzilla.kernel.org/attachment.cgi?id=19007&action=view
Comment 7 Rafael J. Wysocki 2008-11-24 13:01:19 UTC
Notify-Also : Maxim Levitsky <maximlevitsky@gmail.com>
Comment 8 Eric Anholt 2008-11-25 00:58:41 UTC
There's also a fix for resume with 915-class and original G[M]965 queued in for-airlied at the moment.
Comment 9 Rafael J. Wysocki 2008-11-26 06:20:02 UTC
There still is a problem on Toshiba R500 with the patch from comment #6 applied.

Namely, I tested the patch with a kernel having CONFIG_DEBUG_PAGEALLOC=y and CONFIG_NR_CPUS=128.  After I've switched the settings to CONFIG_DEBUG_PAGEALLOC=n and CONFIG_NR_CPUS=2, resume from suspend to RAM occasionally fails (it is reproducible with the wakealarm suspend stress test script I'm going to attach in the next comment).  However, it only fails if s2ram (the binary) is used for suspending, if 'echo mem > /sys/power/state' is used, it works 100% of the time (so far, it hasn't failed for me).

Now, the differences are that (1) with CONFIG_DEBUG_PAGEALLOC=n on x86_64 the kernel's direct memory mapping is set up using 2M pages instead of 4K pages (I have no idea what the impact of this may be) and (2) s2ram does an additional VT switch right prior to suspend and right after the resume, so presumably there still is a problem with VT switching somewhere.
Comment 10 Rafael J. Wysocki 2008-11-26 06:40:05 UTC
Created attachment 19033 [details]
Script for automated stress-testing of suspend/resume with the help of RTC wake alarm
Comment 11 Andreas Mohr 2008-11-26 09:45:25 UTC
Re: comment #9: VT switching still has weird behaviour indeed on fixed -rc6 (A110L), in that sometimes switching to tty1 from x.org will not be successful and X will reappear (after screen blinking!).  Switching again then works [usually?].

Yup, just tried it again for good measure, and yes indeed, it failed again and needed a second keypress attempt to work.
Comment 12 Rafael J. Wysocki 2008-11-30 14:42:08 UTC
I have just tried the latest Linus' tree that contains all of the recent DRM fixes and unfortunately it hasn't fix the resume issue which is still present.

I suspect it is related to the VT switching.
Comment 13 Rafael J. Wysocki 2008-12-07 14:00:16 UTC
For me, the problem turned out to be related to interrupts, bug #12121, the $subject problem appears to have been fixed.

For this reason, I'll close this bug now and Andreas please see if bug #11947 matches your symptoms and open a new bug entry for your issue if it doesn't.