Bug 16273

Summary: suspend/resume failure with radeon kms driver
Product: Drivers Reporter: Jose Marino (braket)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED OBSOLETE    
Severity: normal CC: akpm, alan, alexdeucher, bjaglin, bugs+kernel, harn-solo
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.35-rc3 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg after suspend/resume test
dmesg after suspend/resume test with radeon.agpmode=-1
dmesg after suspend/resume test (2.6.35-rc4)
dmesg after suspend/resume test (debug test)
dmesg after suspend/resume (2.6.37-rc2+)

Description Jose Marino 2010-06-23 00:01:41 UTC
Created attachment 26905 [details]
dmesg after suspend/resume test

Enabling the new radeon kms driver makes the computer unable to suspend/resume to ram. The computer seems to suspend fine but it freezes on resume. The problem appears when I try to suspend with or without X.

Here's what I do to reproduce the freeze:
- Boot to console (no X) and log in as root
- echo mem > /sys/power/state
- The computer seems to suspend fine.
- On resume I get a black screen and the laptop seems frozen. It doesn't respond to any keyboard input. To reboot I must do a hard reset.

The laptop is a Dell Inspiron 600m with an ATI mobility 9000 (rv250).
I've only tested this with kernels 2.6.34 and 2.6.35-rc3 and both are affected by this problem.
I should also mention that with the old radeon driver (no kms) the computer suspends and resumes fine.


The laptop survives suspend/resume when using /sys/power/pm_test
I do: 
echo core > /sys/power/pm_test 
echo mem > /sys/power/state

After the suspend/resume test I get a black screen again but this time the computer is alive. I can reboot manually from the command line. I attach the dmesg I captured right after the suspend/resume test.

I noticed the line:
PM: Device 0000:00:00.0 failed to resume async: error -16

I tested without async suspend/resume (echo 0 > /sys/power/pm_async) and the error changes to:
PM: Device 0000:00:00.0 failed to resume: error -16
Comment 1 Andrew Morton 2010-06-23 00:06:03 UTC
Recategorised to DRI.
Comment 2 Alex Deucher 2010-06-23 17:19:23 UTC
This looks like a dupe of bug 16140.
Comment 3 Jose Marino 2010-06-24 00:01:29 UTC
Created attachment 26921 [details]
dmesg after suspend/resume test with radeon.agpmode=-1

I don't see those error messages from bug 16140

In any case, I tested with radeon.agpmode=-1 and my problem is still there: the computer freezes on resume with a black screen.

Also, boot parameter radeon.agpmode=-1 with echo core > /sys/power/pm_test behaves as before: system resumes from suspend but I get a black screen. The computer is still alive so I can capture a dmesg and reboot.
I attach the dmesg.
Comment 4 Jose Marino 2010-07-05 18:34:43 UTC
Created attachment 27025 [details]
dmesg after suspend/resume test (2.6.35-rc4)

There has been some improvement with 2.6.35-rc4
With the rc4 the suspend/resume test works fine:
$ echo core > /sys/power/pm_test
$ echo mem > /sys/power/state

The laptop suspends for a few seconds and comes back fine. The display works and everything seems ok. I attach the dmesg I captured after this test.

However, the actual suspend/resume still doesn't work. The symptoms are the same as with rc3: the laptop locks up on resume.
Comment 5 Jose Marino 2010-07-16 22:01:43 UTC
Created attachment 27132 [details]
dmesg after suspend/resume test (debug test)

I was able to get more info about this bug. 
I added an early return in the radeon resume routine with this patch applied on top of f469461df6ff822f71b8737bda86eea20f16ff93

--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -799,6 +799,10 @@ int radeon_resume_kms(struct drm_device *dev)
        radeon_pm_resume(rdev);
        radeon_restore_bios_scratch_regs(rdev);
 
+       printk(KERN_INFO "radeon: **TEST** resume early\n");
+       release_console_sem();
+       return -1;
+
        /* turn on display hw */
        list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
                drm_helper_connector_dpms(connector, DRM_MODE_DPMS_ON);


I booted this kernel and did a suspend/resume cycle, saved the dmesg and rebooted. After the resume the display was black but I was able to save the dmesg and start the reboot. The laptop locked up on reboot.

There are some new interesting error messages in the dmesg:
[drm:radeon_ring_write] *ERROR* radeon: writting more dword to ring than expected !
[drm:r100_ring_test] *ERROR* radeon: ring test failed (sracth(0x15E4)=0xFFFFFFFF)
[drm:r100_cp_init] *ERROR* radeon: cp isn't working (-22).
radeon 0000:01:00.0: failled initializing CP (-22).

I attach the full dmesg just in case.
Comment 6 Jose Marino 2010-09-13 22:29:41 UTC
Update: I still see this problem with latest kernel v2.6.36-rc4
Comment 7 Jose Marino 2010-11-21 03:14:33 UTC
Created attachment 37792 [details]
dmesg after suspend/resume (2.6.37-rc2+)

Current git kernel 2.6.37-rc2 (actually at b86db4744230c94e480de56f1b7f31117edbf193) still has this issue.
I got another dmesg after a suspend/resume cycle by using the same patch as in comment #5. There doesn't seem to be anything new but I'll attach it here just in case.
These are the steps I take:
- boot patched kernel (console only, no X)
- run this:
echo mem > /sys/power/state ; dmesg > dmesg.log ; sync ; sync ; reboot

The laptop survives the suspend/resume (with a black screen) and writes the dmesg to file but hangs on reboot.
Comment 8 Alan 2012-08-09 13:59:44 UTC
The driver has changed so much that this bug report is really obsolete. If you see the problem with a modern (3.2/4/5 kernel) feel free to update it