Bug 43441

Summary: [RADEON:KMS:RV370:RESUME] garbage on screen (console or X) after suspend-resume with radeon (KMS only)
Product: Drivers Reporter: cyberbat (cyberbat)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: NEW ---    
Severity: high CC: adi, alan, alexdeucher, szg00000
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.11 Subsystem:
Regression: No Bisected commit-id:
Attachments: full kernel 3.5rc2 log
.config
console with garbage
X with garbage

Description cyberbat 2012-06-15 16:14:58 UTC
Created attachment 73681 [details]
full kernel 3.5rc2 log

Screen on my Samsung R50 notebook with AMD Mobility Radeon X300 (RV370 chipset) become completely unusable after resume (watch screenshots). I've tested different kernels from 2.6.32 till 3.5rc2. Just the same thing. The thing happens only with KMS turned on. I have tried suspend from X (Xfce) and from console (using pm-suspend).


I repeatedly get following errors in kernel log after resume:

Jun 15 18:56:39 localhost kernel: radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
Jun 15 18:56:39 localhost kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x000000000000023d last fence id 0x000000000000023c)
Jun 15 18:56:39 localhost kernel: Failed to wait GUI idle while programming pipes. Bad things might happen.
Jun 15 18:56:39 localhost kernel: radeon 0000:01:00.0: (r300_asic_reset:392) RBBM_STATUS=0x80010140
Jun 15 18:56:39 localhost kernel: radeon 0000:01:00.0: (r300_asic_reset:411) RBBM_STATUS=0x80010140
Jun 15 18:56:39 localhost kernel: radeon 0000:01:00.0: (r300_asic_reset:423) RBBM_STATUS=0x00000140
Jun 15 18:56:39 localhost kernel: radeon 0000:01:00.0: GPU reset succeed
Jun 15 18:56:39 localhost kernel: radeon 0000:01:00.0: GPU reset succeed
Jun 15 18:56:39 localhost kernel: radeon 0000:01:00.0: f5eaf400 unpin not necessary
Jun 15 18:56:39 localhost kernel: [drm] radeon: 1 quad pipes, 1 Z pipes initialized.
Jun 15 18:56:39 localhost kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000D0040000).
Jun 15 18:56:39 localhost kernel: radeon 0000:01:00.0: WB enabled
Jun 15 18:56:39 localhost kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x00000000b0000000 and cpu addr 0xff8de000
Jun 15 18:56:39 localhost kernel: [drm] radeon: ring at 0x00000000B0001000
Jun 15 18:56:39 localhost kernel: [drm] ring test succeeded in 1 usecs
Jun 15 18:56:39 localhost kernel: [drm] ib test succeeded in 0 usecs
...
Jun 15 18:58:32 localhost kernel: radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
Jun 15 18:58:32 localhost kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000490 last fence id 0x000000000000033b)
Jun 15 18:58:32 localhost kernel: Failed to wait GUI idle while programming pipes. Bad things might happen.
Jun 15 18:58:32 localhost kernel: radeon 0000:01:00.0: (r300_asic_reset:392) RBBM_STATUS=0x80010140
Jun 15 18:58:32 localhost kernel: radeon 0000:01:00.0: (r300_asic_reset:411) RBBM_STATUS=0x80010140
Jun 15 18:58:32 localhost kernel: radeon 0000:01:00.0: (r300_asic_reset:423) RBBM_STATUS=0x00000140
Jun 15 18:58:32 localhost kernel: radeon 0000:01:00.0: GPU reset succeed
Jun 15 18:58:32 localhost kernel: radeon 0000:01:00.0: GPU reset succeed
Jun 15 18:58:32 localhost kernel: radeon 0000:01:00.0: f5eaf400 unpin not necessary
Jun 15 18:58:32 localhost kernel: [drm] radeon: 1 quad pipes, 1 Z pipes initialized.
Jun 15 18:58:32 localhost kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000D0040000).
Jun 15 18:58:32 localhost kernel: radeon 0000:01:00.0: WB enabled
Jun 15 18:58:32 localhost kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x00000000b0000000 and cpu addr 0xff8de000
Jun 15 18:58:32 localhost kernel: [drm] radeon: ring at 0x00000000B0001000
Jun 15 18:58:32 localhost kernel: [drm] ring test succeeded in 1 usecs
Jun 15 18:58:32 localhost kernel: [drm] ib test succeeded in 0 usecs
Jun 15 19:00:46 localhost /usr/sbin/gpm[1110]: *** info [daemon/processrequest.c(42)]: 
Jun 15 19:00:46 localhost /usr/sbin/gpm[1110]: Request on 6 (console 1)
Jun 15 19:01:42 localhost kernel: radeon 0000:01:00.0: GPU lockup CP stall for more than 200579msec
Jun 15 19:01:42 localhost kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x000000000000058f)
Jun 15 19:01:42 localhost kernel: radeon 0000:01:00.0: failed to get a new IB (-35)
Jun 15 19:01:42 localhost kernel: [drm:radeon_cs_ib_chunk] *ERROR* Failed to get ib !
Jun 15 19:01:42 localhost kernel: Failed to wait GUI idle while programming pipes. Bad things might happen.
Jun 15 19:01:42 localhost kernel: radeon 0000:01:00.0: (r300_asic_reset:392) RBBM_STATUS=0x80010140
Jun 15 19:01:42 localhost kernel: radeon 0000:01:00.0: (r300_asic_reset:411) RBBM_STATUS=0x80010140
Jun 15 19:01:42 localhost kernel: radeon 0000:01:00.0: (r300_asic_reset:423) RBBM_STATUS=0x00000140
Jun 15 19:01:42 localhost kernel: radeon 0000:01:00.0: GPU reset succeed
Jun 15 19:01:42 localhost kernel: radeon 0000:01:00.0: GPU reset succeed

I recognize that I have really old notebook, but It has enough power for my tasks so it will be good to use KMS on it cause I loose a lot of features of X without it.
Comment 1 cyberbat 2012-06-15 16:15:58 UTC
Created attachment 73691 [details]
.config
Comment 2 cyberbat 2012-06-15 16:16:58 UTC
Created attachment 73701 [details]
console with garbage
Comment 3 cyberbat 2012-06-15 16:18:19 UTC
Created attachment 73711 [details]
X with garbage

Xorg started with Xfce and garbage on the screen. It's thunar window on the screen under the garbage.
Comment 4 Adrian Knoth 2013-09-16 14:22:56 UTC
I can report similar (same?) garbage with the following device:

01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RS482M [Mobility Radeon Xpress 200]


Kernel version is 3.11.

While UMS always used to work, KMS is causing trouble.


Note that garbage goes away when I plug in the A/C power supply, in other words, it only happens when running on battery.

I've added two videos showing the corruption:

   http://adi.loris.tv/radeon-kms1.mp4

   http://adi.loris.tv/radeon-kms2.mp4


I guess it's PM related, since radeon with KMS sometimes fails to properly resume/suspend, but I haven't noticed a reproducible pattern. As a rough estimate, first suspend/resume cycle works, second almost always fails. If not, then the 3rd will.
Comment 6 Adrian Knoth 2013-09-16 16:17:29 UTC
The second commit was already part of 3.11, but I had to add the first one.

I'm afraid it doesn't change a thing, the machine still hangs when I try to suspend for the second time.

Garbage after first resume when on battery, no garbage when A/C connected.


Possibly unimportant observation: When I suspend a freshly booted system, that is, when I suspend at the kdm login prompt, resuming the machine won't bring the kdm screen back, but a black screen with a working X cursor. Restarting kdm makes it work again. Of course, the next suspend then fails, too. (just to be clear)
Comment 7 Adrian Knoth 2014-01-17 23:45:02 UTC
cyberbat: As mentioned in <https://bugzilla.kernel.org/show_bug.cgi?id=67121>, I'm not sure if I'm seeing the same or a different bug.

Could you try to run some kind of cpuburn (there's a Debian package ) to see if it fixes the output? If not, it's likely a different issue.

TIA