Bug 60381

Summary: AMD Radeon 7770 Ghz edition Crash with DPM active
Product: Drivers Reporter: rafael castillo (jrch2k10)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED CODE_FIX    
Severity: blocking CC: alexdeucher, arek.rusi
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.10.0-next-20130703 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmidecode
glxinfo
lspci
vbios.rom
dmesg output - kernel drm-fixes-3.11
dmesg output
debugging output
dmesg with patch applied
dmesg drm-fixes-3.11+patch
latest drm-fixes-3.11 dmesg
boot with radeon.aspm=1
boot with radeon.aspm=0
Dmesg crash output for 3.11-rc2

Description rafael castillo 2013-07-04 00:37:29 UTC
Created attachment 106731 [details]
dmidecode

with 1 monitor it works fine until X session but UVD freeze the system and the card never reclocks

multimonitor freeze the system the instant KMS try to load DVI  + HDMI

with static PM it crashes if you try to change the profile 

if you need any help to debug this issue plz provide instruction since the hangs disable everything including ssh

system info as attach
Comment 1 rafael castillo 2013-07-04 00:38:04 UTC
Created attachment 106741 [details]
glxinfo
Comment 2 rafael castillo 2013-07-04 00:38:24 UTC
Created attachment 106751 [details]
lspci
Comment 3 Alex Deucher 2013-07-04 01:01:07 UTC
Please attach you dmesg output and a copy of your vbios.  To get a copy of your vbios:

(as root)
(use lspci to get the bus id)
cd /sys/bus/pci/devices/<pci bus id>
echo 1 > rom
cat rom > /tmp/vbios.rom
echo 0 > rom
Comment 4 rafael castillo 2013-07-04 01:11:51 UTC
thanks for your response, im attaching the vbios info but im unable to get an dmesg output in the moment of the crash since it kicks in kms load and hangs[keyboard keys blinking] and even ssh stops

and if i boot without hdmi[all clean up to here] and plug hdmi all freeze again the same way[UVD or reclock have the same issue]
Comment 5 rafael castillo 2013-07-04 01:12:10 UTC
Created attachment 106781 [details]
vbios.rom
Comment 6 Michel Dänzer 2013-07-04 08:20:06 UTC
(In reply to comment #0)
> with static PM it crashes if you try to change the profile 

FWIW, that might work better if you explicitly set the low profile first.
Comment 7 rafael castillo 2013-07-07 01:37:42 UTC
hi alex i see you posted some new patches, any of those i should test to verify the issue on this bug?

no pressure just curiosity and as always many thanks for your hard work
Comment 8 rafael castillo 2013-07-09 02:07:51 UTC
well tested the latest changes in you drm-3.11-next and issue still there, just reporting
Comment 9 Arek Ruśniak 2013-07-12 16:14:00 UTC
Hi, 

Rafael if you pluged only dvi, dpm changes power levels? 

Alex, i'm not sure if i should commit new bug(i have HD7770 GHz Ed. from Asus+olny one dvi display). In my case i've got power_level = 0 (low) all the ime. It reclocks olny when i try with UVD but it kills my machine. Forces power levels works olny for "low" level in other case "auto"(default) or "high": "bash: echo: write error: Invalid argument". It doesn't matter if power state is set to 'performance' or 'balanced'.

Static PM works ok(UVD too)(with something similar to Michael's tip of course).

kernel: drm-next-3.11 or linux-next-20130712
Comment 10 rafael castillo 2013-07-13 01:40:52 UTC
pretty much the same here
Comment 11 Michel Dänzer 2013-07-16 10:22:02 UTC
FWIW, I'm basically seeing the same problems with my 7770 card.
Comment 12 rafael castillo 2013-07-16 14:25:13 UTC
well i tried drm-fixes-3.11 branch late night very fast but i got to KDE using dpm=1 and UVD seemed to work but when i opened xonotic the GPU hard reset and killed the monitors but is getting closer ;)

today ill try to debug this issue with xonotic since i noted you made the output more verbose from my tablet ssh i was too tired last night
Comment 13 Arek Ruśniak 2013-07-16 16:16:21 UTC
i can confirm this for 3d apps(lightsmark or unvanquished), but uvd still doesn't work for me.
Comment 14 Arek Ruśniak 2013-07-16 20:42:27 UTC
Created attachment 106895 [details]
dmesg output - kernel drm-fixes-3.11

I don't try 3.11-rc1, but code for radeon should be the same i hope.
Comment 15 rafael castillo 2013-07-16 23:44:47 UTC
im adding my dmesg for drm-fixes-3.11 just in case
Comment 16 rafael castillo 2013-07-16 23:45:48 UTC
Created attachment 106897 [details]
dmesg output
Comment 17 Alex Deucher 2013-07-17 01:42:25 UTC
Created attachment 106898 [details]
debugging output

Can you attach a dmesg output with dpm enabled with this patch?
Comment 18 rafael castillo 2013-07-17 02:59:42 UTC
I can`t reach X with that patch applied, ill try later with gcc 4.7 just to be sure
Comment 19 rafael castillo 2013-07-17 04:06:10 UTC
well compiling with 4.7 series i can reach X, i guess ill get another fun debug for later, atacched dmesg
Comment 20 rafael castillo 2013-07-17 04:06:38 UTC
Created attachment 106900 [details]
dmesg with patch applied
Comment 21 Arek Ruśniak 2013-07-17 11:58:07 UTC
Created attachment 106902 [details]
dmesg drm-fixes-3.11+patch

It almost looks like the same as before. But You are dev here:) I'll try uvd with this patch.
Comment 22 rafael castillo 2013-07-18 01:44:47 UTC
Created attachment 106917 [details]
latest drm-fixes-3.11 dmesg
Comment 23 rafael castillo 2013-07-18 01:46:19 UTC
ok xonotic still crash play hell with the GPU but now it can resume after failure and for things like gpu accel in browser or normal kwin usage seems stable enough.

it seems only real 3d apps like games trigger the crash
Comment 24 Alex Deucher 2013-07-18 13:18:29 UTC
You might try the latest drm-fixes branch if you were using gcc 4.8.  See:
https://bugs.freedesktop.org/show_bug.cgi?id=66932
Comment 25 rafael castillo 2013-07-18 15:02:11 UTC
well i tried drm-fixes-3.11 with both gcc 4.8.1 and 4.7.2 and both reset the GPU when reclock, i attached the dmesg.

i meant that with your recent changes the GPU recover and allow you to close the game or in the case of chrome fallback to cpu rendering instead of hardlock the system as before, so it got better but reclock still fails.

this Cape Verde XT chips seems to be a really problematic generation or maybe is that this chip come overclocked from factory maybe
Comment 26 Arek Ruśniak 2013-07-18 22:03:27 UTC
Created attachment 106938 [details]
boot with radeon.aspm=1

No change with 3d apps.
Comment 27 Arek Ruśniak 2013-07-18 22:13:55 UTC
Created attachment 106939 [details]
boot with radeon.aspm=0

aspm=0 didn't help, 3d apps hang my pc. 
But finally UVD is working, doesn't mater with or without aspm.
Comment 28 rafael castillo 2013-07-23 01:31:43 UTC
Created attachment 106992 [details]
Dmesg crash output for 3.11-rc2
Comment 29 rafael castillo 2013-07-23 01:32:27 UTC
posted updated crash dmesg with kernel 3.11-rc2 in case it helps
Comment 30 rafael castillo 2013-07-24 01:51:03 UTC
ok i tried the latest patches in drm-fixes and the crashes seemed to stop but i can't get the gpu to reclock

i tried /sys/class/drm/card0/device/power_dpm_state to performance alone and it never scales from state 0

if i try to force it with /sys/class/drm/card0/device/power_dpm_force_performance_level only accepts low

auto or high returns
bash: echo: write error: Invalid argument
Comment 31 rafael castillo 2013-07-24 01:52:01 UTC
uvd reclocks fine but the desktop flicker when it does
Comment 32 Alex Deucher 2013-07-24 02:17:25 UTC
The dynamic re-clocking doesn't currently work reliably on SI asics.
Comment 33 rafael castillo 2013-07-24 02:50:58 UTC
well step by step is getting better, im happy enough to get stable desktop now and since my card render KDE like an monster reclocking is not uber important for me right now.

thanks for an awesome job ;)
Comment 34 rafael castillo 2013-07-30 01:20:31 UTC
tested with today drm-fixes patches and its reclocking like a boss and xonotic passed from 30 FPS to an massive 190FPS in ultimate at 1366x768. i read you need some fixes for other part of asic for later so is up to you if you wish to close the bug report.

again a hundred bazillions thanks this is just awesome now
Comment 35 Arek Ruśniak 2013-07-30 07:53:26 UTC
no pain no gain. Now everything works fast as hell.
Even UVD is fliker-free now. Thanks Alex, best regards to you and radeon team.
Comment 36 Alex Deucher 2013-09-05 20:36:08 UTC
I guess this bug can be closed now?
Comment 37 rafael castillo 2013-09-05 21:04:56 UTC
i guess yes, the only issue i find after this, is that KMS hang if you compile the kernel with radeon kms with Y instead of M

i can't findout why since is too early to see anything
Comment 38 Alex Deucher 2013-09-05 21:11:28 UTC
(In reply to rafael castillo from comment #37)
> i guess yes, the only issue i find after this, is that KMS hang if you
> compile the kernel with radeon kms with Y instead of M
> 
> i can't findout why since is too early to see anything

If you build the driver into the kernel, you also need to build the ucode into the kernel.  I suspect the hang it due to missing ucode in the kernel image.
Comment 39 rafael castillo 2013-09-06 01:17:01 UTC
yeap make sense, i thought i did but is very probable im missing a step or two in the process, i tried just cuz i wanted to see if KMS could start earlier in the boot process since my PC with systemd but too fast and i can't even see kmscon kicking in because once the module load kdm is there. anyway as this bug is concerned all is peachy and since im using drm-3.12-next it got even better in some spots.
Comment 40 rafael castillo 2013-09-06 01:18:32 UTC
many thanks for your time and some nice piece of awesome work