Bug 71891 - 3.13 fails to boot with the radeon module
Summary: 3.13 fails to boot with the radeon module
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: drivers_video-dri
URL:
Keywords:
: 73111 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-03-11 14:14 UTC by sdh
Modified: 2020-06-08 10:26 UTC (History)
12 users (show)

See Also:
Kernel Version: 3.13.6-1-ARCH
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Possible fix. (1.01 KB, patch)
2014-04-10 14:12 UTC, Christian König
Details | Diff
Possible fix v2. (1.47 KB, patch)
2014-04-24 08:58 UTC, Christian König
Details | Diff

Description sdh 2014-03-11 14:14:58 UTC
$  lspci -k | grep -A3 VGA 
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV730/M96 [Mobility Radeon HD 4650/5165]
        Subsystem: Dell Device 0447
        Kernel driver in use: fglrx_pci
        Kernel modules: radeon, fglrx

Dell Inspiron N5010 (BIOS)

Legacy catalyst works properly. Using the open source radeon driver either boots to tty login with above errors, or boots to a locked up black screen.

drm syslog:
[drm] Initialized drm 1.1.0 20060810
[drm] radeon kernel modesetting enabled.
fb: conflicting fb hw usage radeondrmfb vs VESA VGA - removing generic driver
[drm] initializing kernel modesetting (RV730 0x1002:0x9480 0x1028:0x0447).
[drm] register mmio base: 0xFBE20000
[drm] register mmio size: 65536
[drm] Detected VRAM RAM=1024M, BAR=256M
[drm] RAM width 128bits DDR
[drm] radeon: 1024M of VRAM memory ready
[drm] radeon: 1024M of GTT memory ready.
[drm] GART: num cpu pages 262144, num gpu pages 262144
[drm] Loading RV730 Microcode
[drm] PCIE GART of 1024M enabled (table at 0x000000000025D000).
[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[drm] Driver supports precise vblank timestamp query.
[drm] radeon: irq initialized.
[drm] ring test on 0 succeeded in 1 usecs
[drm] ring test on 3 succeeded in 1 usecs
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[drm:uvd_v1_0_start] *ERROR* UVD not responding, giving up!!!
[drm:rv770_startup] *ERROR* radeon: failed initializing UVD (-1).
[drm] Enabling audio 0 support
[drm] ib test on ring 0 succeeded in 0 usecs
[drm] ib test on ring 3 succeeded in 0 usecs
[drm] radeon atom DIG backlight initialized
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   VGA-1
[drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
[drm]   Encoders:
[drm]     CRT1: INTERNAL_KLDSCP_DAC1
[drm] Connector 1:
[drm]   HDMI-A-1
[drm]   HPD1
[drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c
[drm]   Encoders:
[drm]     DFP1: INTERNAL_UNIPHY
[drm] Connector 2:
[drm]   LVDS-1
[drm]   DDC: 0x7f68 0x7f68 0x7f6c 0x7f6c 0x7f70 0x7f70 0x7f74 0x7f74
[drm]   Encoders:
[drm]     LCD1: INTERNAL_UNIPHY2
[drm] Internal thermal controller with fan control
[drm] radeon: dpm initialized
[drm] fb mappable at 0xC045E000
[drm] vram apper at 0xC0000000
[drm] size 4325376
[drm] fb depth is 24
[drm]    pitch is 5632
fbcon: radeondrmfb (fb0) is primary device
[drm:rv770_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed
radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
[drm] Initialized radeon 2.36.0 20080528 for 0000:01:00.0 on minor 0
Comment 1 Alex Deucher 2014-03-11 14:19:48 UTC
Is this a regression?  If so when did it last work?  Does booting with radeon.dpm=0 on the kernel command line in grub help?
Comment 2 sdh 2014-03-11 14:53:54 UTC
Booting with 3.12 kernel gives no such error. Using 3.13, booting from the Arch install iso gives the above errors.

The only difference with radeon.dpm=0 is that the line "rv770_restrict_performance_levels_before_switch failed" disappears.

I suppose this is related to the issue I faced with 3.10, where I could successfully boot using a install iso, but installing xorg + ati-dri and rebooting lead to a locked up black screen. In 3.13, it occurs in the initial booting itself.
Comment 3 Alex Deucher 2014-03-11 15:46:24 UTC
if 3.12 works, can you use git to bisect?
Comment 4 sdh 2014-03-11 16:22:20 UTC
Have never done it before. Will attempt to do it when I have some free time and report back :)
Comment 5 sdh 2014-04-06 15:22:20 UTC
Finally got around to doing this. And the result:

ec5891fbe1b078b191b25a13a2cc40b58fb7a693 is the first bad commit
commit ec5891fbe1b078b191b25a13a2cc40b58fb7a693
Author: Christian König <deathsimple@vodafone.de>
Date:   Mon Apr 8 12:41:36 2013 +0200

    drm/radeon: init UVD clocks to sane defaults

    Just until we get proper DPM for that.

    Signed-off-by: Christian König <christian.koenig@amd.com>
    Reviewed-by: Jerome Glisse <jglisse@redhat.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

:040000 040000 e76a3b2d04c78c73e66b5882155dbf1a26651282 900d340c7fc5a2952f1c4e38a17062e466a0b70a Mdrivers

So this has been present since 3.10, but was triggered only after installing the radeon driver (xf86-video-ati in Arch Linux). From 3.13 onwards, this is always triggered. But that may only be due to config changes in the Arch installer.

This is a one line edit, which later was shifted to a different function. Is enough to resolve the bug (which is still present in the latest kernel tested until v3.14-7247-gcd6362b).

Thanks
Comment 6 sdh 2014-04-06 15:35:09 UTC
s/Is enough/Is it enough/
Sorry for the typo, meant it as a question.
Comment 7 Christian König 2014-04-06 18:23:28 UTC
Well, have you updated the firmware as well while upgrading to 3.13?

The driver now needs both new UVD firmware and updated RLC firmware present.
Comment 8 sdh 2014-04-06 18:50:42 UTC
I suppose you mean these firmwares: http://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmware.git/tree/radeon

The Arch Linux package is built from the latest commit, and is installed on my laptop.

% modinfo radeon | grep firmware | grep uvd
firmware:       radeon/BONAIRE_uvd.bin
firmware:       radeon/TAHITI_uvd.bin
firmware:       radeon/SUMO_uvd.bin
firmware:       radeon/CYPRESS_uvd.bin
firmware:       radeon/RV710_uvd.bin

% modinfo radeon | grep firmware | grep  rlc
firmware:       radeon/SUMO_rlc.bin
firmware:       radeon/CYPRESS_rlc.bin
firmware:       radeon/JUNIPER_rlc.bin
firmware:       radeon/REDWOOD_rlc.bin
firmware:       radeon/CEDAR_rlc.bin
firmware:       radeon/R700_rlc.bin
firmware:       radeon/R600_rlc.bin
firmware:       radeon/ARUBA_rlc.bin
firmware:       radeon/CAYMAN_rlc.bin
firmware:       radeon/BTC_rlc.bin
firmware:       radeon/HAINAN_rlc.bin
firmware:       radeon/OLAND_rlc.bin
firmware:       radeon/VERDE_rlc.bin
firmware:       radeon/PITCAIRN_rlc.bin
firmware:       radeon/TAHITI_rlc.bin
firmware:       radeon/KABINI_rlc.bin
firmware:       radeon/KAVERI_rlc.bin
firmware:       radeon/HAWAII_rlc.bin
firmware:       radeon/BONAIRE_rlc.bin
Comment 9 sdh 2014-04-10 09:57:21 UTC
Hi,

Any update on this? Is there any other information required to fix the bug?
Comment 10 Christian König 2014-04-10 14:12:48 UTC
Created attachment 131871 [details]
Possible fix.

Please try the attached patch.
Comment 11 sdh 2014-04-12 08:59:42 UTC
Nopes. It makes the UVD errors go away, but the boot still gets stuck at the same point. Everything hangs up, requiring me to do a hard reboot.

Weirdly, I could not find any errors in either syslog or Xorg, which makes me feel I'm doing something wrong on my side (but I compiled it exactly like in the official repo, with the additional patch).
Comment 12 Christian König 2014-04-12 09:25:49 UTC
(In reply to sdh from comment #11)
> Nopes. It makes the UVD errors go away, but the boot still gets stuck at the
> same point. Everything hangs up, requiring me to do a hard reboot.

Well that the UVD errors go away is already an rather interesting result. Please attach a new dmesg log if possible.

> Weirdly, I could not find any errors in either syslog or Xorg, which makes
> me feel I'm doing something wrong on my side (but I compiled it exactly like
> in the official repo, with the additional patch).

Your compile is probably working right, but it is possible that we actually see two different problems here.
Comment 13 sdh 2014-04-12 10:18:36 UTC
Can you suggest me a good way to debug this and read the logs? Everything hangs up, so I do a hard reboot and start with nomodeset parameter to see the logs. Is there a better way to do this?
Comment 14 Christian König 2014-04-12 10:27:47 UTC
(In reply to sdh from comment #13)
> Can you suggest me a good way to debug this and read the logs? Everything
> hangs up, so I do a hard reboot and start with nomodeset parameter to see
> the logs. Is there a better way to do this?

Not sure how to do it on Arch, but on Ubuntu I would but the radeon module on the blacklist and boot into text only mode.

Then SSH into the system over the network, load the radeon module manually. If that doesn't crash try to also start X manually.
Comment 15 sdh 2014-04-12 18:12:00 UTC
Thanks, that helped :)

Doing a 'modprobe radeon' after booting up gives the following errors:
radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000002 last fence id 0x0000000000000000 on ring 5)
[drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35).
[drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-35).

Doing a startx completely hangs up the system

Full log:
# journalctl --since="2014-04-12" -o cat | grep 'drm\|radeon'
[drm] Initialized drm 1.1.0 20060810
[drm] radeon kernel modesetting enabled.
fb: conflicting fb hw usage radeondrmfb vs VESA VGA - removing generic driver
[drm] initializing kernel modesetting (RV730 0x1002:0x9480 0x1028:0x0447).
[drm] register mmio base: 0xFBE20000
[drm] register mmio size: 65536
radeon 0000:01:00.0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used)
radeon 0000:01:00.0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF
[drm] Detected VRAM RAM=1024M, BAR=256M
[drm] RAM width 128bits DDR
[drm] radeon: 1024M of VRAM memory ready
[drm] radeon: 1024M of GTT memory ready.
[drm] Loading RV730 Microcode
[drm] Internal thermal controller with fan control
[drm] radeon: dpm initialized
[drm] GART: num cpu pages 262144, num gpu pages 262144
[drm] PCIE GART of 1024M enabled (table at 0x000000000025D000).
radeon 0000:01:00.0: WB enabled
radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff8800b665cc00
radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff8800b665cc0c
radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c598 and cpu addr 0xffffc90012f1c598
[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[drm] Driver supports precise vblank timestamp query.
radeon 0000:01:00.0: irq 50 for MSI/MSI-X
radeon 0000:01:00.0: radeon: using MSI.
[drm] radeon: irq initialized.
[drm] ring test on 0 succeeded in 1 usecs
[drm] ring test on 3 succeeded in 1 usecs
[drm] ring test on 5 succeeded in 1 usecs
[drm] UVD initialized successfully.
[drm] ib test on ring 0 succeeded in 0 usecs
[drm] ib test on ring 3 succeeded in 0 usecs
radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000002 last fence id 0x0000000000000000 on ring 5)
[drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35).
[drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-35).
Starting Load/Save Screen Backlight Brightness of backlight:radeon_bl0...
Started Load/Save Screen Backlight Brightness of backlight:radeon_bl0.
[drm] radeon atom DIG backlight initialized
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   VGA-1
[drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
[drm]   Encoders:
[drm]     CRT1: INTERNAL_KLDSCP_DAC1
[drm] Connector 1:
[drm]   HDMI-A-1
[drm]   HPD1
[drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c
[drm]   Encoders:
[drm]     DFP1: INTERNAL_UNIPHY
[drm] Connector 2:
[drm]   LVDS-1
[drm]   DDC: 0x7f68 0x7f68 0x7f6c 0x7f6c 0x7f70 0x7f70 0x7f74 0x7f74
[drm]   Encoders:
[drm]     LCD1: INTERNAL_UNIPHY2
[drm] fb mappable at 0xC0460000
[drm] vram apper at 0xC0000000
[drm] size 4325376
[drm] fb depth is 24
[drm]    pitch is 5632
fbcon: radeondrmfb (fb0) is primary device
radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
radeon 0000:01:00.0: registered panic notifier
[drm] Initialized radeon 2.37.0 20080528 for 0000:01:00.0 on minor 0
radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[drm:rv770_dpm_set_power_state] *ERROR* rv770_set_sw_state failed
[drm] PCIE GART of 1024M enabled (table at 0x000000000025D000).
radeon 0000:01:00.0: WB enabled
radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff8800b665cc00
radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff8800b665cc0c
radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c598 and cpu addr 0xffffc90012f1c598
Comment 16 sdh 2014-04-15 07:29:23 UTC
ping.

I guess the above response got lost in the huge amount of mails you must be receiving :)
Comment 17 sdh 2014-04-18 14:15:46 UTC
ping again. I'm wondering whether you're very busy, or this email is getting buried in your inbox.
Comment 18 Christian König 2014-04-19 10:25:57 UTC
I did got your original message, I just didn't had time to look into it.

There must be something wrong with the clock settings for your system, I just don't know what it is.
Comment 19 sdh 2014-04-19 13:58:33 UTC
Anything I can do to help debug this?
Comment 20 Ciel Avenir 2014-04-24 03:06:46 UTC
*** Bug 73111 has been marked as a duplicate of this bug. ***
Comment 21 Christian König 2014-04-24 07:58:32 UTC
(In reply to sdh from comment #19)
> Anything I can do to help debug this?

Well, do you have some experience with kernel hacking?

The problem is somewhere in rv770_set_uvd_clocks found in the kernel source file drivers/gpu/drm/radeon/rv770.c. Despite the name this function is used for RV710, RV730 (yours) and RV770.

The implementation works fine on my RV710, but it looks like on some asics the reference frequency (or something else) is different and so programming the PLL results in a way to high frequency and the whole box becomes completely unstable because of this.

A good start would be to get me the values of fb_div, vclk_div and dclk_div used in that function.

You could also try to play with the frequencies used for radeon_set_uvd_clocks. Defaults are 533MHz and 400MHz (e.g. radeon_set_uvd_clocks(rdev, 53300, 40000)), but using those seems to make your system unstable.
Comment 22 Christian König 2014-04-24 08:58:41 UTC
Created attachment 133521 [details]
Possible fix v2.

Meanwhile please test the attached patch.

It won't fix the issue, but might at least allow your system to boot.
Comment 23 sdh 2014-04-24 11:59:13 UTC
(In reply to Christian König from comment #22)
> Possible fix v2.
> Meanwhile please test the attached patch.
> It won't fix the issue, but might at least allow your system to boot.
Yes indeed, thank you. Successfully booted to desktop using the radeon module. \m/

(In reply to Christian König from comment #21)
> Well, do you have some experience with kernel hacking?
The git bisect I did earlier is the most I have done ^_^

> The problem is somewhere in rv770_set_uvd_clocks found in the kernel source
> file drivers/gpu/drm/radeon/rv770.c. Despite the name this function is used
> for RV710, RV730 (yours) and RV770.
> 
> A good start would be to get me the values of fb_div, vclk_div and dclk_div
> used in that function.

diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c
index fef3107..2fbc787 100644
--- a/drivers/gpu/drm/radeon/rv770.c
+++ b/drivers/gpu/drm/radeon/rv770.c
@@ -67,6 +67,9 @@ int rv770_set_uvd_clocks(struct radeon_device *rdev, u32 vclk, u32 dclk)
 	r = radeon_uvd_calc_upll_dividers(rdev, vclk, dclk, 50000, 160000,
 					  43663, 0x03FFFFFE, 1, 30, ~0,
 					  &fb_div, &vclk_div, &dclk_div);
+
+    pr_notice("SDH: %d %d %d",fb_div,vclk_div,dclk_div);
+
 	if (r)
 		return r;

gives "SDH: 808574 5 5"
Comment 24 Dieter Nützel 2014-04-26 01:35:06 UTC
Tested it on my RV730 AGP, too.
The later values are after uvd playback.

Both kernels have these additional patches:
drm-radeon-Inline-r100_mm_rreg-wreg-v3.patch
drm-radeon-add-missing-radeon_semaphore_free-to-error-path.patch
0001-drm-radeon-print-uvd-clocks.patch

Other than that I can't see any change.
mplayer startup took ~4 sec for HDTV (.mkv; 1920x1080p) stuff,
sometimes longer with both values - Christian?
Do you need full dmesg?

/home/dieter> grep SDH dmesg-3.15.0-rc2-1-desktop.log
[   11.455469] SDH: 2585818 3 4
[   11.702222] SDH: 2585818 3 4
[ 4837.269874] SDH: 2585818 3 4
[ 4839.337562] SDH: 2585818 3 4
[ 4849.109778] SDH: 2585818 3 4
[ 4860.157849] SDH: 2585818 3 4
[ 4871.636301] SDH: 2585818 3 4
[ 4885.174251] SDH: 2585818 3 4

/home/dieter> grep SDH dmesg-3.15.0-rc2-1-desktop-lower-clocks.log 
[   11.453487] SDH: 808574 5 5
[   11.700600] SDH: 808574 5 5
[  330.936109] SDH: 2585818 3 4
[  332.969812] SDH: 2585818 3 4
[  349.357418] SDH: 2585818 3 4
[  371.532244] SDH: 2585818 3 4
[  372.860656] SDH: 2585818 3 4
[  387.290967] SDH: 2585818 3 4
[  432.414145] SDH: 2585818 3 4
[  433.611991] SDH: 2585818 3 4
Comment 25 Christian König 2014-04-26 08:40:35 UTC
Thanks for testing this Dieter, that actually saved me quite some time.

The formular for the clock is (fb_div*ref_clock)/(fb_factor*post_div).

fb_factor is a fixed value comming from the hardware design (43663)

ref_clock (reference clock) is the external input clock to the RV730 asic, it's value is burned into the BIOS (usually 27MHz for R7xx).

fb_div and post_div are the values the driver calculates to get a certain frequency for the chip.

So for our values here we get:
(808574*27MHz)/(43663*5) = ~100MHz
(2585818*27MHz)/(43663*3) = ~533MHz
(2585818*27MHz)/(43663*4) = ~400MHz
Over all that looks perfectly fine.

Please provide the content of the registers CG_UPLL_FUNC_CNTL, CG_UPLL_FUNC_CNTL_2 and CG_UPLL_FUNC_CNTL_3. E.g. install radeontool and execute:

radeonreg regmatch 0x718
radeonreg regmatch 0x71c
radeonreg regmatch 0x720

Thx, Christian.
Comment 26 Dieter Nützel 2014-04-27 00:12:36 UTC
0x718   0x20010002 (536936450)
0x71c   0x021f2111 (35594513)
0x720   0x102774da (271021274)

Only for 3.15.0-rc2-1-desktop-lower-clocks.

Do you need more? --- Need some sleep, badly!

Cheers, Dieter.
Comment 27 sdh 2014-04-27 05:52:57 UTC
(In reply to Christian König from comment #25)
> Please provide the content of the registers CG_UPLL_FUNC_CNTL,
> CG_UPLL_FUNC_CNTL_2 and CG_UPLL_FUNC_CNTL_3. E.g. install radeontool and
> execute:
> 
> radeonreg regmatch 0x718
> radeonreg regmatch 0x71c
> radeonreg regmatch 0x720

On my system (3.14.0-1-git), the values are:
0x718   0x20010002 (536936450)
0x71c   0x021f2222 (35594786)
0x720   0x100c567e (269244030)
Comment 28 Yuking 2014-04-29 13:51:13 UTC
My video card is HD6850, motherboard is ASUS P8Z68-LE.

System can not boot up with radeon module very often. After powerup, the screen always goes blank, no any information, and it will not reboot immediately after pressing the reset button ( Reboot  occurs  10 or 15s later). System will boot up after pressing the reset button 2, 3, or even 7 times. And the whole system crashes very often after boot up normally, too. 
The system log has no any failure information after several  boot failures.
Version 3.13~3.15 or earlier versions have the same problem.
Any idea about it? 

P. S. Windows works very well.
Comment 29 Alex Deucher 2014-04-29 13:55:04 UTC
(In reply to Yuking from comment #28)
> My video card is HD6850, motherboard is ASUS P8Z68-LE.
> 
> System can not boot up with radeon module very often. After powerup, the
> screen always goes blank, no any information, and it will not reboot
> immediately after pressing the reset button ( Reboot  occurs  10 or 15s
> later). System will boot up after pressing the reset button 2, 3, or even 7
> times. And the whole system crashes very often after boot up normally, too. 
> The system log has no any failure information after several  boot failures.
> Version 3.13~3.15 or earlier versions have the same problem.
> Any idea about it? 

Your issue doesn't sound related to this bug.  Please open a new bug and include your xorg log and dmesg output and note whether it is a regression or not.
Comment 30 sdh 2014-05-06 05:15:05 UTC
Hi. Any update on this?
Comment 31 sdh 2014-05-12 05:25:16 UTC
I just noticed I'm getting the following errors during the sleep and wake up cycle:

[drm:rv730_stop_dpm] *ERROR* Could not force DPM to low
[drm:rv770_dpm_set_power_state] *ERROR* rv770_set_sw_state failed

Kernel is 3.14.2-1-git-dirty with above patch
Comment 32 Christian König 2014-05-12 07:42:08 UTC
(In reply to sdh from comment #30)
> Hi. Any update on this?

I've pushed the workaround upstream. So you should at least have a booting system. Just don't try to use any accelerated video decoding since that would crash the box again.

My best guess is that the information in the BIOS about the reference frequency is wrong, but without having the hardware here I can't do much else to get UVD working properly.

(In reply to sdh from comment #31)
> I just noticed I'm getting the following errors during the sleep and wake up
> cycle:
> 
> [drm:rv730_stop_dpm] *ERROR* Could not force DPM to low
> [drm:rv770_dpm_set_power_state] *ERROR* rv770_set_sw_state failed
> 
> Kernel is 3.14.2-1-git-dirty with above patch

Sounds like an unrelated DPM problem to me, but who knows.
Comment 33 Alex Deucher 2014-05-12 14:59:22 UTC
I wonder if UVD uses the reference clock directly, or if it uses xclk.  If it uses xclk, they may explain the problems.  Can you post your dmesg output with this patch applied?

diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c
index fef3107..bda9137 100644
--- a/drivers/gpu/drm/radeon/rv770.c
+++ b/drivers/gpu/drm/radeon/rv770.c
@@ -1594,6 +1594,9 @@ static void rv770_gpu_init(struct radeon_device *rdev)
        WREG32(PA_CL_ENHANCE, (CLIP_VTX_REORDER_ENA |
                                          NUM_CLIP_SEQ(3)));
        WREG32(VC_ENHANCE, 0);
+
+       DRM_INFO("ref: %u, xclk: %u\n",
+                rdev->clock.spll.reference_freq, rv770_get_xclk(rdev));
 }
 
 void r700_vram_gtt_location(struct radeon_device *rdev, struct radeon_mc *mc)
Comment 34 Christian König 2014-05-12 15:11:33 UTC
(In reply to Alex Deucher from comment #33)
> I wonder if UVD uses the reference clock directly, or if it uses xclk.  If
> it uses xclk, they may explain the problems.

They can be different? Yeah that would indeed explain the issue. AFAIK RV730 uses XTALIN directly for the UVD PLLs.
Comment 35 Dieter Nützel 2014-05-12 17:07:46 UTC
(In reply to Alex Deucher from comment #33)
> I wonder if UVD uses the reference clock directly, or if it uses xclk.  If
> it uses xclk, they may explain the problems.  Can you post your dmesg output
> with this patch applied?

Here is mine (with latest stuff from Christian):

[   11.159443] [drm] initializing kernel modesetting (RV730 0x1002:0x9495 0x174B:0x0028).

[   11.257005] [drm] ref: 2700, xclk: 2700

[   11.263794] radeon 0000:01:00.0: WB disabled   <--- Is this intended?

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV730 [Radeon HD 4600 AGP Series] (prog-if 00 [VGA controller])
        Subsystem: PC Partner Limited / Sapphire Technology Radeon HD 4650 AGP DDR2
        Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 16
        Memory at c0000000 (32-bit, prefetchable) [size=256M]
        I/O ports at a800 [size=256]
        Memory at dfdf0000 (32-bit, non-prefetchable) [size=64K]
        Expansion ROM at dfdc0000 [disabled] [size=128K]
        Capabilities: [50] Power Management version 3
        Capabilities: [58] AGP version 3.0
        Kernel driver in use: radeon
        Kernel modules: radeon
Comment 36 Alex Deucher 2014-05-12 18:21:04 UTC
(In reply to Dieter Nützel from comment #35)
> (In reply to Alex Deucher from comment #33)
> > I wonder if UVD uses the reference clock directly, or if it uses xclk.  If
> > it uses xclk, they may explain the problems.  Can you post your dmesg
> output
> > with this patch applied?
> 
> Here is mine (with latest stuff from Christian):
> 
> [   11.159443] [drm] initializing kernel modesetting (RV730 0x1002:0x9495
> 0x174B:0x0028).
> 
> [   11.257005] [drm] ref: 2700, xclk: 2700

On your board (and most boards) they are the same, so your board would not be affected if this proves to be an issue.

> 
> [   11.263794] radeon 0000:01:00.0: WB disabled   <--- Is this intended?

On AGP cards, WB is disabled.
Comment 37 sdh 2014-05-13 05:44:20 UTC
(In reply to Christian König from comment #32)
> I've pushed the workaround upstream. So you should at least have a booting
> system. Just don't try to use any accelerated video decoding since that
> would crash the box again.

Cool, so 3.15 will boot normally for me, thanks! \m/
I am able to play games like Batman Arkham Origins (using wine) without crashing, so no issues so far.

> My best guess is that the information in the BIOS about the reference
> frequency is wrong, but without having the hardware here I can't do much
> else to get UVD working properly.

Everything is working properly for now. What issues could I face because of this workaround?

(In reply to Alex Deucher from comment #33)
> I wonder if UVD uses the reference clock directly, or if it uses xclk.  If
> it uses xclk, they may explain the problems.  Can you post your dmesg output
> with this patch applied?

My output is the same as Dieter's:
[drm] ref: 2700, xclk: 2700
Comment 38 Howard Chu 2014-06-26 00:14:58 UTC
I'm also seeing this UVD not responding on a Trinity laptop - Asus N56DP with A10-4600M and discrete GPU. Was working fine on 3.12, fails on 3.13 and 3.14. I tried 3.15 but that crashes within 30 seconds of bootup if radeon.dpm is enabled. I typically can barely login and type "dmesg" before it dies.

Aside from the UVD failing to initialize, X still comes up ok on 3.13 and 3.14.
Comment 39 Christian König 2014-06-26 08:21:41 UTC
(In reply to Howard Chu from comment #38)
> I'm also seeing this UVD not responding on a Trinity laptop - Asus N56DP
> with A10-4600M and discrete GPU. Was working fine on 3.12, fails on 3.13 and
> 3.14. I tried 3.15 but that crashes within 30 seconds of bootup if
> radeon.dpm is enabled. I typically can barely login and type "dmesg" before
> it dies.
> 
> Aside from the UVD failing to initialize, X still comes up ok on 3.13 and
> 3.14.

That's a completely different problem. This tracker is about UVD clock settings on RV7xx.

Please open up a new bug report and attach you dmesg to it.
Comment 40 sdh 2014-07-03 15:11:06 UTC
Hi. Do we close this bug report now that I can successfully boot into 3.15 along with the radeon module?
Comment 41 Christian König 2014-07-04 13:08:16 UTC
(In reply to sdh from comment #40)
> Hi. Do we close this bug report now that I can successfully boot into 3.15
> along with the radeon module?

I think we can close it, but it's your bug report you need to update the status.
Comment 42 sdh 2014-07-04 13:16:55 UTC
Ah ok cool. Thanks a lot for getting this resolved.

Cheers!
Comment 43 Dino 2014-08-15 12:51:04 UTC
I have similar issue which seems to occur since 3.12 kernel.
I get [drm] initialized then I get this weird screen. http://imgbox.com/LXh03H2o
It hangs up at that point, so I have to hold power button then power it on again.
It does boot with radeon.modeset=0 
Graphics is HD7640G Trinity (AMD A8-4500M, microcode is ARUBA) and laptop is ASUS K55N-DS81.
Comment 44 Dino 2014-08-15 15:08:19 UTC
Solved. Last line before this happened was 
fb: conflicting fb hw usage radeondrmfb vs VESA VGA - removing generic driver

It works after passing radeon.dpm=0 to boot options.
Comment 45 Joehni 2014-10-29 18:08:13 UTC
I suffer from the same problem with vanilla-sources from 3.13.0 to 3.17.1, however, this issue is closed as solved ... since when?
Comment 46 Alex Deucher 2014-10-29 19:21:03 UTC
The original issue reported was fixed.  Please file a new bug if you are having an issue.
Comment 47 closesrc 2015-06-03 06:57:17 UTC
how do I apply the patch, any instructions please? thanks.
Comment 48 Christian König 2015-06-03 12:48:18 UTC
(In reply to closesrc from comment #47)
> how do I apply the patch, any instructions please? thanks.

You simply don't. This bug is already closed and the fix upstream. No need to apply anything manually any more.

If you have similar problems please open up a new bug report.
Comment 49 Weber K. 2016-09-20 01:28:18 UTC
Hi!

I have HD 6850 and Kernel 4.4.14.

This problem appeared for me when I changed rootflags.
Solved with rootflags=relatime,lazytime,commit=60 in kernel parameters.

HTH

Best regards
Weber Kai
Comment 50 Weber K. 2016-09-20 01:45:50 UTC
Forgot to mention: And relatime,lazytime,commit=60 in fstab
I believe maybe dpm need some fs information to work well.

Note You need to log in before you can comment on or make changes to this bug.