Bug 77181 - radeon -- GPU lockup when hibernating or waking up
Summary: radeon -- GPU lockup when hibernating or waking up
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-06-02 07:28 UTC by Mantas Mikulėnas
Modified: 2016-03-23 18:55 UTC (History)
5 users (show)

See Also:
Kernel Version: 3.15.0-rc8
Tree: Mainline
Regression: No


Attachments

Description Mantas Mikulėnas 2014-06-02 07:28:49 UTC
With 3.15-rc* kernels (including -rc8 built today), hibernation (suspend to disk) doesn't work properly; the system hangs for ~10 seconds and I get a "GPU lockup" message before it continues. The same happens both when hibernating, and when resuming from hibernation. (Suspend-to-RAM, however, works perfectly.) In the last few attempts, during resume it recovered from the lockup, but remained unstable afterwards (see second "GPU softreset" log at the bottom).

The same happens both with Xorg running, and without.

Computer: ASUS K52JT.206 laptop, using UEFI.

01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Robson CE [Radeon HD 6370M/7370M] [1002:68e4]

dmesg after resuming from hibernate:

[  235.036435] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  235.609112] PM: Hibernation mode set to 'platform'
[  235.765779] wlan0: deauthenticating from 24:a4:3c:ae:df:83 by local choice (Reason: 3=DEAUTH_LEAVING)
[  235.773778] cfg80211: Calling CRDA to update world regulatory domain
[  235.773990] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[  235.776029] cfg80211: World regulatory domain updated:
[  235.776034] cfg80211: [snip]
[  235.817885] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[  235.883507] PM: Syncing filesystems ... done.
[  237.132171] Freezing user space processes ... (elapsed 0.023 seconds) done.
[  237.155971] PM: Marking nosave pages: [mem 0x000a0000-0x000fffff]
[  237.155974] PM: Marking nosave pages: [mem 0x029a9000-0x029a9fff]
[  237.155975] PM: Marking nosave pages: [mem 0x029b8000-0x029b9fff]
[  237.155976] PM: Marking nosave pages: [mem 0x029c3000-0x029c3fff]
[  237.155978] PM: Marking nosave pages: [mem 0xbed06000-0xbed91fff]
[  237.155980] PM: Marking nosave pages: [mem 0xbed95000-0xbed9bfff]
[  237.155981] PM: Marking nosave pages: [mem 0xbed9d000-0xbee02fff]
[  237.155984] PM: Marking nosave pages: [mem 0xbf800000-0xffffffff]
[  237.156474] PM: Basic memory bitmaps created
[  237.156517] PM: Preallocating image memory... done (allocated 362140 pages)
[  237.269398] PM: Allocated 1448560 kbytes in 0.11 seconds (13168.72 MB/s)
[  237.269400] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[  237.270653] Suspending console(s) (use no_console_suspend to debug)
[  237.274839] pciehp 0000:00:1c.1:pcie04: slot(1): Link Up event
[  237.283819] pciehp 0000:00:1c.5:pcie04: slot(5): Link Up event
[  237.377141] pciehp 0000:00:1c.1:pcie04: Device 0000:03:00.0 already exists at 0000:03:00, cannot hot-add
[  237.377145] pciehp 0000:00:1c.1:pcie04: Cannot add device at 0000:03:00
[  237.387169] pciehp 0000:00:1c.5:pcie04: Device 0000:05:00.0 already exists at 0000:05:00, cannot hot-add
[  237.387172] pciehp 0000:00:1c.5:pcie04: Cannot add device at 0000:05:00
[  237.802212] PM: freeze of devices complete after 531.509 msecs
[  237.802648] PM: late freeze of devices complete after 0.431 msecs
[  237.803625] PM: noirq freeze of devices complete after 0.974 msecs
[  237.803969] ACPI: Preparing to enter system sleep state S4
[  237.805209] PM: Saving platform NVS memory
[  237.805540] Disabling non-boot CPUs ...
[  237.806878] kvm: disabling virtualization on CPU1
[  237.907118] smpboot: CPU 1 is now offline
[  237.908730] kvm: disabling virtualization on CPU2
[  238.010458] smpboot: CPU 2 is now offline
[  238.012035] kvm: disabling virtualization on CPU3
[  238.113805] smpboot: CPU 3 is now offline
[  238.114503] PM: Creating hibernation image:
[  238.224525] PM: Need to copy 377652 pages
[  238.224530] PM: Normal pages needed: 377652 + 1024, available pages: 635664
[  238.115167] PM: Restoring platform NVS memory
[  238.115648] microcode: CPU0 sig=0x20655, pf=0x10, revision=0x4
[  238.115662] Enabling non-boot CPUs ...
[  238.115712] x86: Booting SMP configuration:
[  238.115712] smpboot: Booting Node 0 Processor 1 APIC 0x4
[  238.126823] kvm: enabling virtualization on CPU1
[  238.129214] microcode: CPU1 sig=0x20655, pf=0x10, revision=0x2
[  238.129417] microcode: CPU1 updated to revision 0x4, date = 2013-06-28
[  238.129422] CPU1 is up
[  238.129447] smpboot: Booting Node 0 Processor 2 APIC 0x1
[  238.140626] kvm: enabling virtualization on CPU2
[  238.143072] microcode: CPU2 sig=0x20655, pf=0x10, revision=0x2
[  238.143249] microcode: CPU2 updated to revision 0x4, date = 2013-06-28
[  238.143255] CPU2 is up
[  238.143276] smpboot: Booting Node 0 Processor 3 APIC 0x5
[  238.154445] kvm: enabling virtualization on CPU3
[  238.156881] microcode: CPU3 sig=0x20655, pf=0x10, revision=0x4
[  238.156888] CPU3 is up
[  238.159235] ACPI: Waking up from system sleep state S4
[  238.572267] ACPI: \_SB_.SLPB: ACPI_NOTIFY_DEVICE_WAKE event
[  238.585241] PM: noirq restore of devices complete after 11.564 msecs
[  238.585438] PM: early restore of devices complete after 0.165 msecs
[  238.643160] mei_me 0000:00:16.0: irq 47 for MSI/MSI-X
[  238.643162] usb usb1: root hub lost power or was reset
[  238.647082] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
[  238.647203] usb usb2: root hub lost power or was reset
[  238.651122] ehci-pci 0000:00:1d.0: cache line size of 64 is not supported
[  238.651363] snd_hda_intel 0000:00:1b.0: irq 48 for MSI/MSI-X
[  238.651681] snd_hda_intel 0000:01:00.1: irq 49 for MSI/MSI-X
[  238.653131] sd 0:0:0:0: [sda] Starting disk
[  238.658947] switching from power state:
[  238.658949] 	ui class: none
[  238.658951] 	internal class: boot 
[  238.658952] 	caps: video 
[  238.658953] 	uvd    vclk: 0 dclk: 0
[  238.658954] 		power level 0    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
[  238.658956] 		power level 1    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
[  238.658957] 		power level 2    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
[  238.658958] 	status: c b 
[  238.658958] switching to power state:
[  238.658958] 	ui class: performance
[  238.658960] 	internal class: none
[  238.658961] 	caps: single_disp video 
[  238.658963] 	uvd    vclk: 0 dclk: 0
[  238.658963] 		power level 0    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
[  238.658964] 		power level 1    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
[  238.658965] 		power level 2    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
[  238.658966] 	status: r 
[  238.664760] [drm] PCIE GART of 1024M enabled (table at 0x000000000025D000).
[  238.664880] radeon 0000:01:00.0: WB enabled
[  238.664883] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff880001912c00
[  238.664884] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff880001912c0c
[  238.666037] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90010d9c418
[  238.682922] [drm] ring test on 0 succeeded in 1 usecs
[  238.682981] [drm] ring test on 3 succeeded in 1 usecs
[  238.748484] pciehp 0000:00:1c.1:pcie04: Device 0000:03:00.0 already exists at 0000:03:00, cannot hot-add
[  238.748487] pciehp 0000:00:1c.1:pcie04: Cannot add device at 0000:03:00
[  238.748525] pciehp 0000:00:1c.5:pcie04: Device 0000:05:00.0 already exists at 0000:05:00, cannot hot-add
[  238.748529] pciehp 0000:00:1c.5:pcie04: Cannot add device at 0000:05:00
[  238.860143] [drm] ring test on 5 succeeded in 1 usecs
[  238.860148] [drm] UVD initialized successfully.
[  238.860183] [drm] ib test on ring 0 succeeded in 0 usecs
[  238.860216] [drm] ib test on ring 3 succeeded in 1 usecs
[  238.965196] usb 2-1: reset high-speed USB device number 2 using ehci-pci
[  238.978495] ata6: SATA link down (SStatus 0 SControl 300)
[  238.985171] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  238.985207] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[  238.986567] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[  238.989278] ata2.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded
[  238.990012] ata2.00: ACPI cmd ef/90:03:00:00:00:a0 (SET FEATURES) succeeded
[  238.991760] ata5: SATA link down (SStatus 0 SControl 300)
[  238.992884] ata2.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded
[  238.993618] ata2.00: ACPI cmd ef/90:03:00:00:00:a0 (SET FEATURES) succeeded
[  238.994024] ata2.00: configured for UDMA/100
[  238.995663] ata1.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded
[  239.005635] ata1.00: ACPI cmd ef/90:03:00:00:00:a0 (SET FEATURES) succeeded
[  239.089888] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[  239.099627] ata1.00: ACPI cmd ef/10:06:00:00:00:a0 (SET FEATURES) succeeded
[  239.110256] ata1.00: ACPI cmd ef/90:03:00:00:00:a0 (SET FEATURES) succeeded
[  239.140436] ata1.00: configured for UDMA/133
[  249.009247] radeon 0000:01:00.0: ring 5 stalled for more than 10000msec
[  249.009250] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000004 last fence id 0x0000000000000002 on ring 5)
[  249.009252] [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35).
[  249.009256] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-35).
[  250.435952] PM: restore of devices complete after 11792.082 msecs
[  250.436675] PM: Image restored successfully.
[  250.436698] PM: Basic memory bitmaps freed
[  250.436699] Restarting tasks ... done.
[  250.439326] video LNXVIDEO:00: Restoring backlight state
[  250.456916] jme 0000:05:00.5: irq 50 for MSI/MSI-X
[  250.457017] jme 0000:05:00.5 eth0: Link is up at ANed: 100 Mbps, Full-Duplex, MDI
[  250.458065] jme 0000:05:00.5 eth0: Link is down
[  250.479766] jme 0000:05:00.5 eth0: Link is down
[  250.479834] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  250.494725] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[  250.546164] usb 1-1: new high-speed USB device number 3 using ehci-pci
[  250.670484] hub 1-1:1.0: USB hub found
[  250.670647] hub 1-1:1.0: 6 ports detected
[  250.686380] usb 1-1: USB disconnect, device number 3
[  251.847086] wlan0: authenticate with 24:a4:3c:ae:df:83
[  251.853958] wlan0: send auth to 24:a4:3c:ae:df:83 (try 1/3)
[  251.861038] wlan0: authenticated
[  251.862738] wlan0: associate with 24:a4:3c:ae:df:83 (try 1/3)
[  251.867237] wlan0: RX AssocResp from 24:a4:3c:ae:df:83 (capab=0x431 status=0 aid=2)
[  251.867306] wlan0: associated
[  251.867346] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[  252.248171] jme 0000:05:00.5 eth0: Link is up at ANed: 100 Mbps, Full-Duplex, MDI
[  252.248519] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

dmesg when it reset itself a bit later:

Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: ring 5 stalled for more than 281610msec
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000005 last fence id 0x0000000000000004 on ring 5)
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: Saved 23 dwords of commands on ring 0.
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: GPU softreset: 0x00000009
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   GRBM_STATUS               = 0xA7702828
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x7C000005
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000007
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   SRBM_STATUS               = 0x200800C0
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010800
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00028004
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80038647
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003828
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000007
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000007
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   SRBM_STATUS               = 0x200800C0
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume
Jun 02 10:15:54 rain kernel: switching from power state:
Jun 02 10:15:54 rain kernel:         ui class: none
Jun 02 10:15:54 rain kernel:         internal class: boot 
Jun 02 10:15:54 rain kernel:         caps: video 
Jun 02 10:15:54 rain kernel:         uvd    vclk: 0 dclk: 0
Jun 02 10:15:54 rain kernel:                 power level 0    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:                 power level 1    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:                 power level 2    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:         status: c b 
Jun 02 10:15:54 rain kernel: switching to power state:
Jun 02 10:15:54 rain kernel:         ui class: performance
Jun 02 10:15:54 rain kernel:         internal class: none
Jun 02 10:15:54 rain kernel:         caps: single_disp video 
Jun 02 10:15:54 rain kernel:         uvd    vclk: 0 dclk: 0
Jun 02 10:15:54 rain kernel:                 power level 0    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:                 power level 1    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:                 power level 2    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:         status: r 
Jun 02 10:15:54 rain kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025D000).
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: WB enabled
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff880001912c00
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff880001912c0c
Jun 02 10:15:54 rain kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90010d9c418
Jun 02 10:15:54 rain kernel: [drm] ring test on 0 succeeded in 1 usecs
Jun 02 10:15:54 rain kernel: [drm] ring test on 3 succeeded in 1 usecs
Jun 02 10:15:54 rain kernel: [drm] ring test on 5 succeeded in 1 usecs
Jun 02 10:15:54 rain kernel: [drm] UVD initialized successfully.
Jun 02 10:15:54 rain kernel: [drm] ib test on ring 0 succeeded in 0 usecs
Jun 02 10:15:54 rain kernel: [drm] ib test on ring 3 succeeded in 0 usecs
Jun 02 10:15:54 rain kernel: [drm:uvd_v1_0_ib_test] *ERROR* radeon: failed to get create msg (-22).
Jun 02 10:15:54 rain kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-22).
Jun 02 10:15:54 rain kernel: [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed
Jun 02 10:15:54 rain kernel: switching from power state:
Jun 02 10:15:54 rain kernel:         ui class: none
Jun 02 10:15:54 rain kernel:         internal class: boot 
Jun 02 10:15:54 rain kernel:         caps: video 
Jun 02 10:15:54 rain kernel:         uvd    vclk: 0 dclk: 0
Jun 02 10:15:54 rain kernel:                 power level 0    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:                 power level 1    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:                 power level 2    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:         status: c b 
Jun 02 10:15:54 rain kernel: switching to power state:
Jun 02 10:15:54 rain kernel:         ui class: performance
Jun 02 10:15:54 rain kernel:         internal class: none
Jun 02 10:15:54 rain kernel:         caps: single_disp video 
Jun 02 10:15:54 rain kernel:         uvd    vclk: 0 dclk: 0
Jun 02 10:15:54 rain kernel:                 power level 0    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:                 power level 1    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:                 power level 2    sclk: 75000 mclk: 80000 vddc: 1100 vddci: 0
Jun 02 10:15:54 rain kernel:         status: r
Comment 1 Rostislav Devyatov 2014-08-14 12:52:18 UTC
Same problem here (mobility radeon 4550, kernel 3.14.14-gentoo)
Comment 2 Alex Deucher 2014-08-14 13:22:47 UTC
Is this a regression?  If so, can you bisect?
Comment 3 Rostislav Devyatov 2014-08-14 15:29:22 UTC
(In reply to Alex Deucher from comment #2)

I am sorry, but I don't think I understand you. What do you mean by "regression"? What am I trying to bisect?
Comment 4 Alex Deucher 2014-08-14 17:01:28 UTC
(In reply to Rostislav Devyatov from comment #3)
> (In reply to Alex Deucher from comment #2)
> 
> I am sorry, but I don't think I understand you. What do you mean by
> "regression"? What am I trying to bisect?

Did it used to work previously?  If so what kernel?  If you know what kernel used to work, you can use git to bisect the changes and identify what change broke it.
Comment 5 Rostislav Devyatov 2014-08-14 18:48:35 UTC
(In reply to Alex Deucher from comment #4)
> Did it used to work previously?  If so what kernel?  If you know what kernel
> used to work, you can use git to bisect the changes and identify what change
> broke it.

Previously, in kernel 3.12.21-r1-gentoo, I had the same problem (the screen is sometimes messy after hibernation, and the same "fence wait failed" error in syslog). Before that, I ran deblobbed kernels 3.8.13-gentoo and 3.4.9-gentoo, and there were also problems with the screen, but a bit different. The screen just became black (sometimes it happened after hibernation, sometimes just during the normal work), but the mouse pointer was there, and the error in syslog was different: it started with "kernel BUG" at some location. I don't have the syslog from 3.8.13, but in 3.4.9 it was at mm/slub.c:3474 .

On my system, both problems (the one I have now and the one I had previously) happen rather irregularly.
Comment 6 Rostislav Devyatov 2014-08-18 15:21:16 UTC
Also, after a "normal" wake-up from hibernate, when the screen does not get distorted, I see the following message in dmesg:

[drm:rv770_dpm_set_power_state] *ERROR* rv770_set_sw_state failed
Comment 7 Mantas Mikulėnas 2014-09-14 01:33:54 UTC
(In reply to Alex Deucher from comment #2)
> Is this a regression?  If so, can you bisect?

Narrowed it down to the UVD merge for 3.10 (f18353eee757).

ec5891fbe1b0 ("drm/radeon: UVD bringup v8") was the first commit with the "fence wait failed" errors:

[  +0.010612] sd 0:0:0:0: [sda] Starting disk
[  +9.661530] radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
[  +0.000003] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000004 last fence id 0x0000000000000002)
[  +0.000003] [drm:r600_uvd_ib_test] *ERROR* radeon: fence wait failed (-35).
[  +0.000002] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-35).
[  +1.614706] PM: restore of devices complete after 11819.193 msecs

Seven commits in between [f2ba57b5eab8, ef0e6e657cfe] were also bad, but with a different error message:

[  +0.010618] sd 0:0:0:0: [sda] Starting disk
[  +0.560078] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[  +1.020296] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[  +1.020318] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[  +1.020299] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[  +1.020297] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[  +1.020298] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[  +1.020298] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[  +1.020323] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[  +1.020297] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[  +1.020297] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[  +0.020007] [drm:r600_uvd_init] *ERROR* UVD not responding, giving up!!!
[  +0.000001] [drm:evergreen_startup] *ERROR* radeon: error initializing UVD (-1).
[  +0.000026] [drm] ib test on ring 0 succeeded in 0 usecs
[  +0.000025] [drm] ib test on ring 3 succeeded in 1 usecs
[  +1.614533] PM: restore of devices complete after 11878.332 msecs
[  +0.055061] PM: Image restored successfully.

4474f3a91f95 was the last good commit:

[  +0.010691] sd 0:0:0:0: [sda] Starting disk
[  +1.154120] PM: restore of devices complete after 1645.354 msecs
Comment 8 Barto 2014-09-22 08:48:03 UTC
I have the same problem with my radeon HD4650 PCie ( I use the radeon driver ), with dpm actived,

the hibernation doesn't work, it fails with an error message in the console related to my radeon HD4650 "fence ring error",

then at reboot fsck starts, which reveals that the hibernation has failed

Note You need to log in before you can comment on or make changes to this bug.