Bug 107381 - radeon VCE init error (-110) -- AMD/Intel Mars Hybrid Graphics
Summary: radeon VCE init error (-110) -- AMD/Intel Mars Hybrid Graphics
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-11-06 17:09 UTC by Andrew Schmadel
Modified: 2021-09-06 06:50 UTC (History)
28 users (show)

See Also:
Kernel Version: 4.3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg output (9.67 KB, application/octet-stream)
2015-11-06 17:09 UTC, Andrew Schmadel
Details
dmesg output after gnomeshell locked up, crashed and restarted (13.59 KB, application/octet-stream)
2015-11-24 20:11 UTC, Jean-Pierre van Riel
Details
dmesg output after forcing webgl rendering with radeon and firefox (18.65 KB, application/octet-stream)
2015-11-24 20:12 UTC, Jean-Pierre van Riel
Details
dmesg under kernel 4.4 - archlinux x64 (64.30 KB, text/plain)
2016-03-03 15:07 UTC, Robin KERDILES
Details
Ubuntu 15.10 Kernel 4.5-rc6 (86.43 KB, text/plain)
2016-03-04 10:34 UTC, corentin.dehay
Details

Description Andrew Schmadel 2015-11-06 17:09:02 UTC
Created attachment 192261 [details]
dmesg output

Since upgrading to Ubuntu 15.10, I have encountered graphics performance issues, and have occasionally experienced lockups during boot.

I have encountered this issue on kernel 4.2.0 and 4.3.0, and it seems to have affected users on other distributions as well:

https://bugs.launchpad.net/fedora/+source/linux/+bug/1512848
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=803087
https://bugzilla.redhat.com/show_bug.cgi?id=1262649

Notably, this issue seems to primarily impact users with The ATI "Mars" chipset, on machines that have an Intel/AMD hybrid graphics hardware configuration.

This shows up in dmesg (full log attached, because there's a fair amount of seemingly-useful context): 
[ 4.917369] radeon 0000:01:00.0: VCE init error (-110).


Some other context from my PC: 

$ xrandr --listproviders
Providers: number : 3
Provider 0: id: 0x6a cap: 0x9, Source Output, Sink Offload crtcs: 4 outputs: 5 associated providers: 2 name:Intel
Provider 1: id: 0x41 cap: 0x6, Sink Output, Source Offload crtcs: 2 outputs: 0 associated providers: 2 name:radeon
Provider 2: id: 0x41 cap: 0x6, Sink Output, Source Offload crtcs: 2 outputs: 0 associated providers: 2 name:radeon

$ lspci -k (trimmed to omit likely-irrelevant devices)
00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09)
        Subsystem: Samsung Electronics Co Ltd Device c0e6
        Kernel driver in use: ivb_uncore
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)
        Kernel driver in use: pcieport
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
        DeviceName: Onboard IGD
        Subsystem: Samsung Electronics Co Ltd Device c0e6
        Kernel driver in use: i915
01:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Mars [Radeon HD 8670A/8670M/8750M] (rev ff)
        Kernel driver in use: radeon
Comment 1 Jean-Pierre van Riel 2015-11-24 20:10:29 UTC
Similar frequent 'radeon 0000:01:00.0: VCE init error (-110)' issues for me. 

Gnome shell lag gets quite bad at times. At one point, it froze for a fairly long while, and restarted before I'd finished switching and logging into another console.

I noticed libgjs.so.0.0.0 segfaulted during the lockup.

I found the following to be an interesting way to test IGPU versus the DGPU, but it didn't trigger the bug
$ DRI_PRIME=0 vblank_mode=0 firefox http://oortonline.gl/#run
$ DRI_PRIME=1 vblank_mode=0 firefox http://oortonline.gl/#run

I was able to trigger more errors with radeon and saw this in dmesg

[68750.862570] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait failed (-35).
[68750.862583] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed testing IB on GFX ring (-35).
[68750.862588] [drm:radeon_resume_kms [radeon]] *ERROR* ib ring test failed (-35).

More info on my hardware

$  xrandr --listproviders
Providers: number : 3
Provider 0: id: 0x6e cap: 0x9, Source Output, Sink Offload crtcs: 4 outputs: 8 associated providers: 2 name:Intel
Provider 1: id: 0x42 cap: 0x6, Sink Output, Source Offload crtcs: 2 outputs: 1 associated providers: 2 name:radeon
Provider 2: id: 0x42 cap: 0x6, Sink Output, Source Offload crtcs: 2 outputs: 1 associated providers: 2 name:radeon

$ lspci -k
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06)
	Kernel driver in use: pcieport
00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06)
	DeviceName:  Onboard IGD
	Subsystem: Dell Device 05be
	Kernel driver in use: i915
...
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Mars XTX [Radeon HD 8790M] (rev ff)
	Kernel driver in use: radeon

$ sudo lshw -c display
  *-display UNCLAIMED     
       description: VGA compatible controller
       product: Mars XTX [Radeon HD 8790M]
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: pci@0000:01:00.0
       version: 00
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi vga_controller bus_master cap_list
       configuration: latency=0
       resources: memory:e0000000-efffffff memory:f7c00000-f7c3ffff ioport:e000(size=256) memory:f7c40000-f7c5ffff
  *-display
       description: VGA compatible controller
       product: 4th Gen Core Processor Integrated Graphics Controller
       vendor: Intel Corporation
       physical id: 2
       bus info: pci@0000:00:02.0
       version: 06
       width: 64 bits
       clock: 33MHz
       capabilities: msi pm vga_controller bus_master cap_list rom
       configuration: driver=i915 latency=0
       resources: irq:30 memory:f5800000-f5bfffff memory:d0000000-dfffffff ioport:f000(size=64)

To see switcheroo options
# cat /sys/kernel/debug/vgaswitcheroo/switch
0:DIS: :DynOff:0000:01:00.0
1:IGD:+:Pwr:0000:00:02.0

dmesg files attached
Comment 2 Jean-Pierre van Riel 2015-11-24 20:11:12 UTC
Created attachment 195321 [details]
dmesg output after gnomeshell locked up, crashed and restarted
Comment 3 Jean-Pierre van Riel 2015-11-24 20:12:16 UTC
Created attachment 195331 [details]
dmesg output after forcing webgl rendering with radeon and firefox
Comment 4 Robin KERDILES 2016-03-03 15:05:49 UTC
This issue still affects kernels 4.4, 4.5-rc6
Archlinux x64
Hardware : radeon HD 8750M
Comment 5 Robin KERDILES 2016-03-03 15:07:41 UTC
Created attachment 206701 [details]
dmesg under kernel 4.4 - archlinux x64
Comment 6 corentin.dehay 2016-03-04 10:34:24 UTC
Created attachment 206781 [details]
Ubuntu 15.10 Kernel 4.5-rc6

Dmesg with ubuntu 15.10 and kernel 4.5-rc6 if it can help to fix it (or not)
Comment 7 Stratos Zolotas 2016-03-06 10:52:26 UTC
I'm experiencing the same issue and I'm not on a hybrid configuration. I have a 2 VGA setup with radeon only hardware on openSUSE Tumbleweed

uname -a
Linux teras 4.4.3-1-default #1 SMP PREEMPT Fri Feb 26 09:54:10 UTC 2016 (171b8f1) x86_64 x86_64 x86_64 GNU/Linux

sudo lspci | grep VGA
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250/350]
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]

dmesg output:
[    2.750393] radeon 0000:01:00.0: VCE init error (-110).
Comment 8 chico76 2016-05-07 09:58:50 UTC
I have the same problem, tested with 4.4.8-1-lts, 4.5.1 and I have recently compiled the 4.5.3 kernel in Arch linux.


I have a hybrid set up the ARUBA card is working fine and the OLAND is the one getting the VCE init error (-110).


maj 07 10:01:43 noname kernel: radeon 0000:01:00.0: VCE init error (-110).
maj 07 10:01:44 noname kernel: [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD)
maj 07 10:01:44 noname kernel: [drm:si_resume [radeon]] *ERROR* si startup failed on resume
maj 07 10:01:44 noname acpid[554]: client connected from 666[0:1000]

if it helps, the error seems to be yielded from radeon/si.c after the comment:
/* allocate wb buffer */

But well, I am not qualified to look at this, haven't been programming for 10 years...
Comment 9 Stratos Zolotas 2016-05-07 10:54:31 UTC
After removing my Redwood AMD VGA and replaced with an extra Oland, now I have the error in double, with 4.5.2 kernel

sudo lspci | grep VGA
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250/350]
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland PRO [Radeon R7 240/340]

[    2.593576] [drm] Found VCE firmware/feedback version 50.0.1 / 17!
[    2.774303] radeon 0000:01:00.0: VCE init error (-110).
[    4.160741] [drm] Found VCE firmware/feedback version 50.0.1 / 17!
[    4.268492] radeon 0000:02:00.0: VCE init error (-110).
Comment 10 chico76 2016-05-11 19:47:14 UTC
Does anyone have any idea in wich kernel version the OLAND chip startet to fail?
Maybe i can try to bisect the kernel if I new a version when it was working..
Comment 11 bsarels 2016-06-16 07:35:16 UTC
Hello,

I'm affected too on a HP Elitebook 840 G1.

sudo lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b)
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Mars [Radeon HD 8730M] (rev ff)

I'm willing to help if possible.
Comment 12 Gauthier P. 2016-10-02 19:05:22 UTC
Hello,

I can confirm the bug on a HP Elitebook 840 G1, running ArchLinux and the lasted available kernel (Linux arch 4.7.5-1-ARCH #1 SMP PREEMPT Sat Sep 24 13:04:22 CEST 2016 x86_64 GNU/Linux).

The problem also occurs on my own build of the last kernel (4.7.6).

I hope to see this problem solved.

Sincerely,
Comment 13 madbiologist 2016-11-07 11:38:25 UTC
VCE 1.0 support was added in kernel 4.2. If you are able to build your own kernel with this commit reverted it should fix the issue:

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=a918efab631a5112d9d168700458317ad77f269c
Comment 14 Bhaskar 2016-12-01 12:29:52 UTC
This issue still does not seem to be fixed. My kernel version is 4.4.0-51-generic and the issue still occurs.

Here's the output of dmesg | egrep -i 'vce|error' :

[    2.057617] [drm] Found VCE firmware/feedback version 40.2.2 / 15!
[    2.271399] [drm] VCE initialized successfully.
[    4.627532] kfd kfd: error getting iommu info. is the iommu enabled?
[    4.627538] kfd kfd: Error initializing iommuv2 for device (1002:130a)
[    4.627729] kfd kfd: device (1002:130a) NOT added due to errors
[    4.752479] [drm] Found VCE firmware/feedback version 50.0.1 / 17!
[    6.235140] radeon 0000:01:00.0: VCE init error (-110).
[    7.301364] [drm:radeon_acpi_init [radeon]] *ERROR* Cannot find a backlight controller
[   20.220010] EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro
[   31.605179] radeon 0000:01:00.0: VCE init error (-110).
[   51.400194] radeon 0000:01:00.0: VCE init error (-110).
[   74.917029] radeon 0000:01:00.0: VCE init error (-110).



uname -a
Linux gublu 4.4.0-51-generic #72-Ubuntu SMP Thu Nov 24 18:29:54 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux


sudo lspci | grep VGA
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Kaveri [Radeon R6 Graphics]
Comment 15 Stratos Zolotas 2016-12-01 12:36:14 UTC
I'm on 4.8.10 and still is not fixed. I have two AMD GPUS (both Oland) and the issue is appearing on both. So don't expect to see a fix on any released kernel. Haven't tested any 4.9 pre-release yet.

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250/350]
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland PRO [Radeon R7 240/340]

dmesg | egrep -i 'vce|error'
[    3.297273] [drm] Found VCE firmware/feedback version 50.0.1 / 17!
[    3.456641] radeon 0000:01:00.0: failed VCE resume (-110).
[    4.878397] [drm] Found VCE firmware/feedback version 50.0.1 / 17!
[    4.985278] radeon 0000:02:00.0: failed VCE resume (-110).

uname -a
Linux teras 4.8.10-1-default #1 SMP PREEMPT Mon Nov 21 13:50:28 UTC 2016 (d1ec066) x86_64 x86_64 x86_64 GNU/Linux
Comment 16 WHAT 2016-12-30 20:01:29 UTC
openSUSE Leap 42.2 with kernel 4.4.36 is also affected, showing "radeon VCE init error (-110)" on startup, Radeon R7 M265 GPU not working (lspci -k output with (rev ff)); Unable to install proprietary fglrx driver because the installer can't find the AMD GPU.
Comment 17 Jean-Pierre van Riel 2017-01-08 12:58:42 UTC
Until VCE gets fixed for Mars/, there might be a way to disable VCE (I need to look into it more)

https://github.com/torvalds/linux/commit/fabb5935871db1f31fcd2684fd154e24de04d917#diff-9bc1b4aaf15dd521a1991717e4e2a2e0
Comment 18 Tabs 2017-01-24 08:41:30 UTC
It is possible to disable VCI in the latest drivers (commit was done on Mar 18, 2016).

For persistent changes edit file /etc/modprobe.d/radeon.conf to add the line:
options radeon vce=0

After the next reboot you can check the changes apply by using
systool -vm radeon

It stops the error messages for me, but I have the feeling that the UI is much slower (probably 2D accelaration has been disabled in the process).
Comment 19 Michel Dänzer 2017-01-24 08:56:42 UTC
(In reply to Tabs from comment #18)
> It stops the error messages for me, but I have the feeling that the UI is
> much slower (probably 2D accelaration has been disabled in the process).

You can check this in the Xorg log file.
Comment 20 armenberug 2018-12-29 15:34:07 UTC
Hello Everyone!

Any update on this? I am getting the same problem with AMD Radeon 8750m and Intel HD Graphics 4600
Comment 21 Mahmoud Elagdar 2020-04-24 23:04:33 UTC
(In reply to armenberug from comment #20)
> Hello Everyone!
> 
> Any update on this? I am getting the same problem with AMD Radeon 8750m and
> Intel HD Graphics 4600

A workaround for your card is to use amdgpu instead of radeon
There're several ways to do this here: https://wiki.archlinux.org/index.php/AMDGPU#Enable_Southern_Islands_(SI)_and_Sea_Islands_(CIK)_support
Comment 22 Alex Deucher 2020-04-27 14:40:07 UTC
(In reply to Mahmoud Elagdar from comment #21)
> 
> A workaround for your card is to use amdgpu instead of radeon
> There're several ways to do this here:
> https://wiki.archlinux.org/index.php/
> AMDGPU#Enable_Southern_Islands_(SI)_and_Sea_Islands_(CIK)_support

amdgpu does not have support for vce at the moment for these asics.
Comment 23 GeorgeQQHu 2021-09-06 06:50:37 UTC
this card didn't seem to have VCE at all

pls refer to https://en.wikipedia.org/wiki/Video_Coding_Engine

it is same as Bug: https://bugzilla.kernel.org/show_bug.cgi?id=197327

Note You need to log in before you can comment on or make changes to this bug.