Bug 214197 - [Asus G713QY] RX6800M not usable after exiting Vulkan application
Summary: [Asus G713QY] RX6800M not usable after exiting Vulkan application
Status: RESOLVED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-27 10:44 UTC by velemas
Modified: 2022-01-30 11:57 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.13.13
Subsystem:
Regression: No
Bisected commit-id:


Attachments
full dmesg output (100.80 KB, text/plain)
2021-08-28 17:48 UTC, velemas
Details

Description velemas 2021-08-27 10:44:35 UTC
Asus ROG Strix G17 Advantage Edition (G713QY) has hybrid-graphics with dGPU RX6800M. After exiting any Vulkan application, it becomes unusable. Vulkaninfo sees dGPU before Vulkan app and does not see RX6800M after.

After Vulkan app close, dmesg reports:

[  154.385749] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
[  154.401405] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[  154.401409] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[  159.038150] amdgpu 0000:03:00.0: amdgpu: message:        RunDcBtc (54)       param: 0x00000000 is timeout (no response)
[  159.038154] amdgpu 0000:03:00.0: amdgpu: Failed to setup smc hw!
[  159.038156] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -62
[  159.038220] amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-62).

Using amdgpu.runpm=0 parameter fixes the issue.
Comment 1 Alex Deucher 2021-08-27 15:05:20 UTC
Please attach your full dmesg output from boot through the problematic case.
Comment 3 velemas 2021-08-28 17:48:13 UTC
Created attachment 298505 [details]
full dmesg output
Comment 4 velemas 2021-08-28 17:49:40 UTC
(In reply to Alex Deucher from comment #2)
> Does this patch fix the issue?
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/
> ?id=202ead5a3c589b0594a75cb99f080174f6851fed

Kernel 5.13.13 has this patch already. So apparently it does not fix the problem.
It occurs with radv, amdvlk, and amdvlk-pro. External monitor is attached via HDMI (although it happens without ext. monitor too).

Sometimes dmesg does not contain above mentioned lines but dGPU is still unusable. Sometimes DXVK delivers VK_ERROR_DEVICE_LOST status even during runtime.
Comment 5 Pablo Cholaky 2021-10-19 05:36:46 UTC
Can confirm this issue as well under MSI Delta with RX6700M, in order to discard any "laptop specific issue". Both are Zen3 Navi cards.

Now, while it doesn't break GPU usage, but its a waste of power resources.

This issue it's kinda common, even with kernel 5.15.0-rc5. I don't have any steps to reproduce sadly.
Comment 7 velemas 2021-10-21 08:57:24 UTC
Kernel 5.14.14 already has it but it's not fixed. I got mostly the same dmesg message but somewhat different:

[  367.167527] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
[  367.183399] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[  367.183406] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[  371.863082] amdgpu 0000:03:00.0: amdgpu: Failed to setup smc hw!
[  371.863085] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -62
[  371.863147] amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-62).
Comment 8 velemas 2022-01-30 11:56:36 UTC
Recent kernels in 5.15.* and 5.16.* fix the issue for me.

Note You need to log in before you can comment on or make changes to this bug.