Bug 214921

Summary: amdgpu hangs HP Laptop on shutdown
Product: Drivers Reporter: spasswolf
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: alexdeucher, bjo, pmw.gover
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.15, 5.15.1, 5.15.0-next-20211112 Subsystem:
Regression: Yes Bisected commit-id:

Description spasswolf 2021-11-03 02:46:53 UTC
Commit bf756fb833cbe8c6881c964f09db718bade6e591 leads to an improper shutdown, i.e. the System does not switch off and has to be powered off by pressing the power button for a long time. The problem seems to occur relatively late in the shutdown as it leaves no trace in logfiles.  
It also does not fix hangs on suspension on this Laptop.
Reverting this commit in 5.15 leads to working shutdown again while 
resuming from suspension still does not work.
Hardware:
HP bw064-ng
CPU:
processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 21
model		: 101
model name	: AMD A10-9620P RADEON R5, 10 COMPUTE CORES 4C+6G
stepping	: 1
microcode	: 0x6006118
GPU:
00:01.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Wani [Radeon R5/R6/R7 Graphics] [1002:9874] (rev ca)
Comment 1 spasswolf 2021-11-04 08:15:56 UTC
It turns out that ressuming from suspend has been broken a long time on the above hardware. The last Kernel where it works is 5.12.
The first commit where resuming from suspend leads to screen corruption is
4588f7b7dd5f09e70b6e223490a0d054c3d64071
Comment 2 Paul Gover 2021-11-05 16:21:35 UTC
Spasswolf's comment doesn't seem in any way related to this bug report.  I presume it was filed against the wrong bug!

I can confirm on my setup, HP laptop with an AMD "Stoney" chipset, on kernel 5.15.0 the system doesn't shutdown when you use KDE power menu "shutdown", nor if you issue the "halt" or "poweroff" commands, nor "shutdown -P now".
The keyboard is dead, so Ctl-Alt-Del doesn't reboot, all that's left is holding the power button down.

Behaviour was correct on kernel 5.14 (at least, up to 5.14.14).
Comment 3 spasswolf 2021-11-07 13:21:28 UTC
Shutdown still hangs with linux-5.15.1.
This fixes the shutdown issue for me:
diff -aur linux-5.15.1.orig/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c linux-5.15.1/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
--- linux-5.15.1.orig/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c	2021-11-06 14:13:31.000000000 +0100
+++ linux-5.15.1/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c	2021-11-07 14:19:39.194630084 +0100
@@ -554,18 +554,6 @@
 	 * jobs for clockgating/powergating/dpm setting to
 	 * ->set_powergating_state().
 	 */
-	cancel_delayed_work_sync(&adev->uvd.idle_work);
-
-	if (adev->pm.dpm_enabled) {
-		amdgpu_dpm_enable_uvd(adev, false);
-	} else {
-		amdgpu_asic_set_uvd_clocks(adev, 0, 0);
-		/* shutdown the UVD block */
-		amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
-						       AMD_PG_STATE_GATE);
-		amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
-						       AMD_CG_STATE_GATE);
-	}
 
 	if (RREG32(mmUVD_STATUS) != 0)
 		uvd_v6_0_stop(adev);
Comment 4 spasswolf 2021-11-14 20:31:44 UTC
This still bug is still present in the 5.15.0-next-20211112 where it breaks suspend:

static int uvd_v6_0_suspend(void *handle)
{
	int r;
	struct amdgpu_device *adev = (struct amdgpu_device *)handle;

	/*
	 * Proper cleanups before halting the HW engine:
	 *   - cancel the delayed idle work
	 *   - enable powergating
	 *   - enable clockgating
	 *   - disable dpm
	 *
	 * TODO: to align with the VCN implementation, move the
	 * jobs for clockgating/powergating/dpm setting to
	 * ->set_powergating_state().
	 */
#if 0
	cancel_delayed_work_sync(&adev->uvd.idle_work);

	if (adev->pm.dpm_enabled) {
		amdgpu_dpm_enable_uvd(adev, false);
	} else {
		amdgpu_asic_set_uvd_clocks(adev, 0, 0);
		/* shutdown the UVD block */
		amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
						       AMD_PG_STATE_GATE);
		amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
						       AMD_CG_STATE_GATE);
	}
#endif

	r = uvd_v6_0_hw_fini(adev);
	if (r)
		return r;

	return amdgpu_uvd_suspend(adev);
}
This makes suspend work again.
Comment 5 Alex Deucher 2021-11-15 16:56:46 UTC
Should be fixed with this patch:
https://patchwork.freedesktop.org/series/96646/
Comment 6 spasswolf 2021-11-15 23:19:12 UTC
Tested the patch with linux-5.15.2, linux-next-20211115 and linux-5.16-rc1. It solves the hang on suspend (or shutdown) problem in all cases but resuming from suspend is still broken on linux-5.15.2 when the IOMMU is missing:
https://bugzilla.kernel.org/show_bug.cgi?id=214963
Comment 7 Paul Gover 2021-11-28 19:16:00 UTC
Kernel 5.15.5 (which IIUC contains the patch or equivalent) works for me.