Bug 211799

Summary: On AMD Ryzen 4500U laptop, running kernel 5.11 fails to shutdown correctly with immediate reboot
Product: Platform Specific/Hardware Reporter: Mike Cloaked (mike.cloaked)
Component: x86-64Assignee: EFI Virtual User (efi)
Status: RESOLVED CODE_FIX    
Severity: high CC: chukhanhhoang97, dominik.gierlach, factorio, jan.steffens, lithium
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.11 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Journal log for one 5.11 build to the point where the poweroff command was given
Journal log for a second build of kernel 5,.11 also to the point of issuing poweroff
dmidecode output
ver_linux for the current kernel 5.11 -ignore or delete this attachment
ver_linux output
Journal including log for when the laptop was suspended, then resumed
Journal log for kernel 5.11 containing the bad commit
Journal for custom kernel with the fix patches now working correctly
config for kernel with the patches that fix this issue

Description Mike Cloaked 2021-02-16 10:01:29 UTC
Created attachment 295313 [details]
Journal log for one 5.11 build to the point where the poweroff command was given

HP Envy X360 15-en0000na laptop runs kernel 5.11 without problems until the machine is shut down.  If shutdown is requested from within the plasma desktop, or if only using a VT console and the command "systemctl poweroff" or "systemctl halt" are issued, then the system shuts down but immediately reboots again, and will not remain powered off or halted.

I was not sure which category to tag this report but it is a serious regression. Kernel 5.10.x all booted fine and shuts down to poweroff correctly.

I will attach two journal logs - for two builds of kernel 5.11, and the demidecode output.
Comment 1 Mike Cloaked 2021-02-16 10:02:42 UTC
Created attachment 295315 [details]
Journal log for a second build of kernel 5,.11 also to the point of issuing poweroff
Comment 2 Mike Cloaked 2021-02-16 10:03:21 UTC
Created attachment 295317 [details]
dmidecode output
Comment 3 Hoang Chu 2021-02-16 11:43:19 UTC
Hello, I am having the same issue with Kernel 5.11. My laptop (Ryzen 5 4650U) can shutdown properly with Kernel 5.8. I noticed a warning if Kernel 5.11 is booted:

kernel: amdgpu 0000:04:00.0: amdgpu: Unsupported power profile mode 0 on RENOIR

however, I am not sure whether this is the cause.
Comment 4 Mike Cloaked 2021-02-16 12:09:32 UTC
I don't see the same amdgpu message in my logs concerning an Unsupported power profile, so maybe the cause is something else.  For completeness I am running arch linux, and only see this problem on my AMD based laptop - Intel laptops shut down fine for me.
Comment 5 Mike Cloaked 2021-02-16 12:11:49 UTC
Is there any workaround or kernel parameter at boot that might resolve this until there is a patch to fix the problem?
Comment 6 Hoang Chu 2021-02-16 12:20:22 UTC
Oh I guess you overlooked them. They are Line 1590 and 1616 in your log files 1 and 2 respectively. I am on Mint and have the same problem, so I suppose the problem is not distro-related.
Comment 7 Hoang Chu 2021-02-16 12:23:36 UTC
I have been googling for a while but haven't found any solution yet. If you find any, please let me know. Thanks!
Comment 8 Mike Cloaked 2021-02-16 14:05:27 UTC
Yes you are right - I missed the line in my own log:

Feb 16 09:35:36 ryzen1 kernel: amdgpu 0000:04:00.0: amdgpu: Unsupported power profile mode 0 on RENOIR

in my second journal file, which is exactly the same log line as in yours.
Comment 9 Mike Cloaked 2021-02-16 15:33:44 UTC
I have also tried as root: "systemctl stop sddm" before systemctl poweroff, as well as "systemctl stop graphical.target" before systemctl poweroff, and the laptop still reboots after shutting down.
Comment 10 Mike Cloaked 2021-02-16 15:52:50 UTC
Created attachment 295321 [details]
ver_linux for the current kernel 5.11 -ignore or delete this attachment
Comment 11 Mike Cloaked 2021-02-16 15:56:13 UTC
Created attachment 295323 [details]
ver_linux output
Comment 12 Mike Cloaked 2021-02-16 16:00:55 UTC
Created attachment 295325 [details]
Journal including log for when the laptop was suspended, then resumed
Comment 13 Mike Cloaked 2021-02-16 17:52:33 UTC
The assignee should be changed to 	platform_x86_64@kernel-bugs.osdl.org
Comment 14 Dominikus Gierlach 2021-02-16 19:49:32 UTC
Same issue on my HP envy 13 with a ryzen 4700u.
I tested it back to 5.11-rc1.
Comment 15 Jan Steffens 2021-02-16 23:54:23 UTC
Can you bisect to find the first bad commit?
Comment 16 Mike Cloaked 2021-02-17 13:28:37 UTC
I am not set up to build and do bisecting unfortunately.
Comment 17 Mike Cloaked 2021-02-17 13:37:48 UTC
The kernel I am using is built by someone who has up to date custom kernels, and I download and install from his repo.  So I can run latest kernel but don't do the builds myself.
Comment 18 Dominikus Gierlach 2021-02-17 20:17:27 UTC
I started bisecting, let's see how long it takes me :)
Comment 19 Mike Cloaked 2021-02-18 23:09:08 UTC
Bisection has been done between 5.10 and 5.11 and the first bad commit as below:

git bisect good
628c36d7b238e2d72158e8aba229ec79c69c157e is the first bad commit
commit 628c36d7b238e2d72158e8aba229ec79c69c157e
Author: Prike Liang <Prike.Liang@amd.com>
Date:   Wed Sep 9 14:40:24 2020 +0800

    drm/amdgpu: update amdgpu device suspend/resume sequence for s0i3 support

    - Need skip the RLC/CP/GFX disable for let GFXOFF enter during
suspend period.
    - For s0i3 suspend only need suspend DCE and each IP interrupt.
    - Before VBIOS POSTed check and atom HW INT need set the GPU power
status change
      to D0 in the resume period, otherwise the HW will be mess up and
see the SDMA hang.
    - Need handle the GPU reset path during amdgpu device suspend.

    Signed-off-by: Prike Liang <Prike.Liang@amd.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Acked-by: Huang Rui <ray.huang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

This commit is the same as identified by kernel bisection in the report at:

https://gitlab.freedesktop.org/drm/amd/-/issues/1499

However the proposed patch to fix the issue does not work in my case. A kernel was built with the patch proposed in this linked gitlabn report on top of 5.11 but does not fix the problem for my machine, which still fails to remain powered off, and reboots itself.

For my system the graphics is :

04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Renoir (rev c2)
        DeviceName: AMD Radeon(TM) Graphics
        Subsystem: Hewlett-Packard Company Device 876f
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu

Is there any diagnostics that I can run to provide more information with the 5.11 kernel to help with finding what needs to be changed to provide a fix?
Comment 20 Mike Cloaked 2021-02-18 23:11:15 UTC
Created attachment 295357 [details]
Journal log for kernel 5.11 containing the bad commit

The log for kernel 5.11 boot containing the bad commit identified
Comment 21 Dominikus Gierlach 2021-02-19 08:08:50 UTC
Huh.
The proposed patch fixes the issue for me on an HP envy x360 13, which should be very similar to your device except for its screen size. It uses the same chipset, including the Radeon graphics controller.

Did you apply the patch on the bad commit or the 5.11 release?
Comment 22 Mike Cloaked 2021-02-19 11:34:15 UTC
I have now tested the new kernel v5.11-arch2 referenced at 
https://git.archlinux.org/linux.git/log/?h=v5.11-arch2

and downloaded from the arch linux [testing] repo.

This is $ uname -r
5.11.0-arch2-1

and the patches included in this build fix the shutdown issue for me.

My machine now shuts down and powers off normally, and no longer reboots itself.
Comment 23 Mike Cloaked 2021-02-19 15:24:44 UTC
Created attachment 295363 [details]
Journal for custom kernel with the fix patches now working correctly
Comment 24 Mike Cloaked 2021-02-19 15:25:24 UTC
Created attachment 295365 [details]
config for kernel with the patches that fix this issue
Comment 25 factorio 2021-02-22 04:05:16 UTC
I have experienced the same issue on Renoir. 

While the patches included in 5.11.0-arch2-1 fix the poweroff on shutdown issue. There is an additional issue where the computer cannot poweroff on hibernation.
Comment 26 Mike Cloaked 2021-03-05 12:38:43 UTC
This is now resolved in kernel 5.11.3 for me.
Comment 27 mskmsk 2021-03-06 13:47:47 UTC
I solved this by adding amdgpu as module in the /etc/mkinitcpio.conf file e.g. MODULES=(amdgpu) and regenerate initramfs
Comment 28 Mike Cloaked 2021-03-06 17:22:13 UTC
I already had that in my config file which is perhaps why I didn't see the same as you did, but the original problem is resolved in 5.11.3