Bug 212293
Summary: | [amdgpu] divide error: 0000 on resume from S3 | ||
---|---|---|---|
Product: | Drivers | Reporter: | Sefa Eyeoglu (contact) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | alexdeucher |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 5.11.6 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
kernel log since resume
git bisect log |
ADDITIONAL SYSTEM INFO OS: Arch Linux (with testing repos) Kernels with this issue: 5.11.6.arch1, 5.11.6.zen1, 5.12rc2 (built from Arch Linux User Repository) Kernels without this issue: 5.10.23-1-lts This took some time, as I apparently went wrong paths sometimes. Anyways. I bisected between tags v5.10 (good) and v5.11 (bad), while only looking at path "drivers/gpu/drm/amd". At the end I landed at commit 12f4849a1cfd69f3c37cca042f2e9c512f923741 by Simon Ser (emersion). I will do some debugging myself to see if it's the real deal, but that change might very well be it. Created attachment 295887 [details]
git bisect log
I was unable to add Simon Ser to CC Okay I tried to debug it by printing. diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 573cf17262da..8e6b890ad611 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -9271,6 +9271,8 @@ static int dm_check_crtc_cursor(struct drm_atomic_state *state, return 0; } + printk("SCRUMPLEX_DEBUG %d %d %d %d", new_cursor_state->src_w, new_cursor_state->src_h, new_primary_state->src_w, new_primary_state->src_h); + cursor_scale_w = new_cursor_state->crtc_w * 1000 / (new_cursor_state->src_w >> 16); cursor_scale_h = new_cursor_state->crtc_h * 1000 / -- 2.31.0 This adds my very professional printk, which outputs all values that are used to divide in any way later. While reproducing the issue I got the following output [ 89.850437] SCRUMPLEX_DEBUG 8388608 8388608 0 0 So some weird state is causing the src_w and src_h values of "new_primary_state" to be 0. That would explain the issue to me. Now I don't know enough about drm_plane_state and drm_atomic_get_new_plane_state to say why this is like this. But as with most of these kinds of issues. A simple condition check beforehand would solve this issue. I submitted a patch here: https://lists.freedesktop.org/archives/amd-gfx/2021-March/060754.html Fixed in 5.11 and 5.12 |
Created attachment 295869 [details] kernel log since resume My system experiences a kernel panic when resuming from S3, coming from amdgpu. The GPU has to be in a specific state for this to happen. Mainly when my desktop environment turns off the screens after some inactivity, and subsequently suspends the system. This issue only occurs with kernel versions 5.11.x. I could only reproduce this with KDE Plasma / KWin on Wayland, while testing KDE Plasma / KWin on Xorg and on Wayland (Xorg seems to work fine). REPRODUCTION 1. Start KDE Plasma / KWin on Wayland 2. Set Screen Energy Saving "Switch off after" to a low value like 1min 3. Wait until Plasma has turned off screens 4. Suspend the system (via SSH for example) 5. Try to wake from sleep SYSTEM INFO CPU: AMD Ryzen 9 3900X Mainboard: ASUS ROG STRIX B450-F GAMING II GPU: GIGABYTE Radeon RX VEGA 56 GAMING OC 8G ATTACHMENTS I attached the kernel panic I could capture via ttyS0.