just upgrade from 5.18 to 6.0.0-rc1
to set to sleep, it seems a little slow.
then wake up, black screen,
everything works fine before.
amd_gpio AMDI0030:00: failed to get iomux index
Can you please bisect (https://www.kernel.org/doc/html/latest/admin-guide/bug-bisect.html)?
Also, please share a full dmesg from the previous working kernel and full dmesg from broken kernel.
Created attachment 301657 [details]
sorry I don't have bad version dmesg for 6.0
but I have some other information:
Kernel version : 5.19.3-gs - ACPI version : 20220331
Freq. scaling driver : amd-pstate
Kernel version : 6.0.0-rc2-gs - ACPI version : 20220331
Freq. scaling driver : acpi-cpufreq
6.0.0 seems `amd-pstate` not loaded, but use `acpi-cpufreq`
That may be a completely separate bug. Maybe you can try to explicitly set which frequency scaling driver is used in both cases to isolate if that's the cause of your issue.
I add 'Command line: BOOT_IMAGE=/boot/vmlinuz-xxx root=/dev/nvme0n1p4 amd_pstate.shared_mem=1' to enable `amd-pstate` for all , I think.
when next 6.0.0-rc3 come out, I think I will have another try and upload the dmesg here.
Once you test 6.0-rc3 if it's still failing, please perform a bisect to find the root cause.
I made a git bitsect
[cb6b81b21bd9cf09d72b7fe711be1b55001eb166] Merge tag 'drm-misc-next-fixes-2022-07-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
# git bisect bad
Bisecting: 5 revisions left to test after this (roughly 3 steps)
[676ad8e997036e2f815c293b76c356fb7cc97a08] drm: rcar-du: Lift z-pos restriction on primary plane for Gen3
# git bisect good
Bisecting: 2 revisions left to test after this (roughly 2 steps)
[c96cfaf8fc02d4bb70727dfa7ce7841a3cff9be2] drm/nouveau: Don't pm_runtime_put_sync(), only pm_runtime_put_autosuspend()
cpufreq but wake up good
then I don't know whether it's good or bad, because
if say bad, it wake up good
if say good, it use cpufreq instead of expected amd-pstate
but there is already 2 rev left, so can I leave it for you the developer?
btw, I start from tags/v6.0-rc1(bad) and tags/v5.19(good) with 7097 revisions,
I hope there are no regression more than once, and hope the result range is meaningful.
Created attachment 301670 [details]
another dmesg when bad
> then I don't know whether it's good or bad, because
> if say bad, it wake up good
> if say good, it use cpufreq instead of expected amd-pstate
Because you know there is a problem with this part way through it's better to force all the tests to use acpi-cpufreq. It removes more variability in the test result.
> but there is already 2 rev left, so can I leave it for you the developer?
I don't know what are actually left, I think I'd need to see the whole log to see what happens. With the above guidance can you narrow down to a specific commit?
Shot in the dark until we know the causing commit - a6250bdb6c4677ee77d699b338e077b900f94c0c in 6.0-rc2 and latest 5.19.y helps some other people with VT freezes.
wow, maybe my result is not valid, because I really get lost in the bisect game.
[/home/neoe/oss/linux/linux] git bisect good
cb6b81b21bd9cf09d72b7fe711be1b55001eb166 is the first bad commit
Merge: 3cfb5bc94fab 6f2c8d5f1659
Author: Dave Airlie <email@example.com>
Date: Fri Jul 22 13:43:46 2022 +1000
Merge tag 'drm-misc-next-fixes-2022-07-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
Short summary of fixes pull:
* amdgpu: Fix for drm buddy memory corruption
* nouveau: PM fixes; DP fixes
Signed-off-by: Dave Airlie <firstname.lastname@example.org>
From: Thomas Zimmermann <email@example.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 16 ++++++++--------
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 2 +-
drivers/gpu/drm/nouveau/nouveau_connector.c | 8 +++-----
drivers/gpu/drm/nouveau/nouveau_display.c | 4 ++--
drivers/gpu/drm/nouveau/nouveau_fbcon.c | 2 +-
5 files changed, 15 insertions(+), 17 deletions(-)
[/home/neoe/oss/linux/linux] git bisect log
git bisect start
# status: waiting for both good and bad commits
# good: [1b54a0121dba12af268fb75c413feabdb9f573d4] drm/amd/display: Reduce stack size in the mode support function
git bisect good 1b54a0121dba12af268fb75c413feabdb9f573d4
# status: waiting for bad commit, 1 good commit known
# bad: [2bc7ea71a73747a77e7f83bc085b0d2393235410] Merge tag 'topic/nouveau-misc-2022-07-27' of git://anongit.freedesktop.org/drm/drm into drm-next
git bisect bad 2bc7ea71a73747a77e7f83bc085b0d2393235410
# good: [c877bed82e1017c102c137d432933ccbba92c119] drm/i915/gt: Only kick the signal worker if there's been an update
git bisect good c877bed82e1017c102c137d432933ccbba92c119
# bad: [ee8b1ef9a6b089abf7a9c7d094b6e93fa05f15b9] Merge tag 'amd-drm-next-5.20-2022-07-26' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
git bisect bad ee8b1ef9a6b089abf7a9c7d094b6e93fa05f15b9
# bad: [cb6b81b21bd9cf09d72b7fe711be1b55001eb166] Merge tag 'drm-misc-next-fixes-2022-07-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
git bisect bad cb6b81b21bd9cf09d72b7fe711be1b55001eb166
# good: [676ad8e997036e2f815c293b76c356fb7cc97a08] drm: rcar-du: Lift z-pos restriction on primary plane for Gen3
git bisect good 676ad8e997036e2f815c293b76c356fb7cc97a08
# good: [c96cfaf8fc02d4bb70727dfa7ce7841a3cff9be2] drm/nouveau: Don't pm_runtime_put_sync(), only pm_runtime_put_autosuspend()
git bisect good c96cfaf8fc02d4bb70727dfa7ce7841a3cff9be2
# good: [6f2c8d5f16594a13295d153245e0bb8166db7ac9] drm/amdgpu: Fix for drm buddy memory corruption
git bisect good 6f2c8d5f16594a13295d153245e0bb8166db7ac9
# good: [3cfb5bc94fab39c456dccee75553f7f6c52ee7f7] Merge tag 'du-next-20220707' of git://linuxtv.org/pinchartl/media into drm-next
git bisect good 3cfb5bc94fab39c456dccee75553f7f6c52ee7f7
# first bad commit: [cb6b81b21bd9cf09d72b7fe711be1b55001eb166] Merge tag 'drm-misc-next-fixes-2022-07-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
It's unfortunate it ends on a merge commit but not unheard of.
If that's a true result, you should be able to add nomodeset which will turn off amdgpu and check whether you can suspend/resume. If so it does confirm this is still an amdgpu bug.
The real commit from that one should be 6f2c8d5f16594a13295d153245e0bb8166db7ac9, but you have that marked as good above. The only other stuff in that merge request is nouveau which shouldn't affect your system.
If nomodeset helps I think you should redo your bisect.
added nomodeset to 6.0.0-rc2 still wake up to black screen.
means this is more likely acpi bug than amdgpu bug.
In that case I think you should redo the bisect with amdgpu blacklisted for the entire duration so that any instability in the middle of the release doesn't lead to a bad result.
neoe, any news? Did you try what Mario suggested?
sorry, not yet.
a compile takes 12 minutes, and a bitsec need about 14 iterators, and not automatic yet. quite expensive for me.
I don't know if any amd zen2 will have the same problem.
So I think if 6.0.0 release and still problem, I will make an automatic script to do the git bitsect test, if there are no better ways.
I see 6.0.0-rc4 come out today, so I had a test on it.
"acpitool -e" shows the "Freq. scaling driver : amd-pstate" , better than rc2
but also noticed that the HDMI-audio output is gone, I just can say there seems alot amdgpu changes. I just cannot follow the test process according to the state.
As a linux user and lover, I decided to stick to 5.19.y until 6.0.0 become stable.
Thank you all guys.
6.0.0-rc4 also black-screen after sleep by 'acpitool -s'
(In reply to neoe from comment #18)
> a bitsec need about 14 iterators
FWIW, it's likely just about 8 if it's really between v6.0-rc1..v6.0-rc2
This might be worth to look at: https://gitlab.freedesktop.org/drm/amd/-/issues/2164
I found this bug is fixed in v6.0.9 (no idea what happened)