Bug 207561
Summary: | DRM? broke for AMDGPU in 5.6.10 (worked in 5.6.6) | ||
---|---|---|---|
Product: | Drivers | Reporter: | Artem S. Tashkinov (aros) |
Component: | Video(Other) | Assignee: | other_other |
Status: | RESOLVED CODE_FIX | ||
Severity: | blocking | CC: | alexdeucher, torvalds |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 5.6.10 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Artem S. Tashkinov
2020-05-03 17:33:05 UTC
OK, both issues are gone with 5.6.6 compiled with GCC10. Looks like we have two regressions: Audio broke. DRM broke for my RX 5600 XT. Looks like the same issue is discussed here: https://lkml.org/lkml/2020/4/10/545 Alex, am I correct that patch https://cgit.freedesktop.org/drm/drm-misc/commit/?h=drm-misc-fixes&id=8623b5255ae7ccaf276aac3920787bf575fa6b37 should fix my issue? Is this patch scheduled for 5.6.11? Bisection? The oops looks like __kthread_should_park() called with a NULL argument. That should have been fixed by commit 8623b5255ae7 ("drm/scheduler: fix drm_sched_get_cleanup_job") in mainline. I'm not sure why this started showing up in -stable. Can you try to bisect the audio breakage, that seems to be something else. You'd need to add that oneliner from commit 8623b5255ae7 to avoid the drm breakage. (In reply to Linus Torvalds from comment #5) > Can you try to bisect the audio breakage, that seems to be something else. > > You'd need to add that oneliner from commit 8623b5255ae7 to avoid the drm > breakage. So, after three compilations: Kernel 5.6.6 compiled with GCC 10 (Fedora 32): all is fine. Kernel 5.6.10 compiled with GCC 10 (Fedora 32): DRM bug/ALSA broken. Kernel 5.6.10 compiled with GCC 9 (Fedora 31): all is fine. Looks like GCC 10 generates invalid code once again, at least its Fedora 32 version. GCC 10 hasn't yet been formally released. I'm now running 5.6.10 compiled with GCC 9 and everything is OK. No idea what to do next. Perhaps it's worth closing this bug report as RESOLVED/INVALID. And bug 207563 as well. (In reply to Artem S. Tashkinov from comment #6) > > So, after three compilations: > > Kernel 5.6.6 compiled with GCC 10 (Fedora 32): all is fine. > Kernel 5.6.10 compiled with GCC 10 (Fedora 32): DRM bug/ALSA broken. > Kernel 5.6.10 compiled with GCC 9 (Fedora 31): all is fine. > > Looks like GCC 10 generates invalid code once again, at least its Fedora 32 > version. GCC 10 hasn't yet been formally released. Uhhuh. Potential compilers bugs are not fun to chase. It might be a real kernel bug that is just exposed by the compiler change, of course, but ... Is there any obvious change in the ALSA output to give a hint of where the breakage might be? (In reply to Linus Torvalds from comment #7) > Uhhuh. Potential compilers bugs are not fun to chase. > > It might be a real kernel bug that is just exposed by the compiler change, > of course, but ... > > Is there any obvious change in the ALSA output to give a hint of where the > breakage might be? But ... I'm now running 5.6.10 compiled with GCC 9.3 and everything works. :-) There are ALSA related changes between 5.6.6 and 5.6.10 but I'm not sure they are relevant if the problem is down to the compiler. I'm thinking if GCC 10 miscompiles the kernel, it can break something so much as to cause breakage all over the place. Really don't know what to do. I'm compiling the vanilla kernel using the default GCC 10 compiler in Fedora 32, so I guess you could take a look at that. I can attach my .config if that's of any help. (In reply to Artem S. Tashkinov from comment #8) > > > > Is there any obvious change in the ALSA output to give a hint of where the > > breakage might be? > > But ... I'm now running 5.6.10 compiled with GCC 9.3 and everything works. > :-) I meant between the (working) gcc-9 and (broken) gcc-10 case. Apply the drm one-liner fix to make it all past that one (assuming it does fix the gcc10 case for drm). Linus (In reply to Linus Torvalds from comment #9) > (In reply to Artem S. Tashkinov from comment #8) > > > > > > Is there any obvious change in the ALSA output to give a hint of where > the > > > breakage might be? > > > > But ... I'm now running 5.6.10 compiled with GCC 9.3 and everything works. > > :-) > > I meant between the (working) gcc-9 and (broken) gcc-10 case. > > Apply the drm one-liner fix to make it all past that one (assuming it does > fix the gcc10 case for drm). > With the applied one liner everything works as intended with GCC 10. I'm baffled. (In reply to Artem S. Tashkinov from comment #10) > > With the applied one liner everything works as intended with GCC 10. I'm > baffled. Ok, then the sound problem was likely just due to the oops. The oops might just have broken some kthread functionality that sound depended on or whatever. I think you can close this, although you should probably make sure the stable people know about that oneliner fix. (In reply to Linus Torvalds from comment #11) > (In reply to Artem S. Tashkinov from comment #10) > > > > With the applied one liner everything works as intended with GCC 10. I'm > > baffled. > > Ok, then the sound problem was likely just due to the oops. > > The oops might just have broken some kthread functionality that sound > depended on or whatever. > > I think you can close this, although you should probably make sure the > stable people know about that oneliner fix. I'd like to hear from Alex Deucher as I'm not even sure why kernel 5.6.10/GCC93 without this patch works, but the same kernel compiled with GCC10 breaks. I guess he can also propose this oneliner for stable. (In reply to Artem S. Tashkinov from comment #12) > > I'd like to hear from Alex Deucher as I'm not even sure why kernel > 5.6.10/GCC93 without this patch works, but the same kernel compiled with > GCC10 breaks. Well, it is touted as a fix for a race. So just timing differences may make the race happen or not. And a compiler change will obviously cause timing differences. So that part I'm not surprised about, although getting Alex to say "yes, backport it", is likely a good idea anyway. That said, bugzilla often gets a lot less attention than just emailing people. Hint hint. Linus Yes, backport it :) (In reply to Alex Deucher from comment #14) > Yes, backport it :) Tested-by: Artem S. Tashkinov Please submit for stable (5.6.11). Thank you! |