Bug 206225
Summary: | nouveau: Screen distortion and lockup on resume | ||
---|---|---|---|
Product: | Drivers | Reporter: | Christoph Marz (derchiller-foren) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED DOCUMENTED | ||
Severity: | high | CC: | imirkin, lukasz.wojnilowicz |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 5.4.12 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
5.3.9 nouveau resume bug
5.3.9 nouveau resume ok 5.4.12: Syslog excerpt: Resume after hibernation 5.4.12: Syslog excerpt: Resume after suspend 5.4.11: Syslog excerpt: Resume after hibernation: No error |
Description
Christoph Marz
2020-01-16 16:03:47 UTC
Created attachment 286845 [details]
5.3.9 nouveau resume bug
dmesg output after resume from hibernation before installing 'firmware-misc-nonfree'
Created attachment 286847 [details]
5.3.9 nouveau resume ok
dmesg output after resume from hibernation right after installing 'firmware-misc-nonfree'
Created attachment 286849 [details]
5.4.12: Syslog excerpt: Resume after hibernation
nouveau error messages and call trace; I was able to switch to a VT
Created attachment 286851 [details]
5.4.12: Syslog excerpt: Resume after suspend
nouveau error messages, no call trace; I was NOT able to switch to a VT
Created attachment 286875 [details]
5.4.11: Syslog excerpt: Resume after hibernation: No error
I see you have nouveau.config=PCRYPT=0 in your kernel config. Why did you add this -- was there some kind of issue with the engine? Did someone in #nouveau tell you to do it to help some issue? It's normally used for copy acceleration on G96 (which would, in turn, be used to copy off any vram data to ram on suspend). The reason I ask is that starting with kernel 4.3, that will no longer have the effect of disabling PCRYPT. The new config to achieve that would be nouveau.config=cipher=0. Note that for G96, I don't think anything in firmware-misc-nonfree would affect it either way. (In reply to Ilia Mirkin from comment #6) > I see you have nouveau.config=PCRYPT=0 in your kernel config. Why did you > add this -- was there some kind of issue with the engine? Did someone in > #nouveau tell you to do it to help some issue? Hello Ilia, I had found a bug report (https://bugs.freedesktop.org/show_bug.cgi?id=58378) dealing with a similar issue, and there you suggested to try that option (https://bugs.freedesktop.org/show_bug.cgi?id=58378#c46), and it seemingly solved the issue, so I gave it a try, but removed it after I noticed that it had no effect at all. >It's normally used for copy > acceleration on G96 (which would, in turn, be used to copy off any vram data > to ram on suspend). > > The reason I ask is that starting with kernel 4.3, that will no longer have > the effect of disabling PCRYPT. The new config to achieve that would be > nouveau.config=cipher=0. Ok, thanks for clarification. Copy acceleration sounds good, is there any downside? > Note that for G96, I don't think anything in firmware-misc-nonfree would > affect it either way. I will uninstall that package and report back. BTW: No problems with 5.4.13 so far. Well, the problem the other users were having is that their GPUs were actually missing the crypt engine entirely, and we were not properly reading the capabilities bits that indicated this. Trying to use the crypt engine when it's not actually there has some obvious downsides :) But I don't see an indication that this would be the case on your setup. (First of all, we now respect the capability bit, and secondly, you don't have any mmio read/write errors in that range.) Well, then this might sound strange: I purged firmware-misc-nonfree, rebooted, sent the system to sleep and resumed, and the distortion was back. Instead of reinstalling it, I set nouveau.config=cipher=0 and tested again, and everything is fine. Furthermore, now I can use the firmware for Video Acceleration. Before, I always had distortion after resume with that firmware installed. So everything seems fine now, but is there any downside in disabling the crypt engine? Sounds like there are things going on that we don't quite understand then... maybe Ben can weigh in. If the cipher method is disabled (aka CRYPT), it will fall back to M2MF for copy acceleration. In experiments, this is slightly slower but still accelerated. Ok. However, thank you for telling me the right option for disabling the crypt engine on current kernels. If you need any logs or want me to do certain tests, let me know. Follow-up: After a dist-upgrade, the error returned. I deleted the video acceleration firmware and it was ok again. When I installed 5.4.14, there were warnings about possibly missing firmware (the nvidia files from firmware-misc-nonfree), so I reinstalled that package and updated the initramfs (I think I missed that step after purging the package). Furthermore, I removed nouveau.config=cipher=0 since that doesn't seem to be related to the error. To conclude: When it works, I do a dist-upgrade one day and the error returns. Doing a dist-upgrade a few days after makes it work again. The same holds for kernel upgrades. We have 2024 and https://nouveau.freedesktop.org/KernelModuleParameters.html (updated in 2024) still recommends using nouveau.config=PCRYPT=0 instead of nouveau.config=cipher=0. I believe that correct names can be taken at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/nouveau/nvkm/engine from the directory names. |