Bug 74751
Summary: | resume from suspend broken with 3.15-rc1 and rc2 kernels | ||
---|---|---|---|
Product: | Drivers | Reporter: | Tasev Nikola (tasev.stefanoska) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | daniel, montonen.niko, wshuman3 |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.15-rc2 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
bisect.log
dmesg lspci grep VGA dmidecode dmesg working kernel after suspend/resume dmesg broken kernel before suspend/resume dmesg from working kernel dmesg from broken kernel commit 25f397a429dfa43f22c278d0119a60a343aa568f from gitk dmesg with devices for sys-power-pm-test resume fbcon later |
Description
Tasev Nikola
2014-04-24 16:32:38 UTC
Created attachment 133621 [details]
dmesg
Created attachment 133631 [details]
lspci grep VGA
Created attachment 133641 [details]
dmidecode
Can you please boot with drm.debug=0xe on broken kernels and on working kernels, do a suspend/resume and then attach dmesg for each? Please make sure early boot messages are not cut off, increasing dmesg logsize with log_buf_len if needed. Hm, 25f397a429dfa43f22c was already merged into 3.11, but you say that 3.14 works fine. Is this really the right first bad commit git bisect found? I'm confused ... Hi Daniel, First, sorry if i confuse you , i'm just an averrage user. But yes this is really the first bad commit git bisect found. I saw that it is just one line simple patch but it probably masked another problem elsewhere because the break in the patch is just before the dpms screen code. And yes changing the break 3 lines above like before the patch and recompiling the kernel fixes the problem for now. I dont know from where could come the problem. And yes, all the kernels before the 3.15-rc1 work without a problem. With the broken 3.15-rc2 kernel (non patched), my computer frooze immediately after resume. I only have a black screen, i can't log into console and i must shutdown the computer pressing the power button 10 sec. I attached the dmesg before suspend/resume. For the working 3.15-rc2 with the patch, I attached the dmesg with drm.debug=0xe after suspend/resume. Created attachment 133791 [details]
dmesg working kernel after suspend/resume
Created attachment 133801 [details]
dmesg broken kernel before suspend/resume
Hi, I just tried now 3.15-rc3, the bug is still there. I have a Lenovo ThinkPad Edge E325 with the upgraded version of Tasev's APU, the E450, with the HD6320 Wrestler graphics chipset, and I'm suffering from an issue that would appear to be the same as this. I'm currently running 3.14.1 on the machine, and it works just fine, but 3.15-rc1 and 3.15-rc3 both fail to resume from suspend (I haven't tested rc2). The machine hangs completely (network interfaces do not resume etc.), so I'm unable to get dmesg after suspend. I feel it's worth noting that the machine also completely ignores the lid closing with 3.15-rc3, and I believe there are lots of ACPI changes in 3.15, which would explain a lot. Are there notes somewhere on how to get some useful debug info for situations like this? Created attachment 134161 [details]
dmesg from working kernel
Created attachment 134171 [details]
dmesg from broken kernel
Now I'm even more confused, since 25f397a429dfa is a lot more than a one-line patch. And it _really_ is included in 3.14 already, so git bisect can't possible list that one as the offending commit for a post-3.14 regression, the tool doesn't work like that. Can you please double-check the sha1 and perhaps cite the full commit message + patch to make sure we're talking about the same? Hi, Here is the patch from the commit that i found with git bisect. It's a copy paste from gitk for the commit 25f397a429dfa43f22c278d0119a60a343aa568f ---------------------- drivers/gpu/drm/drm_crtc_helper.c ---------------------- index c0f2d62..8108db9 100644 @@ -695,12 +695,13 @@ int drm_crtc_helper_set_config(struct drm_mode_set *set) if (new_encoder == NULL) /* don't break so fail path works correct */ fail = 1; - break; if (connector->dpms != DRM_MODE_DPMS_ON) { DRM_DEBUG_KMS("connector dpms not on, full mode switch\n"); mode_changed = true; } + + break; } } I just changed the break in the code like before the patch and after recompiling the 3.15-rc2 the suspend resume work again. But like i said i'm just an average user so sorry if i did something wrong. I attached a full copy paste from gitk for the commit 25f397a429dfa43f22c278d0119a60a343aa568f Created attachment 134191 [details]
commit 25f397a429dfa43f22c278d0119a60a343aa568f from gitk
For bisecting i did this: git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux-git git bisect start | tee -a /root/bisect.log git bisect bad v3.15-rc1 | tee -a /root/bisect.log git bisect good v3.14 | tee -a /root/bisect.log Then i compile the kernel each time with CONCURRENCY_LEVEL=4 make-kpkg --initrd kernel_image modules_image After testing the kernel git bisect good or git bisect bad then make clean and build again, testing an so on. Hi, I just tested 3.15-rc4 today and the bug is still there. Is it something that i could do/test to help debug this ? Hi Just tested now 3.15-rc5, still not working. try the patch mentioned in https://bugzilla.kernel.org/show_bug.cgi?id=75651 I tried the patch just in case but it didn't help. Thank you anyway. Ok, the offending commit is actually 177cf92de4aa97ec1435987e91696ed8b5023130, at least that one matches the diff of your change. It references the other commit, but that's not the commit itself. For debugging it might be useful to do the suspend partially. You can disable certain parts of suspend (it will immediately resume if you do this) # echo <mode> > /sys/power/pm_test See # cat /sys/power/pm_test for a list of all possible values. If you manage to reproduce the bug with this please attach the drm.debug=0xe dmesg. Hi Daniel, I didn't notice your last reply until today when i was testing the 3.15-rc6 kernel to report that is still not working. cat /sys/power/pm_test give me : [none] core processors platform devices freezer So i echo the different mode to /sys/power/pm_test one by one and then do pm-suspend every time but i could not reproduce the bug, the computer resume normaly after 3-4 seconds. If you need the dmesg after every different suspend resume i can send it to you, just tell me. Just for info, i also test the rc6 kernel with reverting the patch in comment 14 and it works ok after resume. Hi I try again different value for /sys/power/pm_test with the 3.15-rc7 kernel and i notice that with devices selected for /sys/power/pm_test it took long time to resume (12-15 seconds). Looking into dmesg i can see GPU lockup and uvd errors (at 363.74 line). I don't no if this is related to my resume problem. Dmesg is attached. Created attachment 137481 [details]
dmesg with devices for sys-power-pm-test
Created attachment 137721 [details]
resume fbcon later
Hm, somehow forgotten to attach this yesterday. Please test this patch instead of any reverts.
(In reply to Daniel Vetter from comment #25) > Created attachment 137721 [details] > resume fbcon later > > Hm, somehow forgotten to attach this yesterday. Please test this patch > instead of any reverts. The patch work's fine. Thank you Patch is merged upstream, thanks for the report and testing. |