Bug 13512

Summary: D43 on 2.6.30 doesn't suspend anymore
Product: Power Management Reporter: Daniel Smolik (marvin)
Component: Hibernation/SuspendAssignee: ykzhao (yakui.zhao)
Status: CLOSED UNREPRODUCIBLE    
Severity: normal CC: acpi-bugzilla, hyc, lenb, rjw, rui.zhang, yakui.zhao
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.30 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 13070    
Attachments: lspci -vvv output
dmidecode output
lspci -vvv on D810
dmidecode on D810
lspci -vvv HP dv5z
dmidecode HP dv5z
dmesg before sleep
dmesg after sleep
OOPS during suspend
OOPS decoded by ksymoops
Don't register the connector device under the drm_class
dmesg with 2.6.31rc3 before sleep
dmesg with 2.6.31rc3 after resume
dmesg with 2.6.31rc3 before second sleep, but after X server start/stop

Description Daniel Smolik 2009-06-11 20:12:58 UTC
On 2.6.30 suspend/resume  stop working. If boot only to console  system suspend and wakeup but didn't restore console and it stays black.
When I in X (intel driver) and do echo mem > /sys/power/state system start suspending but freeze in black screen. Power button doesn't  flashing.
On 2.6.29.3 all works with the same kernel configuration.
Comment 1 Rafael J. Wysocki 2009-06-11 20:30:15 UTC
Does 2.6.30-rc1 also fail?
Comment 2 Daniel Smolik 2009-06-11 21:47:17 UTC
Yes, it is the same as 2.6.30. I will try 2.6.29.4.
Dan
Comment 3 Rafael J. Wysocki 2009-06-11 21:51:16 UTC
I'm afraid this is not going to help a lot.  Please check the kernel where the commit:

commit 2ed8d2b3a81bdbb0418301628ccdb008ac9f40b7
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Mon Mar 16 22:34:06 2009 +0100

    PM: Rework handling of interrupts during suspend-resume

is the head.
Comment 4 Daniel Smolik 2009-06-11 22:05:43 UTC
Sorry I don't know how do it.
Comment 5 Rafael J. Wysocki 2009-06-11 22:46:25 UTC
Please read http://linux.yyz.us/git-howto.html#download_first_time to learn how to clone the Linus' kernel repository.  When you have the repository cloned, go to the directory containing it, do

$ git checkout -b test 2ed8d2b3a81bdbb0418301628ccdb008ac9f40b7

build the kernel and install it as usual.
Comment 6 Daniel Smolik 2009-06-12 08:23:58 UTC
Thanks for advice. I start working on it.
Comment 7 Daniel Smolik 2009-06-13 21:53:38 UTC
I build kernel as you recommend me. And when I try suspend comp freeze.
Situation is the same as in 2.6.30.

Dan
Comment 8 Rafael J. Wysocki 2009-06-13 22:12:47 UTC
Thanks.

Now, can you please do the same as in comment #5, but for commit 0a0c5168df270a50e3518e4f12bddb31f8f5f38f (one commit before the one you have tested)?
Comment 9 Len Brown 2009-06-16 01:26:58 UTC
can you tell us more about what "D43" is?
eg. lspci and dmidecode
in case there is some common issue here with similar machines?
Comment 10 Daniel Smolik 2009-06-16 05:39:21 UTC
bugzilla-daemon@bugzilla.kernel.org napsal(a):
> http://bugzilla.kernel.org/show_bug.cgi?id=13512
> 
> 
> Len Brown <len.brown@intel.com> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |acpi-bugzilla@lists.sourcef
>                    |                            |orge.net,
>                    |                            |len.brown@intel.com
> 
> 
> 
> 
> --- Comment #9 from Len Brown <len.brown@intel.com>  2009-06-16 01:26:58 ---
> can you tell us more about what "D43" is?
> eg. lspci and dmidecode
> in case there is some common issue here with similar machines?
> 
Yes this is Dell notebook D43 I late send lspci and dmidecode.
Comment 11 Daniel Smolik 2009-06-16 19:20:50 UTC
There is dmidecode output and lspci -vvv output.
Comment 12 Daniel Smolik 2009-06-16 19:22:02 UTC
Created attachment 21944 [details]
lspci -vvv output
Comment 13 Daniel Smolik 2009-06-16 19:22:33 UTC
Created attachment 21945 [details]
dmidecode output
Comment 14 Alan Crosswell 2009-06-17 20:44:25 UTC
Seeing what looks like the same as this bug on Dell Latitude D810 running 2.6.29.4-167.fc11.i586.  Suspend works.  Upon resume (opening cover), screen is blank.  Keyboard is mostly non-responsive. FN-F2 (turn on/off wifi&blutooth radio does cause the bluetooth indicator to toggle).  This used to work on a much older kernel (2.2.26?? on FC9).

lspci and dmidecode will follow.
Comment 15 Alan Crosswell 2009-06-17 20:45:21 UTC
Created attachment 21974 [details]
lspci -vvv on D810
Comment 16 Alan Crosswell 2009-06-17 20:45:58 UTC
Created attachment 21975 [details]
dmidecode on D810
Comment 17 Zhang Rui 2009-06-18 09:03:12 UTC
it would be great if you can run git bisect to find out which commit introduces this bug.
Comment 18 Daniel Smolik 2009-06-18 21:02:47 UTC
Rafael I do this:

git checkout -b test_pokus 0a0c5168df270a50e3518e4f12bddb31f8f5f38f
But use previous colned tree. I only execute this command is it right ?
If yes, comp still freezes as before.
Comment 19 Daniel Smolik 2009-06-19 11:19:26 UTC
Now I clone fresh git tree and
git checkout -b test 0a0c5168df270a50e3518e4f12bddb31f8f5f38f compile kernel and situation is the same kernel freeze.
Comment 20 Alan Crosswell 2009-06-19 20:37:07 UTC
(In reply to comment #14)
> Seeing what looks like the same as this bug on Dell Latitude D810 running
> 2.6.29.4-167.fc11.i586.  Suspend works.  Upon resume (opening cover), screen
> is
> blank.  Keyboard is mostly non-responsive. FN-F2 (turn on/off wifi&blutooth
> radio does cause the bluetooth indicator to toggle).  This used to work on a
> much older kernel (2.2.26?? on FC9).
> 
> lspci and dmidecode will follow.

I had a brain fart.  The older kernel version was kernel-2.6.27.24-78.2.53.fc9.i686
Comment 21 Daniel Smolik 2009-06-28 12:54:13 UTC
2.6.31-rc1 still buggy.
Comment 22 Howard Chu 2009-06-28 21:23:16 UTC
I have a similar problem with an HP dv5z; up to 2.6.29.4 it suspended OK, but on 2.6.30 and 2.6.31-rc1 it hangs during suspend - all I see is a blank screen with a blinking text cursor in the top left corner. One of the times this happened, pressing the power button caused it to finish suspending, and then the power button blinked like it usually does. But upon resume, while the power light came on steady, nothing else happened. I haven't been able to duplicate that situation; every other time it has just hung with the blinking cursor.

I also still have i8042.reset in my boot command line. On 2.6.29 and previous kernels this was needed otherwise my keyboard wouldn't come back. Should this still be required on newer kernels?
Comment 23 Howard Chu 2009-06-28 21:26:33 UTC
Created attachment 22133 [details]
lspci -vvv HP dv5z
Comment 24 Howard Chu 2009-06-28 21:27:04 UTC
Created attachment 22134 [details]
dmidecode HP dv5z
Comment 25 Rafael J. Wysocki 2009-06-29 23:19:25 UTC
On Monday 29 June 2009, Daniel Smolik wrote:
> Rafael J. affected napsal(a):
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.29 and 2.6.30.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.29 and 2.6.30.  Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=13512
> > Subject             : D43 on 2.6.30 doesn't suspend anymore
> > Submitter   : Daniel Smolik <marvin@mydatex.cz>
> > Date                : 2009-06-11 20:12 (18 days old)
> >
> >
> >   
> Yes problem still exists. I now bitsecting and I am near to find 
> affected patch.
Comment 26 Daniel Smolik 2009-06-30 22:42:21 UTC
e70049b9e74267dd47e1ffa62302073487afcb48 is first bad commit

Bisecting show me this commit. But I mean that I do something wrong. Bug is in i915 driver. Because when I unload i915.ko notebook suspend OK. But this commit doesn't affect i915 I mean,.
Comment 27 Howard Chu 2009-07-05 04:41:28 UTC
Suspend is working for me again on 2.6.31-rc2.
Comment 28 ykzhao 2009-07-06 01:39:14 UTC
Hi, Daniel 
    Will you please also try the latest kernel(2.6.31-rc2) and see whether it also works for you?
    thanks.
Comment 29 Daniel Smolik 2009-07-06 14:29:59 UTC
Still the same. Unloading i915 solve problem. May be I don't use framebuffer cosole only text. 

Dan
Comment 30 Rafael J. Wysocki 2009-07-07 10:46:42 UTC
On Tuesday 07 July 2009, Daniel Smolik wrote:
> Rafael J. Wysocki napsal(a):
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.29 and 2.6.30.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.29 and 2.6.30.  Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=13512
> > Subject             : D43 on 2.6.30 doesn't suspend anymore
> > Submitter   : Daniel Smolik <marvin@mydatex.cz>
> > Date                : 2009-06-11 20:12 (26 days old)
> >
> >
> >   
> 
> Bug still exists. Latest which I test is 2.6.31-rc2.
Comment 31 ykzhao 2009-07-13 03:42:24 UTC
HI, Daniel
    Will you please do the following test to see whether the box can't be resumed or there is no display after resume?
    a. kill the process using /proc/acpi/event
    b. echo mem > /sys/power/state; dmesg >dmesg_after; sync; 
    c. press the power button and see whether the box can be resumed.
    d. if the box can't be resumed, please reboot the box and see whether there exists the file of dmesg_after.
    
    If there exists the file of dmesg_after, please try the latest linus git tree and see whether it can be resumed correctly.

BTW: It will be better that the test is done in KMS mode. The CONFIG_DRM_I915_KMS should be enabled in kernel configuration. In such case the framebuffer console should also be enabled.

    Thanks.
Comment 32 Daniel Smolik 2009-07-13 06:19:46 UTC
Yes I can, but I mean that notebook doesn't sleep. I test this with Xserver stopped and sleep/resume works. I will try it with KMS + framebuffer enabled.
I try what you recommend with 2.6.31-rc2 OK ?

Dan
Comment 33 ykzhao 2009-07-13 06:35:55 UTC
If you try the 2.6.31-rc2 kernel, it will be better that you can apply the patch in https://bugs.freedesktop.org/show_bug.cgi?id=21576#C16.

thanks.
Comment 34 Daniel Smolik 2009-07-13 20:52:33 UTC
Created attachment 22332 [details]
dmesg before sleep
Comment 35 Daniel Smolik 2009-07-13 20:53:16 UTC
Created attachment 22333 [details]
dmesg after sleep
Comment 36 Daniel Smolik 2009-07-13 20:57:21 UTC
I do all as you want. But without framebuffer console situation is the same.
Comp freeze and in dmesg is nothing.But if I enable framebuffer console suspend start working. I attach new dmesg_before and dmesg_after. But another problem is that when I enable  fb console I don't see any output before Xserver starts.
Comment 37 ykzhao 2009-07-14 01:31:51 UTC
Hi, Daniel
    It seems that the incorrect driver is used. From the dmesg log it seems that the intel fb driver is used. But in fact the drm i915 driver had better be used. At the same time the fb console had better be enabled if the system is booted with KMS enabled. (BTW: please boot the system with KMS enabled.)
    Thanks.
Comment 38 ykzhao 2009-07-14 01:36:18 UTC
Please enable the following in kernel configuration.
   >CONFIG_FRAMEBUFFER_CONSOLE
   >CONFIG_DRM_I915_KMS

At the same time the CONFIG_FB_INTEL should be cleared in kernel configuration.
Thanks.
Comment 39 Daniel Smolik 2009-07-14 19:58:34 UTC
I do all as you recommend me. And situation is now little different.
If I don't start Xserver suspend/resume work. But If I start it and stop suspend freeze after suspending console. I mean that my version of Xserver can't handle KMS properly. I will continue with investigation.
Comment 40 Daniel Smolik 2009-07-14 21:33:40 UTC
If in log I see this:
[drm:i915_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
[drm:i915_initialize] *ERROR* Client tried to initialize ringbuffer in GEM mode

and try suspend system freeze.
Comment 41 Daniel Smolik 2009-07-14 21:59:38 UTC
New information. I have found oops in sleep preparation. Tomorrow I will connect serial console and post a OOPS.
Comment 42 Daniel Smolik 2009-07-15 21:37:09 UTC
Created attachment 22364 [details]
OOPS during suspend
Comment 43 Daniel Smolik 2009-07-15 21:37:56 UTC
Created attachment 22365 [details]
OOPS decoded by ksymoops
Comment 44 ykzhao 2009-07-16 03:12:54 UTC
Created attachment 22369 [details]
Don't register the connector device under the drm_class

Will you please try the debug patch on the latest kernel and see whether the oops still exists?
In this debug patch it won't register the connector device under the drm_class. In such case it won't call the drm_sysfs_suspend/resume for the connector device.

Thanks.
Comment 45 Daniel Smolik 2009-07-16 05:59:53 UTC
Yes, do you want latest git or 31-rc3 ?
Dan
Comment 46 ykzhao 2009-07-16 07:07:01 UTC
The 31-rc3 is ok.
Thanks.
Comment 47 Daniel Smolik 2009-07-18 23:00:42 UTC
I compile 2.6.31-rc3 with the same config as rc2 and situation is worse then before. System still oops and I didn't see your debug message. But when I use serial console system freeze and didn't OOPS :-(.
Comment 48 ykzhao 2009-07-20 15:12:44 UTC
Hi, Daniel
    Sorry that I don't pay attention to the info in comment #39 that your xserver can't handle the KMS correctly.
    From the info in comment #39 it seems that the suspend/resume can work well if the X is not started. Right? Will you please double check it again? Please also confirm whether the OOPS still exists on the 2.6.31-rc3 when KMS is enabled.
    If it still exists, please try the patch in comment #44 and see whether the OOPS still exists?
    Thanks.
Comment 49 Daniel Smolik 2009-07-20 15:31:41 UTC
(In reply to comment #48)
> Hi, Daniel
>     Sorry that I don't pay attention to the info in comment #39 that your
> xserver can't handle the KMS correctly.
>     From the info in comment #39 it seems that the suspend/resume can work
>     well
> if the X is not started. Right? Will you please double check it again? Please
> also confirm whether the OOPS still exists on the 2.6.31-rc3 when KMS is
> enabled.
>     If it still exists, please try the patch in comment #44 and see whether
>     the
> OOPS still exists?
>     Thanks.

On 2.6.31-rc3 suspend stop working. OOPS still exists in 2.6.31-rc3 I use patch that you recommend me. If I connect serial console system freeze and didn't OOPS.
I will check if suspend work without Xserver and let you know.

Dan
Comment 50 ykzhao 2009-07-21 07:06:26 UTC
Hi, Daniel
    Will you please attach the output of dmesg when the OOPS happens?
    It will be great if you can attach the output of dmesg after doing suspend/resume without Xserver.
    
Thanks.
Comment 51 Daniel Smolik 2009-07-25 10:51:15 UTC
Created attachment 22485 [details]
dmesg with 2.6.31rc3 before sleep
Comment 52 Daniel Smolik 2009-07-25 10:52:05 UTC
Created attachment 22486 [details]
dmesg with 2.6.31rc3 after resume
Comment 53 Daniel Smolik 2009-07-25 10:52:56 UTC
Created attachment 22487 [details]
dmesg with 2.6.31rc3 before second sleep, but after X server start/stop
Comment 54 Daniel Smolik 2009-07-25 10:56:33 UTC
I add some dmesg output before/after suspend resume without X. After running and stopping Xserver in dmesg is displayed error message about drm.
If I try suspend after this system OOPS. But I can't catch this OOPS because if I attach serial console system freeze not OOPS. I can do only screenshot. Is it suitable for you ?

Dan
Comment 55 ykzhao 2009-07-27 01:56:28 UTC
Hi, Daniel
    From the log in comment #52 it seems that the box can be suspend/resumed correctly on the 2.6.31-rc3. And there is no OOPS.
    Will you please double check what is the issue on the D43 box?
    Thanks.
Comment 56 Daniel Smolik 2009-07-27 05:55:17 UTC
Yes but Xserver is not running. Please look to comment #53 and #54. After I run Xserver system doesn't sleep anymore but OOPS. I test it many times and after running Xserver system always OOPS. Is enough start and stop Xsserver and after this system always OOPS.
Comment 57 Rafael J. Wysocki 2009-08-03 14:41:32 UTC
On Monday 03 August 2009, Daniel Smolik wrote:
> Rafael J. Wysocki napsal(a):
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.29 and 2.6.30.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.29 and 2.6.30.  Please verify if it still should
> > be listed and let me know (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=13512
> > Subject             : D43 on 2.6.30 doesn't suspend anymore
> > Submitter   : Daniel Smolik <marvin@mydatex.cz>
> > Date                : 2009-06-11 20:12 (53 days old)
> > 
> > 
> Yes still exists.
Comment 58 Alan Crosswell 2009-08-03 14:46:30 UTC
As of the most recent FC rawhide distro (sorry, kernel version not immediately available) suspend/resume with X works for me on a Dell Latitude D810.
Comment 59 Daniel Smolik 2009-08-03 16:27:37 UTC
It is possible, but if your Xserver use some old interface kernel OOPS or freeze.
Comment 60 Rafael J. Wysocki 2009-08-10 13:43:35 UTC
On Monday 10 August 2009, Daniel Smolik wrote:
> Rafael J. Wysocki napsal(a):
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.29 and 2.6.30.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.29 and 2.6.30.  Please verify if it still should
> > be listed and let me know (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=13512
> > Subject             : D43 on 2.6.30 doesn't suspend anymore
> > Submitter   : Daniel Smolik <marvin@mydatex.cz>
> > Date                : 2009-06-11 20:12 (60 days old)
> > 
> > 
> I will try 2.6.31-rc5 a let you know. But problem still exists.
Comment 61 Daniel Smolik 2009-08-20 07:31:27 UTC
2.6.31-rc6 oops still the same.
Comment 62 Daniel Smolik 2009-09-03 22:03:15 UTC
2.6.31-rc8 still OOPS in i915_gem_idle
Comment 63 Daniel Smolik 2009-09-28 10:01:51 UTC
Now in squeeze was upgraded Xserver and with 2.6.31 suspend/resume start working again. But with old Xserver still OOPS.
Comment 64 Zhang Rui 2009-09-29 00:55:07 UTC
So I don't think this is a LInux kernel problem.
please file a new bug at bugs.freedesktop.org.
Comment 65 Daniel Smolik 2009-09-29 04:54:46 UTC
Sorry, If userspace call kernel with wrong way and kernel OOPS it is userspace problem not kernel problem ? Thanks for reply.
Comment 66 ykzhao 2009-09-29 07:02:05 UTC
Thanks for so quick response.
    Now it seems that it is not a linux kernel problem.
   If the OOPS still exists, please open the bug in bugs.freedesktop.org.

Thanks.