Bug 13243

Summary: suspend fails, bisected 2.6.30 regression, pnp_bus_suspend() returns -5 -- HP Compaq nc6000
Product: ACPI Reporter: cedric (cedric)
Component: OtherAssignee: ykzhao (yakui.zhao)
Status: CLOSED CODE_FIX    
Severity: normal CC: acpi-bugzilla, bjorn.helgaas, lenb, rjw, shaohua.li, yakui.zhao
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.30-rc4 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 13070    
Attachments: .config file
acpidump result
dmidecode result
patch vs 2.6.30-rc4 to ignore _PS3 failure
Add HP nc6000 into DMI power check table to disable the power state check

Description cedric 2009-05-04 15:02:08 UTC
Created attachment 21212 [details]
.config file

when trying to suspend on my laptop with 2.6.30-rc4, i have the following dmesg :
...                           
pnp 00:07: legacy suspend                                          
pnp 00:06: legacy suspend                                          
pnp 00:05: legacy suspend                                          
pnp 00:04: legacy suspend                                          
pnp 00:03: legacy suspend                                          
serial 00:02: legacy suspend                                       
ACPI: Transitioning device [C169] to D3                            
serial 00:02: disable failed                                       
suspend_device(): pnp_bus_suspend+0x0/0x6b returns -5              
PM: Device 00:02 failed to suspend: error -5                       
PM: Some devices failed to suspend                                 
pnp 00:03: legacy resume                                           
pnp 00:04: legacy resume                                           
pnp 00:05: legacy resume                                           
pnp 00:06: legacy resume                                           
pnp 00:07: legacy resume                                           
pnp 00:08: legacy resume                                           
...
Comment 1 Rafael J. Wysocki 2009-05-04 19:23:35 UTC
One of the pnp devices fails to suspend.  I'm not sure what it is, though.

What kernel was the last working one?
Comment 2 cedric 2009-05-05 07:59:34 UTC
the last was 2.6.29. I'll try to bisect. I think 30-rc1 was already NOK
Comment 3 cedric 2009-05-05 18:30:22 UTC
bisect done and the first bad commit is :
# bad: [6328a57401dc5f5cf9931738eb7268fcd8058c49] Enable PNPACPI _PSx Support, v3
git bisect bad 6328a57401dc5f5cf9931738eb7268fcd8058c49

when i revert it from 2.6.30-rc4, suspend works.
Comment 4 Rafael J. Wysocki 2009-05-05 21:54:13 UTC
Thanks for bisecting this!

First-Bad-Commit : 6328a57401dc5f5cf9931738eb7268fcd8058c49

Caused by:

commit 6328a57401dc5f5cf9931738eb7268fcd8058c49
Author: Witold Szczeponik <Witold.Szczeponik@gmx.net>
Date:   Mon Mar 30 19:31:06 2009 +0200

    Enable PNPACPI _PSx Support, v3

    Signed-off-by: Witold Szczeponik <Witold.Szczeponik@gmx.net>
    Acked-by: Zhao Yakui <yakui.zhao@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>
Comment 5 ykzhao 2009-05-06 00:56:11 UTC
Hi, Cedric
    Will you please attach the output of acpidump?
    Thanks.
Comment 6 cedric 2009-05-06 08:47:57 UTC
Created attachment 21243 [details]
acpidump result
Comment 7 ykzhao 2009-05-07 01:28:35 UTC
Hi, Cedric
    Will you please add the boot option of "acpi.power_nocheck=1" and see whether the issue still exists?
Comment 8 ykzhao 2009-05-07 01:29:03 UTC
Will you please attach the output of dmidecode?
   Thanks.
Comment 9 cedric 2009-05-07 07:44:41 UTC
Created attachment 21255 [details]
dmidecode result

will try the "acpi.power_nocheck=1" in 5 minutes :-)
Comment 10 cedric 2009-05-07 08:26:09 UTC
tried the "acpi.power_nocheck=1" and it worked but i had a warning in dmesg

------------[ cut here ]------------
WARNING: at kernel/hrtimer.c:625 hres_timers_resume+0x3c/0x48()
Hardware name: HP Compaq nc6000 (DU358S#UUG)
hres_timers_resume() called with IRQs enabled!Modules linked in: nfs lockd nfs_acl sunrpc adm1031 snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss acpi_cpufreq usbhid pcmcia snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm sr_mod snd_timer ehci_hcd cdrom snd rtc_cmos uhci_hcd yenta_socketrtc_core usbcore i2c_i801 tg3 rsrc_nonstatic snd_page_alloc pcmcia_core libphy rtc_lib battery ac
Pid: 2943, comm: pm-suspend Not tainted 2.6.30-rc4 #1
Call Trace:
 [<c011ce4b>] warn_slowpath+0x74/0xc3
 [<c0310038>] ? rt_mutex_slowunlock+0xe8/0x13b
 [<c010728b>] ? pit_next_event+0x19/0x1e
 [<c0135d0c>] ? clockevents_program_event+0x92/0x128
 [<c012f78a>] ? ktime_get_ts+0x45/0x49
 [<c0136add>] ? tick_dev_program_event+0x33/0xb4
 [<c0136ba5>] ? tick_program_event+0x17/0x19
 [<c0136bc8>] ? tick_resume_oneshot+0x21/0x39
 [<c013036b>] ? notifier_call_chain+0x2b/0x5a
 [<c012f72b>] hres_timers_resume+0x3c/0x48
 [<c013299f>] timekeeping_resume+0xd1/0xf0
 [<c0247b6d>] __sysdev_resume+0x14/0x42
 [<c0247bd2>] sysdev_resume+0x37/0x66
 [<c013e0f7>] suspend_devices_and_enter+0x18e/0x1b5
 [<c030ee9f>] ? printk+0x18/0x21
 [<c013e261>] enter_state+0x11e/0x17b
 [<c013e33a>] state_store+0x7c/0xad
 [<c013e2be>] ? state_store+0x0/0xad
 [<c01d551b>] kobj_attr_store+0x20/0x27
 [<c019e5cc>] sysfs_write_file+0x97/0xe0
 [<c0166eaf>] vfs_write+0x8a/0x110
 [<c019e535>] ? sysfs_write_file+0x0/0xe0
 [<c0166fde>] sys_write+0x3d/0x6b
 [<c0102cc8>] sysenter_do_call+0x12/0x26
---[ end trace fbfd487192b3ff22 ]---

if you need more info, ask.
Comment 11 Len Brown 2009-05-08 03:46:47 UTC
> serial 00:02: legacy suspend                                       
> ACPI: Transitioning device [C169] to D3    

the printk above should actually say "Failed to transition device..."
         
> serial 00:02: disable failed                                       
> suspend_device(): pnp_bus_suspend+0x0/0x6b returns -5 

pnp_bus_suspend is getting -5 back from pnp_stop_dev(),
which is returning -EIO
because pnpacpi_disable_resources()
which evaluates _PS3 (and then _DIS) returns an error.

Before the regression commit, 6328a57401dc5f5cf9931738eb7268fcd8058c49
pnpacpi_disable_resources evaluated _DIS without first evaluating _PS3.

acpi.power_nocheck=1 bypassed the failure
by disabling a sanity check in acpi_bus_set_power()
and acpi_power_off() that verifies that device actually got into D3.

It is possible that the error checking on _PS3 that went
into the final version of the offending commit was really
a mistake and we should continue and evaluate _DIS
even if _PS3 was not verified.

I don't know what the backtrace is about in the
acpi.power_nocheck=1 case -- it may be an additional
unrelated problem.
Comment 12 Len Brown 2009-05-08 04:32:30 UTC
Created attachment 21273 [details]
patch vs 2.6.30-rc4 to ignore _PS3 failure

Please test.

I think this should behave the same way as acpi.power_nocheck=1
Comment 13 ykzhao 2009-05-08 07:55:48 UTC
Agree with What Len said in comment #11.
   It should report that "Failed to transit the device into Dx state". And he problem can be fixed by the patch in comment #12.
   Of course I will add this box to dmi check table.

   And from the log in comment #10 it seems that there exists another issue. 
   >WARNING: at kernel/hrtimer.c:625 hres_timers_resume+0x3c/0x48()
   This is related with the following commit:
    >commit 1d4a7f1c4faf53eb9e822743ec8a70b3019a26d2
    >Author: Peter Zijlstra <peterz@infradead.org>
    >Date:   Sun Jan 18 16:39:29 2009 +0100
       >hrtimers: fix inconsistent lock state on resume in hres_timers_resume
    
    Please open another bug about this issue.
    Thanks.
Comment 14 cedric 2009-05-08 08:17:23 UTC
So, you have my test-by ;-)
I'll open another bug for the warning.

Many thanks
Comment 15 ykzhao 2009-05-11 01:19:45 UTC
Created attachment 21299 [details]
Add HP nc6000 into DMI power check table to disable the power state check

Hi, Cedric
    Will you please try the debug patch on the latest kernel and see whether the issue can be fixed?
    In the debug patch the HP nc6000 is added into the DMI power check table so that it will skip the power state check in power transition. It is equal to the boot option of "acpi.power_nocheck=1".
    Thanks.
Comment 16 cedric 2009-05-11 14:53:48 UTC
Hi, 
I tested it and it's working. I can suspend and resume (and still have the warning).
Comment 17 Rafael J. Wysocki 2009-05-13 09:03:53 UTC
Handled-By : Zhao Yakui <yakui.zhao@intel.com>
Patch : http://bugzilla.kernel.org/attachment.cgi?id=21299
Comment 18 Rafael J. Wysocki 2009-05-16 21:58:40 UTC
Fixed by commit ddc50b6ad634d9ce2526a777d4b7da80effdfb60 .
Comment 19 Rafael J. Wysocki 2009-05-18 17:46:35 UTC
Comment #18 is wrong, this is fixed by commit 19bde778c1fd2574cc020a618d7d576f260271ca .