Bug 15407

Summary: All Lenovo i5/i7 notebooks do not resume properly after suspend - system resumes but will hang if resumed a second time
Product: ACPI Reporter: Bryn Hughes (linux)
Component: Power-Sleep-WakeAssignee: acpi_power-sleep-wake
Status: CLOSED CODE_FIX    
Severity: high CC: achiang, adamw, jason, jerone.young, jlgoolsbee, jryans, lenb, lenovox201s, manisandro, maximlevitsky, me, mjg59-kernel, rui.zhang, sarmbruster, sergio, thejoe
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.32 Subsystem:
Regression: No Bisected commit-id:
Attachments: output of dmesg from Thinkpad W510 (4319-29G)
output of acpidump from Thinkpad W510 (4319-29G)
/proc/interrupts from Thinkpad W510 (4319-29G)
Output of dmesg on W510 (4318-CTO)
Output of acpidump on W510 (4318-CTO)
/proc/interrupts on W510 (4318-CTO)
Contents of /sys/firmware/acpi/tables/dynamic/SSDT* (pre and post suspend)
dmesg output post resume
dmesg output after 1st resume when "irqpoll" is used
add various lenovo machines to acpisleep_dmi_table
dmidecode of laptop that needs acpi_sleep=sci_force_enable

Description Bryn Hughes 2010-02-27 00:31:24 UTC
After unloading drivers that cause problems with suspend (xhci, bluetooth, etc) the system can suspend OK.  On resume however this message is displayed in dmesg:

[18446744064.763411] irq 9: nobody cared (try booting with the "irqpoll" option)
[18446744064.763419] Pid: 0, comm: swapper Tainted: G        W  2.6.32-14-generic #20-Ubuntu
[18446744064.763423] Call Trace:
[18446744064.763426]  <IRQ>  [<ffffffff810c4c1b>] __report_bad_irq+0x2b/0xa0
[18446744064.763439]  [<ffffffff810c4e1c>] note_interrupt+0x18c/0x1d0
[18446744064.763446]  [<ffffffff81019f09>] ? read_tsc+0x9/0x20
[18446744064.763452]  [<ffffffff810c551d>] handle_fasteoi_irq+0xdd/0x100
[18446744064.763457]  [<ffffffff81015d52>] handle_irq+0x22/0x30
[18446744064.763465]  [<ffffffff81563bdc>] do_IRQ+0x6c/0xf0
[18446744064.763472]  [<ffffffff81013b53>] ret_from_intr+0x0/0x11
[18446744064.763475]  <EOI>  [<ffffffff813086c0>] ? acpi_idle_enter_bm+0x283/0x2b7
[18446744064.763486]  [<ffffffff813086b9>] ? acpi_idle_enter_bm+0x27c/0x2b7
[18446744064.763492]  [<ffffffff8144c567>] ? cpuidle_idle_call+0xa7/0x140
[18446744064.763497]  [<ffffffff81011ea3>] ? cpu_idle+0xb3/0x110
[18446744064.763504]  [<ffffffff81558aad>] ? start_secondary+0xa8/0xaa
[18446744064.763508] handlers:
[18446744064.763510] [<ffffffff812de974>] (acpi_irq+0x0/0x31)
[18446744064.763517] Disabling IRQ #9


If the system is suspended a second time, it will not resume and instead shows a BIOS screen and must be hard booted.

Note this issue likely affects the following Lenovo models:

T410
T410s
T510
T510i
W510
W510i
W710
Comment 1 Zhang Rui 2010-03-02 08:50:46 UTC
please attach 
1. the dmesg output after boot
2. the acpidump output
3. the content of "/proc/interrupts" before suspend.

is this a regression? I mean are you aware of any previous kernel release that doesn't have this problem?
If yes, what's the earliest kernel that has this problem.
Comment 2 Stefan Armbruster 2010-03-02 09:03:35 UTC
Created attachment 25314 [details]
output of dmesg from Thinkpad W510 (4319-29G)
Comment 3 Stefan Armbruster 2010-03-02 09:04:16 UTC
Created attachment 25315 [details]
output of acpidump from Thinkpad W510 (4319-29G)
Comment 4 Stefan Armbruster 2010-03-02 09:04:46 UTC
Created attachment 25316 [details]
/proc/interrupts from Thinkpad W510 (4319-29G)
Comment 5 Stefan Armbruster 2010-03-02 09:07:35 UTC
I observed the same behaviour like Bryn on a Thinkpad W510 (4319-29G).

The problem occurs with 2.6.32 and 2.6.33. I did not test any previous kernel versions.
Comment 6 Bryn Hughes 2010-03-02 16:44:45 UTC
Created attachment 25320 [details]
Output of dmesg on W510 (4318-CTO)
Comment 7 Bryn Hughes 2010-03-02 16:45:50 UTC
Created attachment 25321 [details]
Output of acpidump on W510 (4318-CTO)
Comment 8 Bryn Hughes 2010-03-02 16:46:26 UTC
Created attachment 25322 [details]
/proc/interrupts on W510 (4318-CTO)
Comment 9 Bryn Hughes 2010-03-02 16:47:13 UTC
I have tested on 2.6.31, 2.6.32 and 2.6.33 - all versions currently have the same behaviour.
Comment 10 Zhang Rui 2010-03-03 01:48:33 UTC
please also attach all the files in /sys/firmware/acpi/tables/dynamic.
you can get the table by running "cat /sys/firmware/acpi/tables/dynamic/SSDTx > ssdtx.dat".

please attach the dmesg output after the first resume.

please attach the screen shot when the system hangs during the second resume.
Comment 11 Bryn Hughes 2010-03-03 02:50:18 UTC
It is impossible to provide a screenshot of the second resume unfortunately - the system gets dumped back to a BIOS screen that just has the Thinkpad logo, but is completely frozen.  Linux is not running anymore at that point.

The other attachments will be uploaded shortly...
Comment 12 Bryn Hughes 2010-03-03 03:01:27 UTC
Created attachment 25332 [details]
Contents of /sys/firmware/acpi/tables/dynamic/SSDT* (pre and post suspend)
Comment 13 Bryn Hughes 2010-03-03 03:02:30 UTC
Created attachment 25333 [details]
dmesg output post resume
Comment 14 Zhang Rui 2010-03-03 03:26:43 UTC
[    2.068747] ------------[ cut here ]------------
[    2.068759] WARNING: at /build/buildd/linux-2.6.32/arch/x86/kernel/hpet.c:392 hpet_next_event+0x7a/0x90()
[    2.068763] Hardware name: 4318CTO
[    2.068765] Modules linked in:
[    2.068770] Pid: 0, comm: swapper Not tainted 2.6.32-15-generic #22-Ubuntu
[    2.068773] Call Trace:
[    2.068781]  [<ffffffff81064f8b>] warn_slowpath_common+0x7b/0xc0
[    2.068787]  [<ffffffff81064fe4>] warn_slowpath_null+0x14/0x20
[    2.068791]  [<ffffffff810375da>] hpet_next_event+0x7a/0x90
[    2.068796]  [<ffffffff81037620>] hpet_legacy_next_event+0x10/0x20
[    2.068804]  [<ffffffff81090f94>] clockevents_program_event+0x54/0xa0
[    2.068810]  [<ffffffff810924c8>] tick_dev_program_event+0x48/0xd0
[    2.068816]  [<ffffffff81091e1e>] tick_broadcast_oneshot_control+0x11e/0x120
[    2.068821]  [<ffffffff81091630>] tick_notify+0x130/0x200
[    2.068829]  [<ffffffff81561cf7>] notifier_call_chain+0x47/0x90
[    2.068835]  [<ffffffff81088566>] raw_notifier_call_chain+0x16/0x20
[    2.068841]  [<ffffffff81090db7>] clockevents_notify+0x37/0x160
[    2.068847]  [<ffffffff813080a1>] lapic_timer_state_broadcast+0x46/0x48
[    2.068852]  [<ffffffff81308609>] acpi_idle_enter_bm+0x180/0x2b7
[    2.068857]  [<ffffffff81561cc6>] ? notifier_call_chain+0x16/0x90
[    2.068863]  [<ffffffff8144c9e7>] cpuidle_idle_call+0xa7/0x140
[    2.068870]  [<ffffffff81011ea3>] cpu_idle+0xb3/0x110
[    2.068877]  [<ffffffff81558f69>] start_secondary+0xa8/0xaa
[    2.068881] ACPI Warning for \_PR_.CPU0._PSD: Return Package type mismatch at index 4 - found [NULL Object Descriptor], expected Integer
[    2.068894] ---[ end trace 0bcff51cc3f6d229 ]---
[    2.068898]  (20090903/nspredef-1012)
[    2.068910] ACPI: Invalid _PSD data
[    2.069070] ACPI Warning for \_PR_.CPU1._PSD: Return Package type mismatch at index 4 - found [NULL Object Descriptor], expected Integer (20090903/nspredef-1012)
[    2.069084] ACPI: Invalid _PSD data
[    2.069239] ACPI Warning for \_PR_.CPU2._PSD: Return Package type mismatch at index 4 - found [NULL Object Descriptor], expected Integer (20090903/nspredef-1012)
[    2.069252] ACPI: Invalid _PSD data
[    2.069406] ACPI Warning for \_PR_.CPU3._PSD: Return Package type mismatch at index 4 - found [NULL Object Descriptor], expected Integer (20090903/nspredef-1012)
[    2.069419] ACPI: Invalid _PSD data
[    2.069572] ACPI Warning for \_PR_.CPU4._PSD: Return Package type mismatch at index 4 - found [NULL Object Descriptor], expected Integer (20090903/nspredef-1012)
[    2.069585] ACPI: Invalid _PSD data
[    2.069739] ACPI Warning for \_PR_.CPU5._PSD: Return Package type mismatch at index 4 - found [NULL Object Descriptor], expected Integer (20090903/nspredef-1012)
[    2.069752] ACPI: Invalid _PSD data
[    2.069905] ACPI Warning for \_PR_.CPU6._PSD: Return Package type mismatch at index 4 - found [NULL Object Descriptor], expected Integer (20090903/nspredef-1012)
[    2.069918] ACPI: Invalid _PSD data
[    2.070071] ACPI Warning for \_PR_.CPU7._PSD: Return Package type mismatch at index 4 - found [NULL Object Descriptor], expected Integer (20090903/nspredef-1012)
[    2.070084] ACPI: Invalid _PSD data

please blacklist processor.ko and boot with boot option "hpet=disable" and see if the error messages above goes away. And check if the problem still exists.
Comment 15 Bryn Hughes 2010-03-03 19:02:42 UTC
Unfortunately the acpi processor module is compiled in to my kernel (Ubuntu) - I can compile a fresh one if you feel it would help.

hpet=disable removes the hpet-related message, but the system starts running poorly - it runs slower and it has poor video performance at times.
Comment 16 Bryn Hughes 2010-03-03 19:03:10 UTC
oh, and hpet=disable does not correct the problem on resume...
Comment 17 Zhang Rui 2010-03-04 05:48:49 UTC
(In reply to comment #15)
> Unfortunately the acpi processor module is compiled in to my kernel (Ubuntu)
> -
> I can compile a fresh one if you feel it would help.
> 
yes, please. thanks.

BTW: does the second resume still hangs if you boot with "irqpoll"?
please attach the dmesg output after the first resume (boot with irqpoll).
Comment 18 Stefan Armbruster 2010-03-07 09:54:24 UTC
Yes, the second resume also hangs when irqpoll is added to the kernel command line, see the attachment.
Comment 19 Stefan Armbruster 2010-03-07 09:55:04 UTC
Created attachment 25393 [details]
dmesg output after 1st resume when "irqpoll" is used
Comment 20 Bryn Hughes 2010-03-19 20:11:38 UTC
OK, was able to compile a kernel with ACPI_PROCESSOR as a module and then blacklist it.  Confirmed that it was NOT loaded...

The "Invalid PSD Data" messages do go away with ACPI_PROCESSOR blacklisted, but the system still gets the same error message on resume, and can't resume a second time.
Comment 21 lenovox201s 2010-03-25 16:54:04 UTC
I can confirm this is also the case for the Lenovo x201s.  I have tried kernels 2.6.32-2, 2.6.33-1 and 2.6.34-rc4, they all exhibit the same problem.  I would be happy to provide more info, please instruct what is needed.
Comment 22 lenovox201s 2010-03-25 17:01:46 UTC
(In reply to comment #21)
> I can confirm this is also the case for the Lenovo x201s.  I have tried
> kernels
> 2.6.32-2, 2.6.33-1 and 2.6.34-rc4, they all exhibit the same problem.  I
> would
> be happy to provide more info, please instruct what is needed.

2.6.34-rc2 sorry typo
Comment 23 Bryn Hughes 2010-03-26 17:28:34 UTC
I notice the status is still 'NEEDINFO' - what other info can we provide?
Comment 24 Jerone Young 2010-04-04 21:23:59 UTC
We have been working with Lenovo. Lenovo is resolving issues with their bios now. More info is in Launchpad where we are tracking this issue:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/532374
Comment 25 Alex Chiang 2010-04-19 04:51:22 UTC
Bryn, Stefan, others who can reproduce this bug:

Please try booting with this kernel command line option and see if it helps:

acpi_sleep=sci_force_enable

Please report back your results. Thanks.
Comment 26 Alex Chiang 2010-04-19 20:14:45 UTC
Created attachment 26053 [details]
add various lenovo machines to acpisleep_dmi_table

Please try this patch, which adds the Lenovo laptops to acpisleep_dmi_table.

We have some information from Lenovo BIOS engineers that indicates forcing OSPM to write to SCI_EN after resume is what Windows does.
Comment 27 Jerone Young 2010-04-20 06:50:53 UTC
@Alex
       For this issue it may be best to not have these in the kernel since Lenovo is resolving the issue in their updated bios. This is mainly reserved for bioses that have no hope of changing I believe.
Comment 28 Maxim Levitsky 2010-04-20 15:16:47 UTC
Sorry to hijack that thread, but I have very similar issue on my acer aspire 5720.

It is well known that bios hangs on second resume from ram.
The 'acpi_sleep=sci_force_enable' doesn't help, and nether does milliard or other things I tried (I am not a newbie, I even have some kernel code written)

So the only option left is to contact acer's bios team.
Maybe you know which strings to pull to make that possible?
Comment 29 Matthew Garrett 2010-04-20 15:58:46 UTC
Jerone:

Given that the BIOS can't be updated under Linux, and given that there's no
indication to the users that there's a BIOS bug, it makes sense to just make
the software work. When hardware works fine under another OS then that
indicates that we need to make Linux mimic the behaviour of that OS to the
closest extent practical.
Comment 30 Jerone Young 2010-04-20 16:14:29 UTC
@Matthew
      Actually you can update the bios. For thinkpads they provide an DOS based iso for updating the bios.

      Also this just happened to be a case where things feel out of the ACPI specification by accident. Under Windows 7 they just force this ... but I'm not sure what the consequences are of doing this on other systems.

@Maxim
      I can ask some folks working for us who work with acer. But no promises.
Comment 31 Matthew Garrett 2010-04-20 16:21:58 UTC
Jerone:

The X201 has no optical drive.
Comment 32 Jerone Young 2010-04-20 16:25:14 UTC
@Matthew
       You use a usb cdrom. I just did this to an X201.
Comment 33 Matthew Garrett 2010-04-20 16:28:06 UTC
So people who don't own a USB optical drive get stuck with a broken OS? That's less than optimal.
Comment 34 Jerone Young 2010-04-20 16:38:30 UTC
@Matthew 
      Also there is a issue with USB after suspsend that we are looking to fix in the BIOS as well. So you will need to update anyways. 

      If you have a problem with bios updating methods. It's best to take it up with Lenovo.
Comment 35 Matthew Garrett 2010-04-20 16:44:20 UTC
Th USB bug appears to be worked-around by unloading and reloading the hci drivers, which is an indication that we're doing something wrong in our resume path. We should fix the bug, not get vendors to release BIOS updates.
Comment 36 Jerone Young 2010-04-20 17:09:57 UTC
@Matthew
      Hmm .. Don't want to side track this bug. But I tried this on a T410 and that did not help the USB bug. Will try again

      We are tracking & discussing on launchpad  here:
https://bugs.launchpad.net/oem-priority/+bug/566149
Comment 37 Adam Williamson 2010-04-20 19:36:40 UTC
matthew: googling around suggests that grub4dos can actually boot ISO images directly, which may be a way to do the BIOS flash for people with no optical drive. It's hedged around, but sounds like this might be the kind of ISO for which it'd work:

http://diddy.boot-land.net/grub4dos/files/map.htm#hd32
Comment 38 Sandro Mani 2010-04-22 20:16:24 UTC
Just a note on updating: the best method actually seems http://www.thinkwiki.org/wiki/BIOS_update_without_optical_disk
Comment 39 Matthew Garrett 2010-04-22 20:27:42 UTC
Patch is posted to linux-acpi, but I don't seem to have sufficiently privileges to alter the bug state.
Comment 40 Zhang Rui 2010-05-10 06:26:43 UTC
okay, patch available at https://patchwork.kernel.org/patch/94711/
Comment 41 Maxim Levitsky 2010-05-18 21:13:52 UTC
@Jerone Young. Was there any success contacting acer folks?

This problem affects many peoples, and it very frustrating.

(In fact I recently switched to nouveau drives, which work almost perfectly here, and with patch to skip VT switch on suspend/resume, my system resumes from ram and ready to work instantly, seems even faster that in windows).

But it happens only once.... :-(
Comment 42 Maxim Levitsky 2010-05-18 21:16:41 UTC
Sorry for typos above (need some sleep), but I want to add that if I had a email address of team that writes the BIOS, I could team up with other users that suffer from same problem, and send together mail to that team.
Comment 43 Bryn Hughes 2010-05-18 21:37:47 UTC
This issue is resolved both in the latest mainstream kernels and in the latest Lenovo BIOS for the affected machines.  Either fix will work - the newer kernel duplicates the behaviour of that "Other" OS when required, while the Lenovo BIOS now follows the ACPI specifications properly.
Comment 44 Maxim Levitsky 2010-05-18 22:00:01 UTC
@Bryn Hugens, I have unrelated problem. My system is Acer Aspire 5720, and it is reported at https://bugzilla.kernel.org/show_bug.cgi?id=13931.

I asked Jerone Young to see if he could contact acer on that problem.
Sorry for noise.
Comment 45 Len Brown 2010-05-20 03:04:12 UTC
The  init_set_sci_en_on_resume DMI patches
mentioned in comment #4 (and others) shipped in 2.6.34.

this bug report is closed.

Note that we will try to clean up the workaround in 2.6.35
by doing away with the DMI lists.
Comment 46 Sérgio M Basto 2015-03-16 21:08:39 UTC
(In reply to Alex Chiang from comment #25)
> Bryn, Stefan, others who can reproduce this bug:
> 
> Please try booting with this kernel command line option and see if it helps:
> 
> acpi_sleep=sci_force_enable
> 
> Please report back your results. Thanks.

Hi, acpi_sleep=sci_force_enable fix pm-suspend on my laptop , beautiful ! 

I was searching on prevent suspend on close lid, that gives me corrupt access to disk, after wakeup .

someone point to this bug , I saw some usb error on wakeup that match on description , so I test it and it works !
Comment 47 Sérgio M Basto 2015-03-16 21:11:24 UTC
Created attachment 170841 [details]
dmidecode of laptop that needs acpi_sleep=sci_force_enable

many thanks !
Comment 48 Matthew Garrett 2015-03-16 21:19:47 UTC
Sergio,

The sci_force_enable option was removed from the kernel in 2010, so that's unlikely to be what's helping you.
Comment 49 Sérgio M Basto 2015-03-17 02:27:42 UTC
:/ , on the second attempt fails again ,

BTW this parameter still in documentation : 

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/Documentation/kernel-parameters.txt?id=refs/tags/v3.18.9#n359

sci_force_enable causes the kernel to set SCI_EN directly on resume from S1/S3 (which is against the ACPI spec, but some broken systems don't work without it).

We need revert document part of this patch :

https://lists.ubuntu.com/archives/kernel-team/2010-April/010243.html