Bug 80351

Summary: Suspend Failure on Lenovo Thinkpad Edge E540 (Model 20C6CTO1WW)
Product: Power Management Reporter: Jörn Horstmann (joern-bugs)
Component: Hibernation/SuspendAssignee: Zhang Rui (rui.zhang)
Status: CLOSED DOCUMENTED    
Severity: low CC: bodia.id, danyer, dudaerich, gbr, idfred, karsten-bugzilla-kernel.org, murat.asya, rui.zhang, tianyu.lan, tomasz89, wangkangping
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 3.17-rc1 Tree: Mainline
Regression: No
Attachments: Bug report form with hardware information
the acpidump out of Lenovo ThinkPad Edge 540 Model 20C60043TX on J9ET91WW (2.11 )bios
acpidump file for version 2.13
2.13 bios information
I got this kernel dmesg log when the system was crashed. The log includes warning massages about ACPI
Loaded kernel models on the system that is crasing by this issue

Description Jörn Horstmann 2014-07-16 15:39:46 UTC
The laptop fails to suspend, meaning the fan stays on and resume is not
possible. The problem exists at least between kernel version 3.13 (Ubuntu
14.04 default) and 3.16-rc4, I did not test any earlier versions.

Suspend works if USB 3.0 is disabled in the BIOS.

I reported this bug initially on launchpad at
<https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1340376>.

Another report <https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1331077>
with similar hardware traced the problem to a bios update:

> I confirmed that this is definitely a BIOS issue. The last working BIOS is
> v1.61 [2], so the change was introduced by v2.07 [1]. Installing the latest
> v2.08 [3] does not fix this problem.
>
> So this is the changelog for v2.07 [1]:
> - (Fix) Fixed an issue that system will auto wakeup after shutdown when
>         battery capacity is under 15% on DC only.
> - (Fix) Fixed an issue related to APS.
> - (New) Added support for new GPU config.

Bios revision 2.09 and 2.11 also did not fix the problem. Suspending with
pm_trace did not point to any matching devices:

[    1.174760]   Magic number: 0:303:740
[    1.174762]   hash matches /home/apw/COD/linux/drivers/base/power/main.c:812

When setting pm_test to 'core' the system comes back up without problems.
Comment 1 Jörn Horstmann 2014-07-16 15:43:23 UTC
Created attachment 143241 [details]
Bug report form with hardware information
Comment 2 Lan Tianyu 2014-07-17 01:38:22 UTC
Could you check suspend without xhci driver?
Comment 3 Jörn Horstmann 2014-07-17 09:06:24 UTC
Looks like this kernel config has xhci builtin, I will build a kernel where its a module and try again.

$ grep -i xhci /boot/config-3.16.0-031600rc4-generic 
CONFIG_USB_XHCI_HCD=y
CONFIG_USB_XHCI_PLATFORM=m

I was testing again from recovery console with this script:

#!/bin/sh
sync
echo N > /sys/module/printk/parameters/console_suspend
echo Y > /sys/module/printk/parameters/ignore_loglevel
echo 1 > /sys/power/pm_async
echo 1 > /sys/power/pm_trace
echo $1 > /sys/power/pm_test
echo mem > /sys/power/state

That way I could still see the log messages on the console. With parameter 'core' for pm_test the system resumed in 5 seconds, with 'none' the last message was

kvm: disabling virtualization on CPU7
smpboot: CPU 7 is now offline
Comment 4 Jörn Horstmann 2014-07-17 14:50:52 UTC
I build a 3.16.0-rc5 kernel with CONFIG_USB_XHCI_HCD=m, then did a

rmmod xhci_hcd

before trying to suspend. The system still hangs at the same point as before.

The only difference I see is that the dmesg output after reboot no longer contains a 'hash matches' line.

$ dmesg | grep -C2 Magic
[    2.832383] ima: No TPM chip found, activating TPM-bypass!
[    2.832398] evm: HMAC attrs: 0x1
[    2.832977]   Magic number: 4:676:740
[    2.833066] rtc_cmos 00:02: setting system clock to 2000-01-19 15:43:49 UTC (948296629)
[    2.833229] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
Comment 5 Lan Tianyu 2014-07-18 02:38:02 UTC
Please comment CONFIG_USB_XHCI_HCD in the configure file and xhci will not be built in kernel. Try again.

Could you cat /sys/power/pm_trace_dev_match when there is a magic number? It should show the device causing the failure of system suspend.
Comment 6 Jörn Horstmann 2014-07-18 09:52:04 UTC
Ok, I did that:

$ grep -i xhc .config
# CONFIG_USB_XHCI_HCD is not set

$ lsusb 
unable to initialize libusb: -99

$ cat /proc/version 
Linux version 3.16.0-rc5+ (jh@JH-EDGE-E540) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #17 SMP Fri Jul 18 11:06:28 CEST 2014

Same behaviour and magic number as in the previous comment:

$ dmesg | grep -C3 -i magic 
[    1.143904] AppArmor: AppArmor sha1 policy hashing enabled
[    1.143909] ima: No TPM chip found, activating TPM-bypass!
[    1.143929] evm: HMAC attrs: 0x1
[    1.144593]   Magic number: 4:676:740
[    1.144692] rtc_cmos 00:02: setting system clock to 2000-01-19 15:44:08 UTC (948296648)
[    1.144786] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
[    1.144788] EDD information not available.

But no 'hash matches'

$ dmesg | grep -C3 -i matches

And nothing in pm_trace_dev_match

$ cat /sys/power/pm_trace_dev_match

So it seems usb/xhci isn't the culprit. Any other ideas how to debug this?
Comment 7 Jörn Horstmann 2014-07-18 21:53:53 UTC
Managed to enable acpi debugging and got some output:

[397.130987] hwvalid-0148 hw-validate-io-request: Address 0000000000001804 LastAddress 0000000000001805 Length 2
hwrеgs-0192 hw-read
Read 00000001 Width 16 from 0000000000001804 (SystemIO)
[397.131175] hwsleep-0119 hw-legacy-sleep Entering sleep state [S3]
[397.131258] hwvalid-0148 hw-validate-io-request Address 0000000000001804 LastAddress 0000000000001805 Length 2
hwrеgs-0243 hw-write
Wrote 00001401 width 16 to 0000000000001804 (SystemIO)
[397.131727] hwvalid-0148 hw-validate-io-request Address 0000000000001804 LastAddress 0000000000001805 Length 2

If I read drivers/acpi/acpica/hwsleep.c correctly, suspend is invoked by writing the bit masks for SLP_TYP and for SLP_EN in two separate steps, and the system is hanging right before the second write logs its success.
Comment 8 Jörn Horstmann 2014-08-18 22:13:38 UTC
Suspend problem still occurs with 3.17-rc1, I tested older kernel versions and the same problem occurs for example with 3.3.7 (version as of Ubuntu 12.04).

I can also confirm that a bios downgrade to 1.61 makes suspend and resume work, as noted in the first comment. I've got acpi dumps of both versions, would these help in tracking down the problem?
Comment 9 Murat Ödünç 2014-08-22 14:38:40 UTC
I have Lenovo Thinkpad e530 laptop too. this bug effects me. My laptop model name is 20C60043TX.

I have tried J9ET91WW (2.11 )bios  with ubuntu 14.04 lts, this bug still is efected suspend. 

When I was looking for solution, I have read about usb3 disable to fix it. That works me..
Comment 10 Zhang Rui 2014-08-25 03:20:32 UTC
is there any chance that you can try windows on your laptop and check if suspend works with USB3.0 enabled in BIOS?
please attach the acpidump output of your laptop.
Comment 11 Murat Ödünç 2014-08-25 07:58:35 UTC
Ubuntu 14.04 is only installed on my laptop, windows is not.

How to attach acpidump on my laptop? . 'acpidump' command is not available on ubuntu 14.04..

I want to help fixing this issue.
Comment 12 Zhang Rui 2014-08-26 02:53:59 UTC
just apt-get install acpidump.
Comment 13 Murat Ödünç 2014-08-26 19:21:19 UTC
Created attachment 148411 [details]
the acpidump out of Lenovo ThinkPad Edge 540 Model 20C60043TX on J9ET91WW (2.11 )bios

This file have been created on Lenovo ThinkPad Edge 540 Model 20C60043TX. Current bios version was J9ET91WW (2.11).
Comment 14 Zhang Rui 2014-08-27 07:35:46 UTC
(In reply to Jörn Horstmann from comment #8)
> Suspend problem still occurs with 3.17-rc1, I tested older kernel versions
> and the same problem occurs for example with 3.3.7 (version as of Ubuntu
> 12.04).
> 
> I can also confirm that a bios downgrade to 1.61 makes suspend and resume
> work, as noted in the first comment.

Sorry that I missed this post.
And I will close this bug as it is a BIOS issue and there is nothing we can do in Linux kernel to fix/workaround it.
Comment 15 Murat Ödünç 2014-09-21 16:48:54 UTC
Created attachment 151171 [details]
acpidump file for version 2.13
Comment 16 Murat Ödünç 2014-09-21 16:50:00 UTC
Created attachment 151181 [details]
2.13 bios information
Comment 17 Murat Ödünç 2014-09-21 16:53:55 UTC
Created attachment 151191 [details]
I got this kernel dmesg log when the system was crashed. The log includes warning massages about ACPI
Comment 18 Murat Ödünç 2014-09-21 16:55:50 UTC
Created attachment 151201 [details]
Loaded kernel models on the system that is crasing by this issue
Comment 19 Murat Ödünç 2014-09-21 16:56:36 UTC
Lenovo have been published new bios that 2.13 and I have installed  on my machine. I want to report that  this issue still go on..  I want to share  same information to fixing  the issue..
Comment 20 Zhang Rui 2014-09-22 01:49:20 UTC
(In reply to Murat Ödünç from comment #19)
> Lenovo have been published new bios that 2.13 and I have installed  on my
> machine. I want to report that  this issue still go on..  I want to share 
> same information to fixing  the issue..

well, I'd say it's difficult to root cause this issue from OS without the help from BIOS engineer, as we don't know  what changes have been done, and which may lead to system crash.
So the best way is to raise this problem to the Lenovo BIOS guys, and I don't know how to do this...
Comment 21 Murat Ödünç 2014-09-23 15:52:56 UTC
yes it looks you are right that lenovo can fix this issue..

I have been sended a couple of tweets about this issue to offical Lenovo account.

They have been directed me to Lenovo Geeks Forum :D That is joke!

Thank you for your attention, Zhang Rui
Comment 22 Murat Ödünç 2014-09-23 16:12:13 UTC
Now I checked my twitter just now,  A person who works Lenovo have been seended twitt to me .

form his tweet:
"@muratsplat @lenovo Murat, Would you send me an email with more details? markah@lenovo.com We'll look at what you have in bugzilla"
Comment 23 Dan Andresan 2014-10-03 10:22:53 UTC
(In reply to Murat Ödünç from comment #22)
> Now I checked my twitter just now,  A person who works Lenovo have been
> seended twitt to me .
> 
> form his tweet:
> "@muratsplat @lenovo Murat, Would you send me an email with more details?
> markah@lenovo.com We'll look at what you have in bugzilla"

Hi Murat, have you followed up with markah@lenovo.com? I have the same problem so I would like to contact him/her in case you abandoned this issue.
Comment 24 Murat Ödünç 2014-10-04 19:40:13 UTC
Hi Dan,

I haven't. I have been suggested going on this bug page to Lenovo worker(markah@lenovo.com). Because there are a lot of information about suspend issue on this page..

There is any action to fix the issue from Lenovo. I'm waiting for what lenovo fixes this issue..
Comment 25 Dan Andresan 2014-10-14 08:29:02 UTC
Murat,

I contacted markah@lenovo.com asking him to put me in contact with a developer from BIOS/UEFI team, but no answer at all. I offered to run all the tests required and try different beta BIOSes that the engineers would provide.

Then I googled a little bit and found this: http://download.lenovo.com/lenovo/eSupport%20Replacement%20Communication%2005.20.11.ppt which shows that markah@lenovo.com is Mark Hopkins, lead/boss/admin/"community strategist" of Lenovo Community initiative. http://www.lithium.com/why-lithium/customer-success/lenovo

All respect to Mark, but I think we would need someone with access to technical people ;)
Comment 26 Murat Ödünç 2014-10-16 11:45:37 UTC
Yes I have gotten information like you have found about Mark. 

I hope Lenovo fixes this issue.

It looks there is anything what we can do.
Comment 27 Dan Andresan 2014-10-30 19:43:36 UTC
Today Lenovo released BIOS 2.16 for E540/440. Quite a jump from 2.13.
The problem is NOT solved. The disabling of USB 3.0 in BIOS permits the suspend (as before). All in all, no change.
Comment 28 Murat Ödünç 2014-10-30 21:41:12 UTC
Thank you for the news..

But it looks bad news :)

Lenovo  has to do something to fix it..


On 30-10-2014 21:43, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=80351
>
> --- Comment #27 from Dan Andresan <danyer@gmail.com> ---
> Today Lenovo released BIOS 2.16 for E540/440. Quite a jump from 2.13.
> The problem is NOT solved. The disabling of USB 3.0 in BIOS permits the
> suspend
> (as before). All in all, no change.
>
Comment 29 Ivan Frederiks 2015-03-20 15:44:47 UTC
Lenovo says that they fixed this issue in BIOS 2.18: http://support.lenovo.com/us/en/products/laptops-and-netbooks/thinkpad-edge-laptops/thinkpad-edge-e540/downloads/DS037208
Comment 30 Dan Andresan 2015-03-20 16:37:48 UTC
Thank you Ivan for the notification.
Thank you Lenovo for fixing this bug.

I updated my BIOS (should we still call it BIOS, now that it's UEFI?) to 2.18.
I set the USB 3.0 mode to Enabled.

Suspended and resumed and cried with joy. It's fixed!

What I didn't test:

- set the USB 3.0 mode to Auto (don't know what it does, how is it different from Enabled)
- does it actually works at USB 3.0 speed?

I know that this bug is closed, but now it can be closed again ;)
Comment 31 Murat Ödünç 2015-03-20 19:59:52 UTC
Hi guys. This is big news you are given :)

Thanks Ivan to the news..

As soon as possible I will try to new bios.. 

Lenovo shows that Linux lover again !