Bug 106801

Summary: Lenovo B50-30 Laptop freezes on shutdown
Product: ACPI Reporter: strk
Component: Power-OffAssignee: acpi_power-off
Status: CLOSED UNREPRODUCIBLE    
Severity: normal CC: aaron.lu, diabloid, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.17-30-gdbcbe68 (first bad), 3.18.22. 3.18.0, 3.18.16, 3.19.0, 3.19.0-25-generic, 3.16.0-4-686-pae, 3.2.04.686-pae Subsystem:
Regression: Yes Bisected commit-id:
Attachments: lshw output

Description strk 2015-10-28 11:02:19 UTC
Software requested shutdown process fails to power off the machine as the last step. It's described here: http://ubuntuforums.org/showthread.php?t=2285917&page=2&p=13380825 and here (italian): http://www.tomshw.it/forum/linux-e-altri-sistemi-operativi/534783-lenovo-b50-30-shutdown-freeze.html

According to various reporters it's due to a bug in the machine BIOS (version 2.08) and fixed by upgrading it to a new one (version 2.11). But the BIOS upgrade can only be performed by running a Windows .exe file provided by Lenovo ( http://support.lenovo.com/us/en/products/laptops-and-netbooks/lenovo-b-series-laptops/lenovo-b50-30-notebook/downloads/DS104434 )

Empirically, I found that the kernel shipped with Trisquel 7.0 ( 3.13.0-39-lowlatency ) succeeds in making the machine shutdown, so I'm trying to understand if this is a regression in the kernel or what else.

If I may be of any help with further debugging the issue I'll keep the machine for that. Otherwise I might return it to the seller as "broken".
Comment 1 strk 2015-10-28 19:26:44 UTC
'3.13.0-66-lowlatency #108+7.0trisquel2' is also ok, I'm now trying to install a trisquel version of a known-to-fail kernel, to see if it's due to the configuration (or blobs)
Comment 2 strk 2015-10-28 20:01:26 UTC
I confirm the bug happens with '3.19.0-25-generic #26~14.04.01-Ubuntu', which is the version packaged in trisquel, stripped of any blob.

In this case I noticed this error at boot time:
ACPI PCC probe failed
Comment 3 strk 2015-10-28 21:54:25 UTC
Kernel '3.16.0-51-lowlatency' is also good, so regression must have happened in 3.17, 3.18 or 3.19
Comment 4 Aaron Lu 2015-10-29 02:52:03 UTC
To find the offending kernel commit, you will need to use git bisect:
https://www.kernel.org/pub/software/scm/git/docs/git-bisect-lk2009.html

First, you will need to find out two adjacent kernels that the first works and the 2nd doesn't, and then start bisect from there.
Comment 5 strk 2015-10-30 07:36:12 UTC
3.18.22 is still good
Comment 6 strk 2015-10-30 10:26:40 UTC
I'm taking it back, 3.18.22 isn't necessarely good (I might have built it incorrectly).
By only relying on prepackaged kernels from http://kernel.ubuntu.com/~kernel-ppa/mainline/ I found latest GOOD being 3.17.8 (as well as 3.17.0) and first BAD being 3.18.0 (as well as 3.18.22 and 3.18.23).
Comment 7 strk 2015-10-30 18:00:41 UTC
Bisect is in progress (but very very slow). So far:
3.17-19-g354f1db is GOOD
3.17-30-gdbcbe68 is BAD
Comment 8 strk 2015-10-31 09:31:37 UTC
Finally completed: 
dbcbe68bb76c4f8057160209859ecd7c75e86c30 is the first bad commit

Bisect log:
# bad: [f114040e3ea6e07372334ade75d1ee0775c355e1] Linux 3.18-rc1
# good: [bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9] Linux 3.17
git bisect start 'v3.18-rc1' 'v3.17' '--' 'drivers/acpi' 'arch/x86/kernel/acpi' 'include/acpi'
# good: [354f1dbe1905f8ab34ec5950277643a625b0c7f5] Merge branch 'acpi-video'
git bisect good 354f1dbe1905f8ab34ec5950277643a625b0c7f5
# bad: [dbcbe68bb76c4f8057160209859ecd7c75e86c30] Merge branches 'acpi-pnp' and 'acpi-blacklist'
git bisect bad dbcbe68bb76c4f8057160209859ecd7c75e86c30
# good: [871dd05c0520c2e4caf5516455fb08abc86cd703] Merge back earlier 'acpi-lpss' material for 3.18-rc1
git bisect good 871dd05c0520c2e4caf5516455fb08abc86cd703
# good: [a13f453140d542f9d5a0ee15601531c72e5401d7] Merge branch 'acpi-lpss'
git bisect good a13f453140d542f9d5a0ee15601531c72e5401d7
# good: [4990141496b82f91cb96b37100ac882ea5cee8b7] ACPI / PNP: remove Fujitsu device IDs from ACPI PNP ID list
git bisect good 4990141496b82f91cb96b37100ac882ea5cee8b7
# good: [8ee4104a681a3a30a495265825d8ebfe87d57d28] ACPI / blacklist: add Win8 OSI quirks for some Dell laptop models
git bisect good 8ee4104a681a3a30a495265825d8ebfe87d57d28
# first bad commit: [dbcbe68bb76c4f8057160209859ecd7c75e86c30] Merge branches 'acpi-pnp' and 'acpi-blacklist'
Comment 9 strk 2015-10-31 09:32:19 UTC
let me know how else I can be of help
Comment 10 strk 2015-10-31 11:05:37 UTC
The offending commit is a merge with 3 parents.
All parents are GOOD, the merge commit is BAD.
The smallest diff is against parent a13f453, and can be seen here:
http://strk.keybit.net/tmp/dbcbe68b-a13f453.diff
Comment 11 strk 2015-10-31 11:36:06 UTC
Created attachment 191691 [details]
lshw output
Comment 12 Zhang Rui 2015-11-03 07:56:59 UTC
does the problem still exist if you revert these two commits?
Comment 13 Zhang Rui 2015-11-03 08:02:17 UTC
(In reply to Zhang Rui from comment #12)
> does the problem still exist if you revert these two commits?

BTW, I don't think any of these two commits could bring any problem for your laptop. Please ignore my previous comment. :)
Comment 14 strk 2015-11-03 08:09:40 UTC
I will reverse-apply the patch linked in comment 11 and test again.
Comment 15 strk 2015-11-03 08:34:10 UTC
Actually, reverse-applying that patch takes me back to a13f453140d542f9d5a0ee15601531c72e5401d7 which I've already tested as being GOOD.
Comment 16 strk 2015-11-03 09:38:32 UTC
indeed I cannot confirm dbcbe68bb76c4f8057160209859ecd7c75e86c30 is bad either :(
I guess I'll have to restart bisecting
Comment 17 strk 2015-11-04 08:39:24 UTC
I've just found a bug in my testing procedure: lately I was testing with just "reboot" while I've just found that even 3.16.0-51-lowlatency fails with "halt" while it succeeds with "shutdown -h now" and "reboot".

Do these differences help finding out more ?
Comment 18 strk 2015-11-04 08:48:05 UTC
By contrary, behaviour of 4.2.5-040205-lowlatency is:
 shutdown -h now: bad                                                            
 reboot: bad                                                                     
 halt: bad
Comment 19 strk 2015-11-04 08:55:58 UTC
3.13.0-66-lowlatency behaves the same as 3.16.0-51-lowlatency:
 reboot: good
 shutdown -h now: good
 halt: bad
Comment 20 strk 2015-11-04 09:02:28 UTC
3.18.0-031800rc1-lowlatency same as 4.2.5-040205-lowlatency:
 shutdown -h now: bad                                                            
 reboot: bad                                                                     
 halt: bad
Comment 21 Zhang Rui 2015-11-09 03:16:36 UTC
so, it seems that halt is always bad on your system...

this seems to be two problems to me, one is the halt issue and another one is the shutdown/reboot issue.
Please confirm if the halt broken is a regression or not first.
Comment 22 strk 2015-11-09 09:10:51 UTC
I could actually find no system where "halt" powers off the machine, so that I thought it isn't really meant to. Am I wrong ?
Comment 23 Zhang Rui 2015-11-30 06:37:09 UTC
I think halt means the system stops, but power is not cut off.
Comment 24 Powerman 2015-12-29 13:34:32 UTC
Halt is indeed not meant to cut power, AFAIK. 

This bug also appears on Lenovo B50-10 and the firmware upgrade (to v2.04) does not yet solve it. Downgrade to kernel linux-image-3.16.0-57-generic in Ubuntu Trusty solved it. In case someone wants to try the Lenovo Insyde UEFI upgrade instructions for Linux: https://forums.lenovo.com/t5/Welcome-FAQs-Knowledge-Base/How-to-flash-InsydeH2O-EFI-under-DOS-enviroment/ta-p/278406

Looking at the offending commit part by strk@keybit.net 2015-10-31 11:05:37 UTC : is it possible that the insyde UEFI misidentifies itself? Insyde is in many cheapo business laptops, I had the same problem with HP 250 G3 (UEFI update, performed on Linux, solved it).

Will try to isolate the problem and reconfigure-recompile some kernels if I get to it.
Comment 25 Zhang Rui 2015-12-31 01:14:30 UTC
strk@keybit.net,

Can you please restart your git bisect?
You just need to confirm if "shutdown -h now"/"reboot" work or not this time.
Comment 26 strk 2015-12-31 08:16:24 UTC
Zhang, I settled down for now with the latest working kernel (3.16.0-51-lowlatency) but have run out of time on the task. I guess I might get back on this on a possible future accidental upgrade...
Comment 27 Zhang Rui 2015-12-31 08:22:17 UTC
well, it's a pity to drop your effort on this, since we have narrowed down the problem between 3.16-51 and 3.18.0-031800rc1then .
If you really don't have time for this, I will close this bug report for now.
And please feel free to re-open it at anytime when you can restart your bisection.