Bug 187311 - thinkpad x60:Thermal shutdown
Summary: thinkpad x60:Thermal shutdown
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: ACPICA-Core (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Lv Zheng
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-11-08 18:35 UTC by Srinivas Pandruvada
Modified: 2016-11-24 09:33 UTC (History)
6 users (show)

See Also:
Kernel Version: 4.9-rc2
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
thermal zone dump (5.45 KB, application/octet-stream)
2016-11-08 18:35 UTC, Srinivas Pandruvada
Details
thermal core debug log (5.04 KB, text/plain)
2016-11-08 18:47 UTC, Srinivas Pandruvada
Details
dmesg with acpi_cpufreq debug and processor_perflib.c debug with ppc (134.01 KB, application/octet-stream)
2016-11-08 18:52 UTC, Srinivas Pandruvada
Details
v4.8 ppc changed dynamically dmesg (1.43 KB, application/octet-stream)
2016-11-08 18:55 UTC, Srinivas Pandruvada
Details
.config for (broken) v4.9-rc (111.60 KB, application/octet-stream)
2016-11-09 07:15 UTC, Pavel Machek
Details
ACPIdump for X60 (270.59 KB, application/octet-stream)
2016-11-09 07:26 UTC, Pavel Machek
Details
test script, with -t it should check for a bug. (5.58 KB, application/octet-stream)
2016-11-13 17:39 UTC, Pavel Machek
Details
signature.asc (182 bytes, application/pgp-signature)
2016-11-16 07:53 UTC, Pavel Machek
Details

Description Srinivas Pandruvada 2016-11-08 18:35:28 UTC
Created attachment 243951 [details]
thermal zone dump

Thinkpad x60 dmesg is showing
[14049.733423] thinkpad_acpi: THERMAL EMERGENCY: a sensor reports something is extremely hot!

This eventually causes thermal shutdown.

Thinkpad x60 has a regression compared to v4.8-rc1 where CPU was forced to run at lower frequency with some stress workload like kernel compile. 

With 4.8-rc1
.....................
The bios_limit is changed with high temperature

/sys/devices/system/cpu/cpu0/cpufreq/bios_limit:1000000
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq:1833000
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq:1000000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq:1000000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq:1000000

------------
4.9-rc2

The BIOS limit always stays high irrespective of temperature.

sudo watch cat /proc/acpi/ibm/thermal
/sys/devices/system/cpu/cpu0/cpufreq/bios_limit
/sys/devices/virtual/thermal/thermal_zone1/temp  /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq

temperatures:   98 49 -128 85 28 -128 28 -128 49 58 -128 -128 -128
-128 -128 -128
1833000
95000
1833000
Comment 1 Srinivas Pandruvada 2016-11-08 18:37:20 UTC
This bug was entered on behalf of

Pavel Machek <pavel@ucw.cz>

to better track logs and analysis.
Comment 2 Srinivas Pandruvada 2016-11-08 18:47:52 UTC
Created attachment 243961 [details]
thermal core debug log

Thermal passive cooling is taking action at 92C, which may not be aggressive enough for this system to prevent thermal shutdown.
Comment 3 Srinivas Pandruvada 2016-11-08 18:52:53 UTC
Created attachment 243971 [details]
dmesg with acpi_cpufreq debug and processor_perflib.c debug with ppc

The PPC is never changed.
Comment 4 Srinivas Pandruvada 2016-11-08 18:55:39 UTC
Created attachment 243981 [details]
v4.8 ppc changed dynamically dmesg

In v4.8 _PPC is getting changed.
Comment 5 Srinivas Pandruvada 2016-11-08 19:01:29 UTC
Experimented on a Haswell system and simulated _PPC change via TDP change to make sure that 

"static void acpi_processor_notify(acpi_handle handle, u32 event, void *data)"

is called and _PPC change is notified via processor_perflib.c.

So this path is not broken. So if the BIOS notifies via   ACPI_PROCESSOR_NOTIFY_PERFORMANCE, _PPC will be read again and work correctly.

So something caused thinkpad x60 to not send any _PPC change notification.
Comment 6 Srinivas Pandruvada 2016-11-08 19:03:31 UTC
So need to bisect to find out.
Meanwhile please attach

1. output of
# acpidump

2.
kernel .config file
Comment 7 Pavel Machek 2016-11-09 07:07:39 UTC
Some additional data points:

4.8-final works ok.

4.9-rc4 is also broken.

Same problem can be seen on thinkpad T40p: bios_limit does not get updated properly.
Comment 8 Pavel Machek 2016-11-09 07:15:18 UTC
Created attachment 244031 [details]
.config for (broken) v4.9-rc
Comment 9 Pavel Machek 2016-11-09 07:26:18 UTC
Created attachment 244041 [details]
ACPIdump for X60

If you want acpidump from T40p, you can have it, too...
Comment 10 Chris Rankin 2016-11-09 12:09:46 UTC
Interesting - I have a T60p, and I am noticing that a 4.8.x kernel sometimes takes a ridiculous amount of time to boot. Specifically, the grub message about "Loading the initial ramdisk" appears and then I can end up waiting *minutes* for the next line of text. And then the next, etc.

This behaviour is intermittent and unpredictable, though. And I have no idea how to debug it. But when it happens, it does look like the CPU is running at clockwork speeds.
Comment 11 Pavel Machek 2016-11-09 12:18:32 UTC
Chris: if it is thermal, you should be able to monitor /proc/acpi/ibm/thermal and /sys/devices/.../cpu.../cpufreq/bios_limit and see the limiting. But thermal protection can "only" cause something like factor 3 slowdown, and this stuff is not active while grub is looding initial ramdisk. So you probably have something else... Perhaps best for new bugreport. ... and if I understand it correctly, that would be grub bug, not kernel.
Comment 12 Pavel Machek 2016-11-13 17:39:12 UTC
Created attachment 244321 [details]
test script, with -t it should check for a bug.
Comment 13 Pavel Machek 2016-11-13 20:49:30 UTC
I tried to bisect on T40p, unfortunately half of kernels do not boot there :-(.

pavel@amd:/data/l/linux$ git bisect log
# bad: [1001354ca34179f3db924eb66672442a173147dc] Linux 4.9-rc1
# good: [c8d2bc9bc39ebea8437fd974fdbc21847bb897a3] Linux 4.8
git bisect start '1001354ca34179f3db924eb66672442a173147dc' 'c8d2bc9bc39ebea8437fd974fdbc21847bb897a3'
# good: [41844e36206be90cd4d962ea49b0abc3612a99d0] Merge tag 'staging-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect good 41844e36206be90cd4d962ea49b0abc3612a99d0
# skip: [6b5e09a748ad0a0b198d0e268c7e689044bfe48a] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
git bisect skip 6b5e09a748ad0a0b198d0e268c7e689044bfe48a
# good: [3fd386625679bd2adb94d2a3d25dd2fdd38b52e3] dmaengine: coh901318: use correct print specifiers
git bisect good 3fd386625679bd2adb94d2a3d25dd2fdd38b52e3
# good: [c3b809834db8b1a8891c7ff873a216eac119628d] [media] pulse8-cec: fix compiler warning
git bisect good c3b809834db8b1a8891c7ff873a216eac119628d
# skip: [30066ce675d3af350bc5a53858991c0b518dda00] Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
git bisect skip 30066ce675d3af350bc5a53858991c0b518dda00
# skip: [30066ce675d3af350bc5a53858991c0b518dda00] Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
git bisect skip 30066ce675d3af350bc5a53858991c0b518dda00
# good: [33a051a5fc72c78a6770cb4f49b8932ae3587de9] drm/i915/cmdparser: Remove stray intel_engine_cs *ring
git bisect good 33a051a5fc72c78a6770cb4f49b8932ae3587de9
# skip: [6763afe4b9f39142bda2a92d69e62fe85f67251c] Merge tag 'dlm-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
git bisect skip 6763afe4b9f39142bda2a92d69e62fe85f67251c
pavel@amd:/data/l/linux$
Comment 14 Pavel Machek 2016-11-13 21:28:58 UTC
Continuing bisect on x60.
Comment 15 Pavel Machek 2016-11-14 11:25:16 UTC
Hmmm. This xtensa commit can't possibly be responsible. I guess starting on one machine and continuing bisect on second one was not exactly good idea.

pavel@amd:/data/l/linux$ git bisect log
# bad: [b26b5ef5ec7eab0e1d84c5b281e87b2f2a5e0586] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect start 'b26b5ef5ec7eab0e1d84c5b281e87b2f2a5e0586'
# good: [33a051a5fc72c78a6770cb4f49b8932ae3587de9] drm/i915/cmdparser: Remove stray intel_engine_cs *ring
git bisect good 33a051a5fc72c78a6770cb4f49b8932ae3587de9
# good: [c3b809834db8b1a8891c7ff873a216eac119628d] [media] pulse8-cec: fix compiler warning
git bisect good c3b809834db8b1a8891c7ff873a216eac119628d
# good: [3fd386625679bd2adb94d2a3d25dd2fdd38b52e3] dmaengine: coh901318: use correct print specifiers
git bisect good 3fd386625679bd2adb94d2a3d25dd2fdd38b52e3
# good: [41844e36206be90cd4d962ea49b0abc3612a99d0] Merge tag 'staging-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect good 41844e36206be90cd4d962ea49b0abc3612a99d0
# good: [c8d2bc9bc39ebea8437fd974fdbc21847bb897a3] Linux 4.8
git bisect good c8d2bc9bc39ebea8437fd974fdbc21847bb897a3
# bad: [fed41f7d039bad02f94cad9059e4b14cd81d13f2] Merge branch 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
# bad: [677664895278267a80bda0e3b26821d60cdbebf5] nmi_backtrace: do a local dump_stack() instead of a self-NMI
git bisect bad 677664895278267a80bda0e3b26821d60cdbebf5
# bad: [021723e6c5a5e7b50eb68f9812418406de9860b2] Merge tag 'for-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
git bisect bad 021723e6c5a5e7b50eb68f9812418406de9860b2
# bad: [3940ee36a0565ea7fb848e3c798afe22efd0b90a] Merge tag 'for-linus-4.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
git bisect bad 3940ee36a0565ea7fb848e3c798afe22efd0b90a
# bad: [14986a34e1289424811443a524cdd9e1688c7913] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
git bisect bad 14986a34e1289424811443a524cdd9e1688c7913
# bad: [f84d9fa86820b3074a8c143444a6932c0c0fd019] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
git bisect bad f84d9fa86820b3074a8c143444a6932c0c0fd019
# bad: [c7f5d36a3cc26e5068f4444aa22c4579e5eac85f] Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
git bisect bad c7f5d36a3cc26e5068f4444aa22c4579e5eac85f
# good: [81da313f2cf3c3798d5795f09159bab05d8af32d] xtensa: add default memmap option to iss_defconfig
git bisect good 81da313f2cf3c3798d5795f09159bab05d8af32d
# good: [2a744007c332f9d604b95aaecb106596c52ab001] m68k: don't panic if no hardware FPU defined
git bisect good 2a744007c332f9d604b95aaecb106596c52ab001
# good: [742859adc721da65ff4e8b59412d73bd3d2a57fe] m68k: let clk_disable() return immediately if clk is NULL
git bisect good 742859adc721da65ff4e8b59412d73bd3d2a57fe
# good: [742859adc721da65ff4e8b59412d73bd3d2a57fe] m68k: let clk_disable() return immediately if clk is NULL
git bisect good 742859adc721da65ff4e8b59412d73bd3d2a57fe
# good: [a4c6be5ad1d0c7af0c5421b68a00b6406b28a325] xtensa: disable MMU initialization option on MMUv2 cores
git bisect good a4c6be5ad1d0c7af0c5421b68a00b6406b28a325
# bad: [a6930aaee06755d1bdcfd943fbf614e4d92bb0c7] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
git bisect bad a6930aaee06755d1bdcfd943fbf614e4d92bb0c7
# bad: [d8ea757b25ec82687c497fc90aa83f9bcea24b5b] Merge tag 'xtensa-20161005' of git://github.com/jcmvbkbc/linux-xtensa
Comment 16 Pavel Machek 2016-11-14 18:40:22 UTC
And we have plausible-looking bisection result:

# bad: [1001354ca34179f3db924eb66672442a173147dc] Linux 4.9-rc1
# good: [c8d2bc9bc39ebea8437fd974fdbc21847bb897a3] Linux 4.8
git bisect start '1001354ca34179f3db924eb66672442a173147dc' 'c8d2bc9bc39ebea8437fd974fdbc21847bb897a3'
# bad: [41844e36206be90cd4d962ea49b0abc3612a99d0] Merge tag 'staging-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect bad 41844e36206be90cd4d962ea49b0abc3612a99d0
# good: [b50afd203a5ef1998c18d6519ad2b2c546d6af22] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
git bisect good b50afd203a5ef1998c18d6519ad2b2c546d6af22
# good: [c86bdd474a0c7b644fff91e0db069040c6a39926] staging: rts5208: Remove unnecessary parentheses
git bisect good c86bdd474a0c7b644fff91e0db069040c6a39926
# bad: [77b0a4aa0732f1856aef85b8db085864e5971a14] Merge tag 'hwmon-for-linus-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
git bisect bad 77b0a4aa0732f1856aef85b8db085864e5971a14
# bad: [5e1b834b27fb2c27cde33a0752425f11d10c0b2d] Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 5e1b834b27fb2c27cde33a0752425f11d10c0b2d
git bisect bad 00bcf5cdd6c0e2e92ce3dd852ca68a3b779fa4ec
# good: [72ec94560d7ee1d3a61d5904fd9a5bf68bf3b11a] Merge tag 'pm-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect good 72ec94560d7ee1d3a61d5904fd9a5bf68bf3b11a
# bad: [0137a337d7760c265a7d126964297e41ba9a1cb3] Merge branches 'acpi-wdat' and 'acpi-ec'
git bisect bad 0137a337d7760c265a7d126964297e41ba9a1cb3
# bad: [9274139f4e8bb0835a5a6224957e15b8e63693e4] Merge branch 'acpica'
git bisect bad 9274139f4e8bb0835a5a6224957e15b8e63693e4
# bad: [d2d48eae4639f7d74e9d91e7698f566bdbd8cdd6] ACPICA: EFI: Port acpidump to EDK2 environment
git bisect bad d2d48eae4639f7d74e9d91e7698f566bdbd8cdd6
# bad: [6ea8c546f3655a81f82672f24b66dad6095bdd07] ACPICA: FADT support cleanup
git bisect bad 6ea8c546f3655a81f82672f24b66dad6095bdd07
# good: [7c312ad1f28030c3a95b0de087bf52c45c16a0db] ACPICA: Disassembler: Add option to emit embedded External operators/opcodes
git bisect good 7c312ad1f28030c3a95b0de087bf52c45c16a0db
pavel@amd:/data/l/linux$ git bisect log
# bad: [1001354ca34179f3db924eb66672442a173147dc] Linux 4.9-rc1
# good: [c8d2bc9bc39ebea8437fd974fdbc21847bb897a3] Linux 4.8
git bisect start '1001354ca34179f3db924eb66672442a173147dc' 'c8d2bc9bc39ebea8437fd974fdbc21847bb897a3'
# bad: [41844e36206be90cd4d962ea49b0abc3612a99d0] Merge tag 'staging-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect bad 41844e36206be90cd4d962ea49b0abc3612a99d0
# good: [b50afd203a5ef1998c18d6519ad2b2c546d6af22] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
git bisect good b50afd203a5ef1998c18d6519ad2b2c546d6af22
# good: [c86bdd474a0c7b644fff91e0db069040c6a39926] staging: rts5208: Remove unnecessary parentheses
git bisect good c86bdd474a0c7b644fff91e0db069040c6a39926
# bad: [77b0a4aa0732f1856aef85b8db085864e5971a14] Merge tag 'hwmon-for-linus-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
git bisect bad 77b0a4aa0732f1856aef85b8db085864e5971a14
# bad: [5e1b834b27fb2c27cde33a0752425f11d10c0b2d] Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 5e1b834b27fb2c27cde33a0752425f11d10c0b2d
# bad: [00bcf5cdd6c0e2e92ce3dd852ca68a3b779fa4ec] Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 00bcf5cdd6c0e2e92ce3dd852ca68a3b779fa4ec
# good: [72ec94560d7ee1d3a61d5904fd9a5bf68bf3b11a] Merge tag 'pm-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect good 72ec94560d7ee1d3a61d5904fd9a5bf68bf3b11a
# bad: [0137a337d7760c265a7d126964297e41ba9a1cb3] Merge branches 'acpi-wdat' and 'acpi-ec'
git bisect bad 0137a337d7760c265a7d126964297e41ba9a1cb3
# bad: [9274139f4e8bb0835a5a6224957e15b8e63693e4] Merge branch 'acpica'
git bisect bad 9274139f4e8bb0835a5a6224957e15b8e63693e4
# bad: [d2d48eae4639f7d74e9d91e7698f566bdbd8cdd6] ACPICA: EFI: Port acpidump to EDK2 environment
git bisect bad d2d48eae4639f7d74e9d91e7698f566bdbd8cdd6
# bad: [6ea8c546f3655a81f82672f24b66dad6095bdd07] ACPICA: FADT support cleanup
git bisect bad 6ea8c546f3655a81f82672f24b66dad6095bdd07
# good: [7c312ad1f28030c3a95b0de087bf52c45c16a0db] ACPICA: Disassembler: Add option to emit embedded External operators/opcodes
git bisect good 7c312ad1f28030c3a95b0de087bf52c45c16a0db
# good: [d8303ace36aaa001e1704acb2bd13dd4f08a0d67] ACPICA: iASL/Disassembler: Add a check for missing filename
git bisect good d8303ace36aaa001e1704acb2bd13dd4f08a0d67
# good: [2af52c2bd20c50e80b121e15cd50a579e364485a] ACPICA: Events: Introduce acpi_mask_gpe() to implement GPE masking mechanism
git bisect good 2af52c2bd20c50e80b121e15cd50a579e364485a
# first bad commit: [6ea8c546f3655a81f82672f24b66dad6095bdd07] ACPICA: FADT support cleanup
Comment 17 Srinivas Pandruvada 2016-11-14 18:53:03 UTC
Thanks for bisect. May be Lv Zheng or Bob can comment.
Comment 18 Pavel Machek 2016-11-14 19:52:01 UTC
Well, we are quite late in -rc series, and the patch is cleanup -- it should not have changed anything. So I believe revert is right action at the moment.
Comment 19 Lv Zheng 2016-11-15 01:13:36 UTC
The affection seems to be caused by the deletion of ACPI_FADT_V2_SIZE;

     FADT V1 size: 0x074      ACPI 1.0
-    FADT V2 size: 0x084
     FADT V3 size: 0x0F4      ACPI 2.0
     FADT V4 size: 0x100      ACPI 3.0 and ACPI 4.0
     FADT V5 size: 0x10C      ACPI 5.0
     FADT V6 size: 0x114      ACPI 6.0

Should the following line:
	if (acpi_gbl_FADT.header.length <= ACPI_FADT_V3_SIZE) {
be changed to:
	if (acpi_gbl_FADT.header.length < ACPI_FADT_V3_SIZE) {

Please give this change a try.
Comment 20 Pavel Machek 2016-11-15 08:41:47 UTC
Seems to do the trick... but it still changes behaviour for header.length > 0x0f4 and < 0x100. Are you sure that's a good idea?
Comment 21 DE 2016-11-15 21:15:18 UTC
Hello guys, it appears that it is not a good idea to use laptops...
I also get ridiculous amount of time to reach the boot screen of Ubuntu on my AMD CQ57 laptop.

I have had huge performance issues on that laptop as well as crashes. In order to fix those, I had to check all CPU errata and decode some MSR values. My BIOS has not patched a specific erratum and I made a simple patch. This resulted in a rather large communication which is still in progress[http://marc.info/?t=147688796200001&r=1&w=2]. 

While trying to see what is wrong, I found that my CPU was in C2 mode most of the time whether it run on AC or battery! To get better performance I have to use processor.max_cstate=1 as well as apply my erratum patch. Unfortunately when I limit my laptop to C1, C1 option is vanished and no C state is left!

What is worse, crashes were observed on intel machines too [https://bugzilla.kernel.org/show_bug.cgi?id=109051]


Pavel can you run some commands and inform me here?

replace * with the CPU number

first get
cat /sys/bus/cpu/devices/cpu*/cpufreq/scaling_available_governors
then 
cat /sys/bus/cpu/devices/cpu*/cpufreq/scaling_governor

now get
ls /sys/devices/system/cpu/cpu*/cpuidle

if the above is not empty then for each state of each CPU run(replace % with state number from the above command)
cat /sys/devices/system/cpu/cpu*/cpuidle/state%/name
cat /sys/devices/system/cpu/cpu*/cpuidle/state%/usage

all those numbers will tell us how much time your CPUs spend on each C state.
Comment 22 Lv Zheng 2016-11-16 02:37:57 UTC
(In reply to Pavel Machek from comment #20)
> Seems to do the trick... but it still changes behaviour for header.length >
> 0x0f4 and < 0x100. Are you sure that's a good idea?

Why will ">= V3_SIZE && < V4_SIZE" matter a regression fix?

The bisected commit did only the following functional change:
-	if (acpi_gbl_FADT.header.length <= ACPI_FADT_V2_SIZE) {
+	if (acpi_gbl_FADT.header.length <= ACPI_FADT_V3_SIZE) {
As V2_SIZE actually doesn't exist in the world (according to Bob, it should only be used by IA64 test platforms), IMO, changing the following could be a sufficient regression fix.
-	if (acpi_gbl_FADT.header.length <= ACPI_FADT_V3_SIZE) {
+	if (acpi_gbl_FADT.header.length < ACPI_FADT_V3_SIZE) {
Comment 23 Lv Zheng 2016-11-16 02:41:00 UTC
(In reply to DE from comment #21)
> Hello guys, it appears that it is not a good idea to use laptops...
> I also get ridiculous amount of time to reach the boot screen of Ubuntu on
> my AMD CQ57 laptop.
> 
> I have had huge performance issues on that laptop as well as crashes. In
> order to fix those, I had to check all CPU errata and decode some MSR
> values. My BIOS has not patched a specific erratum and I made a simple
> patch. This resulted in a rather large communication which is still in
> progress[http://marc.info/?t=147688796200001&r=1&w=2]. 
> 
> While trying to see what is wrong, I found that my CPU was in C2 mode most
> of the time whether it run on AC or battery! To get better performance I
> have to use processor.max_cstate=1 as well as apply my erratum patch.
> Unfortunately when I limit my laptop to C1, C1 option is vanished and no C
> state is left!
> 
> What is worse, crashes were observed on intel machines too
> [https://bugzilla.kernel.org/show_bug.cgi?id=109051]
> 
> 
> Pavel can you run some commands and inform me here?
> 
> replace * with the CPU number
> 
> first get
> cat /sys/bus/cpu/devices/cpu*/cpufreq/scaling_available_governors
> then 
> cat /sys/bus/cpu/devices/cpu*/cpufreq/scaling_governor
> 
> now get
> ls /sys/devices/system/cpu/cpu*/cpuidle
> 
> if the above is not empty then for each state of each CPU run(replace % with
> state number from the above command)
> cat /sys/devices/system/cpu/cpu*/cpuidle/state%/name
> cat /sys/devices/system/cpu/cpu*/cpuidle/state%/usage
> 
> all those numbers will tell us how much time your CPUs spend on each C state.

This sounds like an entirely different issue to me...
Comment 24 Lv Zheng 2016-11-16 03:16:32 UTC
Let me attach the link of the bisected commit here for convenience:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e6cbd7f2
Comment 25 Lv Zheng 2016-11-16 03:18:05 UTC
Let me attach the link of the bisected commit here for convenience:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6ea8c546
Comment 26 Pavel Machek 2016-11-16 07:53:43 UTC
Created attachment 244731 [details]
signature.asc

> --- Comment #21 from DE <risc4all@yahoo.com> ---
> Hello guys, it appears that it is not a good idea to use laptops...

risc4all: can you start a separate bug, please? That's not related...
Comment 27 Lv Zheng 2016-11-17 07:53:57 UTC
Hi, 

Rafael has reverted the commit from the Linux upstream:
https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/commit/?h=linux-next&id=e2174b0c24caca170ca61eda2ae49c9561ff8896

ACPICA upstream also decided to revert the commit.
So let's close the bug.

Thanks
Comment 28 Pavel Machek 2016-11-19 09:47:34 UTC
Hmm. Closed? Really?

Either support for FADT_v2 is needed, then I guess we should add a comment saying so, so this does not happen again.

Or support for FADT v2 is not needed (I believe that's the case here), then fixed version of cleaned-up patch should be submitted for v4.10. [Because otherwise I'm afraid someone will see "cleanup oportunity" in future, and I'll be doing bisect again.]
Comment 29 Lv Zheng 2016-11-21 07:06:59 UTC
(In reply to Pavel Machek from comment #28)
> Hmm. Closed? Really?
> 
> Either support for FADT_v2 is needed, then I guess we should add a comment
> saying so, so this does not happen again.
> 
> Or support for FADT v2 is not needed (I believe that's the case here), then
> fixed version of cleaned-up patch should be submitted for v4.10. [Because
> otherwise I'm afraid someone will see "cleanup oportunity" in future, and
> I'll be doing bisect again.]

From this case, I cannot conclude that FADT v2 is needed.
Maybe a better cleanup should be generated from the ACPICA upstream.
But that's up to the ACPICA upstream to determine I think.

So let me close from Linux point of view as the reversion is upstreamed:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e2174b0c24caca170ca61eda2ae49c9561ff8896

Thanks
Lv
Comment 30 Vito Caputo 2016-11-24 09:33:33 UTC
This also broke (and fixed) the ThinkPad x61s, thanks.

Note You need to log in before you can comment on or make changes to this bug.