Bug 109081

Summary: boot hang unless intel_idle.max_cstate=7 - Intel i7 6700HQ (Skylake-H),
Product: Power Management Reporter: Ludovic Magerand (ludovic)
Component: intel_idleAssignee: Len Brown (lenb)
Status: CLOSED CODE_FIX    
Severity: blocking CC: aaron.lu, al3jandro.ramirez, anthony.atomictech, dsborets, euwa, ewal, forprimetime, gaggery.tsai, hmh, kae, krzysztof+kernel, lev.lybin, lkernel, ludovic, lv.zheng, mail, marcin.bajor, martin.passard59, nightmare.quake, rashedabdeltawab, rui.zhang, saunders.52, sd, sjoerd, slepy12, smac0628, szg00000, ucelsanicin, wendy.wang, yu.c.chen
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 4.4-rc5 Subsystem:
Regression: No Bisected commit-id:
Attachments: Kernel configuration
Log file with acpi=off
Log file with nolapic
Output of lspci -vvnn
Output of dmidecode
ACPI tables
Cpuinfo with acpi=off
Cpuinfo with nolapic
Kernel Configuration for 4.4-rc5
dmesg with kernel 4.4-rc5 and acpi=off
dmesg with kernel 4.4-rc5 and nolapic
dmesg log
dmesg w/ acpi=off on GL552V
dmidecode w/ acpi=off on GL552V
lscpu w/ acpi=off on GL552V
cstate=7 & 4.4-rc8
dmesg_acpi.txt
debug patch to disable c8 + C9 on selected SKL-H systems

Description Ludovic Magerand 2015-12-08 14:42:54 UTC
Created attachment 196761 [details]
Kernel configuration

Got a new MSI laptop after my previous one died, it's a GE72 6QF and it comes with a Intel i7 6700HQ processor which seems to cause trouble to the linux kernel.

If I try to boot without any options, the kernel freeze nearly instantly, I just have the time to see a few lines on the screen and everything goes dark, no way to use sysrq to reboot, I have to do a hardware off using the power button.

I manage to boot using either one of the acpi=off or nolapic options, corresponding logs are attached.

With acpi=off, all the core of my processor seems to be correctly detected and usable, but I'm afraid that as power management is off, cooling might not work correctly and therefor it's not safe to run like this.

With nolapic, obviously only one core of the processor is detected and usable, which make the thing really slow, but everything else seems to work correctly.

I also tried various other options :
* acpi=ht
* acpi=strict 
* acpi=noirq
* pci=noacpi
* pnpacpi=off
* noapic
* lapic=notscdeadline
* acpi_osi=Linux
* acpi_os_name="Windows 2015" (the laptop comes with windows 10)
* acpi.power_nocheck=1
Most of them doesn't change anything, with some of them I had a few more lines before the kernel freeze, but it never reach the point where the root filesystem was mounted and so I have no log.

There is some options in the bios (which is up to date, I flashed it to the latest version) related to hypthreading and power states, I tried playing with them but it doesn't seems to change anything so I left them to there default values.

I had the same kind of problems with my two previous laptop (coming also from MSI) but was able to fix them quickly by editing the DSDT.
But it was 4 and 6 years ago and now fixing this seems to be out of my reach (I can't even get the damn DSDT decompiled... some of the acpi tables makes iasl segfault).

Here I attache the kernel config, the log files, the output of lspci and dmidecode and the acpi tables.

Feel free to ask me to try other configuration or boot options, or providing more log or anything, I will try to provides them as fast as possible (I need the thing to do my work, and I'm pretty inefficient with windows).
Comment 1 Ludovic Magerand 2015-12-08 14:43:20 UTC
Created attachment 196771 [details]
Log file with acpi=off
Comment 2 Ludovic Magerand 2015-12-08 14:43:42 UTC
Created attachment 196781 [details]
Log file with nolapic
Comment 3 Ludovic Magerand 2015-12-08 14:44:05 UTC
Created attachment 196791 [details]
Output of lspci -vvnn
Comment 4 Ludovic Magerand 2015-12-08 14:44:25 UTC
Created attachment 196801 [details]
Output of dmidecode
Comment 5 Ludovic Magerand 2015-12-08 14:44:51 UTC
Created attachment 196811 [details]
ACPI tables
Comment 6 Ludovic Magerand 2015-12-08 14:45:18 UTC
Created attachment 196821 [details]
Cpuinfo with acpi=off
Comment 7 Ludovic Magerand 2015-12-08 14:45:36 UTC
Created attachment 196831 [details]
Cpuinfo with nolapic
Comment 8 Ludovic Magerand 2015-12-14 15:14:40 UTC
Created attachment 197351 [details]
Kernel Configuration for 4.4-rc5
Comment 9 Ludovic Magerand 2015-12-14 15:15:52 UTC
Created attachment 197361 [details]
dmesg with kernel 4.4-rc5 and acpi=off
Comment 10 Ludovic Magerand 2015-12-14 15:16:37 UTC
Created attachment 197371 [details]
dmesg with kernel 4.4-rc5 and nolapic
Comment 11 Ludovic Magerand 2015-12-14 16:22:53 UTC
I investigated this more in depth this week-end.

First, there was a EC firmware update on MSI website that I missed when I flashed the bios, so I applied it, but it haven't change a single thing, not even the ACPI tables.

I also switched to the latest 4.4-rc5 kernel version and did a full review of the configuration, the new configuration is given in previous attachements, along with the new dmesg for boot options acpi=off and nolapic.

I tried to use the microcode update module of the kernel, but it seems that the latest intel microcode data package does not contain the microcode for this processor, when running iucode_tool to generate an initrd file, I get 
iucode_tool -S --write-earlyfw=/boot/ucode.cpio /lib/firmware/intel-ucode/*
iucode_tool: system has processor(s) with signature 0x000506e3
iucode_tool: No valid microcodes were selected, nothing to do...

I have been able to finaly decompile the ACPI tables, which allowed me to make a guess on what the problem might be.
There is two error in the kernel log when using nolapic that seems big enought to me to result in real troubles.

First error is
[    0.000027] ACPI: Core revision 20150930
[    0.019261] ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20150930/dswload-210)
[    0.019268] ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20150930/psobject-227)
[    0.019294] ACPI Exception: AE_NOT_FOUND, (SSDT:xh_rvp11) while loading table (20150930/tbxfload-193)
[    0.026871] ACPI Error: 1 table load failures, 9 successful (20150930/tbxfload-214)

The error itself appears to happen when parsing the SSDT5 table (xh_rvp11) which seems to be related to USB/XHCI.
It says that the device HS11 is not found, but when I decompiled the SSDT5 table, HS11 is resolved as an external DeviceObj, coming from the DSDT probably.
iasl has no problem to compile this file back without any errors.

I guess the error happens because in the DSDT, the HS11 Device is created inside an If at the root of the table (meaning not in any Scope or Method, or nothing) :
    If (LEqual (PCHV (), SPTH))
    {
        Scope (_SB.PCI0.XHC.RHUB)
        {
            Device (HS11)
            {
                Name (_ADR, 0x0B)  // _ADR: Address
                Device (CAM0)
It seems to be related to the laptop webcam by the way (the webcam appears as Bus 001 Device 011 in lsusb), which can be deactived by a switch and is deactivated at boot time.
The PCHV Method is declared at the root of the table also
    Name (SPTH, One)
    Name (SPTL, 0x02)
    Method (PCHV, 0, NotSerialized)
    {
        If (LEqual (PCHS, One))
        {
            Return (SPTH) /* \SPTH */
        }

        If (LEqual (PCHS, 0x02))
        {
            Return (SPTL) /* \SPTL */
        }

        Return (Zero)
    }
PCHS appears in
    OperationRegion (PNVA, SystemMemory, PNVB, PNVL)
    Field (PNVA, AnyAcc, Lock, Preserve)
    {
        RCRV,   32, 
        PCHS,   16, 
        PCHG,   16, 
also at the root of the table.
If I get it correctly, it means that PCHS is some value that is read from the memory and/or hardware ?
So it might be possible that it is not already initialized when the DSDT table is loading ? Or if it corresponds to the activation status of the webcam, it might be deactivated and being SPTL. It would make the device not created when parsing the DSDT, and result in the error later when parsing the SSDT5 table.

For the second error, it might be the same kind of problem.
[    0.204210] ACPI : EC: EC description table is found, configuring boot EC
[    0.204224] ACPI : EC: EC started
[    0.213212] ACPI Error: [^^^PEG0.PEGP.EASP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359)
[    0.213218] ACPI Error: Method parse/execution failed [\_SB.PCI0.LPCB.EC._REG] (Node ffff8804730d2af0), AE_NOT_FOUND (20150930/psparse-542)
[    0.213231] ACPI : EC: Fail in evaluating the _REG object of EC device. Broken bios is suspected.
[    0.217187] ACPI Error: [^^^PEG0.PEGP.EASP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359)
[    0.217192] ACPI Error: Method parse/execution failed [\_SB.PCI0.LPCB.EC._REG] (Node ffff8804730d2af0), AE_NOT_FOUND (20150930/psparse-542)

Apparently the ECDT is loading correctly and when activating it, using the _REG method of the EC device defined in the DSDT, it fails because EASP is not found.
When I decompile the DSDT, using the SSDTs as external tables with the -e switch, ^^^PEG0.PEGP.EASP is resolved as an external _SB_.PCI0.PEG0.PEGP.EASP of type UnknownObj.
I think it comes from the SSDT6 table (SaSsdt), which seems to be related mostly to PCI and graphical devices.
Note that I can't compile back neither the DSDT nor the SSDT6 table because when decompiling it there are some unresolved external methods.

In the SSDT6 table, EASP is defined inside an If at the root of the table
    If (CondRefOf (\_SB.PCI0.PEG0.PEGP))
    {
        Scope (\_SB.PCI0.PEG0.PEGP)
        {
            OperationRegion (PCIS, PCI_Config, Zero, 0x0100)
            Field (PCIS, AnyAcc, NoLock, Preserve)
            {
                PVID,   16, 
                PDID,   16, 
                Offset (0x88), 
                EASP,   2,
\_SB.PCI0.PEG0.PEGP is resolved as a external DeviceObj, which is defined in the DSDT, without any condition this time.

I have no idea if my reasoning is correct, and/or if it might help solve the problem, and how if so.
But again, if you need more information I'm still available.
Comment 12 Aaron Lu 2015-12-16 02:41:33 UTC
Add Lv.

Lv,
Can you please take a look at this? There are some ACPI errors, but most of them shouldn't make the system un-bootable, except the EC _REG failure where ECDT is involved(but I'm not sure).
Comment 13 Denis 2015-12-17 00:00:32 UTC
I'm have the same problem on my new MSI GE62 6QF. When the booting process is  freezes I can see next message on the screen:

EC: Fail in evaluating the _REG object of EC device. Broken bios is suspected.
Comment 14 Denis 2015-12-17 07:36:00 UTC
Created attachment 197541 [details]
dmesg log

I was able to run ARCH Linux live CD (with no freezes and get a log) with the next BIOS settings:

SpeedStep: Enabled or Disabled
Boot mode: Legacy (only)

linux kernel settings: by default for Arch Linux live CD

Hoping it will help to troubleshoot and error.
Comment 15 Ludovic Magerand 2015-12-17 21:43:32 UTC
Hi,

I tried to play with the bios options previously, and it didn't changed anything, but I couldn't remember if I did it before I flashed the bios to the newest version or not.
The latest bios version enable more options about the CPU, they are :
* SpeedStep (enabled by default)
* Virtualization (enabled by default)
* HyperThreading (enabled by default)
* C-states (enabled by default)
* VT-d (disable by default)

So I reverted all the bios settings to default, removed FastBoot and put back BootMode to "UEFI with CSM" (I will need UEFI to boot windows as long as I don't have the linux kernel and nvidia drivers working), and configured UEFI to boot on the external hard drive if present.
Then I played with the 3 options that was most likely to had something to do with this bug (SpeedStep, HyperThreading and C-states).
The conclusion is that whatever SpeedStep and HyperThreading are set to, if C-states are enabled, the kernel freeze, and if they are disabled, I can boot either on 4.4-rc5 or 4.3.3-gentoo without any acpi or lapic options.

If I boot with C-states disabled and both SpeedStep and HyperThreading enabled, the two later seem to work perfectly: I have the 8 virtual cores detected and working, and the conservative cpufreq policy seems to work correctly as I monitored the cpu frequency during a build of the kernel and it was stepping from 800MHz to 2600MHz with a lot of in-between values separetly on all the virtual cores.
The kernel logs on 4.4-rc5 seems to be the same as previously, with the sames ACPI errors still occuring.

As a conclusion, I would say the freeze are most likely caused by the C-states, and I can live without them for now, even if it would be better to have them working (this laptop battery is already quite short).
If you have some patches to test, I will try them, and if you need some informations, I will provide them too.

Now I have to install the nvidia-drivers (hoping that the _REG error will not make this driver failling as it is something related to the graphical devices) to see if I can switch completly to linux ^^
Comment 16 Aaron Lu 2015-12-18 01:52:09 UTC
One option may be worth a try: intel_idle.max_cstate=0, this disable the default intel_idle driver and fallback to ACPI idle driver.
Comment 17 Ludovic Magerand 2015-12-18 17:14:13 UTC
Enabling back the C-states in the bios and booting with the kernel parameter intel_idle.max_cstate=0 is working perfectly as far as I see.
So the bug causing the freezes is definetly inside the intel cpuidle module apparently.
Comment 18 Aaron Lu 2015-12-21 01:49:20 UTC
Thanks for the test, I'll move the bug to intel_idle.
Comment 19 Aaron Lu 2015-12-21 01:50:02 UTC
CC Yu, he might also want to take a look.
Comment 20 Andrew 2015-12-24 21:15:39 UTC
I have msi gs40 6qe (the same i7 6700hq). Fortunately for me I had more time to run ArchLinux until kernel freezes so I even thought it because of running xorg. I didn't even notice kernel freeze without xorg( https://bbs.archlinux.org/viewtopic.php?pid=1585836 ) Here's mine https://gist.github.com/Deathangel908/8d654e7575314b3aabc3 dmesg, and https://gist.github.com/anonymous/5962155853d36ff40c7b dmidecode.
Comment 21 Henrique de Moraes Holschuh 2015-12-28 10:22:05 UTC
There's a chance the C-state issues in your Skylake systems are not caused by kernel bugs, but rather by falty CPU microcode.

You need to run a very up-to-date kernel for Skylake -- often more up-to-date than what is available in stable/LTS distros -- as well as very up-to-date CPU microcode in the BIOS/UEFI -- often more recent than what is available from your system vendor!!

Up-to-date Skylake microcode will be revision 0x56 or higher at the moment.  You will notice Intel is *not* distributing any Skylake microcode updates on the public Linux microcode distribution yet, so it depends solely on your system BIOS/UEFI.

Linux seems to run fine most of the time with microcode 0x49 and newer, but this is in no way certain.  We know from reports that Windows 10 requires microcode newer than that to be able to run several software packages and to avoid crashing -- and that might apply to Linux just as well.

Still, this does *not* rule out the possibility of a Linux kernel bug, the same way it does not rule out a firmware bug since the ACPI tables in that MSI laptop are not to be trusted.  It does mean MSI owns you a BIOS/UEFI update based on the old microcode reported in /proc/cpuinfo, though.
Comment 22 Aaron Lu 2015-12-28 10:23:04 UTC
On vacation, expect no response from me, sorry for the inconvenience.
Comment 23 Johannes Larsen 2015-12-28 17:51:48 UTC
I am having the same problem on a late 2015 Dell XPS 13 (9350) that also has the skylake chipset. I have the latest Dell BIOS update (1.1.7) resulting in microcode revision 0x5e(according to /proc/cpuinfo), and I am running ARCH with more or less HEAD of torvalds (a881643, somewhere between 4.4-rc6 and 4.4-rc7).

The intel_idle.max_cstate=0 argument suppresses my problem, so I believe this is the same bug.

If updating to rc7, which is compiling as we speak, fixes the problem I will post a update, so, unless I say anything, presume the problem persists in rc7. I am happy to provide more information if anyone can think of something that can help resolve the problem.
Comment 24 Henrique de Moraes Holschuh 2015-12-28 19:38:52 UTC
(In reply to Johannes Larsen from comment #23)
> the skylake chipset. I have the latest Dell BIOS update (1.1.7) resulting in
> microcode revision 0x5e(according to /proc/cpuinfo), and I am running ARCH
> with more or less HEAD of torvalds (a881643, somewhere between 4.4-rc6 and
> 4.4-rc7).

Thanks for the report.  This likely means we do have an intel-idle issue, instead of a firmware or cpu microcode issue.
Comment 25 Chen Yu 2015-12-29 00:33:55 UTC
Hi, Wendy,
do we have a i7 6700hq in hand ? thanks.
Yu
Comment 26 wendy.wang 2015-12-29 01:07:41 UTC
(In reply to Chen Yu from comment #25)
> Hi, Wendy,
> do we have a i7 6700hq in hand ? thanks.
> Yu

We have SKL I7 6700 CPU, but it was installed on the reference platform board, not product from market
Comment 27 Cláudio Pereira 2015-12-29 04:25:02 UTC
The exact same behavior happens with an Asus GL552V, the bug is not specific to the MSI GE72 6QF it seems.
Same CPU, exact same problem with 4.4.
acpi=off is enough to boot

There are some logs and information on the system here:
https://bbs.archlinux.org/viewtopic.php?id=206790
(Jump straight to the logs since the beginning its just me complaining about the lack of HID drivers and telling how newer kernels don't work)

I'll try to attach the logs I get for you to be able to tell whether is the same bug or a different one. I'm sorry but I am not competent enough to be able to tell. Ask for any information you need :)
Comment 28 Cláudio Pereira 2015-12-29 04:26:51 UTC
Created attachment 198431 [details]
dmesg w/ acpi=off on GL552V
Comment 29 Cláudio Pereira 2015-12-29 04:38:43 UTC
Created attachment 198441 [details]
dmidecode w/ acpi=off on GL552V
Comment 30 Cláudio Pereira 2015-12-29 04:41:02 UTC
Created attachment 198451 [details]
lscpu w/ acpi=off on GL552V
Comment 31 Lv Zheng 2015-12-29 05:17:12 UTC
(In reply to Ludovic Magerand from comment #11)
> I investigated this more in depth this week-end.
> 
> First, there was a EC firmware update on MSI website that I missed when I
> flashed the bios, so I applied it, but it haven't change a single thing, not
> even the ACPI tables.
> 
> I also switched to the latest 4.4-rc5 kernel version and did a full review
> of the configuration, the new configuration is given in previous
> attachements, along with the new dmesg for boot options acpi=off and nolapic.
> 
> I tried to use the microcode update module of the kernel, but it seems that
> the latest intel microcode data package does not contain the microcode for
> this processor, when running iucode_tool to generate an initrd file, I get 
> iucode_tool -S --write-earlyfw=/boot/ucode.cpio /lib/firmware/intel-ucode/*
> iucode_tool: system has processor(s) with signature 0x000506e3
> iucode_tool: No valid microcodes were selected, nothing to do...
> 
> I have been able to finaly decompile the ACPI tables, which allowed me to
> make a guess on what the problem might be.
> There is two error in the kernel log when using nolapic that seems big
> enought to me to result in real troubles.
> 
> First error is
> [    0.000027] ACPI: Core revision 20150930
> [    0.019261] ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup
> failure, AE_NOT_FOUND (20150930/dswload-210)
> [    0.019268] ACPI Exception: AE_NOT_FOUND, During name lookup/catalog
> (20150930/psobject-227)
> [    0.019294] ACPI Exception: AE_NOT_FOUND, (SSDT:xh_rvp11) while loading
> table (20150930/tbxfload-193)
> [    0.026871] ACPI Error: 1 table load failures, 9 successful
> (20150930/tbxfload-214)
> 
> The error itself appears to happen when parsing the SSDT5 table (xh_rvp11)
> which seems to be related to USB/XHCI.
> It says that the device HS11 is not found, but when I decompiled the SSDT5
> table, HS11 is resolved as an external DeviceObj, coming from the DSDT
> probably.
> iasl has no problem to compile this file back without any errors.
> 
> I guess the error happens because in the DSDT, the HS11 Device is created
> inside an If at the root of the table (meaning not in any Scope or Method,
> or nothing) :
>     If (LEqual (PCHV (), SPTH))
>     {

ACPICA upstream has a commit to play with such kind of module level code.
I'm working to correct it and enable it for Linux.
Maybe you can wait a while and try again 4.6 kernels.

>         Scope (_SB.PCI0.XHC.RHUB)
>         {
>             Device (HS11)
>             {
>                 Name (_ADR, 0x0B)  // _ADR: Address
>                 Device (CAM0)
> It seems to be related to the laptop webcam by the way (the webcam appears
> as Bus 001 Device 011 in lsusb), which can be deactived by a switch and is
> deactivated at boot time.
> The PCHV Method is declared at the root of the table also
>     Name (SPTH, One)
>     Name (SPTL, 0x02)
>     Method (PCHV, 0, NotSerialized)
>     {
>         If (LEqual (PCHS, One))
>         {
>             Return (SPTH) /* \SPTH */
>         }
> 
>         If (LEqual (PCHS, 0x02))
>         {
>             Return (SPTL) /* \SPTL */
>         }
> 
>         Return (Zero)
>     }
> PCHS appears in
>     OperationRegion (PNVA, SystemMemory, PNVB, PNVL)
>     Field (PNVA, AnyAcc, Lock, Preserve)
>     {
>         RCRV,   32, 
>         PCHS,   16, 
>         PCHG,   16, 
> also at the root of the table.
> If I get it correctly, it means that PCHS is some value that is read from
> the memory and/or hardware ?
> So it might be possible that it is not already initialized when the DSDT
> table is loading ? Or if it corresponds to the activation status of the
> webcam, it might be deactivated and being SPTL. It would make the device not
> created when parsing the DSDT, and result in the error later when parsing
> the SSDT5 table.

Current ACPICA's AML interpreter won't execute the above "If" block before loading SSDT5.

> 
> For the second error, it might be the same kind of problem.
> [    0.204210] ACPI : EC: EC description table is found, configuring boot EC
> [    0.204224] ACPI : EC: EC started
> [    0.213212] ACPI Error: [^^^PEG0.PEGP.EASP] Namespace lookup failure,
> AE_NOT_FOUND (20150930/psargs-359)
> [    0.213218] ACPI Error: Method parse/execution failed
> [\_SB.PCI0.LPCB.EC._REG] (Node ffff8804730d2af0), AE_NOT_FOUND
> (20150930/psparse-542)
> [    0.213231] ACPI : EC: Fail in evaluating the _REG object of EC device.
> Broken bios is suspected.
> [    0.217187] ACPI Error: [^^^PEG0.PEGP.EASP] Namespace lookup failure,
> AE_NOT_FOUND (20150930/psargs-359)
> [    0.217192] ACPI Error: Method parse/execution failed
> [\_SB.PCI0.LPCB.EC._REG] (Node ffff8804730d2af0), AE_NOT_FOUND
> (20150930/psparse-542)
> 
> Apparently the ECDT is loading correctly and when activating it, using the
> _REG method of the EC device defined in the DSDT, it fails because EASP is
> not found.

For ECDT, _REG is not required to be evaluated.
This is a bug in current EC driver.
And I have a patch to correct it.
Again, you should wait for the 4.6 kernels and retry.

> When I decompile the DSDT, using the SSDTs as external tables with the -e
> switch, ^^^PEG0.PEGP.EASP is resolved as an external
> _SB_.PCI0.PEG0.PEGP.EASP of type UnknownObj.
> I think it comes from the SSDT6 table (SaSsdt), which seems to be related
> mostly to PCI and graphical devices.
> Note that I can't compile back neither the DSDT nor the SSDT6 table because
> when decompiling it there are some unresolved external methods.
> 
> In the SSDT6 table, EASP is defined inside an If at the root of the table
>     If (CondRefOf (\_SB.PCI0.PEG0.PEGP))
>     {
>         Scope (\_SB.PCI0.PEG0.PEGP)
>         {
>             OperationRegion (PCIS, PCI_Config, Zero, 0x0100)
>             Field (PCIS, AnyAcc, NoLock, Preserve)
>             {
>                 PVID,   16, 
>                 PDID,   16, 
>                 Offset (0x88), 
>                 EASP,   2,
> \_SB.PCI0.PEG0.PEGP is resolved as a external DeviceObj, which is defined in
> the DSDT, without any condition this time.

It's just because _REG is evaluated before executing this block.
I think it can be solved by the EC fix.

> 
> I have no idea if my reasoning is correct, and/or if it might help solve the
> problem, and how if so.
> But again, if you need more information I'm still available.

I'm not sure.
You can wait and try.
Hope the intel idle problem is just because of the SSDT5 loading failure.

Thanks
-Lv
Comment 32 Ludovic Magerand 2015-12-29 15:40:35 UTC
Well, as the same freezing problem seem to appear in other laptop models and from other manufacturers, I would now think that this bug is really related to the CPU, and not to the ACPI tables (or interpreter).
Especially since disabling the intel_idle driver correct the freezing problem perfectly.
But it is good to know that the ACPI interpreter will be corrected, because I guess that the ACPI errors might cause others troubles (for example Xorg segfault when I try to use the nvidia driver and device).
Althought, waiting for 4.6 is a long way, we are not even arrived to the first release of 4.4.

I don't think that the problem causing the freezes is in the CPU microcode, because if it was, I guess it would probably cause trouble to the OS installed by default on the laptop too, and it is not the case, the only thing that seems to have problem under this OS is the intel integrated graphical card for which the driver crash frequently when using the nvidia device (for games or for CUDA computation).
But if intel release an update to this microcode, I will test it.
For now it seems I'm running on the 0x39 version from cpuinfo, which is coming from the latest bios given by MSI, and the last time I tried, there was no update available from intel.
Comment 33 Cláudio Pereira 2015-12-30 14:20:48 UTC
Ludovic, just one small detail. Do you mind to pick a Mint 17 (or Ubuntu 14.04, guess the result would be the same) live CD and try to boot from it using only  nouveau.modeset=0 and nothing else?
For my specific case it boots, so it maybe is a regression.

Arch with 4.2 (December ISO) also boots, this time with the i915.preliminary_hw_support=1 (besides turning off the modeset for nouveau) and the same OS with 4.3 doesn't.

If this bug is a regression it might be easier to find.
Comment 34 Denis 2015-12-30 20:02:59 UTC
Hi,

Here is my test results. With intel_idle.max_cstate set as:

cstate|booting|wake up after sleep mode
0        T        T
1        T        F
2        T        F
3        T        T
4        T        T/F (sometimes doesn't wake up)
6        T        F

So I can use 0 or 3 for now

Please let me know if you need more detailed information
Comment 35 Johannes Larsen 2015-12-31 04:04:29 UTC
In my case the problem seems to be reproducible  with intel_idle.max_cstate >= 3, and, as opposed to Denis, whether I am booting normally, from hibernation or from suspend does not seem to make an impact.

A post, [1], in a Dell forum thread about running linux on the Dell 9350 laptop suggested adding i915 as an early loaded module in the initramfs. I tried adding it, and by doing so I am not able to reproduce the problem when intel_idle.max_cstate is unset (I have not tried setting it, but presumably that also works).

So I believe this problem might be caused by loading the i915 module when the CPU is idling, or maybe if it changes C-state during loading.

[1]: http://en.community.dell.com/techcenter/os-applications/f/4613/t/19659067?pi22229=8#20859687
Comment 36 Andrey 2016-01-03 15:56:57 UTC
I faced with a similar problem on my new MSI GE62 6QC (Skylake i7 6700HQ, Intel HM170).
The only difference is that on kernel's freeze my screen stays ON.

When system stops being started with no extra options, the three lines on the screen are as follows:
[0.17xxxxx] ACPI; EC: Fail in evaluating the _REG object of EC device. Broken BIOS is suspected.
[4.66xxxxx] nouveau E[   PIBUS][0000:01:00.0] HUB0: 0x6013d4 0x00005700 (0x1f408200)
[4.66xxxxx] nouveau E[   PIBUS][0000:01:00.0] HUB0: 0x10ecc0 0xffffffff (0x1d40822c)


intel_idle.max_cstate=0 directive takes boot process much further, but even this does not bring my system up – normal boot messages at some point become interrupted with a bunch of lines like this:

…
apparmor.service
[   83.635044] iwlwifi 0000:02:00.0: Unsupported splx structure
[  108.098261] NMI watchdog: BUG:soft lockup- CPU#5 stuck for 22s! [plymouthd:231]
[  136.093973] NMI watchdog: BUG:soft lockup- CPU#5 stuck for 22s! [plymouthd:231]
[  142.341017] INFO: rcu_sched self-detected stall on CPU { 5} (t=15000 jiffies g=1180 c=1179 q=0)
[  168.098261] NMI watchdog: BUG:soft lockup- CPU#5 stuck for 22s! [plymouthd:231]
[  196.098261] NMI watchdog: BUG:soft lockup- CPU#5 stuck for 22s! [plymouthd:231]
[  224.098261] NMI watchdog: BUG:soft lockup- CPU#5 stuck for 22s! [plymouthd:231]
[  240.098052] INFO:task systemd:1 blockedfor more than 120seconds.
[  240.098656]       Tainted: G		L  4.2.0-16-generic#19Ubuntu
[  240.099262] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[  240.099920] INFO: taskkworker/0:1:78 blocked for more than120 seconds.


Lines like this come repeatedly, only the time and CPU## change and seems this would take forever.

Booting with acpi=off takes system up, but since devices with this option (at least my touchpad) become unoperable, I didn't run for very long, just grabbed dmesg, dmidecode, lspci and cpuinfo. 

Should this help investigation, I'll happily participate in provision of required info or running kernels with options of interest, so please feel free to involve me. 
Regards!
Comment 37 Chen Yu 2016-01-03 16:15:23 UTC
It seems that many people has this problem when trying to boot up.
can you guys help to check if it can boot up with following command appended(other command line options remain unchanged)

'init=/bin/bash nomodeset text' 
or do you have a serial output?
Comment 38 Andrey 2016-01-03 16:29:48 UTC
None. Just 
[0.17xxxxx] ACPI; EC: Fail in evaluating the _REG object of EC device. Broken BIOS is suspected.
And blinking cursor at the beginning of next line.

If you meant serial port - unfortunately no, it is not available on my laptop...
Comment 39 sdavid 2016-01-05 21:06:24 UTC
Hello,

I have a MSI GS40 6QE with skylake 6700HQ
I was able to install and boot on debian jessie with default 3.16 kernel.
I get "ACPI; EC: Fail in evaluating the _REG object of EC device. Broken BIOS is suspected." but boot and xorg with nouveau started successfully.
I didn't notice other problems but didn't stay long with this kernel as all hardware isn't supported (lan/wifi)

I tried jessie's backported kernel 4.2 and 4.3 and get the same ACPI error.
But thoses kernel are unusable because of
"NMI watchdog: BUG:soft lockup- CPU#5 stuck for 22s!"

acpi=off stop the lockup issue but also disable hardware (touchpad)

I'll keep going on 4.4 and can help if you need more arms to test kernel patches or get informations about this particular hardware.
Comment 40 Cláudio Pereira 2016-01-05 22:08:28 UTC
Dear engineers, can we have any more information on how you're progressing with this particular issue?

It severely cripples at least two (three if the Dell XPS has the same problem) of the best-selling laptops on the market today, all from different brands. The bug has the highest priority level (if it is P1) so I assume you're working on it. Did you already figured out anything?

I came here to post what I think to be more information, but Andrey and "small+kernel@pasglop.net" already said basically the same.

Here are some photographs (the first two) of the system with "udev.log-priority=debug nomodeset i915.modeset=0 debug ignore_loglevel earlyprintk=efi,keep log_buf_len=16M"
http://imgur.com/a/cYHC3

The photos after those two are for "nomodeset i915.modeset=0 ignore_loglevel earlyprintk=efi,keep" I think, yet I'm not certain.

Both cases loop like that forever saying the same every time but in different orders.
Comment 41 Chen Yu 2016-01-06 03:35:36 UTC
(In reply to Cláudio Pereira from comment #40)
> Dear engineers, can we have any more information on how you're progressing
> with this particular issue?
> 
> It severely cripples at least two (three if the Dell XPS has the same
> problem) of the best-selling laptops on the market today, all from different
> brands. The bug has the highest priority level (if it is P1) so I assume
> you're working on it. Did you already figured out anything?
> 
> I came here to post what I think to be more information, but Andrey and
> "small+kernel@pasglop.net" already said basically the same.
> 
> Here are some photographs (the first two) of the system with
> "udev.log-priority=debug nomodeset i915.modeset=0 debug ignore_loglevel
> earlyprintk=efi,keep log_buf_len=16M"
> http://imgur.com/a/cYHC3
> 
> The photos after those two are for "nomodeset i915.modeset=0 ignore_loglevel
> earlyprintk=efi,keep" I think, yet I'm not certain.
> 
> Both cases loop like that forever saying the same every time but in
> different orders.

According  to your first picture, it seems that CPU0 blocked at initializing the clock source. Is it the first warnning message appearing on the monitor? I want to make sure if it is the first cause for this problem.
How about adding 'notsc' in your command line?
besides, if you have time, can you please help test if #Comment 37 works for you(you might need to recompile the kernel with USB2.0/USB3.0 built-in.
Comment 42 Andrey 2016-01-06 08:47:42 UTC
A short update on "init=/bin/bash" command line option.

With both options "intel.idle.max_cstate=0" and "init=/bin/bash" used at the same time and "quiet splash" keywords removed, I was able not just to boot Ubuntu 15.10 but also to install it.  

Sometimes the installed system encounters kernel panic on boot. A trailing slash in "init=/bin/bash/" helps against this...
Comment 43 Chen Yu 2016-01-06 08:57:35 UTC
(In reply to Andrey from comment #42)
> A short update on "init=/bin/bash" command line option.
> 
> With both options "intel.idle.max_cstate=0" and "init=/bin/bash" used at the
> same time and "quiet splash" keywords removed, I was able not just to boot
> Ubuntu 15.10 but also to install it.  
> 
> Sometimes the installed system encounters kernel panic on boot. A trailing
> slash in "init=/bin/bash/" helps against this...
Hi, do you mean, if the command line is: "init=/bin/bash/ nomodeset text", the system can not boot up, while if it is appended with "intel.idle.max_cstate=0 init=/bin/bash/ nomodeset text", everything goes well?
I think we should firstly confirm if it is related to graphic or it is actually caused by cstate.
Comment 44 Andrey 2016-01-06 11:00:15 UTC
Chen Yu, exactly (I checked onnce again).

The complete initial boot options string in my case is:
"file=/cdrom/preseed/ubuntu-mate.seed boot=casper initrd=/casper/initrdlz quiet splash ---" 

1. Insertion of "init=/bin/bash nomodeset text" before "---" leads to well known error "ACPI: EC: Fail in evaluating _REG object...", then boot stops. The same result I also get with only the option "intel_idle.max_cstate=0".

2. Putting "intel_idle.max_cstate=0 init=/bin/bash/ nomodeset text" at the same place of boot options string lets system to boot well in my case.

Thank you!
Comment 45 Ludovic Magerand 2016-01-06 13:21:28 UTC
Sorry, I was on vacation the two last weaks and had only wifi to connect but I didn't installed the wifi tools, so I didn't worked on this bug.
Moreover as I needed a fully working linux environnement and access to my nvidia GPU to do my work, I installed VirtualBox on the default OS so I can use it to access the nvidia GPU and have my linux system to work.

Anyway, I think we have 3 differents bugs in all this :

* An ACPI bug that seems to affect mostly the MSI laptops, which prevent the EC to be correctly initialized (but it seems to work somehow more or less correctly anyway), this bug should be corrected one day with some updates to acpica

* A bug into the cstate on skylake architecture causing some complete system freeze and which can be disabled as a workaround using "intel_idle.max_cstate=0"

* Many bugs in the i915 driver including one with modesetting for the skylake architecture, this one doesn't happen on my system, I guess that's because I'm running the latest 4.4 rc and compile the kernel with the option to include preliminary hardaware support (which does include some fixes to the modesetting code). There are still other bugs anyway in this driver as I have some segfault trace in the logs related to this driver but they doesn't make the system freeze, I can run it for hours.

Andrey, I don't know which version of the kernel you are running, but I suggest you should try to enable the preliminary support in the i915 driver using the kernel option i915.preliminary_hw_support=1
Comment 46 Andrey 2016-01-06 15:25:14 UTC
Hi, Ludovic. The most recent kernel I founmd among linux distributions - was kernel 4.2 featured in Ubuntu (MATE) distribution v.15.10.
If there are downloadable linux distributions with kernels > 4.2, please let me know, I will try it.

Enabling support of i915 driver with "i915.preliminary_hw_support=1" option doesn't change anything, it seems...
Comment 47 Ludovic Magerand 2016-01-06 16:20:15 UTC
You can try the Archlinux livecd, it comes with a 4.3.3 kernel, as far as I know, this is one of the most up to date.

By the way, during lunch I realized I misunderstood what you were trying to do in the last comments, I was thinking that some people were having trouble with cstate disabled and that disabling modeset was removing the problem.
But it seems you were trying to boot with cstate enable and modeset disable.

Therefor I did the following test : boot kernel 4.4 rc 5 (with the config attached previously) with intel.idle.max_cstate from 1 to 8 (which is the size of the array skl_cstates in drivers/idle/intel_idle.c) and i915.modeset=0.
The result is as follows :
* cstate 1 to 7 was able to boot, in dmesg I have "max_cstate 7 reached" and no more error about the i915 driver segfault in the kernel log
* cstate 8 caused a complete kernel freeze as previously

As to me there is two different bug involved here, I also tried to boot only with intel.idle.max_cstate from 8 to 1.
As previously, cstate 8 caused again a complete kernel freeze, but cstate 7 to 1 were able to boot correctly with the message "max_cstate N reached" in the kernel logs and the segfault of the i915 driver being back.

So I think there is really a problem in the intel cstate code, and it is not related to the i915 driver, and this problem is probably just with the cstate 8 (named C10-SKL).

During all theses test, I just run the kernel a few minutes (the time to look if /proc/cpuinfo was correct and what was in dmesg).
Tonight when I will leave my office, I will boot with max_cstate=7 and I will see tomorrow if it is still up.

I will create also a bug report for the i915 driver as now I know that the bug seems to be related to modeset.
Comment 48 Cláudio Pereira 2016-01-06 22:16:26 UTC
(In reply to Chen Yu from comment #41)
> (In reply to Cláudio Pereira from comment #40)
> > Dear engineers, can we have any more information on how you're progressing
> > with this particular issue?
> > 
> > It severely cripples at least two (three if the Dell XPS has the same
> > problem) of the best-selling laptops on the market today, all from
> different
> > brands. The bug has the highest priority level (if it is P1) so I assume
> > you're working on it. Did you already figured out anything?
> > 
> > I came here to post what I think to be more information, but Andrey and
> > "small+kernel@pasglop.net" already said basically the same.
> > 
> > Here are some photographs (the first two) of the system with
> > "udev.log-priority=debug nomodeset i915.modeset=0 debug ignore_loglevel
> > earlyprintk=efi,keep log_buf_len=16M"
> > http://imgur.com/a/cYHC3
> > 
> > The photos after those two are for "nomodeset i915.modeset=0
> ignore_loglevel
> > earlyprintk=efi,keep" I think, yet I'm not certain.
> > 
> > Both cases loop like that forever saying the same every time but in
> > different orders.
> 
> According  to your first picture, it seems that CPU0 blocked at initializing
> the clock source. Is it the first warnning message appearing on the monitor?
> I want to make sure if it is the first cause for this problem.
> How about adding 'notsc' in your command line?
> besides, if you have time, can you please help test if #Comment 37 works for
> you(you might need to recompile the kernel with USB2.0/USB3.0 built-in.

'notsc' apparently does nothing.
The first error that appears with it is "NMI watchdog; Watchdog detected hard LOCKUP on cpu 0" 0.69 seconds after booting.
Yet I'm not sure if it is the same error that appeared without 'notsc'.

There's also a warning which I'm not sure if is related
"Using host bridge windows from ACPI: if necessary, use "pci=nocrs" and report a bug"
and
"[Firmware bug]: ACPI: BIOS _OSI(Linux) query ignored"

Neither comment 37 nor setting the cstate work. Only acpi=off did the trick so far.

I tried to compile a kernel with USB built in, but unfortunately after it compiled, it had some trouble installing. I'm not really experienced building kernels, and unfortunately don't have the time to learn right now.

If you want I can give you access to this machine. I can't figure out what is going wrong but am desperate to have it working since I need it for college.
Anything you need just ask.
Comment 49 Susan 2016-01-07 16:34:55 UTC
I had the same issue. 

MSI GE72-6QD. 
Sklylake 6700HQ
Dual Graphics - Nvidia and Intel
Mint 17.3 KDE
Ubuntu 14.04/15.04/15.10

I could not boot any kernel 4.3 or higher without a crash before being able to log in. 

I found elsewhere about the intel_idle fix for passing the kernel/boot flag. This worked for me. 

With the MSI bios update .107 they introduced the ability to disable "cstates". I am now using that instead of passing boot flags. 

Mint 17.3 KDE - Kernel 4.4-rc8 from the drm-intel-nightly branch (self compiled)
Everything boots as long as cstates are disabled in the bios. 

I also have the "broken _EC" error. That was introduced for me when I upgraded the bios to .105. Prior to that I had no issue with that error message. Though I was not able to disable cstates in the bios. 

I upgraded to MSI's .110 bios this morning and have yet to check if I can enable cstates or not.
Comment 50 Susan 2016-01-07 16:45:48 UTC
Update - I cannot boot with cstates enabled. Still. 

Microcode for MSI has been updated to 55; with the bios update. Previously it was 39.
Comment 51 Ludovic Magerand 2016-01-07 18:46:25 UTC
As promised, I tested the kernel with only intel.idle.max_cstate=7 yesterday, I run 4.4-rc5 for about half an hour doing various stuff (upgrading the kernel to 4.4-rc8, updating the system, ... and also nothing).
Then I booted the newly 4.4-rc8 kernel and worked one hour trying to make Xorg working, I had no freezes nor watchdog CPU stall message in the kernel logs.
I left the system as is up all the night, and it was still up and running this morning.
Is there a way to check how much time the processor stayed in each cstate, just to be sure that the kernel actually use them ?

As a workaround for the freezes with a 4.4 kernel, I think using kernel option intel.idle.max_cstate=7 is the best for now.
It might also work on 4.3 kernels (who are supposed to support the Skylake processor family), but I don't have any to test currently.

Susan, you should try the workaround I just mentioned, it will enable nearly all cstates which is probably fine.
Comment 52 Susan 2016-01-07 19:29:16 UTC
Ludovic M. 

I followed your suggestion of trying to limit cstates to 7. This has worked so far. The kernel flag I had to pass though was "intel_idle.max_cstate=7"

So far so good on 4.4-rc8 (drm-intel-nightly).
Comment 53 Aaron Lu 2016-01-08 02:05:16 UTC
(In reply to Ludovic Magerand from comment #51)
> Is there a way to check how much time the processor stayed in each cstate,
> just to be sure that the kernel actually use them ?

/sys/devices/system/cpu/cpuX/cpuidle/stateY/time should tell how long the processor stayed in that idle state.

See Documentation/cpuidle/sysfs.txt for more information.
Comment 54 Ludovic Magerand 2016-01-08 21:08:18 UTC
Ok, I tested on the gentoo 4.3.3 kernel, after 10 minutes I did 'cat /sys/devices/system/cpu/cpu?/cpuidle/state?/time' and there was a value in every one (the last one for each CPU being a bit higher, but as I didn't do anything stressfull on the system, it seems legite that the CPU went more on the last cstate).

So all the cstate from 1 to 7 are working fine on both 4.4 and 4.3.3 kernels. The problem is really just with the last cstate.

I think I can't do more to help until someone has a patch to test, so I guess it's up to you :)
Comment 55 Denis 2016-01-08 21:12:06 UTC
Hi, 
I can confirm that using
intel_idle.max_cstate=7" on 4.4-rc8 (drm-intel-nightly) the system is bootable and wake-up after sleep.
But I see some weird warnings in syslog
....
WARNING: CPU: 3 PID: 893 at /home/kernel/COD/linux/drivers/gpu/drm/i915/intel_display.c:13896 intel_prepare_plane_fb+0x269/0x2d0 [i915]()
....

and still 
[    0.251309] ACPI Error: [^^^PEG0.PEGP.EASP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359)
[    0.251313] ACPI Error: Method parse/execution failed [\_SB.PCI0.LPCB.EC._REG] (Node ffff8804730edaf0), AE_NOT_FOUND (20150930/psparse-542)
[    0.251325] ACPI : EC: Fail in evaluating the _REG object of EC device. Broken bios is suspected.
[    0.283299] ACPI Error: [^^^PEG0.PEGP.EASP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359)
[    0.283302] ACPI Error: Method parse/execution failed [\_SB.PCI0.LPCB.EC._REG] (Node ffff8804730edaf0), AE_NOT_FOUND (20150930/psparse-542)
[    0.285500] ACPI: Executed 24 blocks of module-level executable AML code
[    0.291346] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored
[    0.294647] ACPI: Dynamic OEM Table Load:
[    0.294652] ACPI: SSDT 0xFFFF880470808C00 0003CF (v02 PmRef  Cpu0Cst  00003001 INTL 20120913)
[    0.295490] ACPI: Dynamic OEM Table Load:
[    0.295495] ACPI: SSDT 0xFFFF880470C52800 0005EA (v02 PmRef  Cpu0Ist  00003000 INTL 20120913)
[    0.297357] ACPI: Dynamic OEM Table Load:
[    0.297362] ACPI: SSDT 0xFFFF880470C53000 0005AA (v02 PmRef  ApIst    00003000 INTL 20120913)
[    0.298366] ACPI: Dynamic OEM Table Load:
[    0.298369] ACPI: SSDT 0xFFFF880470FFA000 000119 (v02 PmRef  ApCst    00003000 INTL 20120913)
[    0.302766] ACPI: Interpreter enabled
[    0.302774] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20150930/hwxface-580)
[    0.302781] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20150930/hwxface-580)
[    0.302799] ACPI: (supports S0 S3 S4 S5)
[    0.302800] ACPI: Using IOAPIC for interrupt routing

I'm using discrete graphics for now. Tried intel graphics but hadsome problem with sleep mode.
Comment 56 Denis 2016-01-08 21:13:44 UTC
Created attachment 199051 [details]
cstate=7 & 4.4-rc8
Comment 57 Sjoerd Furth 2016-01-08 22:17:53 UTC
*** Bug 110371 has been marked as a duplicate of this bug. ***
Comment 58 Rashed Abdel-Tawab 2016-01-10 08:09:00 UTC
Same problem on the Ghost Pro 6QE (I have the 4K version). After updating the BIOS and EC firmware earlier today, I can finally get a boot on 4.x kermels. With Ubuntu 15.10, the Intel graphics stack, and the 20160109 drm-intel-nightly 4.4.0-994 kernel, and setting the cstates parameter gets a boot. However, if I do not set nouveau.modeset=0, I'll get a cpu lockup. Using the nvidia-355 drivers (tried 352 and 358 as well) gives me complete functionality of the nvidia card. However it cannot switch to the Intel GPU for some reason, which I suspect is another bug in the i915 driver (it used to switch fine on Ubuntu 14.04 and the 3.19 kernel)
Comment 59 Cláudio Pereira 2016-01-10 23:01:06 UTC
I updated the bios of my GL552VW , its at version 216 now, I think it was at 210.
After doing so I gave the newer kernels another go.
Boot parameters: "rw intel_idle.max_cstate=7 nouveau.blacklist=1 acpi_osi=! acpi_backlight=native"

Still doesn't work on 4.3, BUT it does on 4.4. It all works just fine.
Of course that there are no proprietary drivers for 4.4 just yet (not that I like them, but nouveau has no HW accel in the 960M) so the discrete GPU isn't used, but the Intel APU totally works, suspend and resume included.

I'll keep an eye on this bug to know about the fix, but the workaround works.
What am I losing by limiting cstates to 7?
Comment 60 Rashed Abdel-Tawab 2016-01-11 07:53:57 UTC
(In reply to Cláudio Pereira from comment #59)
> Of course that there are no proprietary drivers for 4.4 just yet (not that I
> like them, but nouveau has no HW accel in the 960M) so the discrete GPU
> isn't used, but the Intel APU totally works, suspend and resume included.

I'm using the 4.4.4-994 drm-intel-nightly kernel and nvidia-355 (352, 358, and 361 work too) and nvidia HW acceleration is fine. However now, on the Intel GPU it cannot suspend or logout, the kernel just hangs when attempting that. I fixed my previous problem of it not switching to the Intel GPU by updating my GuC firmware to the latest version that Intel provides on 01.org
Comment 61 Ludovic Magerand 2016-01-11 12:08:44 UTC
(In reply to Cláudio Pereira from comment #59)
> After doing so I gave the newer kernels another go.
> Boot parameters: "rw intel_idle.max_cstate=7 nouveau.blacklist=1 acpi_osi=!
> acpi_backlight=native"

I don't think it's a good idea to boot with acpi_osi=! as it basically tells the acpi firmware that you don't have any OS installed, and many acpi firmware check which version of the OS is running to enable/disable some parts.
Have you tried removing it ?
 
> Still doesn't work on 4.3, BUT it does on 4.4. It all works just fine.
> Of course that there are no proprietary drivers for 4.4 just yet (not that I
> like them, but nouveau has no HW accel in the 960M) so the discrete GPU
> isn't used, but the Intel APU totally works, suspend and resume included.
> 
> I'll keep an eye on this bug to know about the fix, but the workaround works.
> What am I losing by limiting cstates to 7?

You won't loose much, just the deepest cstate. It means that when being idle your processor cores won't but put into the deepest power saving state (which seems to be so deep that the kernel can't get out of it :D).
It will probably affect a little bit the battery life, but less than using intel_idle.max_cstate=0 which revert to the acpi state which are far less efficient in power saving.
Comment 62 Rashed Abdel-Tawab 2016-01-15 22:49:46 UTC
Update: issue is still present in both stable 4.4 and the drm-intel kernel from http://cgit.freedesktop.org/drm-intel
Comment 63 Lev Lybin 2016-01-19 17:11:06 UTC
The problem still relevant. 
Kernel: 4.4.0-3-ARCH
Intel i7 6700HQ microcode: CPU0 sig=0x506e3, pf=0x20, revision=0x39
MSI PE70 6QE
BIOS E1795IMS.10C 12/10/2015
GRUB_CMDLINE_LINUX_DEFAULT="intel_idle.max_cstate=7 acpi_osi=Linux acpi_backlight=native"
Brightness control is not working. 
Without intel_idle.max_cstate OS freezes.
Comment 64 Lev Lybin 2016-01-19 17:11:45 UTC
Created attachment 200461 [details]
dmesg_acpi.txt
Comment 65 Aaron Lu 2016-01-20 03:10:26 UTC
(In reply to Lev Lybin from comment #63)
> The problem still relevant. 
> Kernel: 4.4.0-3-ARCH
> Intel i7 6700HQ microcode: CPU0 sig=0x506e3, pf=0x20, revision=0x39
> MSI PE70 6QE
> BIOS E1795IMS.10C 12/10/2015
> GRUB_CMDLINE_LINUX_DEFAULT="intel_idle.max_cstate=7 acpi_osi=Linux
> acpi_backlight=native"
> Brightness control is not working. 

Backlight is another problem, please file a new bug for it and provide dmesg/acpidump there, thanks.
Comment 66 Susan 2016-01-20 04:24:38 UTC
(In reply to Lev Lybin from comment #63)
> The problem still relevant. 
> Kernel: 4.4.0-3-ARCH
> Intel i7 6700HQ microcode: CPU0 sig=0x506e3, pf=0x20, revision=0x39
> MSI PE70 6QE
> BIOS E1795IMS.10C 12/10/2015
> GRUB_CMDLINE_LINUX_DEFAULT="intel_idle.max_cstate=7 acpi_osi=Linux
> acpi_backlight=native"
> Brightness control is not working. 
> Without intel_idle.max_cstate OS freezes.

Same processor. Same laptop manufacturer. 

MSI GE72.

Only use the cstate flag. For me the OSI flag doesn't do anything and the backlight flag breaks brightness. When using only the cstate flag everything appears to work; including brightness.
Comment 67 Rashed Abdel-Tawab 2016-01-21 03:00:34 UTC
Yeah, the OSI and backlights flags are unnecessary. Your GRUB_CMDLINE_LINUX_DEFAULT should just be "intel_idle.max_cstate=7 nouveau.modeset=0". Also you should update your BIOS from MSI since they updated the microcode to v49 or v55 (depending on your model). We should all be getting another microcode update now that Intel's patching the Skylake errata.
Comment 68 Lev Lybin 2016-01-21 07:20:48 UTC
(In reply to Rashed Abdel-Tawab from comment #67)
> Yeah, the OSI and backlights flags are unnecessary. Your
> GRUB_CMDLINE_LINUX_DEFAULT should just be "intel_idle.max_cstate=7
> nouveau.modeset=0". Also you should update your BIOS from MSI since they
> updated the microcode to v49 or v55 (depending on your model). We should all
> be getting another microcode update now that Intel's patching the Skylake
> errata.

I've replaced nvidia to nouveau, added nouveau.modeset=0, backlights works fine. Thanks. I have the latest version of the BIOS (v39 microcode), waiting...
Comment 69 Rashed Abdel-Tawab 2016-01-21 07:28:21 UTC
Great. Has anyone else encountered the kernel panic when trying to close the X session while on the Intel GPU? I'll try to get a log for it tomorrow since I know it's useless reporting it without proper logs.
Comment 70 Lev Lybin 2016-01-21 08:56:49 UTC
Just now I've updated the microcode on v55. The problem is relevant. Do you need some information e.g. dmesg, acpidump etc?
Comment 71 Lev Lybin 2016-01-25 15:07:08 UTC
I have a new problem if use intel driver: skype video blinking blue during incoming call. if replace intel on modesetting, backlights doesn't works, but no problem with video. Kernel 4.4
https://bugs.launchpad.net/ubuntu/+source/skype/+bug/1078068/comments/24
https://www.reddit.com/r/archlinux/comments/41ht3e/annoying_skylake_issue_plus_skype_issue/
Comment 72 Sjoerd Furth 2016-03-01 13:04:07 UTC
Is there a status on this issue? I am still facing issues with the current kernel...

In case you want logs from my machine (ASUS ROG GL552VW, i7-6700 HQ, GTX 960M) just let me know :)
Comment 73 Anthony 2016-03-01 14:13:05 UTC
(In reply to Sjoerd Furth from comment #72)
> Is there a status on this issue? I am still facing issues with the current
> kernel...
> 
> In case you want logs from my machine (ASUS ROG GL552VW, i7-6700 HQ, GTX
> 960M) just let me know :)

I am on an ROG  GL752VW, i7-6700, this issue seems resolved for me as of 4.5rc1.
Comment 74 Lev Lybin 2016-03-01 14:16:49 UTC
The problem is relevant for me on 4.5rc4.
Comment 75 Sjoerd Furth 2016-03-01 17:26:10 UTC
The problem is still here in 4.5.0-rc6
Comment 76 Lv Zheng 2016-03-02 08:23:40 UTC
If you mean the problem in comment 11, it is fixed:
http://www.spinics.net/lists/linux-acpi/msg63550.html
But the series contains things that need more time to review, so you have to wait a bit longer.

Thanks and best regards
-Lv
Comment 77 Lev Lybin 2016-03-02 08:44:23 UTC
Linux PE70 4.5.0-rc6-mainline #1 SMP PREEMPT Tue Mar 1 22:41:17 ICT 2016 x86_64 GNU/Linux

Intel i7 6700HQ microcode: CPU0 sig=0x506e3, pf=0x20, revision=0x55
MSI PE70 6QE

Without intel_idle.max_cstate OS freezes. And I still get these messages:

[    0.021271] ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20160108/dswload-210)
[    0.029990] ACPI Error: 1 table load failures, 9 successful (20160108/tbxfload-215)
[    0.243388] ACPI Error: [^^^PEG0.PEGP.EASP] Namespace lookup failure, AE_NOT_FOUND (20160108/psargs-360)
[    0.243393] ACPI Error: Method parse/execution failed [\_SB.PCI0.LPCB.EC._REG] (Node ffff8804730e34b0), AE_NOT_FOUND (20160108/psparse-542)
[    3.841840] acpi_call: Cannot get handle: Error: AE_NOT_FOUND
[    3.851290] acpi_call: Cannot get handle: Error: AE_NOT_FOUND
Comment 78 Lev Lybin 2016-03-02 08:45:46 UTC
@Lv Zheng , thank you.
Comment 79 Sjoerd Furth 2016-03-03 19:00:04 UTC
(In reply to Lv Zheng from comment #76)
> If you mean the problem in comment 11, it is fixed:
> http://www.spinics.net/lists/linux-acpi/msg63550.html
> But the series contains things that need more time to review, so you have to
> wait a bit longer.
> 
> Thanks and best regards
> -Lv

Dear Lv,

First of all thanks for your efforts. Only I am not quite sure which part will be fixed in that link.

Is it that the bootflag intel_idle.max_cstate wont be needed anymore or is has it something to do with the ACPI tables (or both)?

With kind regards,

Sjoerd Furth
Comment 80 Rashed Abdel-Tawab 2016-03-03 19:32:20 UTC
I believe Lv linked tgat ACPI big as a fix for the MSI laptops that are presenting issues with ACPI.
Comment 81 Len Brown 2016-03-13 06:20:54 UTC
Let's focus this report on the boot failure that requires
"intel_idle.max_cstate=7" to work-around.  Please file other
but reports for issues not directly related to that failure.
Comment 82 Len Brown 2016-03-13 06:31:10 UTC
Created attachment 208761 [details]
debug patch to disable c8 + C9 on selected SKL-H systems

Please report if the attached patch allows your system to boot
with no "intel_idle.max_cstate=" (or acpi=off) cmdline workaround.

If it is working as intended, you should see something like this in dmesg:

dmesg | grep idle

intel_idle: MWAIT substates: 0x11142120
intel_idle: v0.4.1 model 0x5E
intel_idle: lapic_timer_reliable_states 0xffffffff
intel_idle: SGX present 0x29c6fbf
intel_idle: state C8-SKL is disabled
intel_idle: state C9-SKL is disabled

grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
should show that C8-SKL and C9-SKL are not longer present.

If your BIOS has a SETUP option to enable SGX and you enable it,
then you should be able to boot without this patch, and this patch
will print another line about SGX being enabled, but you will not
see the bit about C8-SKL and C9-SKL being disabled, and you should
see them in sysfs using the grep above.

If this patch fails to fix your boot issue,
please boot with "intel_idle.max_cstate=7"
and show the output from "dmesg | grep idle"
Comment 83 sdavid 2016-03-13 23:19:18 UTC
Thank you Len Brown.

I'm successfully running a 4.5-rc7 kernel with your patch and without the "intel_idle.max_cstate=7" I needed before.
My hardware is MSI GS40 "phantom" 6QE with i7-6700HQ CPU @ 2.60GHz.


dmesg :

[    0.000000]Command line: BOOT_IMAGE=/vmlinuz-4.5.0-rc7-phantom root=/dev/mapper/pcsd-root ro text nomodeset
(...)
[    0.778740] intel_idle: MWAIT substates: 0x11142120
[    0.778741] intel_idle: v0.4.1 model 0x5E
[    0.778741] intel_idle: lapic_timer_reliable_states 0xffffffff
[    0.778743] intel_idle: SGX present 0x29c6fbf
[    0.778744] intel_idle: state C8-SKL is disabled
[    0.778745] intel_idle: state C9-SKL is disabled


grep . /sys/devices/system/cpu/cpu0/cpuidle/*/* :

/sys/devices/system/cpu/cpu0/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE
/sys/devices/system/cpu/cpu0/cpuidle/state0/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
/sys/devices/system/cpu/cpu0/cpuidle/state0/name:POLL
/sys/devices/system/cpu/cpu0/cpuidle/state0/power:4294967295
/sys/devices/system/cpu/cpu0/cpuidle/state0/residency:0
/sys/devices/system/cpu/cpu0/cpuidle/state0/time:7674964
/sys/devices/system/cpu/cpu0/cpuidle/state0/usage:3133
/sys/devices/system/cpu/cpu0/cpuidle/state1/desc:MWAIT 0x00
/sys/devices/system/cpu/cpu0/cpuidle/state1/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
/sys/devices/system/cpu/cpu0/cpuidle/state1/name:C1-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state1/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/residency:2
/sys/devices/system/cpu/cpu0/cpuidle/state1/time:6542990
/sys/devices/system/cpu/cpu0/cpuidle/state1/usage:18820
/sys/devices/system/cpu/cpu0/cpuidle/state2/desc:MWAIT 0x01
/sys/devices/system/cpu/cpu0/cpuidle/state2/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
/sys/devices/system/cpu/cpu0/cpuidle/state2/name:C1E-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state2/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state2/residency:20
/sys/devices/system/cpu/cpu0/cpuidle/state2/time:12444721
/sys/devices/system/cpu/cpu0/cpuidle/state2/usage:22193
/sys/devices/system/cpu/cpu0/cpuidle/state3/desc:MWAIT 0x10
/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state3/latency:70
/sys/devices/system/cpu/cpu0/cpuidle/state3/name:C3-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state3/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state3/residency:100
/sys/devices/system/cpu/cpu0/cpuidle/state3/time:2071866
/sys/devices/system/cpu/cpu0/cpuidle/state3/usage:4613
/sys/devices/system/cpu/cpu0/cpuidle/state4/desc:MWAIT 0x20
/sys/devices/system/cpu/cpu0/cpuidle/state4/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state4/latency:85
/sys/devices/system/cpu/cpu0/cpuidle/state4/name:C6-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state4/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state4/residency:200
/sys/devices/system/cpu/cpu0/cpuidle/state4/time:76102469
/sys/devices/system/cpu/cpu0/cpuidle/state4/usage:68295
/sys/devices/system/cpu/cpu0/cpuidle/state5/desc:MWAIT 0x33
/sys/devices/system/cpu/cpu0/cpuidle/state5/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state5/latency:124
/sys/devices/system/cpu/cpu0/cpuidle/state5/name:C7s-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state5/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state5/residency:800
/sys/devices/system/cpu/cpu0/cpuidle/state5/time:235877869
/sys/devices/system/cpu/cpu0/cpuidle/state5/usage:113583
/sys/devices/system/cpu/cpu0/cpuidle/state6/desc:MWAIT 0x60
/sys/devices/system/cpu/cpu0/cpuidle/state6/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state6/latency:890
/sys/devices/system/cpu/cpu0/cpuidle/state6/name:C10-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state6/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state6/residency:5000
/sys/devices/system/cpu/cpu0/cpuidle/state6/time:488816024
/sys/devices/system/cpu/cpu0/cpuidle/state6/usage:29191


Thanks, and best regards.

sdavid
Comment 84 Lev Lybin 2016-03-14 02:09:07 UTC
Works fine :) Thanks.
Comment 85 Cláudio Pereira 2016-03-15 00:59:51 UTC
Are we losing anything out of our systems by using this patch? I mean, this essentially seems to disable features our processors have, am I wrong?
#61 made me think this is all about power saving features, and laptops with this processor are usually power hungry machines, so every every bit counts.

Also since this is a bugfix, will it get backported into currently supported mainstream kernels?
Eg. it would be a shame if the upcoming Ubuntu LTS had trouble dealing with this for having 4.4 instead of 4.6.
Comment 86 Lev Lybin 2016-03-15 04:02:47 UTC
Some information there, page 66: http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/desktop-6th-gen-core-family-datasheet-vol-1.pdf

As I understood, C7-C10 are similar. We have C7 and C10. Can to see this: grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*

But why C8 and C9 are disabled, I hadn't understood fully. Is it bug of processor? Is it disabled using microcode?
Comment 87 sdavid 2016-03-15 15:14:14 UTC
(In reply to Lv Zheng from comment #76)
> If you mean the problem in comment 11, it is fixed:
> http://www.spinics.net/lists/linux-acpi/msg63550.html
> But the series contains things that need more time to review, so you have to
> wait a bit longer.
> 
> Thanks and best regards
> -Lv

about the ACPI error :
I understood it was a separated issue.
Is there a dedicated report for this issue that I could also follow ?
I searched bugzilla tracking under ACPI topics but didn't found any reports that match.
I can create it, it's just I'm new here and I don't know what's are the rules.

Thanks and best regards.

sdavid
Comment 88 Lev Lybin 2016-03-15 15:41:36 UTC
I think this problem is solved here CTRL+F "Lv Zheng (7)": http://lkml.iu.edu/hypermail/linux/kernel/1603.1/05278.html
Comment 89 sdavid 2016-03-15 22:37:51 UTC
(In reply to Lev Lybin from comment #88)
> I think this problem is solved here CTRL+F "Lv Zheng (7)":
> http://lkml.iu.edu/hypermail/linux/kernel/1603.1/05278.html

Thank you !
Comment 90 Len Brown 2016-03-16 05:51:40 UTC
Re: comment #85

Yes, when intel_idle disables C8 and C9, the OS loses the ability
to directly request those idle states.  However, we do this only
when C10 is enabled.  So the processor can still enter C10
(which saves more energy than C8,C9) and the processor can
still choose to "demote" those C10 requests to C8,C9
residency if it determines that is a better match for
the expected latency.

So I don't expect this workaround to have a measurable impact
except in academic scenarios.  Note that, by comparison, ACPI
mode generally exports C1/C7/C10 -- so even with C8, C9
removed from intel_idle, it offers more fine-grain C-state
selection than ACPI, which is what Windows uses...
Comment 91 Sjoerd Furth 2016-03-18 01:11:57 UTC
(In reply to Len Brown from comment #82)
> Created attachment 208761 [details]
> debug patch to disable c8 + C9 on selected SKL-H systems
> 
> Please report if the attached patch allows your system to boot
> with no "intel_idle.max_cstate=" (or acpi=off) cmdline workaround.
> 
> If it is working as intended, you should see something like this in dmesg:
> 
> dmesg | grep idle
> 
> intel_idle: MWAIT substates: 0x11142120
> intel_idle: v0.4.1 model 0x5E
> intel_idle: lapic_timer_reliable_states 0xffffffff
> intel_idle: SGX present 0x29c6fbf
> intel_idle: state C8-SKL is disabled
> intel_idle: state C9-SKL is disabled
> 
> grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
> should show that C8-SKL and C9-SKL are not longer present.
> 
> If your BIOS has a SETUP option to enable SGX and you enable it,
> then you should be able to boot without this patch, and this patch
> will print another line about SGX being enabled, but you will not
> see the bit about C8-SKL and C9-SKL being disabled, and you should
> see them in sysfs using the grep above.
> 
> If this patch fails to fix your boot issue,
> please boot with "intel_idle.max_cstate=7"
> and show the output from "dmesg | grep idle"

Dear Lev,

Today I tested the patch on kernel 4.4.5 (Arch current stable). It is also working there.

[sjoerd@Sjoerd-Laptop-Arch-Linux ~]$ dmesg | grep idle

[    0.000000] Command line: \boot\vmlinuz-linux-custom root=/dev/sdb5 rw initrd=/boot/initramfs-linux-custom.img
[    0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 6370452778343963 ns
[    0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635855245 ns
[    0.039644] process: using mwait in idle threads
[    0.209242] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 6370867519511994 ns
[    0.220851] cpuidle: using governor ladder
[    0.234201] cpuidle: using governor menu
[    0.390041] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    0.445249] intel_idle: MWAIT substates: 0x11142120
[    0.445250] intel_idle: v0.4.1 model 0x5E
[    0.445251] intel_idle: lapic_timer_reliable_states 0xffffffff
[    0.445252] intel_idle: SGX present 0x29c6fbf
[    0.445253] intel_idle: state C8-SKL is disabled
[    0.445254] intel_idle: state C9-SKL is disabled
[    1.435604] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x255cb5c6a11, max_idle_ns: 440795249002 ns

[sjoerd@Sjoerd-Laptop-Arch-Linux ~]$ grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
/sys/devices/system/cpu/cpu0/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE
/sys/devices/system/cpu/cpu0/cpuidle/state0/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
/sys/devices/system/cpu/cpu0/cpuidle/state0/name:POLL
/sys/devices/system/cpu/cpu0/cpuidle/state0/power:4294967295
/sys/devices/system/cpu/cpu0/cpuidle/state0/residency:0
/sys/devices/system/cpu/cpu0/cpuidle/state0/time:10148
/sys/devices/system/cpu/cpu0/cpuidle/state0/usage:74
/sys/devices/system/cpu/cpu0/cpuidle/state1/desc:MWAIT 0x00
/sys/devices/system/cpu/cpu0/cpuidle/state1/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
/sys/devices/system/cpu/cpu0/cpuidle/state1/name:C1-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state1/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/residency:2
/sys/devices/system/cpu/cpu0/cpuidle/state1/time:9223221
/sys/devices/system/cpu/cpu0/cpuidle/state1/usage:48074
/sys/devices/system/cpu/cpu0/cpuidle/state2/desc:MWAIT 0x01
/sys/devices/system/cpu/cpu0/cpuidle/state2/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
/sys/devices/system/cpu/cpu0/cpuidle/state2/name:C1E-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state2/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state2/residency:20
/sys/devices/system/cpu/cpu0/cpuidle/state2/time:9280773
/sys/devices/system/cpu/cpu0/cpuidle/state2/usage:13186
/sys/devices/system/cpu/cpu0/cpuidle/state3/desc:MWAIT 0x10
/sys/devices/system/cpu/cpu0/cpuidle/state3/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state3/latency:70
/sys/devices/system/cpu/cpu0/cpuidle/state3/name:C3-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state3/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state3/residency:100
/sys/devices/system/cpu/cpu0/cpuidle/state3/time:720935
/sys/devices/system/cpu/cpu0/cpuidle/state3/usage:795
/sys/devices/system/cpu/cpu0/cpuidle/state4/desc:MWAIT 0x20
/sys/devices/system/cpu/cpu0/cpuidle/state4/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state4/latency:85
/sys/devices/system/cpu/cpu0/cpuidle/state4/name:C6-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state4/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state4/residency:200
/sys/devices/system/cpu/cpu0/cpuidle/state4/time:10644507
/sys/devices/system/cpu/cpu0/cpuidle/state4/usage:4403
/sys/devices/system/cpu/cpu0/cpuidle/state5/desc:MWAIT 0x33
/sys/devices/system/cpu/cpu0/cpuidle/state5/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state5/latency:124
/sys/devices/system/cpu/cpu0/cpuidle/state5/name:C7s-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state5/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state5/residency:800
/sys/devices/system/cpu/cpu0/cpuidle/state5/time:12803186
/sys/devices/system/cpu/cpu0/cpuidle/state5/usage:4540
/sys/devices/system/cpu/cpu0/cpuidle/state6/desc:MWAIT 0x60
/sys/devices/system/cpu/cpu0/cpuidle/state6/disable:0
/sys/devices/system/cpu/cpu0/cpuidle/state6/latency:890
/sys/devices/system/cpu/cpu0/cpuidle/state6/name:C10-SKL
/sys/devices/system/cpu/cpu0/cpuidle/state6/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state6/residency:5000
/sys/devices/system/cpu/cpu0/cpuidle/state6/time:49171424
/sys/devices/system/cpu/cpu0/cpuidle/state6/usage:2143


Thank you for your hard work!

With kind regards,

Sjoerd Furth
Comment 92 Len Brown 2016-03-28 17:32:42 UTC
fix shipped upstream in v4.6-rc1:

commit d70e28f57e14a481977436695b0c9ba165472431
Author: Len Brown <len.brown@intel.com>
Date:   Sun Mar 13 00:33:48 2016 -0500

    intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled


This patch will need to be applied where intel_idle has SKL support.
For the upstream kernel, that is Linux Linux-4.3, 4.4, and 4.5.

closed.
Comment 93 Lv Zheng 2016-04-13 07:33:35 UTC
(In reply to sdavid from comment #87)
> (In reply to Lv Zheng from comment #76)
> > If you mean the problem in comment 11, it is fixed:
> > http://www.spinics.net/lists/linux-acpi/msg63550.html
> > But the series contains things that need more time to review, so you have
> to
> > wait a bit longer.
> > 
> > Thanks and best regards
> > -Lv
> 
> about the ACPI error :
> I understood it was a separated issue.
> Is there a dedicated report for this issue that I could also follow ?
> I searched bugzilla tracking under ACPI topics but didn't found any reports
> that match.
> I can create it, it's just I'm new here and I don't know what's are the
> rules.
> 
> Thanks and best regards.
> 
> sdavid

Thanks for the ping.
You can find the related fix on this bug entry:
https://bugzilla.kernel.org/show_bug.cgi?id=102421
Several fixes of them are upstreamed.

The issue is more serious than expected.
Though there are only error logs bugging us around,
the errors in fact indicate many issues related to the ACPI subsystem initialization.
So you can have your cases tested there or file another bug and assign it to me.

Thanks
-Lv
Comment 94 Zhang Rui 2016-05-16 06:48:47 UTC
*** Bug 112261 has been marked as a duplicate of this bug. ***
Comment 95 Ego 2016-08-04 12:01:04 UTC
Hello, I'm trying to install the system but climbs all the same error
  iwlwifi 0000: 02: 00.0: Unsupported splx structure
nmi watchdog: bug: soft lockup - cpu # 1 stuck for 23s!
...
with a preinstalled operating system and installed microcode ucode-intel-20170714.1 same. Even the translation system in Legacy mode. All the same, the same. What else can you do?
Comment 96 Ego 2016-08-04 12:07:42 UTC
specifications
msi gp-62-lp-466
Intel i7 6700HQ
bios version 2016-01-26