Bug 110451 - FADT address favor - Boot fails on HP 6715s
Summary: FADT address favor - Boot fails on HP 6715s
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Config-Tables (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Lv Zheng
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-01-06 20:08 UTC by Olof Englund
Modified: 2022-04-25 11:49 UTC (History)
14 users (show)

See Also:
Kernel Version: 4.4.0-rc7
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
acpi.dat from acpidump (327.01 KB, application/x-ns-proxy-autoconfig)
2016-01-12 21:35 UTC, Olof Englund
Details
Output of sudo dmidecode > dmi.dat (6.76 KB, application/x-ns-proxy-autoconfig)
2016-01-13 19:03 UTC, Olof Englund
Details
acpidump with newer bios (333.94 KB, application/x-ns-proxy-autoconfig)
2016-01-13 21:01 UTC, Olof Englund
Details
dmidecode with newer BIOS (6.76 KB, application/x-ns-proxy-autoconfig)
2016-01-13 21:01 UTC, Olof Englund
Details

Description Olof Englund 2016-01-06 20:08:20 UTC
Original bug report here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1529381

Result when trying to boot:

- Message about loading initrd image or somesuch.
- Screen goes black.
- Fan goes to full speed.
- HD light lit for about 15 seconds or so, then goes dark.

Found the latest working kernel version (in Ubuntu) to be 3.19.0-25.26. Bisected to 3.19.0-26.27 and found this:

"19114b5458510b757cd2801e64094e4062e4067f is the first bad commit
commit 19114b5458510b757cd2801e64094e4062e4067f
Author: Lv Zheng <lv.zheng@intel.com>
Date: Wed Jul 1 14:43:34 2015 +0800

    ACPICA: Tables: Enable default 64-bit FADT addresses favor

    BugLink: http://bugs.launchpad.net/bugs/1479048

    commit 0ea61381788a37d864f9841b0fe97d40f7058f3b upstream.

    ACPICA commit 4da56eeae0749dfe8491285c1e1fad48f6efafd8

    The following commit temporarily disables correct 64-bit FADT addresses
    favor during the period the root cause of the bug is not fixed:
     Commit: 85dbd5801f62b66e2aa7826aaefcaebead44c8a6
     ACPICA: Tables: Restore old behavor to favor 32-bit FADT addresses.

    With enough protections, this patch re-enables 64-bit FADT addresses by
    default. If regressions are reported against such change, this patch should
    be bisected and reverted.
    Note that 64-bit FACS favor and 64-bit firmware waking vector favor are
    excluded by this commit in order not to break OSPMs. Lv Zheng.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021
    Link: https://github.com/acpica/acpica/commit/4da56eea
    Reported-and-tested-by: Oswald Buddenhagen <ossi@kde.org>
    Signed-off-by: Lv Zheng <lv.zheng@intel.com>
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Kamal Mostafa <kamal@canonical.com>
    Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

:040000 040000 0b002dc7d6e5abe0ded978926361d88efd5a0f88 27c36ea102b1622ded7beecbca477f62fa33922a M include"
Comment 1 Aaron Lu 2016-01-08 02:19:04 UTC
Add the author.
Comment 2 Len Brown 2016-01-11 23:49:28 UTC
mark as regression
Comment 3 Lv Zheng 2016-01-12 00:23:51 UTC
Hi,

First, please upload full acpidump here.

Second, there are known issues related to the feature.
The issues have already been fixed in the upstream kernel.
Could you try to cherry pick the following 3 fixes from upstream kernel and try again?

1. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8ec3f45
author Lv Zheng <lv.zheng@intel.com> 2015-08-25 02:29:01 (GMT) 
commit 8ec3f459073e67e5c6d78507dec693064b3040a2 (patch) 

ACPICA: Tables: Fix global table list issues by removing fixed table indexes
ACPICA commit c0b38b4c3982c2336ee92a2a14716107248bd941

The fixed table indexes leave holes in the global table list:
 1. One hole can be seen when there is only 1 FACS provided by the BIOS.
 2. Tow holes can be seen when it is a reduced hardware platform.
The holes do not break OSPMs but have broken ACPI debugger "tables"
command.

Also the "fixed table indexes" mechanism may make the descriptors of the
standard tables installed earlier than DSDT to be overwritten by the
descriptors of the fixed tables. For example, FACP disappears from the
global table list after DSDT is installed.

This patch fixes all above issues by removing the "fixed table indexes"
mechanism which is too complicated to be maintained in a regression safe
manner. After removal, the table loader will determine the indexes of the
fixed tables. Lv Zheng.

2. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7484619

author Lv Zheng <lv.zheng@intel.com> 2015-08-25 02:29:08 (GMT) 
commit 7484619bff495c30e977dafe2ff735477bd569ff (patch) 

ACPICA: Tables: Cleanup to reduce FACS globals
ACPICA commit 3f42ba76e2a0453976d3108296d5f656fdf2bd6e

In this patch, FACS table mapping is also tuned a bit so that only the
selected FACS table will be mapped by the OSPM (mapped on demand) and the
FACS related global variables can be reduced. Lv Zheng.

3. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=62fcce91

author Lv Zheng <lv.zheng@intel.com> 2015-10-14 05:53:57 (GMT) 
commit 62fcce91049a9681fc31d068ffcfaec8d168a857 (patch) 

ACPICA: Tables: Fix FADT dependency regression
Some logics actually relying on the existence of FADT, currently relies on
the number of loaded tables. This false dependency can easily trigger
regressions. One of them has been introduced by commit 8ec3f459073e
(ACPICA: Tables: Fix global table list issues by removing fixed table).

The commit changing the fixed table indexes results in the change of FADT
table index, originally, it was 3 (thus the installed table count should be
greater than 4), while currently it is 0 (and the installed table count may
be 3).

This patch fixes this regression by cleaning up the code. Lv Zheng.

Thanks and best regards
-Lv
Comment 4 Olof Englund 2016-01-12 21:35:46 UTC
Created attachment 199471 [details]
acpi.dat from acpidump

It will take me a bit to figure out how to test the cherrypicked patches. In the meantime, here's the results of acpidump.
Comment 5 Olof Englund 2016-01-12 21:46:57 UTC
Hi!

Is there a guide detailing how I should download the latest (?) Mainline, apply the patches and compile?

I found this:

git clone git://kernel.ubuntu.com/virgin/linux.git mainline
cd mainline
git checkout -b `cat ${MAINLINE}/COMMIT`
git am ${MAINLINE}/????-*

Here: https://wiki.ubuntu.com/Kernel/MainlineBuilds

Am I close?

Best regards,
Olof
Comment 6 Lv Zheng 2016-01-13 05:37:29 UTC
(In reply to Olof Englund from comment #4)
> Created attachment 199471 [details]
> acpi.dat from acpidump
> 
> It will take me a bit to figure out how to test the cherrypicked patches. In
> the meantime, here's the results of acpidump.

When decompiling the FADT, I got the following error:
ACPI BIOS Error (bug): 32/64X address mismatch in FADT/Pm2ControlBlock: 0x00008800/0x0000000000008100, using 32 (20130725/tbfadt-565)

[048h 0072   4]    PM2 Control Block Address : 00008800
[05Ah 0090   1]     PM2 Control Block Length : 01

[0C4h 0196  12]            PM2 Control Block : [Generic Address Structure]
[0C4h 0196   1]                     Space ID : 01 [SystemIO]
[0C5h 0197   1]                    Bit Width : 08
[0C6h 0198   1]                   Bit Offset : 00
[0C7h 0199   1]         Encoded Access Width : 00 [Undefined/Legacy]
[0C8h 0200   8]                      Address : 0000000000008100

According to vista documentation, Windows favors 64-bit GAS values.
Do you have troubles in running Vista/Win7/Win8 on this machine?

For us, we just don't know Windows's favourites.
And the current default behavior is required by ACPI spec, so it is not revertable for now.
What we can offer here is to provide a quirk for your platform to revert the behavior back to favor 32-bit GAS values on this platform before your issue is root caused.
Could you upload dmidecode output here, you can obtain dmidecode by executing:
# sudo dmidecode > dmi.dat

Thanks
-Lv
Comment 7 Lv Zheng 2016-01-13 05:45:12 UTC
(In reply to Olof Englund from comment #5)
> git clone git://kernel.ubuntu.com/virgin/linux.git mainline
> cd mainline
> git checkout -b `cat ${MAINLINE}/COMMIT`
> git am ${MAINLINE}/????-*
> 
> Here: https://wiki.ubuntu.com/Kernel/MainlineBuilds


I haven't tried.
If you mean you want to apply upstream kernel commits to ubuntu kernels, you might:

# git clone git clone git://kernel.ubuntu.com/ubuntu/ubuntu-<release>.git
And release should be what you are testing
# git remote add upstream git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
# git fetch upstream
# git cherry-pick 8ec3f45
# git cherry-pick 7484619
# git cherry-pick 62fcce9

Then build and boot the kernel.

Thanks and best regards
-Lv
Comment 8 Olof Englund 2016-01-13 19:03:00 UTC
Created attachment 199551 [details]
Output of sudo dmidecode > dmi.dat
Comment 9 Olof Englund 2016-01-13 19:22:43 UTC
Googling around I discovered there might be a BIOS update for this laptop model after all. HP does not offer it, for reasons unknown. But I'll try to track down a copy of it and update the BIOS (to F.05 or F.07, apparently).

As for Windows, I think this laptop was running Vista before I got my hands on it.
Comment 10 Olof Englund 2016-01-13 20:59:52 UTC
Found a newer BIOS and updated to it, but to my sad surprise it didn't solve the problem. Will attach newer acpidump and dmidecode and next look at getting those patches tested.

Thanks for your patience :)
Comment 11 Olof Englund 2016-01-13 21:01:13 UTC
Created attachment 199561 [details]
acpidump with newer bios
Comment 12 Olof Englund 2016-01-13 21:01:45 UTC
Created attachment 199571 [details]
dmidecode with newer BIOS
Comment 13 Lv Zheng 2016-01-14 05:49:09 UTC
The flag only affect Linux behavior to favor 64-bit addresses in FADT.

By loading the FADT that provided by you in attachment 199561 [details], I can only see 32/64 difference in PM2 control register.

Before writing the quirk, I think I should ask:
Is there any users in Linux using PM2 control register?
And can this user stop Linux from booting?

Thanks and best regards
-Lv
Comment 14 Lv Zheng 2016-01-14 06:12:27 UTC
I just searched the dmesg from the boot log provided by the original reporter:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1529381/+attachment/4540775/+files/BootDmesg.txt

There is no error message complaining the difference of PM2 control register.
So may I think I should why this commit is get bisected out?
The bisection result looks wrong.

Thanks and best regards
-Lv
Comment 15 Lv Zheng 2016-01-14 07:19:46 UTC
Hi,

I'm not able to find the commit in the following ubuntu git repos:
http://kernel.ubuntu.com/git/virgin/linux-stable.git/
http://kernel.ubuntu.com/git/virgin/linux.git/

If you mean this commit in the upstream:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0ea6138

Let me say something about this commit.
The commit only modifies a global variable.
And the code affected by the modification is as follows:
1. If 64-bit DSDT address is different than 32-bit DSDT address, using 64-bit DSDT address, this is done in acpi_tb_select_address().
/* Address mismatch between 32-bit and 64-bit versions */

ACPI_BIOS_WARNING((AE_INFO,
		   "32/64X %s address mismatch in FADT: "
		   "0x%8.8X/0x%8.8X%8.8X, using %u-bit address",
		   register_name, address32,
		   ACPI_FORMAT_UINT64(address64),
		   acpi_gbl_use32_bit_fadt_addresses ? 32 :
		   64));

/* 32-bit address override */

if (acpi_gbl_use32_bit_fadt_addresses) {
	return ((u64)address32);
}


2. If 64-bit PM/GPE register addresses are different from 32-bit register addresses, using 32-bit register addresses, this is done in acpi_tb_convert_fadt().
ACPI_BIOS_WARNING((AE_INFO,
		   "32/64X address mismatch in FADT/%s: "
		   "0x%8.8X/0x%8.8X%8.8X, using %u-bit address",
		   name, address32,
		   ACPI_FORMAT_UINT64
		   (address64->address),
		   acpi_gbl_use32_bit_fadt_addresses
		   ? 32 : 64));

if (acpi_gbl_use32_bit_fadt_addresses) {

	/* 32-bit address override */

	acpi_tb_init_generic_address(address64,
				     ACPI_ADR_SPACE_SYSTEM_IO,
				     *ACPI_ADD_PTR
				     (u8,
				      &acpi_gbl_FADT,
				      fadt_info_table
				      [i].
				      length),
				     (u64)
				     address32,
				     name,
				     flags);
}

So if this is get executed, there should always be a log entry in the kernel dmesg carried out by ACPI_BIOS_WARNING().
While I cannot see this log entry in the dmesg.

IMO, the bisection result is wrong.
I couldn't find the commit, if you have the git tree, you could:
# git clone <the git tree that is reported to be wrong>
# git revert 19114b545
Then build and boot the kernel to try again in order to confirm if this is the correct bisection result.

Thanks and best regards
-Lv
Comment 16 Olof Englund 2016-01-14 19:17:57 UTC
Oh dear, I really hope I haven't messed up with the bisection. Unfortunately I deleted the old git tree when I started trying to try and compile a kernel with the cherry picked patches. I won't be able to revert. I will go through the process again but it will take a couple of days since compiling the kernel takes 5-6 hours on that old laptop.

As for what dmesg says, it is from a working booting with the last (Ubuntu)kernel that I can boot with..

I'm sorry I cannot be of help to you regarding the PM2 register. I am not a developer and I don't really understand working with git and how to debug the kernel. At best I can follow specific instructions but right now it seems I didn't succeed in that either :-p

I'll get back to this after doing a new bisect! The guide I have been following is here:

https://wiki.ubuntu.com/Kernel/KernelBisection

Thanks and best regards

Olof
Comment 17 Olof Englund 2016-01-17 19:49:40 UTC
I did a new bisect. It was a bit different this time, perhaps because I updated the BIOS.

Booting the first bisected kernel (half of the commits I guess?) the computer did not freeze at boot but the text was completely garbled into white sort of blocks. Xorg did not start. I noticed the machine was alive because I was able to switch consoles and do a soft reboot using ctrl+alt+del.

I marked the bisect as bad and continued through the process. Every kernel I booted gave the same result. I marked them all bad and arrived at something that cannot be the source of the problem:

"commit 14565d5968d4627cafcf3c4df4747550e075c205
Author: Luis Henriques <luis.henriques@canonical.com>
Date:   Mon Jul 27 17:37:21 2015 +0100

    UBUNTU: Start new release
    
    Ignore: yes"

I'm a bit bewildered..the bisect seemingly didn't catch the problem but the original problem still remains with 3.19.0-26.27 and later.
Comment 18 Lv Zheng 2016-01-18 01:57:16 UTC
If you don't know the exact buggy behavior, any different buggy behavior can cause a wrong bisection result.
Possibly you should ask graphics guys to debug the issue since it is about the xorg startup.

Thanks and best regards
-Lv
Comment 19 Lv Zheng 2016-01-18 02:05:41 UTC
And if you have many commits between 2 commits broken, you are likely not able to get the culprit of your issue bisected.
MAINTAINERs are asking contributors submitting 1 single non-broken patch for 1 single change, but it may not be so easy to detect broken commits as MAINTAINERs always test the whole series while there are cases the whole series may not be broken but one of them is broken.

Thanks and best regards
-Lv
Comment 20 Gatis Ozols 2016-01-21 09:54:59 UTC
Hi, i have exactly the same problems with same Hp Compaq 6715s laptop.
Comment 21 Gatis Ozols 2016-01-21 11:34:52 UTC
Today, i got help from a developer nick-named "apw", who dwells in #ubuntu-kernel channel on Freenode network. He produced a kernel based on 3.19.0-26.27 with just the commit removed and it BOOTED SUCCESSFULLY!
Comment 22 Gatis Ozols 2016-01-21 11:38:17 UTC
Here's the link to kernel with commit removed:
http://people.canonical.com/~apw/lp1529381-vivid/
Comment 23 Gatis Ozols 2016-01-21 12:16:37 UTC
Here's our conversation about the kernel issue and possible cause in #ubuntu-kernel: 
http://irclogs.ubuntu.com/2016/01/21/%23ubuntu-kernel.html
Comment 24 Gatis Ozols 2016-01-21 17:21:21 UTC
Ubuntu kernel developer, cking, helped me to fixed the the problem.
He created kernel parameter "acpi_force_32bit_fadt_addr" with kernel 3.19.0-48 to make it bootable.
I downloaded his modified kernel http://kernel.ubuntu.com/~cking/lp-1529381/ and added paramter in /etc/default/grub like this:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash acpi_force_32bit_fadt_addr"

This kernel parameter helped my machine to boot without problems! :)
This fix should be implemented in mainstream kernel :)
Comment 25 Olof Englund 2016-01-23 07:00:32 UTC
Hi!

Sorry for being "away" so long. I did the bisect once more, defining total freeze as "bad" and every other malfunction as "good" and arrived at the same result as the first time.

I am very happy to see what has been happening with this since bug and I also tested the kernel mr. Colin Ian King made for us to test in the Ubuntu bug thread. It works like a charm, so it does seem like we found the culprit (HP :-p)
Comment 26 Lv Zheng 2016-01-26 00:20:49 UTC
OK.
Thanks for the information.
I'll help to prepare the fix for the upstream kernel.

Thanks and best regards
-Lv
Comment 27 Zhang Rui 2016-02-22 06:57:48 UTC
Patch from Colin has been applied by Rafael.
https://patchwork.kernel.org/patch/8083471/
Comment 28 Lv Zheng 2016-04-05 05:21:37 UTC
So let's close it.
Comment 29 teika kazura 2016-04-09 11:26:28 UTC
Thanks all for fixing this issue.

The patch is now merged to the mainline, 4.6-rc, but not to 4.1.21, which was released on 3rd Apr.

But it should be merged to 4.1, since the 4.1 series is the latest longterm, and the bug was introduced in 3.19.xx. Does anyone know its possibility?

# BTW according to this ArchLinux forum post[1], this bug also affects HP ProBook 4510s. I asked the OP to confirm if the patch fixes, but have got no response. Maybe we could add DMI_MATCH, so that the manual adding of a kernel boot option is not necessary.

[1] https://bbs.archlinux.org/viewtopic.php?pid=1594468#p1594468

Best regards.
Comment 30 teika kazura 2016-04-14 12:07:02 UTC
@Lv Zheng: I wonder if you could ask 4.1 and 4.4 maintainers to merge the patch?

After my last comment, 4.4.7 (longterm) and 4.5.1 were released, but neither were fixed.
Comment 31 Lv Zheng 2016-04-15 06:03:38 UTC
(In reply to teika kazura from comment #30)
> @Lv Zheng: I wonder if you could ask 4.1 and 4.4 maintainers to merge the
> patch?
> 
> After my last comment, 4.4.7 (longterm) and 4.5.1 were released, but neither
> were fixed.

I don't know how could this be achieved by me.

Normally we do this via patch description.
A "cc: <stable@vger.kernel.org> # kernel version" will be added, the kernel version should be the earliest version that this patch applies.
And the stable maintainers could sense this and pick this patch as a candidate of the stable materials.

Possibly you or the author can ping the stable mailing list to make what you want to happen.

Thanks
-Lv
Comment 32 Lv Zheng 2016-04-15 08:12:50 UTC
Let me mark something clarified here:

Though PM2 control register is used in the Linux ACPI processor related drivers.
I'm still doubt what's the root cause of this issue.

64-bit register favor looks like a spec recommended behavior and Windows declared behavior (since Vista). And we've been reported that there are real platforms may face issues if Linux favors 32-bit registers.

While the platform - HP Compaq 6715s (RU655ET#AK8). It seems it should be able to run recent Windows.

So I don't know if:
1. When Windows are running on this platform, the ACPI processor stuffs are not used, or
2. Windows actually never uses 64-bit registers.

Thanks
-Lv
Comment 33 Lv Zheng 2016-04-15 08:42:03 UTC
The support page of this platform from HP:
http://www.driverscape.com/manufacturers/hp/laptops-desktops/hp-compaq-6715s-%28ru655et-ak8%29/28633

Thanks
-Lv
Comment 34 Lv Zheng 2016-04-20 02:59:16 UTC
> So I don't know if:
> 1. When Windows are running on this platform, the ACPI processor stuffs are
> not used, or
> 2. Windows actually never uses 64-bit registers.

And the uncertain stuffs here make me believe this is not a stable material.
IMO, that's why the patch is not marked as stable material when it is upstreamed because both the author and the maintainers believe this is not a stable material.

Thanks
-Lv

Note You need to log in before you can comment on or make changes to this bug.