Bug 73911

Summary: acpi_tb_validate_xsdt causes early kernel panic - x86 AMI BIOS F2-F4
Product: ACPI Reporter: Bruce Chiarelli (mano155)
Component: ACPICA-CoreAssignee: Lv Zheng (lv.zheng)
Status: CLOSED CODE_FIX    
Severity: normal CC: aaron.lu, jwboyer, ltwardus, lv.zheng, mano155, thomas
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 3.13.0-rc7 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: /proc/cpuinfo
dmesg -k -l emerg,alert,crit,err,warn,notice,info,debug
/proc/iomem
/proc/ioports
sudo lspci -vvv
ver_linux (From a working LTS kernel)
[PATCH] ACPICA: Tables: Revert an additional assumption that a valid XSDT must have one entry.
[PATCH] ACPICA: Tables: Revert an improvement for some failure cases around XSDT validation.
acpidump.txt (Gigabyte GA-Z87X-UD5H)
dmidecode.txt
acpidump: Bypass root table check when downloading a table from an address.
acpidump: Add support to force using RSDT.
[PATCH] x86/ACPI: Add quirk mechanism to stop using broken XSDT on some platforms.
RSDT from patched acpica.org acpidump
XSDT from patched acpica.org acpidump
acpidump -x -x
dmidecode output for Gigabyte B85-HD3
dmesg output after applying 132561
RSDT (Gigabyte B85-HD3)
XSDT (Gigabyte B85-HD3)
acpidump -x -x (Gigabyte B85-HD3)
dmesg after patching (Gigabyte B85-HD3)
acpidump: Add support to force using RSDT.
Tables: Skip NULL entries in RSDT and XSDT.
[PATCH] ACPICA: Tables: Skip NULL entries in RSDT and XSDT.
[PATCH] ACPICA: Tables: Skip NULL entries in RSDT and XSDT.
acpidump1.txt (acpidump), after 3rd patch (B85-HD3)
acpidump2.txt (acpidump -x), after 3rd patch (B85-HD3)
acpidump3.txt (acpidump -x -x), after 3rd patch (B85-HD3)
RSDT after 3rd patch (B85-HD3)
XSDT after 3rd patch (B85-HD3)
[PATCH] x86/ACPI: Add quirk mechanism to stop using broken XSDT on some platforms.
dmesg after applying 132731
dmesg after applying 132801 on a clean v3.14 tree
tbutils.o from plain v3.14
[PATCH] ACPICA: Tables: Fix a bad pointer issue in acpi_tb_parse_root_table().
Dmesg after applying 132891
tbutils.o from panicking kernel

Description Bruce Chiarelli 2014-04-12 22:52:18 UTC
Between 3.13 and 3.14, a change was made that made the kernel refuse to boot. I was able to see the tail end of a backtrace using earlyprintk=vga,keep, but it freezes hard and I can't scroll back or save it anywhere.

Using git-bisect, the first revision that fails is 671cc68dc61f029d44b43a681356078e02d8dab8.
Comment 1 Bruce Chiarelli 2014-04-12 22:52:41 UTC
Created attachment 131991 [details]
/proc/cpuinfo
Comment 2 Bruce Chiarelli 2014-04-12 22:53:18 UTC
Created attachment 132001 [details]
dmesg -k -l emerg,alert,crit,err,warn,notice,info,debug
Comment 3 Bruce Chiarelli 2014-04-12 22:53:35 UTC
Created attachment 132011 [details]
/proc/iomem
Comment 4 Bruce Chiarelli 2014-04-12 22:53:54 UTC
Created attachment 132021 [details]
/proc/ioports
Comment 5 Bruce Chiarelli 2014-04-12 22:54:35 UTC
Created attachment 132031 [details]
sudo lspci -vvv
Comment 6 Bruce Chiarelli 2014-04-12 22:55:13 UTC
Created attachment 132041 [details]
ver_linux (From a working LTS kernel)
Comment 7 Lv Zheng 2014-04-14 05:42:54 UTC
Could you offer acpidump output on a bootable kernel?
Thanks in advance.
Comment 8 Lv Zheng 2014-04-14 06:00:58 UTC
Hi,

You should achieve both RSDP/XSDT for us to examine.

Please download the acpica package from acpica.org and build acpixtract/acpidump/iasl.

Please use the following commands:
# sudo acpidump > acpidump.txt
# acpixtract -a ./acpidump.txt
# sudo iasl -d rsdp.dat
# sudo acpidump -a <RSDT Address> > my_rsdt.dat
# sudo acpidump -a <XSDT Address> > my_xsdt.dat

Note, you can obtain "RSDT Address/XSDT Address" in the decoding result of rsdp.dat, it should be rsdp.dsl.
For example, if there are following entries in rsdp.dsl:

...
[010h 0016     1]        RSDT Address: CAC25028
...
[018h 0024     1]        XSDT Address: 00000000CAC25078
...

You should:
# sudo acpidump -a 0xCAC25028 > my_rsdt.dat
# sudo acpidump -a 0xCAC25078 > my_xsdt.dat

Please upload acpidump.txt, my_rsdt.dat and my_xsdt.dat here.

Thanks in advance.
Comment 9 Lv Zheng 2014-04-16 00:34:56 UTC
*** Bug 74131 has been marked as a duplicate of this bug. ***
Comment 10 Lv Zheng 2014-04-16 02:00:50 UTC
The back port makes an assumption that there is no RSDT/XSDT can contain zero entry (it should contain FADT).
This is not included in the original code, let me post a patch to revert this logic.

And the original Linux commit has issues, it doesn't pass by XSDT for the following cases:
1. The address of the XSDT cannot be mapped.
2. The "Length" field of the XSDT is wrong.
Let me also post a patch to restore this wrong code.

I'll ask you to test after that.

We still need the XSDT/RSDT to find the root cause.
So please follow comment 8 to provide informations for us.
Comment 11 Lv Zheng 2014-04-16 02:02:18 UTC
Created attachment 132411 [details]
[PATCH] ACPICA: Tables: Revert an additional assumption that a valid XSDT must have one entry.

The patch to revert the one-entry-required validation.
Comment 12 Lv Zheng 2014-04-16 02:02:51 UTC
Created attachment 132421 [details]
[PATCH] ACPICA: Tables: Revert an improvement for some failure cases around XSDT validation.

The patch to restore wrong code.
Comment 13 Lv Zheng 2014-04-16 02:04:57 UTC
Here is the test requests:

1. Apply attachment 132411 [details], do a build/boot test and post result here.
2. Apply attachment 132421 [details], do a build/boot test and post result here.
3. If both of above tests failed, apply both attachment 132411 [details] and attachment 132412, do a build/boot test and post result here.

Thanks in advance.
Comment 14 Bruce Chiarelli 2014-04-16 07:34:42 UTC
Created attachment 132441 [details]
acpidump.txt (Gigabyte GA-Z87X-UD5H)

This file is generated from the acpidump utility in the kernel sources. The one at acpica.org (both the current release and from git) did not work:



[sh0e@topd0g bugrep]$ sudo acpidump
Could not get ACPI tables, AE_BAD_ADDRESS
[sh0e@topd0g bugrep]$ sudo ../Letöltések/linux-stable/tools/power/acpi/acpidump > acpidump.txt
[sh0e@topd0g bugrep]$ acpixtract -a ./acpidump.txt 
<..Succeeds..>
[sh0e@topd0g bugrep]$ sudo iasl -d rsdp.dat 

Intel ACPI Component Architecture
ASL Optimizing Compiler version 20140325-64 [Apr 16 2014]
Copyright (c) 2000 - 2014 Intel Corporation

Loading Acpi table from file   rsdp.dat - Length 00000036 (000024)
Acpi Data Table [RSD ] decoded
Formatted output:  rsdp.dsl - 1046 bytes
[sh0e@topd0g bugrep]$ cat rsdp.dsl
<...>
[000h 0000   8]                    Signature : "RSD PTR "
[008h 0008   1]                     Checksum : 86
[009h 0009   6]                       Oem ID : "ALASKA"
[00Fh 0015   1]                     Revision : 02
[010h 0016   4]                 RSDT Address : BA1AB028
[014h 0020   4]                       Length : 00000024
[018h 0024   8]                 XSDT Address : 00000000BA1AB078
[020h 0032   1]            Extended Checksum : E0
[021h 0033   3]                     Reserved : 000000

Raw Table Data: Length 36 (0x24)

  0000: 52 53 44 20 50 54 52 20 86 41 4C 41 53 4B 41 02  RSD PTR .ALASKA.
  0010: 28 B0 1A BA 24 00 00 00 78 B0 1A BA 00 00 00 00  (...$...x.......
  0020: E0 00 00 00                                      ....
[sh0e@topd0g bugrep]$ sudo acpidump -a BA1AB028
BA1AB028: Could not convert to a physical address
[sh0e@topd0g bugrep]$ sudo acpidump -a 0xBA1AB028
Could not get table at 0x00000000BA1AB028, AE_BAD_ADDRESS  
[sh0e@topd0g bugrep]$ sudo ../Letöltések/linux-stable/tools/power/acpi/acpidump -a 0xBA1AB028
ACPI tables were not found. If you know location of RSD PTR table (from dmesg, etc), supply it with either --addr or -a option
[sh0e@topd0g bugrep]$ sudo ../Letöltések/linux-stable/tools/power/acpi/acpidump --addr 0xBA1AB028
ACPI tables were not found. If you know location of RSD PTR table (from dmesg, etc), supply it with either --addr or -a option
[sh0e@topd0g bugrep]$ dmesg | grep RSD
[    0.000000] ACPI: RSDP 00000000000f0490 000024 (v02 ALASKA)
[    0.000000] ACPI Warning: BIOS XSDT has NULL entry, using RSDT (20131115/tbutils-492)
[    0.000000] ACPI: RSDT 00000000ba1ab028 000048 (v01 ALASKA    A M I 01072009 MSFT 00010013)
[sh0e@topd0g bugrep]$ dmesg | grep XSDT
[    0.000000] ACPI Warning: BIOS XSDT has NULL entry, using RSDT (20131115/tbutils-492)
Comment 15 Bruce Chiarelli 2014-04-16 08:02:16 UTC
Neither of the patches fix the problem for me, nor do they work together.
Comment 16 Spyros Stathopoulos 2014-04-16 09:42:30 UTC
Same situation with above. Stock acpidump produces the same error, probably because kernel was not built with CONFIG_ACPI_PROCFS. acpidump from the kernel tree works and everything is pretty much the same as with what Bruce described (Gigabyte B85-HD3). I can upload my acpidump as well, if required.

Neither patches work, standalone or together. Kernel panics and the tail of the call trace is:

? acpi_find_root_pointer+0x116/0x158
? early_idt_handlers+0x120/0x120
acpi_initialize_tables+0x57/0x59
acpi_table_init+0x1b/0x99
acpi_boot_table_init+0x1e/0x85
setup_arch+0x992/0x452
? early_idt_handlers+0x120/0x120
x86_64_start_reservations+0x2a/0x2c
x86_64_start_kernel+0x13e/0x14d

Code: 00 00 00 41 bd 04 00 00 00 e0 54 73 9f ff 48 c7 c2 74
81 48 89 c1 be 00 02 00 00 48 c7 c7 b0 c0 64 81 31 c0 e8 20
a6 9f ff <41> 8b 5c 24 10 be 24 00 00 00 48 89 df e8 64 23
bb ff 48 85 c0

RIP acpi_tb_parse_root_table+0x120/0x2d2
RSP
CR2
Comment 17 Lv Zheng 2014-04-17 04:06:14 UTC
Hi,

(In reply to Spyros Stathopoulos from comment #16)
> Same situation with above. Stock acpidump

What does stock acpidump mean?  This sounds strange.
Do you mean the acpica package downloaded from this URL:
https://acpica.org/downloads
It includes the newest acpidump that can achieve what I said.
You should be able to obtain RSDT/XSDT using the acpidump included in this package.

> produces the same error,
> probably because kernel was not built with CONFIG_ACPI_PROCFS.

Or possibly the same reason as the panic.
See my replies below.

There are 2 possible ways for acpidump to dump tables:
1. From /sys/firmware/acpi/tables, this only requires CONFIG_ACPI
   Command line "acpidump -c" will act in this way.
2. From /dev/mem, no CONFIG_ACPI is required.
   Command line "acpidump" will act in this way.
   However, it might dump dynamic tables from /sys/firmware/acpi/tables/dynamic.
   I don't know if this is the cause of the error.
   In this mode, acpidump will determine whether RSDT or XSDT should be used.
   So it faces the same problem as the kernel:
   If XSDT address was bad, acpidump would fail.

So no CONFIG_ACPI_PROCFS is required.

> acpidump from the
> kernel tree works and everything is pretty much the same as with what Bruce
> described (Gigabyte B85-HD3). I can upload my acpidump as well, if required.

It might be helpless, but please upload the acpidump first.
We may also need RSDP information to check the alignment of the addresses.
The RSDP can only be dumped using recent acpica.org acpidump.

> Neither patches work, standalone or together.

The 2 patches should have restored all original code.
The result possibly means a good thing - the commit itsself might be correct.
Problem might be caused by the platform.

> Kernel panics and the tail of the call trace is:
> 
> ? acpi_find_root_pointer+0x116/0x158
> ? early_idt_handlers+0x120/0x120
> acpi_initialize_tables+0x57/0x59
> acpi_table_init+0x1b/0x99
> acpi_boot_table_init+0x1e/0x85
> setup_arch+0x992/0x452
> ? early_idt_handlers+0x120/0x120
> x86_64_start_reservations+0x2a/0x2c
> x86_64_start_kernel+0x13e/0x14d
> 
> Code: 00 00 00 41 bd 04 00 00 00 e0 54 73 9f ff 48 c7 c2 74
> 81 48 89 c1 be 00 02 00 00 48 c7 c7 b0 c0 64 81 31 c0 e8 20
> a6 9f ff <41> 8b 5c 24 10 be 24 00 00 00 48 89 df e8 64 23
> bb ff 48 85 c0
> 
> RIP acpi_tb_parse_root_table+0x120/0x2d2
> RSP
> CR2

"RIP acpi_tb_parse_root_table+0x120/0x2d2" looks useful.
This belongs to acpi_tb_validate_xsdt.
I found this corresponds to "table->length".
It sounds like an unalignment issue or the XSDT address is not valid.
You could upload drivers/acpi/acpica/tbutils.o here, it helps.

I start to think that some ACPICA commits updated kernel RSDT force mechanism, but didn't correctly update the kernel quirks.
Let me check.
Comment 18 Lv Zheng 2014-04-17 04:14:59 UTC
Hi,

Please check if booting kernel with "acpi=rsdt" can fix this issue.
If it can, please upload dmidecode for the platforms that suffer from this issue here.

Thanks in advance.
Comment 19 Lv Zheng 2014-04-17 04:34:22 UTC
Hi

(In reply to Bruce Chiarelli from comment #14)
> Created attachment 132441 [details]
> acpidump.txt (Gigabyte GA-Z87X-UD5H)
> 
> This file is generated from the acpidump utility in the kernel sources. The
> one at acpica.org (both the current release and from git) did not work:
> 
> 
> 
> [sh0e@topd0g bugrep]$ sudo acpidump
> Could not get ACPI tables, AE_BAD_ADDRESS
> [sh0e@topd0g bugrep]$ sudo
> ../Letöltések/linux-stable/tools/power/acpi/acpidump > acpidump.txt
> [sh0e@topd0g bugrep]$ acpixtract -a ./acpidump.txt 
> <..Succeeds..>
> [sh0e@topd0g bugrep]$ sudo iasl -d rsdp.dat 
> 
> Intel ACPI Component Architecture
> ASL Optimizing Compiler version 20140325-64 [Apr 16 2014]
> Copyright (c) 2000 - 2014 Intel Corporation
> 
> Loading Acpi table from file   rsdp.dat - Length 00000036 (000024)
> Acpi Data Table [RSD ] decoded
> Formatted output:  rsdp.dsl - 1046 bytes
> [sh0e@topd0g bugrep]$ cat rsdp.dsl
> <...>
> [000h 0000   8]                    Signature : "RSD PTR "
> [008h 0008   1]                     Checksum : 86
> [009h 0009   6]                       Oem ID : "ALASKA"
> [00Fh 0015   1]                     Revision : 02
> [010h 0016   4]                 RSDT Address : BA1AB028
> [014h 0020   4]                       Length : 00000024
> [018h 0024   8]                 XSDT Address : 00000000BA1AB078
> [020h 0032   1]            Extended Checksum : E0
> [021h 0033   3]                     Reserved : 000000
> 
> Raw Table Data: Length 36 (0x24)
> 
>   0000: 52 53 44 20 50 54 52 20 86 41 4C 41 53 4B 41 02  RSD PTR .ALASKA.
>   0010: 28 B0 1A BA 24 00 00 00 78 B0 1A BA 00 00 00 00  (...$...x.......
>   0020: E0 00 00 00                                      ....
> [sh0e@topd0g bugrep]$ sudo acpidump -a BA1AB028
> BA1AB028: Could not convert to a physical address
> [sh0e@topd0g bugrep]$ sudo acpidump -a 0xBA1AB028
> Could not get table at 0x00000000BA1AB028, AE_BAD_ADDRESS  

This seems to be the output of acpica.org acpidump.
This failure means:
There are NULL entries in XSDT.

> [sh0e@topd0g bugrep]$ sudo
> ../Letöltések/linux-stable/tools/power/acpi/acpidump -a 0xBA1AB028
> ACPI tables were not found. If you know location of RSD PTR table (from
> dmesg, etc), supply it with either --addr or -a option
> [sh0e@topd0g bugrep]$ sudo
> ../Letöltések/linux-stable/tools/power/acpi/acpidump --addr 0xBA1AB028
> ACPI tables were not found. If you know location of RSD PTR table (from
> dmesg, etc), supply it with either --addr or -a option
> [sh0e@topd0g bugrep]$ dmesg | grep RSD
> [    0.000000] ACPI: RSDP 00000000000f0490 000024 (v02 ALASKA)
> [    0.000000] ACPI Warning: BIOS XSDT has NULL entry, using RSDT
> (20131115/tbutils-492)
> [    0.000000] ACPI: RSDT 00000000ba1ab028 000048 (v01 ALASKA    A M I
> 01072009 MSFT 00010013)
> [sh0e@topd0g bugrep]$ dmesg | grep XSDT
> [    0.000000] ACPI Warning: BIOS XSDT has NULL entry, using RSDT
> (20131115/tbutils-492)
Comment 20 Bruce Chiarelli 2014-04-17 04:51:56 UTC
Created attachment 132531 [details]
dmidecode.txt

acpi=rsdt does indeed work for me.
Comment 21 Lv Zheng 2014-04-17 06:57:55 UTC
Created attachment 132541 [details]
acpidump: Bypass root table check when downloading a table from an address.

ACPICA patch in order to allow "acpidump -a" work for broken platforms.
Comment 22 Lv Zheng 2014-04-17 06:58:48 UTC
Created attachment 132551 [details]
acpidump: Add support to force using RSDT.

ACPICA patch to allow RSDT to be used for broken platforms.
Comment 23 Lv Zheng 2014-04-17 07:00:23 UTC
Created attachment 132561 [details]
[PATCH] x86/ACPI: Add quirk mechanism to stop using broken XSDT on some platforms.

The quirk mechanism and sample.
I need you guys to add more dmi maching information into this patch.
Comment 24 Lv Zheng 2014-04-17 07:11:50 UTC
Here are the test requests:

It's better we can have acpidump fixed, it will be upstreamed to Linux kernel next month.  So please:
1. Download acpica package;
2. Apply attachment 132541 [details] and attachment 132551 [details] on top of recent acpica;
3. Build acpidump and try the following commands:
   # sudo acpidump -x > acpidump1.txt
   # sudo acpidump -x -x > acpidump2.txt
   # sudo acpidump -a 0xBA1AB028 > rsdt.txt
   # sudo acpidump -a 0xBA1AB078 > xsdt.txt
4. Please report back if any of these commands failed.
5. Upload acpidump1.txt here, or if the command failed to generate acpidump1.txt, upload (rsdt.txt and xsdt.txt) here.

And we should use quirk instead of using XSDT validation.
I tried to implement the quirk mechanism.  So please:
1. Apply attachment 132561 [details] on top of recent linux;
2. Modify linux-acpica/arch/x86/kernel/acpi/boot.c, you should change B85-HD3 to Z87X-UD5H
3. Build and boot the kernel without acpi=rsdt specified.
4. Post dmesg here.  If it failed, please help me to debug and complete the patch, it should be easy.

I think we should have a correct matching rule for such platforms, possibly using the BIOS version:
	Vendor: American Megatrends Inc.
	Version: F3
I'm not sure, I need your feed back.

Thanks in advance.
Comment 25 Bruce Chiarelli 2014-04-17 07:16:16 UTC
Created attachment 132571 [details]
RSDT from patched acpica.org acpidump
Comment 26 Bruce Chiarelli 2014-04-17 07:17:22 UTC
Created attachment 132581 [details]
XSDT from patched acpica.org acpidump
Comment 27 Bruce Chiarelli 2014-04-17 07:18:06 UTC
Created attachment 132591 [details]
acpidump -x -x

acpidump -x fails with code 137
Comment 28 Spyros Stathopoulos 2014-04-17 07:46:31 UTC
Created attachment 132601 [details]
dmidecode output for Gigabyte B85-HD3

acpi=rsdt works for me as well, I've attached my dmidecode output from this booting kernel.
Comment 29 Bruce Chiarelli 2014-04-17 07:58:03 UTC
Created attachment 132611 [details]
dmesg output after applying 132561

It boots using this patch. 

Also, I get the same results when I change the quirk in boot.c to this:

    //  DMI_MATCH(DMI_SYS_VENDOR, "Gigabyte Technology Co., Ltd."),
    //	DMI_MATCH(DMI_PRODUCT_NAME, "Z87X-UD5H"),
    DMI_MATCH(DMI_BIOS_VENDOR, "American Megatrends Inc." ),
    DMI_MATCH(DMI_BIOS_VERSION, "F3"),

I think this will probably catch more cases, but it's hard to say which bios version introduced the problem; at least versions F3 and F2 are affected. From the discussion on the ArchLinux bug report, the  firmware was apparently fixed somewhere between versions F4 and F6: https://bugs.archlinux.org/task/39811 . It is reported there that flashing one of the affected motherboards to F6 fixes the problem.
Comment 30 Lv Zheng 2014-04-17 08:06:53 UTC
(In reply to Bruce Chiarelli from comment #29)
> Created attachment 132611 [details]
> dmesg output after applying 132561
> 
> It boots using this patch. 
> 
> Also, I get the same results when I change the quirk in boot.c to this:
> 
>     //  DMI_MATCH(DMI_SYS_VENDOR, "Gigabyte Technology Co., Ltd."),
>     //        DMI_MATCH(DMI_PRODUCT_NAME, "Z87X-UD5H"),
>     DMI_MATCH(DMI_BIOS_VENDOR, "American Megatrends Inc." ),
>     DMI_MATCH(DMI_BIOS_VERSION, "F3"),
> 
> I think this will probably catch more cases, but it's hard to say which bios
> version introduced the problem; at least versions F3 and F2 are affected.
> From the discussion on the ArchLinux bug report, the  firmware was
> apparently fixed somewhere between versions F4 and F6:
> https://bugs.archlinux.org/task/39811 . It is reported there that flashing
> one of the affected motherboards to F6 fixes the problem.

Thanks for the tests, I'll use this to compose the fix patch:
>     DMI_MATCH(DMI_BIOS_VENDOR, "American Megatrends Inc." ),
>     DMI_MATCH(DMI_BIOS_VERSION, "F3"),
And
>     DMI_MATCH(DMI_BIOS_VENDOR, "American Megatrends Inc." ),
>     DMI_MATCH(DMI_BIOS_VERSION, "F2"),
According to the link you provided. :-)

The bug has been resolved.
I'll continue to look at the XSDT validation code using the XSDT you've provided.
So let me open it for a while so that if I can reproduce it I can find people here to test.

Thanks and best regards
Comment 31 Lv Zheng 2014-04-17 08:09:05 UTC
(In reply to Bruce Chiarelli from comment #27)
> Created attachment 132591 [details]
> acpidump -x -x
> 
> acpidump -x fails with code 137

This is expected, I believe this indicates the same reason as why kernel has failed to boot.

Thanks and best regards
-Lv
Comment 32 Spyros Stathopoulos 2014-04-17 08:20:03 UTC
Created attachment 132621 [details]
RSDT (Gigabyte B85-HD3)
Comment 33 Spyros Stathopoulos 2014-04-17 08:21:22 UTC
Created attachment 132631 [details]
XSDT (Gigabyte B85-HD3)
Comment 34 Spyros Stathopoulos 2014-04-17 08:22:02 UTC
Created attachment 132641 [details]
acpidump -x -x (Gigabyte B85-HD3)

acpidump -x fails as already reported, -x -x works
Comment 35 Spyros Stathopoulos 2014-04-17 08:23:39 UTC
I am also for the BIOS check mechanism. It seems to catch a more broad array of cases. I am building the new kernel now, but from what I'm reading I expect it will work
Comment 36 Spyros Stathopoulos 2014-04-17 08:48:24 UTC
Created attachment 132651 [details]
dmesg after patching (Gigabyte B85-HD3)

As expected the kernel boots now. Both the DMI_BIOS_VENDOR and DMI_SYS_VENDOR quirks work.
Comment 37 Lv Zheng 2014-04-17 12:46:30 UTC
(In reply to Bruce Chiarelli from comment #27)
> Created attachment 132591 [details]
> acpidump -x -x
> 
> acpidump -x fails with code 137

It seems only one null entry in it.
I'm curious that if acpidump -a 0xBA1B6D60 can dump the Dsdt.
If it can we should delete the xsdt validation code, but skip null entries during table installation.
Could you give it a try?
Comment 38 Lv Zheng 2014-04-17 12:53:45 UTC
Created attachment 132681 [details]
acpidump: Add support to force using RSDT.

The updated RSDT force for acpidump.
Comment 39 Lv Zheng 2014-04-17 12:55:02 UTC
Created attachment 132691 [details]
Tables: Skip NULL entries in RSDT and XSDT.

Skip NULL entries of RSDT/XSDT in ACPICA.
Comment 40 Lv Zheng 2014-04-17 12:56:52 UTC
Created attachment 132701 [details]
[PATCH] ACPICA: Tables: Skip NULL entries in RSDT and XSDT.

Linuxized attachment 132691 [details].
Note this is generated for the latest acpica release.
For old kernels, you may need to do rebase, or delete the diff blocks for oslinuxtbl.c.
Comment 41 Lv Zheng 2014-04-17 12:59:46 UTC
(In reply to Spyros Stathopoulos from comment #36)
> Created attachment 132651 [details]
dmesg after patching (Gigabyte B85-HD3)
> As expected the kernel boots now. Both the DMI_BIOS_VENDOR and
> DMI_SYS_VENDOR quirks work.

Could you give attachment 132701 [details] a try?

1. Apply it on top of recent Linux (may require rebase work)
2. Build the kernel
3. Boot the kernel without acpi=rsdt

If we can dump DSDT by acpidump -a "the first entry in XSDT", the XSDT validation code in fact should be deleted.
Comment 42 Lv Zheng 2014-04-17 13:03:47 UTC
(In reply to Spyros Stathopoulos from comment #34)
> Created attachment 132641 [details]
acpidump -x -x (Gigabyte B85-HD3)
> acpidump -x fails as already reported, -x -x works

I also need tests for new acpidump.

Please:

1. Download acpica package;
2. Apply attachment 132541 [details], attachment 132681 [details], attachment 132691 [details] on top of recent acpica;
3. Build acpidump and try the following commands:
   # sudo acpidump > acpidump1.txt
   # sudo acpidump -x > acpidump2.txt
   # sudo acpidump -x -x > acpidump3.txt
   # sudo acpidump -a <RSDT Address> > rsdt.txt
   # sudo acpidump -a <XSDT Address> > xsdt.txt
   # sudo acpidump -a <First entry in XSDT> > dsdt.txt
4. Please report back if any of these commands failed.
Comment 43 Lv Zheng 2014-04-17 13:47:05 UTC
The first entry should be Fadt not Dsdt, sorry for the wrong statement.
Comment 44 Lv Zheng 2014-04-17 15:31:26 UTC
Created attachment 132731 [details]
[PATCH] ACPICA: Tables: Skip NULL entries in RSDT and XSDT.

v3.14 rebased result for attachment 132701 [details].
Comment 45 Łukasz Twarduś 2014-04-17 16:15:39 UTC
(In reply to Lv Zheng from comment #30)
> (In reply to Bruce Chiarelli from comment #29)
> > Created attachment 132611 [details]
> > dmesg output after applying 132561
> > 
> > It boots using this patch. 
> > 
> > Also, I get the same results when I change the quirk in boot.c to this:
> > 
> >     //  DMI_MATCH(DMI_SYS_VENDOR, "Gigabyte Technology Co., Ltd."),
> >     //      DMI_MATCH(DMI_PRODUCT_NAME, "Z87X-UD5H"),
> >     DMI_MATCH(DMI_BIOS_VENDOR, "American Megatrends Inc." ),
> >     DMI_MATCH(DMI_BIOS_VERSION, "F3"),
> > 
> > I think this will probably catch more cases, but it's hard to say which
> bios
> > version introduced the problem; at least versions F3 and F2 are affected.
> > From the discussion on the ArchLinux bug report, the  firmware was
> > apparently fixed somewhere between versions F4 and F6:
> > https://bugs.archlinux.org/task/39811 . It is reported there that flashing
> > one of the affected motherboards to F6 fixes the problem.
> 
> Thanks for the tests, I'll use this to compose the fix patch:
> >     DMI_MATCH(DMI_BIOS_VENDOR, "American Megatrends Inc." ),
> >     DMI_MATCH(DMI_BIOS_VERSION, "F3"),
> And
> >     DMI_MATCH(DMI_BIOS_VENDOR, "American Megatrends Inc." ),
> >     DMI_MATCH(DMI_BIOS_VERSION, "F2"),
> According to the link you provided. :-)
> 
> The bug has been resolved.
> I'll continue to look at the XSDT validation code using the XSDT you've
> provided.
> So let me open it for a while so that if I can reproduce it I can find
> people here to test.
> 
> Thanks and best regards

i7-4770
Gigabyte Z87-HD3

I had bios F4, update to F7 solved the issue. With F4 i was able to boot kernel 3.14+ only wiht acpi=off or acpi=rsdt
Comment 46 Spyros Stathopoulos 2014-04-17 16:33:50 UTC
Created attachment 132741 [details]
acpidump1.txt (acpidump), after 3rd patch (B85-HD3)
Comment 47 Spyros Stathopoulos 2014-04-17 16:34:15 UTC
Created attachment 132751 [details]
acpidump2.txt (acpidump -x), after 3rd patch (B85-HD3)
Comment 48 Spyros Stathopoulos 2014-04-17 16:34:36 UTC
Created attachment 132761 [details]
acpidump3.txt (acpidump -x -x), after 3rd patch (B85-HD3)
Comment 49 Spyros Stathopoulos 2014-04-17 16:35:59 UTC
Created attachment 132771 [details]
RSDT after 3rd patch (B85-HD3)
Comment 50 Spyros Stathopoulos 2014-04-17 16:36:29 UTC
Created attachment 132781 [details]
XSDT after 3rd patch (B85-HD3)
Comment 51 Spyros Stathopoulos 2014-04-17 16:45:45 UTC
There was only one entry is xsdt.txt with address 0xBA5ED078. Running again acpidump with the same address gave me again the contents of xsdt.txt. So fadt would be the same as xsdt. Could I be misunderstanding something?
Comment 52 Lv Zheng 2014-04-17 17:46:18 UTC
The decoding result of RSDT:

[000h 0000   4]                    Signature : "RSDT"    [Root System Description Table]
[004h 0004   4]                 Table Length : 00000048
[008h 0008   1]                     Revision : 01
[009h 0009   1]                     Checksum : 49
[00Ah 0010   6]                       Oem ID : "ALASKA"
[010h 0016   8]                 Oem Table ID : "A M I"
[018h 0024   4]                 Oem Revision : 01072009
[01Ch 0028   4]              Asl Compiler ID : "MSFT"
[020h 0032   4]        Asl Compiler Revision : 00010013

[024h 0036   4]       ACPI Table Address   0 : BA5ED0F0
[028h 0040   4]       ACPI Table Address   1 : BA5F8290
[02Ch 0044   4]       ACPI Table Address   2 : BA5F8308
[030h 0048   4]       ACPI Table Address   3 : BA5F8848
[034h 0052   4]       ACPI Table Address   4 : BA5F9320
[038h 0056   4]       ACPI Table Address   5 : BA5F9360
[03Ch 0060   4]       ACPI Table Address   6 : BA5F9398
[040h 0064   4]       ACPI Table Address   7 : BA5F9708
[044h 0068   4]       ACPI Table Address   8 : BA5FC9A8

The decoding result of XSDT:
[000h 0000   4]                    Signature : "XSDT"    [Extended System Description Table]
[004h 0004   4]                 Table Length : 00000074
[008h 0008   1]                     Revision : 01
[009h 0009   1]                     Checksum : 18
[00Ah 0010   6]                       Oem ID : "ALASKA"
[010h 0016   8]                 Oem Table ID : "A M I"
[018h 0024   4]                 Oem Revision : 01072009
[01Ch 0028   4]              Asl Compiler ID : "AMI "
[020h 0032   4]        Asl Compiler Revision : 00010013

[024h 0036   8]       ACPI Table Address   0 : 00000000BA5F8180
[02Ch 0044   8]       ACPI Table Address   1 : 00000000BA5F8290
[034h 0052   8]       ACPI Table Address   2 : 00000000BA5F8308
[03Ch 0060   8]       ACPI Table Address   3 : 00000000BA5F8848
[044h 0068   8]       ACPI Table Address   4 : 00000000BA5F9320
[04Ch 0076   8]       ACPI Table Address   5 : 00000000BA5F9360
[054h 0084   8]       ACPI Table Address   6 : 00000000BA5F9398
[05Ch 0092   8]       ACPI Table Address   7 : 00000000BA5F9708
[064h 0100   8]       ACPI Table Address   8 : 00000000BA5FC9A8
[06Ch 0108   8]       ACPI Table Address   9 : 0000000000000000

There are only 2 differences:
1. ACPI Table Address 9: This is a NULL entry, original solution will stop using XSDT but using RSDT if it detects NULL entry in it.
2. ACPI Table Address 0: They point to different table address, should both be a FADT table.
acpidump1.txt contains the FADT in XSDT (the default behavior, RSDT is not forced):
FACP @ 0x00000000BA5F8180
  0000: 46 41 43 50 0C 01 00 00 05 4B 41 4C 41 53 4B 41  FACP.....KALASKA
  0010: 41 20 4D 20 49 00 00 00 09 20 07 01 41 4D 49 20  A M I.... ..AMI 
  0020: 13 00 01 00 80 F0 6C BA 78 D1 5E BA 01 01 09 00  ......l.x.^.....
  0030: B2 00 00 00 A0 A1 00 00 00 18 00 00 00 00 00 00  ................
  0040: 04 18 00 00 00 00 00 00 50 18 00 00 08 18 00 00  ........P.......
  0050: 20 18 00 00 00 00 00 00 04 02 01 04 10 00 00 00   ...............
  0060: 65 00 39 00 00 04 10 00 00 00 0D 00 32 11 00 00  e.9.........2...
  0070: A5 84 03 00 01 08 00 00 F9 0C 00 00 00 00 00 00  ................
  0080: 06 00 00 00 00 00 00 00 00 00 00 00 78 D1 5E BA  ............x.^.
  0090: 00 00 00 00 01 20 00 02 00 18 00 00 00 00 00 00  ..... ..........
  00A0: 01 00 00 02 00 00 00 00 00 00 00 00 01 10 00 02  ................
  00B0: 04 18 00 00 00 00 00 00 01 00 00 02 00 00 00 00  ................
  00C0: 00 00 00 00 01 08 00 01 50 18 00 00 00 00 00 00  ........P.......
  00D0: 01 20 00 03 08 18 00 00 00 00 00 00 01 80 00 01  . ..............
  00E0: 20 18 00 00 00 00 00 00 01 00 00 01 00 00 00 00   ...............
  00F0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  0100: 00 00 00 00 00 00 00 00 00 00 00 00              ............
acpidump2.txt contains the FADT in RSDT (the behavior to force RSDT):
FACP @ 0x00000000BA5ED0F0
  0000: 46 41 43 50 84 00 00 00 02 A7 41 4C 41 53 4B 41  FACP......ALASKA
  0010: 41 20 4D 20 49 00 00 00 09 20 07 01 41 4D 49 20  A M I.... ..AMI 
  0020: 13 00 01 00 40 F0 6C BA 78 D1 5E BA 01 01 09 00  ....@.l.x.^.....
  0030: B2 00 00 00 A0 A1 00 00 00 18 00 00 00 00 00 00  ................
  0040: 04 18 00 00 00 00 00 00 50 18 00 00 08 18 00 00  ........P.......
  0050: 20 18 00 00 00 00 00 00 04 02 01 04 10 00 00 00   ...............
  0060: 65 00 E9 03 00 04 10 00 00 00 0D 00 32 11 00 00  e...........2...
  0070: A5 84 03 00 01 08 00 00 F9 0C 00 00 00 00 00 00  ................
  0080: 06 00 00 00                                      ....
So no need to do any acpidump test.

It seems both modes work!  What we need to do is ignoring the NULL entry.
The solution is implemented in patch - attachment 132691 [details].
This should be better than the quirk solution.
Comment 53 Lv Zheng 2014-04-17 17:48:59 UTC
Here is the test request for my new solution, this is not a quirk solution.

1. Apply attachment 132731 [details] on top of v3.14 Linux;
2. Build the kernel;
3. Boot the kernel without acpi=rsdt;
4. Post the dmesg here.

Thanks in advance.
Comment 54 Lv Zheng 2014-04-17 18:00:49 UTC
Created attachment 132801 [details]
[PATCH] x86/ACPI: Add quirk mechanism to stop using broken XSDT on some platforms.

The final version of the quirk.
It is posted for those who want it.
Comment 55 Lv Zheng 2014-04-17 18:02:50 UTC
Here is the test request for the quirk solution.

1. Apply attachment 132801 [details] on top of v3.14 Linux;
2. Build the kernel;
3. Boot the kernel without acpi=rsdt;
4. Post the dmesg here.

Thanks in advance.
Comment 56 Lv Zheng 2014-04-17 18:12:53 UTC
I noticed that possibly the better quirk should be checking if XSDT is compiled by an AMI compiler...
[01Ch 0028   4]              Asl Compiler ID : "AMI "
Such XSDT seems to be generated by AMI compiler not MSFT.

Anyway, if comment 53 works, we really don't need any quirk.
Comment 57 Bruce Chiarelli 2014-04-17 19:30:23 UTC
Created attachment 132811 [details]
dmesg after applying 132731

The non-quirk patch works here. :)
Comment 58 Bruce Chiarelli 2014-04-17 19:44:47 UTC
Created attachment 132821 [details]
dmesg after applying 132801 on a clean v3.14 tree

... and the quirk patch works as well for me. Each time, the patches were applied against a clean v3.14 tree.
Comment 59 Lv Zheng 2014-04-17 21:52:13 UTC
The bug has been solved.  I'll contact maintainer.  :)
Comment 60 Spyros Stathopoulos 2014-04-17 22:07:18 UTC
Can confirm too!
Comment 61 Lv Zheng 2014-04-18 01:42:15 UTC
Hi, Bruce Chiarelli

Another request from my team:

I didn't reproduce the issue in my environment using the XSDT provided by you.
The invocation of acpi_tb_validate_xsdt using your XSDT returned AE_NULL_ENTRY and RSDT was selected after that.

So could you please offer the driver/acpi/acpica/tbutils.o that is corresponding to this panic log:

===
? acpi_find_root_pointer+0x116/0x158
? early_idt_handlers+0x120/0x120
acpi_initialize_tables+0x57/0x59
acpi_table_init+0x1b/0x99
acpi_boot_table_init+0x1e/0x85
setup_arch+0x992/0x452
? early_idt_handlers+0x120/0x120
x86_64_start_reservations+0x2a/0x2c
x86_64_start_kernel+0x13e/0x14d

Code: 00 00 00 41 bd 04 00 00 00 e0 54 73 9f ff 48 c7 c2 74
81 48 89 c1 be 00 02 00 00 48 c7 c7 b0 c0 64 81 31 c0 e8 20
a6 9f ff <41> 8b 5c 24 10 be 24 00 00 00 48 89 df e8 64 23
bb ff 48 85 c0

RIP acpi_tb_parse_root_table+0x120/0x2d2
RSP
CR2
===

This might be useful to find potential another issues that are exposed by the bisected commit.
Comment 62 Bruce Chiarelli 2014-04-18 02:51:29 UTC
Created attachment 132871 [details]
tbutils.o from plain v3.14

That panic message was what Spyros Stathopoulos had reported; what I see is a bit different (I didn't have anything to capture it, so I typed this on another computer by sight. Beware.):


000
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff817b3c4d>] dump_stack+0x45/0x56
[    0.000000]  [<ffffffff81ce21a1>] early_idt_handler+0x81/0xa8
[    0.000000]  [<ffffffff812e03a5>] ? __const_udelay+0x15/0x30
[    0.000000]  [<ffffffff817afe3e>] panic+0x1b4/0x1bc
[    0.000000]  [<ffffffff81048aa0>] do_exit+0x990/0xaf0
[    0.000000]  [<ffffffff817bde15>] oops_end+0x85/0xc0
[    0.000000]  [<ffffffff817af8ae>] no_context+0x26b/0x278
[    0.000000]  [<ffffffff817af923>] __bad_area_nosemaphore+0x68/0x1bf
[    0.000000]  [<ffffffff817afa88>] bad_area_nosemaphore+0xe/0x10
[    0.000000]  [<ffffffff817c013c>] __do_page_fault+0x8c/0x510
[    0.000000]  [<ffffffff817b0682>] ? printk+0x4f/0x51
[    0.000000]  [<ffffffff817c05cc>] do_page_fault+0xc/0x10
[    0.000000]  [<ffffffff817bd3e2>] page_fault+0x22/0x30
[    0.000000]  [<ffffffff81d1a8ce>] ? acpi_tb_parse_root_table+0x123/0x2e1
[    0.000000]  [<ffffffff81d1adb9>] ? acpi_find_root_pointer+0x116/0x158
[    0.000000]  [<ffffffff81d1aae3>] acpi_initialize_tables+0x57/0x59
[    0.000000]  [<ffffffff81d18bdd>] acpi_table_init+0x1b/0x99
[    0.000000]  [<ffffffff81cedf7e>] acpi_boot_table_init+0x1e/0x85
[    0.000000]  [<ffffffff81ce60f8>] setup_arch+0x922/0xc33
[    0.000000]  [<ffffffff81ce2b1a>] start_kernel+0x85/0x3ca
[    0.000000]  [<ffffffff81ce25ad>] x86_64_start_reservations+0x2a/0x2c
[    0.000000]  [<ffffffff81ce26a5>] x86_64_start_kernel+0xf6/0xf9
[    0.000000] RIP 0x0
Comment 63 Lv Zheng 2014-04-18 04:24:06 UTC
(In reply to Bruce Chiarelli from comment #62)
> Created attachment 132871 [details]
> tbutils.o from plain v3.14
> 
> That panic message was what Spyros Stathopoulos had reported; what I see is
> a bit different (I didn't have anything to capture it, so I typed this on
> another computer by sight. Beware.):

Thanks for the information.
The cause of the crash is a pointer referencing after unmapping...

Let me post the fix.
Comment 64 Lv Zheng 2014-04-18 04:25:46 UTC
Created attachment 132891 [details]
[PATCH] ACPICA: Tables: Fix a bad pointer issue in acpi_tb_parse_root_table().

This could fix the crash issue.
However patch 132731 is still useful to improve ACPI according to the investigation result.
Comment 65 Lv Zheng 2014-04-18 04:26:31 UTC
Here is the test request for the fix of the root cause.

1. Apply attachment 132891 [details] on top of v3.14 Linux;
2. Build the kernel;
3. Boot the kernel without acpi=rsdt;
4. Post the dmesg here.

Thanks in advance.
Comment 66 Bruce Chiarelli 2014-04-18 04:40:13 UTC
Created attachment 132901 [details]
Dmesg after applying 132891
Comment 67 Spyros Stathopoulos 2014-04-18 09:39:35 UTC
Created attachment 132921 [details]
tbutils.o from panicking kernel

Hope it helps!
Comment 68 Lv Zheng 2014-04-19 09:56:36 UTC
Hi,

All required patches are upstreamed to acpica/master branch:
https://github.com/acpica/acpica/commit/e32ea55

It includes:
1. attachment 132891 [details] - this should appear in 3.14 as stable material.
2. attachment 132681 [details]
3. attachment 132691 [details] - same as attachment 132731 [details], split into 2, so that if other platforms complained regression, we can go back to old solution.

Sorry for the mistake I made in the bisected commit and thanks for the information of "AMI F2-F4" platforms reported here, it really improves acpica and Linux.

I'm going to close this bug.
Comment 69 Josh Boyer 2014-04-25 12:17:55 UTC
Did attachment 132891 [details] ever get sent upstream to Linus?  It seems nothing for this issue is queued for 3.14.y or in Linus' tree.
Comment 70 Lv Zheng 2014-04-28 04:36:26 UTC
(In reply to Josh Boyer from comment #69)
> Did attachment 132891 [details] ever get sent upstream to Linus?
> It seems nothing for this issue is queued for 3.14.y or in Linus' tree.

Hi,

It is under processing.
This patch will be merged by Linux upstream kernels with stable tagged during ACPICA 201404 release process.  The release is going to happen this or next week.

Thanks and best regards
-Lv