Bug 15665 - ACPI warnings, boot hangs on ssb load unless acpi=off
Summary: ACPI warnings, boot hangs on ssb load unless acpi=off
Status: CLOSED DOCUMENTED
Alias: None
Product: ACPI
Classification: Unclassified
Component: BIOS (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: acpi_bios
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-01 08:02 UTC by Philippe De Muyter
Modified: 2010-09-29 02:09 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.33-6-desktop (opensuse 11.3 milestone 4 installed kernel)
Subsystem:
Regression: No
Bisected commit-id:


Attachments
acpidump output (73.75 KB, application/x-compressed-tar)
2010-04-01 08:02 UTC, Philippe De Muyter
Details
5 successive netconsole ignore_loglevel logs of failed boots without acpi=off (502.10 KB, text/x-log)
2010-04-01 08:10 UTC, Philippe De Muyter
Details
successive netconsole ignore_loglevel logs of boot failures with various acpi options (115.86 KB, text/x-log)
2010-04-01 08:12 UTC, Philippe De Muyter
Details
output of dmesg -s64000 from acpi=off boot (48.50 KB, text/x-log)
2010-04-01 16:59 UTC, Philippe De Muyter
Details
log of various boot attempts with selected acpi related kernel parameters (916.56 KB, text/x-log)
2010-04-02 15:39 UTC, Philippe De Muyter
Details
.config of working vmlinuz-2.6.33-6-default kernel (111.77 KB, text/plain)
2010-04-03 15:13 UTC, Philippe De Muyter
Details
.config of unbootable vmlinuz-2.6.33-6-desktop kernel (108.79 KB, text/plain)
2010-04-03 15:18 UTC, Philippe De Muyter
Details

Description Philippe De Muyter 2010-04-01 08:02:10 UTC
Created attachment 25794 [details]
acpidump output

hardware : laptop hp pavilion dv6-1300sb (Intel Pentium Dual-Core T4300 2.1 GHz 4Gb ram)

I have installed, using the opensuse 11.3 milestone 4 DVD, opensuse 11.3 milestone 4 on my hp pavilion dv6-1300sb laptop.  The installation kernel boots and works flawlessly, but the installed kernel always crashes (but at random points) without acpi=off, and often (but not always) also randomly crashes at boot
with acpi=off.

I have attached acpidump output (taken after a successfull boot with the installed kernel) and netconsole ignore_loglevel logs of successive attempts to boot without acpi=off, or with some acpi options I tried after browsing Documentation/kernel-parameters.txt
Comment 1 Philippe De Muyter 2010-04-01 08:10:00 UTC
Created attachment 25795 [details]
5 successive netconsole ignore_loglevel logs of failed boots without acpi=off
Comment 2 Philippe De Muyter 2010-04-01 08:12:51 UTC
Created attachment 25796 [details]
successive netconsole ignore_loglevel logs of boot failures with various acpi options
Comment 3 Philippe De Muyter 2010-04-01 16:59:17 UTC
Created attachment 25802 [details]
output of dmesg -s64000 from acpi=off boot
Comment 4 Philippe De Muyter 2010-04-02 15:39:01 UTC
Created attachment 25821 [details]
log of various boot attempts with selected acpi related kernel parameters

here are the results of boot attempts with

acpi=ht (seems to boot successfully sometimes)
acpi=rsdt (failure)
acpi=noirq (failure)
pci=noacpi (boot sometimes)

taken using netconsole added as module via initrd

I see some [cut here] in those logs, but as I do not have a log of a really succesfull boot,  I don't know if there are relevant.
Comment 5 Andrew Morton 2010-04-02 21:45:13 UTC
From which kernel version is this a regression?  What is the latest kernel version which you know to not have this problem?

Thanks.
Comment 6 Thomas Renninger 2010-04-03 00:03:41 UTC
Some BIOS bugs/warnings:
  - DMAR Register Base Address reporting a zero address
  - ACPI Error (dswload-0802): [_T_1] and related _BQC and GBQC already exists
    error messages. This may not be a BIOS bug, I'd like you to test an
    overridden DSDT for that. Unfortunately to try this recompiling the kernel
    is currently needed.
    But this is for brightness control only, we could have a look after the
    machine is booting reliably.
  - ACPI Warning for \_TZ_.TZ01._CRT

Best you check for a BIOS update first.
Due to the broken DMAR table, I'd try to search BIOS options and turn off anything related to virtualization (after BIOS update did not help).
Comment 7 Thomas Renninger 2010-04-03 00:18:56 UTC
For the brightness issue:
ACPI Error (dswload-0802): [_T_1] and related _BQC and GBQC already exists
Could you open a bug here:
http://acpica.org/bugzilla
and post acpidump and one log already, please.
Eventually I know what is going wrong and the acpica guys may be able to fix this in parallel. It looks like _T_0 and _T_1 are pre-defined internal variables and defining them again: Name (_T_1, Zero)
may cause this message and the calling functions GBQC and _BQC may just throw the same messages and exit/break with the same error code passed upwards and are not evaluated further. This then would be an issue that should get fixed in acpica parts (You also can copy this for a first analysis).
Please add me to CC of the new bug if you open one. Thanks.
Comment 8 Philippe De Muyter 2010-04-03 15:10:51 UTC
(In reply to comment #5)
> From which kernel version is this a regression?  What is the latest kernel
> version which you know to not have this problem?
> 
> Thanks.

Sorry, I thought is was a regression because on opensuse DVD there is an kernel used during the installation and another one finally installed on the hard disk.
The one used for installation works perfectly, the one installed on the hard disk crashes randomly during boot.  Now they appear to be compiled from the same kernel sources but with different configurations, one called default used for installation, and the other one called desktop installed on the hard disk.

version of default kernel says : Linux version 2.6.33-6-default (geeko@buildhost) (gcc version 4.5.0 20100311 (experimental) [trunk revision 157384] (SUSE Linux) ) #1 SMP 2010-02-25 20:06:12 +0100

I attach both config files (obtained from /proc/config.gz)
Comment 9 Philippe De Muyter 2010-04-03 15:13:49 UTC
Created attachment 25834 [details]
.config of working vmlinuz-2.6.33-6-default kernel
Comment 10 Philippe De Muyter 2010-04-03 15:18:17 UTC
Created attachment 25835 [details]
.config of unbootable vmlinuz-2.6.33-6-desktop kernel
Comment 11 Philippe De Muyter 2010-04-03 15:40:17 UTC
(In reply to comment #6)
> Some BIOS bugs/warnings:
>   - DMAR Register Base Address reporting a zero address
>   - ACPI Error (dswload-0802): [_T_1] and related _BQC and GBQC already
>   exists
>     error messages. This may not be a BIOS bug, I'd like you to test an
>     overridden DSDT for that. Unfortunately to try this recompiling the
>     kernel
>     is currently needed.
>     But this is for brightness control only, we could have a look after the
>     machine is booting reliably.
>   - ACPI Warning for \_TZ_.TZ01._CRT
> 
> Best you check for a BIOS update first.

This is already the latest BIOS (F41 A), that I downloaded from hp web site.
When I bought this laptop, it had F36, and linux refused to boot just as it does now.

> Due to the broken DMAR table, I'd try to search BIOS options and turn off
> anything related to virtualization (after BIOS update did not help).

Do you mean BIOS options in BIOS setup or kernel parameters ?
Comment 12 Robert Moore 2010-04-05 21:11:17 UTC
(In reply to comment #7)
> For the brightness issue:
> ACPI Error (dswload-0802): [_T_1] and related _BQC and GBQC already exists

This is a bug in the BIOS and will fail on both Windows and ACPICA. The _T_0 and _T_1 symbols are created as a result of using a Switch() ASL operator. The Microsoft ASL compiler emits these symbols near where the Switch is used. However, if the Switch is used in a loop (which it is in this case), the method will fail on the second pass through the loop.

Note for BIOS writers: The iASL compiler emits such temporary variables (_T_x) at the namespace root, so that the restriction of not using a Switch() within a While() is eliminated.
Comment 13 Len Brown 2010-04-06 02:22:05 UTC
There seem to be four, apparently independent issues on this box.

1. DMAR error message and stack trace

Both the working and desktop kernel configs include CONFIG_DMAR
This appears to be a BIOS bug, but should probably be filed
as a separate bug against PCI -- it is possible that the
stack trace is not useful here and perhaps should be removed
from the kernel.

2. [Firmware Bug]: Invalid critical threshold (_CRT:%)

This one is a BIOS bug.  It is due to Linux claiming suport
for "Windows 2006".  The BIOS has invalid code for vista,
and you can get rid of this message by booting with
acpi_osi="!Windows 2006" acpi_osi="!Windows 2009"
though living with the message is also fine.

3. ACPI warnings related to \_SB_.PCI0.PEGP.VGA_ (LCD and VGA)

This is a third BIOS bug, as explained by Bob in comment #12.
If the LCD brightness controls work on this box,
then just ignore these warnings.

4. ssb b43 driver issues -- apparently causing a hang.

this, the most serious issue, is likely the only regression,
and is likely a duplicate of bug 14716

Finally, the difference between the install kernel and
the desktop kernel is indeed a mystery.  the configs are very similar,
mostly changing suport between =y and =m.
The ACPI support is identical, and both include ssb.

In summary:
1. file a PCI bug
2. documented BIOS bug
3. documented BIOS bug
4. duplicate

this sighting is closed.
Comment 14 Philippe De Muyter 2010-04-20 16:10:06 UTC
I now have more test results from multiple boot attempts either with the opensuse desktop config : never boots, and with the opensuse default config and also with 2.6.34-rc4 : often boots, but not always.

I thus surmise it must be a race problem but I don't know where.  How can I gather the most usefull information using alt-sysrq ?
Comment 15 Thomas Renninger 2010-04-21 08:53:22 UTC
This looks like a bug in the wireless driver, best you ask/go on bug #14716.
Eventually this really has to do with ACPI if the wireless driver works with acpi=off. Anyway, this seem to be a known issue and all is pointing to wireless drivers, thus Len closed this bug "RESOLVED DOCUMENTED".

Note You need to log in before you can comment on or make changes to this bug.