Bug 11884 - No battery status information - HP EliteBook 2730p
No battery status information - HP EliteBook 2730p
Status: CLOSED CODE_FIX
Product: ACPI
Classification: Unclassified
Component: EC
All Linux
: P1 normal
Assigned To: Alexey Starikovskiy
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-10-28 23:01 UTC by Alex Shumakovitch
Modified: 2009-01-16 13:19 UTC (History)
6 users (show)

See Also:
Kernel Version: 2.6.27.4
Tree: Mainline
Regression: ---


Attachments
dmesg output (47.68 KB, text/plain)
2008-10-28 23:05 UTC, Alex Shumakovitch
Details
lcpci -v output (11.89 KB, text/plain)
2008-10-28 23:06 UTC, Alex Shumakovitch
Details
acpidump output (461.17 KB, text/plain)
2008-10-28 23:06 UTC, Alex Shumakovitch
Details
GCC soft lockup error messages while compiling the kernel (20.71 KB, application/octet-stream)
2008-10-29 16:24 UTC, Alex Shumakovitch
Details
try the custom DSDT (628.64 KB, patch)
2008-10-30 02:28 UTC, ykzhao
Details | Diff
without additional parameters (107.26 KB, text/plain)
2008-10-30 04:57 UTC, Michael Haslgrübler
Details
dmesg with parameter acpi_no_auto_ssdt (102.13 KB, text/plain)
2008-10-30 04:58 UTC, Michael Haslgrübler
Details
dmesg on 2.6.27.4 kernel (71.13 KB, text/plain)
2008-10-30 08:23 UTC, Michael Haslgrübler
Details
My dmesg for 2.6.27.4 with patched DSDT (47.96 KB, application/octet-stream)
2008-10-30 11:59 UTC, Alex Shumakovitch
Details
Soft lockup during boot process with 2.6.28-rc2 (screen shot 1) (242.26 KB, image/jpeg)
2008-10-30 13:56 UTC, Alex Shumakovitch
Details
Soft lockup during boot process with 2.6.28-rc2 (screen shot 2) (195.46 KB, image/jpeg)
2008-10-30 13:56 UTC, Alex Shumakovitch
Details
Soft lockup during boot process with 2.6.28-rc2 (screen shot 3) (179.10 KB, image/jpeg)
2008-10-30 13:57 UTC, Alex Shumakovitch
Details
Enable/disable GPE under spinlock (8.55 KB, patch)
2008-10-30 14:30 UTC, Alexey Starikovskiy
Details | Diff
Configuration file for the 2.6.28-rc2 kernel (92.95 KB, application/octet-stream)
2008-10-30 16:14 UTC, Alex Shumakovitch
Details
Soft lockup during boot process with 2.6.28-rc2 and patch from #25 applied (screen shot 4) (212.90 KB, image/jpeg)
2008-10-30 23:29 UTC, Alex Shumakovitch
Details
Soft lockup during boot process with 2.6.28-rc2 and patch from #25 applied (screen shot 5) (244.80 KB, image/jpeg)
2008-10-30 23:30 UTC, Alex Shumakovitch
Details
Soft lockup during boot process with 2.6.28-rc2 and patch from #25 applied (screen shot 6) (168.42 KB, image/jpeg)
2008-10-30 23:31 UTC, Alex Shumakovitch
Details
patch: Remove EC space handler explicitly when failing in _REG object (1.42 KB, text/x-patch)
2008-11-26 00:36 UTC, ykzhao
Details
debug patch to find out where PSWT fails (1.09 KB, patch)
2008-12-01 00:17 UTC, Lin Ming
Details | Diff
dmesg of the vanilla 2.6.27.7 kernel with the "fan while AC is on" BIOS option disabled (46.08 KB, application/octet-stream)
2008-12-02 23:08 UTC, Alex Shumakovitch
Details
dmesg of the 2.6.27.7 kernel with patch from #44 applied and with the "fan while AC is on" BIOS option disabled (46.08 KB, application/octet-stream)
2008-12-02 23:09 UTC, Alex Shumakovitch
Details
dmesg of the 2.6.27.4 kernel with patch from #44 applied and with the "fan while AC is on" BIOS option enabled (49.67 KB, application/octet-stream)
2008-12-03 14:58 UTC, Alex Shumakovitch
Details
dmesg of the 2.6.27.4 kernel with patch from #44 applied and with the "fan while AC is on" BIOS option disabled (46.16 KB, application/octet-stream)
2008-12-03 14:59 UTC, Alex Shumakovitch
Details
debug patch (1.90 KB, patch)
2008-12-09 02:51 UTC, Lin Ming
Details | Diff
dmesg of the 2.6.27.4 kernel with patches from #44 and #52 applied and with the "fan while AC is on" BIOS option enabled (59.48 KB, application/octet-stream)
2008-12-10 23:24 UTC, Alex Shumakovitch
Details
dmesg of the 2.6.27.4 kernel with patches from #44 and #52 applied and with the "fan while AC is on" BIOS option disabled (54.77 KB, application/octet-stream)
2008-12-10 23:26 UTC, Alex Shumakovitch
Details
dmesg of the 2.6.27.4 kernel with patches from #44 and #52 applied, custom DSDT from #57 and with the "fan while AC is on" BIOS option enabled (57.24 KB, application/octet-stream)
2008-12-11 17:04 UTC, Alex Shumakovitch
Details
dmesg of the 2.6.27.7 kernel with patches from #44 and #52 applied, custom DSDT from #57 and with the "fan while AC is on" BIOS option enabled (57.41 KB, application/octet-stream)
2008-12-11 17:06 UTC, Alex Shumakovitch
Details
limit workarounds for ASUS (1.19 KB, patch)
2008-12-15 09:36 UTC, Alexey Starikovskiy
Details | Diff
patch vs 2.6.29-rc1 (1.09 KB, text/x-patch)
2009-01-16 11:01 UTC, Len Brown
Details

Description Alex Shumakovitch 2008-10-28 23:01:36 UTC
Latest working kernel version:    N/A
Earliest failing kernel version:  2.6.27.4
Distribution:                     Debian Lenny (testing)
Hardware Environment:
  HP 2730p tablet PC laptop, Core2 Duo CPU L9400  @1.86GHz, 4GB RAM

Software Environment:
  Fresh install of Debian Lenny/testing with vanilla 2.6.27.4 kernel
  from www.kernel.org The only change made is the intel_agp patch from
  http://intellinuxgraphics.org/download.html (the system hangs up 
  otherwise).

Problem Description:
  ACPI doesn't provide battery status. In fact, the directory
  /proc/acpi/battery is missing completely. dmesg shows multiple
  ACPI errors during the boot time of the type
      ACPI Error (psparse-0530): Method parse/execution failed [\_SB_.BAT0._BIF] (Node f7446ba8), AE_TIME

  All kernels prior to 2.6.27 that I've tried boot with apci=off option
  only (with only one core recognized), so I don't know whether this
  problem existed in earlier versions. I found several bug reports about
  similar problems on other types of laptops, but they all were supposed
  to be fixed in 2.6.27 by now. By the way, temperature sensors at several
  zones and frequency scaling work.
  
Steps to reproduce:
  Boot the laptop. I'm going to attach the outputs of dmesg, lcpci, and 
  acpidump to this bug report for more details. I'm not sure which other
  information might be useful.

Thank you,

   --- Alex.
Comment 1 Alex Shumakovitch 2008-10-28 23:05:23 UTC
Created attachment 18487 [details]
dmesg output
Comment 2 Alex Shumakovitch 2008-10-28 23:06:26 UTC
Created attachment 18488 [details]
lcpci -v output
Comment 3 Alex Shumakovitch 2008-10-28 23:06:57 UTC
Created attachment 18489 [details]
acpidump output
Comment 4 Alexey Starikovskiy 2008-10-29 02:36:00 UTC
Alex,
Could you please check if 2.6.28-rc2 works any better? It looks like ACPI interpreter does not handle your DSDT correctly, leaving EC uninitialized.
Comment 5 Alex Shumakovitch 2008-10-29 09:49:12 UTC
(In reply to comment #4)
> Alex,
> Could you please check if 2.6.28-rc2 works any better? It looks like ACPI
> interpreter does not handle your DSDT correctly, leaving EC uninitialized.
2.6.28-rc2 freezes on boot after 
   [    1.295918] ACPI: Thermal Zone [DTSZ] (48 C)
message, that is, right before EC initialization, and produces
"BUG: soft lockup" errors pproximately every minute. This is exactly the
behaviour that I've had with kernels < 2.6.27.

By the way, kernel compilation resulted in 
  Oct 29 11:53:29 helix kernel: [ 3272.768498] BUG: soft lockup - CPU#1 stuck for 61s! [gcc:3212]
after approximately one hour. I don't know whether this is related to my ACPI
troubles :-(

  --- Alex.
Comment 6 Robert Moore 2008-10-29 10:25:12 UTC
I don't see anything obviously wrong with the DSDT. It loads here correctly, and the EC _REG method works without error. Of course, this is not running on the actual hardware, so the behavior could be different.
Comment 7 Alexey Starikovskiy 2008-10-29 13:27:43 UTC
Yes, this sounds familiar -- there was a bug #11418 sounding very similar...
"hpet=disable" helped there...
Comment 8 Alex Shumakovitch 2008-10-29 15:24:12 UTC
I've tried to boot 2.6.28-rc2 with "hpet=disable", but this
didn't make a difference. In any case, the patch resolving
bug 11418 is already incorporated into the kernel, so it 
doesn't seem to be related.

On a side note, this laptop is listed as "SuSE Linux Enterprise
Desktop 10 Certified" on the HP's web page. I've tried to boot
it with OpenSuSE 11.0, but the live CD froze as well. I don't
really know whether there is any difference in "enterprise"
and "open" kernels and a bit reluctant to give all my personal
information just to download the trial version of the
Enterprise Desktop. Does it make sense to try?

Thanks,

  --- Alex.
Comment 9 Alexey Starikovskiy 2008-10-29 15:34:55 UTC
probably it will be the same -- HP is known to be easy with linux certification on notebooks.
did you try nohz=off and all highres=off too? 
You could enable kernel debugging of locks in kernel config -- it may give us a clue on what is wrong...
Comment 10 Alex Shumakovitch 2008-10-29 16:21:22 UTC
nohz=off and highres=off make no difference for 2.6.28-rc2
Which options in the kernel should I enable to debug locks?
I believe I have most of them, since there is plenty of output
after the soft lock occurs. Unfortunately, I have no way to
capture it during the boot, since the laptop lacks a serial port.

With the 2.6.27.4 kernel, I now suspect that the problem is
related to video, since all lockups during kernel compilation
that I've seen (a couple already) occurred when the screen saver
was trying to kick in. Does it do anything ACPI-related?
I've just compiled the kernel two times in a row from the text VC
without problems.

Anyway, I'll attach the complete error message produced after
a gcc soft lockup in a second.

Thanks,

   --- Alex.

Comment 11 Alex Shumakovitch 2008-10-29 16:24:39 UTC
Created attachment 18506 [details]
GCC soft lockup error messages while compiling the kernel

Please let me know if this is the kind of output that you
are looking for. I'm not good at kernel debugging, sorry.
Comment 12 Michael Haslgrübler 2008-10-29 23:35:20 UTC
I can confirm this problem as I have the same system regarding software and hardware.
Comment 13 Alexey Starikovskiy 2008-10-30 00:36:08 UTC
Alex,
it's always possible to make a screenshot with any digital camera, there are plenty of them around these days...

Michael,
There are two problems here, probably related -- one is soft lockups, other is absent ACPI information. Which one bothers you?
Comment 14 Michael Haslgrübler 2008-10-30 00:53:21 UTC
apparently both. However I didn't use the patch stated in your first post. I also tried out the Release Candidate of Ubuntu 8.10 which features the 2.6.27 kernel, there I get permanent soft lockups on both CPUs which leaves the system unbootable. This seems quite strange for me. One wild guess is that the this problem could be related to the architecture because for my Debian system I use amd64 when trying Ubuntu I used i386.
Comment 15 ykzhao 2008-10-30 02:28:29 UTC
Created attachment 18515 [details]
try the custom DSDT

It seems that this is an obvious BIOS bug.
   The PU1T is defined as the following:
   >Name (PU1T, Package (0x01)
        {
            Package (0x08)
            {
                0x00, 
                0x0B, 
                0x1F,
                0x29,
                0x33, 
                0x3D,
                0x51, 
                0x51
            }
     But it is accessed by "DerefOf (Index (PU1T, 0x01)" in PSWT, which is called by PRIT. In such case OS reports the following error message and EC device can't be initialized correctly.
      >ACPI Error (dswstate-0097): Result stack is empty! State=f74a5e00 [20080609]
      >[    0.248422] ACPI Exception (dsutils-0645): AE_AML_NO_RETURN_VALUE, Missing or null operand [20080609]
      >[    0.248578] ACPI Exception (dsutils-0762): AE_AML_NO_RETURN_VALUE, While creating Arg 0 [20080609]
        })
    
     Will you please try the custom DSDT and see whether the problem still exists? In the custom DSDT the error about PU1T is corrected.
     How to use the custom DSDT can be found in
     http://www.lesswatts.org/projects/acpi/faq.php
     thanks.
Comment 16 Michael Haslgrübler 2008-10-30 04:57:06 UTC
Created attachment 18517 [details]
without additional parameters
Comment 17 Michael Haslgrübler 2008-10-30 04:58:02 UTC
Created attachment 18518 [details]
dmesg with parameter acpi_no_auto_ssdt
Comment 18 Michael Haslgrübler 2008-10-30 05:02:03 UTC
in both cases with and without acpi_no_auto_ssdt the folder /proc/acpi/battery/
is now available however the parse exception still occurs. Additionally with acpi_no_auto_ssdt the cpu frequency scaling stopped working.

I tried with 2.6.27.2 kernel because I had it locally available. I will try it again later with 2.6.27.4.


Comment 19 Michael Haslgrübler 2008-10-30 08:23:46 UTC
Created attachment 18523 [details]
dmesg on 2.6.27.4 kernel

now I tried out the custom DSDT with the 2.6.27.4 kernel however no improvement to 2.6.27.2
Comment 20 Alex Shumakovitch 2008-10-30 11:59:39 UTC
Created attachment 18524 [details]
My dmesg for 2.6.27.4 with patched DSDT

Not much difference for me either. And I don't even have /proc/acpi/battery
directory.

I'm now going to try to compile 2.6.28-rc2 with this patch to see whether
it helps there.
Comment 21 Alexey Starikovskiy 2008-10-30 12:42:33 UTC
(In reply to comment #15)
Actually, access to PU1T should be prohibited by CUZ0[] being set to all 0xFF in RETD, called from _SB.INI. So, the call to _SB.INI _must_ be first function to be called. If any other INI function is called before it -- it will fail miserably.

We could try to pre-init CUZ0 with {0xFF,0xFF,0xFF,0xFF,0xFF,0xFF} instead of {} and see what happens...
Comment 22 Alex Shumakovitch 2008-10-30 13:56:29 UTC
Created attachment 18525 [details]
Soft lockup during boot process with 2.6.28-rc2 (screen shot 1)

OK, I've tried 2.6.28-rc2 with the DSDT patch. No difference.
I'm going to attach a couple of screen shots of the BUG: soft lockup
messages. The problem, of course, is that the complete error
message doesn't fit into one screen, even when I try vga=0x305
boot option.
Comment 23 Alex Shumakovitch 2008-10-30 13:56:54 UTC
Created attachment 18526 [details]
Soft lockup during boot process with 2.6.28-rc2 (screen shot 2)
Comment 24 Alex Shumakovitch 2008-10-30 13:57:11 UTC
Created attachment 18527 [details]
Soft lockup during boot process with 2.6.28-rc2 (screen shot 3)
Comment 25 Alexey Starikovskiy 2008-10-30 14:30:26 UTC
Created attachment 18528 [details]
Enable/disable GPE under spinlock

Thanks for screenshots. Do you have Kernel hacking -> Lock debugging: prove locking correctness turned on? If not, could you try to run with it?
Could you please check this patch ?
Also, it is possible to enable DEBUG mode of EC driver by uncommenting #define DEBUG at very beginning of drivers/acpi/ec.c
Comment 26 Alex Shumakovitch 2008-10-30 15:07:17 UTC
This patch is for 2.6.28-rc2, right? 3 chunks were rejected for 
2.6.27.4.
Comment 27 Alexey Starikovskiy 2008-10-30 15:21:29 UTC
bugme-daemon@bugzilla.kernel.org wrote:
> This patch is for 2.6.28-rc2, right? 3 chunks were rejected for 
> 2.6.27.4.
right. EC driver is quite different in rc2.

Comment 28 Alex Shumakovitch 2008-10-30 16:14:10 UTC
Created attachment 18531 [details]
Configuration file for the 2.6.28-rc2 kernel

OK, now the booting just stopped after the first "ACPI: Thermal Zone"
message with no output at all over the next 10 minutes. I'm attaching
my kernel config file for you to check that all right options are
selected. I will now try to compile 2.6.27.4 with "Lock debugging:
prove locking correctness" activated and DEBUG in ec.c uncommented.
Comment 29 Alex Shumakovitch 2008-10-30 16:55:38 UTC
Well, these changes (kernel option and DEBUG in ec.c) kill my 
2.6.27.4 kernel as well. At the same place. Very strange.
Comment 30 Alex Shumakovitch 2008-10-30 23:29:27 UTC
Created attachment 18544 [details]
Soft lockup during boot process with 2.6.28-rc2 and patch from #25 applied (screen shot 4)

I've now built 2.6.28-rc2 with patch from #25, but without additional 
kernel debugging options. The screen shots are attached (sorry
for the quality --- the lighting was really bad this time)

What is interesting though is that now it appears to be only 
one type of soft lockups, while in the past there were two
distinct ones with visually different error messages (compare
#23 and #24) that appeared one after another with irregular
intervals. That's the only difference that I could notice.
Comment 31 Alex Shumakovitch 2008-10-30 23:30:36 UTC
Created attachment 18545 [details]
Soft lockup during boot process with 2.6.28-rc2 and patch from #25 applied (screen shot 5)
Comment 32 Alex Shumakovitch 2008-10-30 23:31:12 UTC
Created attachment 18546 [details]
Soft lockup during boot process with 2.6.28-rc2 and patch from #25 applied (screen shot 6)

Time to sleep in this part of the world now ;-)
Comment 33 Max Berger 2008-10-31 06:15:21 UTC
Just wanted to add: I can confirm this bug on my machine (also 2730p). Flashing the Bios to F04 (F02 is the original version) did not change anything. Maybe you can report the fixed DSDT (once it works) back to HP for a future bios update?
Comment 34 Alexey Starikovskiy 2008-10-31 10:47:55 UTC
Please look here: http://anholt.livejournal.com/40006.html
Comment 35 Alex Shumakovitch 2008-10-31 11:12:02 UTC
Wow! This DOES solve all the problems that I was experiencing.
I did play with BIOS options, but it never occured to me that
this one might be at fault. Simply amazing.

What will be the proper course of actions now? Is it possible
to make a workaround in kernel around this option (it's
actually quite a useful one), or should one simply turn it off
and not to bother?
 
Thanks a lot for the help.

   --- Alex.
Comment 36 Michael Haslgrübler 2008-10-31 12:18:19 UTC
Alex which kernel and patch(es) did you use? For 2.6.27.4 and the custom DSDT the issue still exists.
Comment 37 Alex Shumakovitch 2008-10-31 13:21:58 UTC
I first tried 2.6.28-rc2 and when everything worked (with lots of
seemingly harmless debugging messages from ec.c) switched back to
2.6.27.4 with the custom DSDT and no other patches (except the 
intel-agp one, of course). I have the F04 version of BIOS though.
Could this be the culprit?

I've complied the kernel twice in a row to test the fan throttling
(worked great) without any side effects and tested suspend/resume
as well.

The only thing that doesn't work so far is adjusting of the screen
brightness (I swear it worked at the beginning, but I don't remember
with which kernel) and built-in speakers and mics. 
Comment 38 Michael Haslgrübler 2008-10-31 14:44:46 UTC
I got it working with 2.6.28-rc2. I still have the old bios version. I think the intel-agp patch is doing the trick with 2.6.27.4. I didn't used it so far
Comment 39 Max Berger 2008-11-02 04:09:31 UTC
I am trying to collect all information about the 2730p at:

http://www.linlap.com/wiki/HP+EliteBook+2730P

Please add your knowledge.
Comment 40 Zhang Rui 2008-11-17 00:19:12 UTC
(In reply to comment #37)
> I first tried 2.6.28-rc2 and when everything worked 
so the problem can not be reproduced in 2.6.28-rc2, right?

> (with lots of
> seemingly harmless debugging messages from ec.c)
this is another problem, please attach the dmesg output

> 
> The only thing that doesn't work so far is adjusting of the screen
> brightness (I swear it worked at the beginning, but I don't remember
> with which kernel) and built-in speakers and mics. 
> 
you can open another bug report.
and attach the output of "grep . /sys/firmware/acpi/interrupts/*" both before and after pressing the power button.


Comment 41 ykzhao 2008-11-25 23:52:23 UTC
It seems that this issue is related with EC.
From the acpidump we can know that there exists the error in evaluating the _REG object of EC device, which causes that the EC device can't be initialized correctly. 
    >[    0.781695] ACPI Error (psparse-0530): Method parse/execution failed [\_SB_.PCI0.LPCB.EC0_.ECRI] (Node ffff88013ba6b9d0), AE_AML_NO_RETURN_VALUE
    >[    0.781975] ACPI Error (psparse-0530): Method parse/execution failed [\_SB_.PCI0.LPCB.EC0_._REG] (Node ffff88013ba6b9b0), AE_AML_NO_RETURN_VALUE

    As the _REG object can't executed successfully, the EC GPE handler will be uninstalled. But the ECRG flag is still set and the EC space handler isn't uninstalled , which causes that the EC internal register will be accessed in AML code.(ECRG flag is set in _REG object, which indicates that EC operation region is already accessible).
     >if (ACPI_FAILURE(status)) {
     >           acpi_remove_gpe_handler(NULL, ec->gpe, &acpi_ec_gpe_handler);
     >           return -ENODEV;
     >  }

     In such case as the EC device is not initialized correctly, the following warning message will be complained.
     >[   11.260018] ACPI: EC: acpi_ec_wait timeout, status = 0xff, event = "b1=0"
     >[   11.260082] ACPI: EC: input buffer is not empty, aborting transaction
     >[   11.260147] ACPI Exception (evregion-0419): AE_TIME, Returned by Handler for [EmbeddedControl] [20080609]
     >[   11.260314] ACPI Error (psparse-0530): Method parse/execution failed [\_SB_.PCI0.LPCB.EC0_.BTIF] (Node ffff88013ba6db10), AE_TIME
     >[   11.260616] ACPI Error (psparse-0530): Method parse/execution failed [\_SB_.BTIF] (Node ffff88013ba72f10), AE_TIME 
     Thanks.   
Comment 42 Zhang Rui 2008-11-25 23:56:19 UTC
EC is not properly initialized because of the buggy _REG.
re-assign to EC category.
Comment 43 ykzhao 2008-11-26 00:36:32 UTC
Created attachment 19029 [details]
patch: Remove EC space handler explicitly when failing in _REG object

Will you please try the attach debug patch and see whether the following message still exists?
    >ACPI Error (psparse-0530): Method parse/execution failed
[\_SB_.BAT0._BIF] (Node f7446ba8), AE_TIME

    Please enable "CONFIG_ACPI_PROCFS_POWER" in kernel configuration.
    
    Thanks.
Comment 44 Lin Ming 2008-12-01 00:17:58 UTC
Created attachment 19087 [details]
debug patch to find out where PSWT fails

>ACPI Error (dswstate-0097): Result stack is empty! State=f74a5e00 [20080609]
>ACPI Exception (dsutils-0645): AE_AML_NO_RETURN_VALUE, Missing or null operand >ACPI Exception (dsutils-0762): AE_AML_NO_RETURN_VALUE, While creating Arg 0 ACPI >Error (psparse-0530): Method parse/execution failed [\_TZ_.PSWT] (Node f743b978), AE_AML_NO_RETURN_VALUE

Alex, let's try to find out where PSWT fails (I think it's the key point of this bug)

Would you please help to test the attached debug patch?
And then attach the dmesg with this patch applied.

Thanks
Comment 45 Alex Shumakovitch 2008-12-01 17:32:11 UTC
Yakui and Lin,

I didn't have a chance to try your patches yet. Sorry.
I will have a go at them tomorrow. Which kernel do I have
to apply them to, by the way? 2.6.27.7? 

  --- Alex.
Comment 46 Lin Ming 2008-12-01 22:53:49 UTC
> I will have a go at them tomorrow. Which kernel do I have
> to apply them to, by the way? 2.6.27.7? 

2.6.27.7 is OK
Comment 47 Alex Shumakovitch 2008-12-02 23:08:17 UTC
Created attachment 19118 [details]
dmesg of the vanilla 2.6.27.7 kernel with the "fan while AC is on" BIOS option disabled

As it turns out, 2.6.27.7 does _not_ boot with the "fan while AC is on" BIOS option enabled (soft lockup). Booting the kernel with this option disabled, doesn't produce any remarkable output. No additional debug messages with the patch from #44 were spotted. I'm sure that the code was compiled in because of this:
   sudo strings /proc/kcore | grep 1188
   Bug 11884 debug begin
   Bug 11884 debug end

Anyway, dmesg's for the vanilla and patched kernels are attached.
I'll try to apply this patch to 2.6.27.4 tomorrow. This is the latest
version that I know of that boots with that BIOS option enabled.

Thanks,

   --- Alex.
Comment 48 Alex Shumakovitch 2008-12-02 23:09:22 UTC
Created attachment 19119 [details]
dmesg of the 2.6.27.7 kernel with patch from #44 applied and with the "fan while AC is on" BIOS option disabled
Comment 49 Alex Shumakovitch 2008-12-03 14:58:47 UTC
Created attachment 19133 [details]
dmesg of the 2.6.27.4 kernel with patch from #44 applied and with the "fan while AC is on" BIOS option enabled

OK, here is the output for the 2.6.27.4 kernel. What next?

  --- Alex.
Comment 50 Alex Shumakovitch 2008-12-03 14:59:27 UTC
Created attachment 19134 [details]
dmesg of the 2.6.27.4 kernel with patch from #44 applied and with the "fan while AC is on" BIOS option disabled
Comment 51 Lin Ming 2008-12-08 23:24:06 UTC
(In reply to comment #49)
> Created an attachment (id=19133) [details]
> dmesg of the 2.6.27.4 kernel with patch from #44 applied and with the "fan
> while AC is on" BIOS option enabled
> 
> OK, here is the output for the 2.6.27.4 kernel. What next?
> 
>   --- Alex.
> 

Thanks Alex, Sorry for delay, I just come back from vacation. 

Method(PSWT)
{
    ...
    //PSWT fails here
    Store (DerefOf (Index (CUZO, 0x00)), Local0)
    ...
}


Comment 52 Lin Ming 2008-12-09 02:51:14 UTC
Created attachment 19222 [details]
debug patch

Store (DerefOf (Index (CUZO, 0x00)), Local0)

DerefOf Op push an object
   --> something wrongs here, the object is popped before Store Op is executed
Store Op pop the object

Alex, please apply this debug patch to see who pops the object
Comment 53 Robert Moore 2008-12-10 13:14:14 UTC
I believe what is happening here is that the CUZO Package object has not been initialized and the DerefOf is failing because of this.

Looking at the DSDT, CUZO is initialized in the RETD method. This method is called from two places: _SB_._INI and HWAK.

So it would appear that \_SB_.PCI0.LPCB.EC0_._REG is being called before _SB_._INI is called, and CUZO is uninitialized.

A quick workaround would be to statically initialize CUZO:

    Name (CUZO, Package (0x06) {0xFF,0xFF,0xFF,0xFF,0xFF,0xFF})

The real fix will be to figure out why the Embedded Controller _REG method is being called before the _INI methods are run. (Also, a better error code in this case would be appropriate.)

Here is my test code that reproduces the problem. If MAIN is run first, it will fail. If INI is run before MAIN, MAIN will not fail.

DefinitionBlock ("", "DSDT", 1, "Intel", "Test", 1)
{
    Name (CUZO, Package (0x06) {})

    Method (MAIN, 0, NotSerialized)
    {
        Store (DerefOf (Index (CUZO, 0x00)), Local0)
        Return ()
    }

    Method (INI)
    {
        Store (0x00, Local0)
        While (LLess (Local0, 0x06))
        {
            Store (0xFF, Index (CUZO, Local0))
            Increment (Local0)
        }
    }
}

Comment 54 Lin Ming 2008-12-10 18:05:28 UTC
(In reply to comment #53)
> A quick workaround would be to statically initialize CUZO:
> 
>     Name (CUZO, Package (0x06) {0xFF,0xFF,0xFF,0xFF,0xFF,0xFF})
> 

Alex, would you please help to test this quick workaround?
See http://www.lesswatts.org/projects/acpi/overridingDSDT.php for info about custom DSDT.
Comment 55 Alex Shumakovitch 2008-12-10 23:24:43 UTC
Created attachment 19245 [details]
dmesg of the 2.6.27.4 kernel with patches from #44 and #52 applied and with the "fan while AC is on" BIOS option enabled

There was some debugging output with the BIOS option both enabled
and disabled (next attachement) this time.

Concerning the "quick workaround", where in my DSDT.dsl file do
I have to add this line? Sorry, I'm just afraid to mess things up.

   --- Alex.
Comment 56 Alex Shumakovitch 2008-12-10 23:26:07 UTC
Created attachment 19246 [details]
dmesg of the 2.6.27.4 kernel with patches from #44 and #52 applied and with the "fan while AC is on" BIOS option disabled
Comment 57 Lin Ming 2008-12-11 01:44:01 UTC
(In reply to comment #55)
> Concerning the "quick workaround", where in my DSDT.dsl file do
> I have to add this line? Sorry, I'm just afraid to mess things up.

Apply below patch to your DSDT.dsl

--- orig.DSDT.dsl       2008-12-11 09:35:04.000000000 +0800
+++ DSDT.dsl    2008-12-11 09:35:32.000000000 +0800
@@ -875,7 +875,7 @@ DefinitionBlock ("DSDT.aml", "DSDT", 1, 
         Name (OSTH, 0x00)
         Name (LARE, Package (0x06) {})
         Name (LARP, Package (0x06) {})
-        Name (CUZO, Package (0x06) {})
+        Name (CUZO, Package (0x06) {0xFF,0xFF,0xFF,0xFF,0xFF,0xFF})
         Mutex (THER, 0x00)
         Name (THSC, 0x3D)
         Name (THOS, 0x00)

Comment 58 Alex Shumakovitch 2008-12-11 17:04:50 UTC
Created attachment 19257 [details]
dmesg of the 2.6.27.4 kernel with patches from #44 and #52 applied, custom DSDT from #57 and with the "fan while AC is on" BIOS option enabled

This "quick workaround" seems to be working!!! I managed to boot both 2.6.27.4 and 2.6.27.7 (dmesg in the next message) kernels without error messages.

So what should be the "right solution" then? Patch DSDT? Disable BIOS option?
Bug HP to fix their BIOS?

Thanks a lot!!

  --- Alex.
Comment 59 Alex Shumakovitch 2008-12-11 17:06:26 UTC
Created attachment 19258 [details]
dmesg of the 2.6.27.7 kernel with patches from #44 and #52 applied, custom DSDT from #57 and with the "fan while AC is on" BIOS option enabled
Comment 60 Lin Ming 2008-12-11 18:37:08 UTC
(In reply to comment #58)
> So what should be the "right solution" then? Patch DSDT? Disable BIOS option?
> Bug HP to fix their BIOS?

As Bob mentioned at #53, 
"The real fix will be to figure out why the Embedded Controller _REG method is
being called before the _INI methods are run."

Thanks for the test.
Comment 61 Lin Ming 2008-12-11 19:48:28 UTC
void __init acpi_early_init(void)
         .....
         status = acpi_ec_ecdt_probe();
         /* Ignore result. Not having an ECDT is not fatal. */
 
         status = acpi_initialize_objects(ACPI_FULL_INITIALIZATION);
         .....
}

acpi_ec_ecdt_probe -> ec_install_handlers -> acpi_install_address_space_handler
-> acpi_ev_execute_reg_methods -> _REG method is called

acpi_initialize_objects -> acpi_ns_initialize_devices ->
-> acpi_ns_init_one_device -> _INI method is called

This is why _REG method is called before _INI method.
Comment 62 Alexey Starikovskiy 2008-12-15 09:36:26 UTC
Created attachment 19310 [details]
limit workarounds for ASUS

Please check if this patch works without modified DSDT.
Comment 63 Alex Shumakovitch 2008-12-16 01:01:32 UTC
Yes, this patch does solve the problem (at least, for 2.6.27.7). Amazing!
Do I understand correctly that the bug in ec.c was that despite the comment
"We really need to limit this workaround, the only ASUS, which needs it ...",
the check was never performed? I wonder how this can be related to the fan
being always on or off.

Anyway, thanks a lot!!

   --- Alex.

Comment 64 Alexey Starikovskiy 2008-12-16 02:41:26 UTC
At the time this comment was written, I was trying to limit early EC registration by looking on presence of EC._INI field -- which is quite rare. Now your machine has it too, but does not want early registration -- thus we need to add one more check "ASUS only". Relation to fan -- if we fail to properly init EC driver and device, it may decide to keep fan in the state set by BIOS, or something safe... you never know what BIOS engineer thinks :)
Comment 65 Len Brown 2009-01-16 11:01:54 UTC
Created attachment 19835 [details]
patch vs 2.6.29-rc1

refreshed patch applied to acpi tree
Comment 66 Len Brown 2009-01-16 13:19:20 UTC
patch in comment #65 shipped in linux-2.6.29-rc2
closed

Note You need to log in before you can comment on or make changes to this bug.