Bug 8246

Summary: 32/64X address mismatch in "Gpe0Block" - IBM Thinkpad R51e
Product: ACPI Reporter: Boris Petersen (transacid)
Component: Config-InterruptsAssignee: Thomas Renninger (trenn)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: acpi-bugzilla, bjg, chris, email, fernandoph, jonnylamb, lakostis, lenb, thesilverhornet_try_gmail, trenn, vjensen
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.20 Subsystem:
Regression: No Bisected commit-id:
Attachments: My kernel .config
Output of dmesg
debug patch vs 2.6.21-rc5
Thinkpad R51e A8043 2.6.23-rc5 (w/noapic) acpidump
Thinkpad R51e A8043 2.6.23-rc5 (w/noapic) lspci -vv
Thinkpad R51e A8043 2.6.23-rc5 (w/noapic) dmesg
2.6.23-rc6 SMP i686 (cmdline: ro ec_intr=0 noapic) acpidump (Version 20071116)
try the patch that force to use RSDT in case of 32/64X address mismatch
Thinkpad R51e A8043 2.6.25-rc6 -- dmesg
Fix up length fiddling of event block in xFADT addresses
Thinkpad R51e 1843-6NG -- dmesg
kernel 2.6.27-wl w/ patch -- dmesg
quick'n dirty hack to force EC pollmode

Description Boris Petersen 2007-03-21 06:17:21 UTC
Most recent kernel where this bug did *NOT* occur: 2.6.19.7
Distribution:gentoo / debian
Hardware Environment: IBM Thinkpad R51e
Problem Description: Since kernel 2.6.20 the acpi modules take an unacceptable
time to load. I took the exact same configuration as i had in my last (2.6.19.7)
kernel. Here are the results i get:
root@warbird ~ # uname -r
2.6.20.3

root@warbird ~ # time modprobe battery

real 2m35.414s
user 0m0.002s
sys 0m0.003s
root@warbird ~ # time modprobe ibm-acpi

real 0m51.626s
user 0m0.002s
sys 0m0.001s
root@warbird ~ # time acpi
     Battery 1: charged, 100%

real 1m23.016s
user 0m0.000s
sys 0m0.002s 

Steps to reproduce: Build a 2.6.20 kernel on an IBM Thinkpad R51e and load acpi
modules.
Comment 1 Boris Petersen 2007-03-21 06:19:16 UTC
Created attachment 10893 [details]
My kernel .config
Comment 2 Boris Petersen 2007-03-21 06:21:20 UTC
Created attachment 10895 [details]
Output of dmesg
Comment 3 Len Brown 2007-03-21 18:47:09 UTC
are battery and ibm_acpi the only modules that now take longer to load?
any difference if you boot with ec_intr=0?

any difference with 2.6.21-rc4?
Comment 4 Boris Petersen 2007-03-22 10:31:09 UTC
Ok I tested 2.6.20.3 with ec_intr=0 and it works fine.
I also tried 2.6.21-rc4 without ec_intr=0 and it's still the same:
root@warbird ~ # uname -r
2.6.21-rc4
root@warbird ~ # time modprobe battery

real	1m5.262s
user	0m0.001s
sys	0m0.003s
Comment 5 Len Brown 2007-03-25 21:02:17 UTC
Created attachment 10943 [details]
debug patch vs 2.6.21-rc5

This debug patch vs 2.6.21-rc5 reduces ACPI_EC_DELAY from 500 to 50 --
as it was in 2.6.19 and before.

Please try it out and report if it reduces the battery modprobe time.
Comment 6 Len Brown 2007-03-25 21:12:18 UTC
This may be an ACPI interrupt issue, rather than an EC issue --
as the EC timing out could also be due to a lost interrupt.

Also, dmesg shows an unusual override, which is probably the ACPI SCI:
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 21 low level)

So we should check that ACPI interrupts are working on this box

try provoking some like so:
cat  /proc/interrupts
stop acpid:
# /etc/init.d/acpid stop
# cat /proc/acpi/event
and press the power button a bunch of times.
you should see some event strings come out of /proc/acpi/event,
and another look at /proc/interrupts should show an interrupt
for each button press on the "acpi" line.

Please paste the resulting /proc/interrupts 

Also, if this fails, it would be good to go back and see
if it also fails using older kernels.
Comment 7 Boris Petersen 2007-04-16 08:02:02 UTC
sorry for the long period. Didn't find the time. Ok it looks like it doesn't
even work on 2.6.19
'cat /proc/acpi/event' doesn't do anything while pressing power button.

root@warbird ~ # cat  /proc/interrupts
           CPU0       
  0:  760051515  local-APIC-edge-fasteio   timer
  1:     186150   IO-APIC-edge      i8042
  7:          0   IO-APIC-edge      parport0
  8:          2   IO-APIC-edge      rtc
  9:       1495   IO-APIC-fasteoi   ehci_hcd:usb1, ohci_hcd:usb2, ohci_hcd:usb3
 12:    1111338   IO-APIC-edge      i8042
 14:    2223791   IO-APIC-edge      ide0
 15:     362565   IO-APIC-edge      ide1
 17:    9077672   IO-APIC-fasteoi   eth0
 18:   29817122   IO-APIC-fasteoi   ATI IXP
 19:          0   IO-APIC-fasteoi   wifi0
 21:          1   IO-APIC-fasteoi   acpi
NMI:          0 
LOC:  660470933 
ERR:          0
MIS:          0
Comment 8 Leonard Tracy 2007-05-03 02:30:10 UTC
I tried using the patch on 2.6.20 Ubuntu and 2.6.21 vanilla.  The modules loaded
faster, however, still no ACPI events were being reported.  I tried doing `cat
/proc/acpi/event and tail -f /var/log/acpi, but nothing showed any signs of
life.  I tried closing the lid, pressing the power button, pressing the fn keys. 

           CPU0       
  0:     401785  local-APIC-edge-fasteio   timer
  1:       4657   IO-APIC-edge      i8042
  7:          0   IO-APIC-edge      parport0
  8:          3   IO-APIC-edge      rtc
  9:          2   IO-APIC-fasteoi   ohci_hcd:usb1, ohci_hcd:usb2, ehci_hcd:usb3
 12:      57354   IO-APIC-edge      i8042
 14:      14891   IO-APIC-edge      ide0
 15:      13968   IO-APIC-edge      ide1
 16:          1   IO-APIC-fasteoi   yenta
 17:        648   IO-APIC-fasteoi   ATI IXP Modem, ATI IXP
 18:       8094   IO-APIC-fasteoi   eth0
 21:          1   IO-APIC-fasteoi   acpi
NMI:          0 
LOC:     401206 
ERR:          0
MIS:          0

This is on a Thinkpad R51e Type 1844-DJU
Comment 9 Boris Petersen 2007-05-09 07:01:44 UTC
I am now running 2.6.21 and the problem is still there. But for me ec_intr=0 is
a good compromise. The whole boot process is faster. 
Comment 10 Leonard Tracy 2007-05-09 18:58:14 UTC
The fact that ACPI interrupts aren't being detected is a problem though.  Can't
suspend by closing the lid or hitting the power button.  Should this be moved to
a new bug?
Comment 11 Len Brown 2007-07-20 10:54:33 UTC
> Ok it looks like it doesn't even work on 2.6.19

Is there any previous Linux kernel release or
kernel or system configuration where
ACPI interrupts work on the R51e?

How about if you boot with "noapic"?

Please attach the output from acpidump and lspci -vv
Comment 12 Chris Lamb 2007-09-10 08:07:39 UTC
Created attachment 12773 [details]
Thinkpad R51e A8043 2.6.23-rc5 (w/noapic) acpidump
Comment 13 Chris Lamb 2007-09-10 08:10:30 UTC
Created attachment 12774 [details]
Thinkpad R51e A8043 2.6.23-rc5 (w/noapic) lspci -vv
Comment 14 Chris Lamb 2007-09-10 08:11:39 UTC
Created attachment 12775 [details]
Thinkpad R51e A8043 2.6.23-rc5 (w/noapic) dmesg
Comment 15 Chris Lamb 2007-09-10 08:14:17 UTC
I'm seeing the same problem, which isn't fixed with noapic: Appending ec_intr=0 does indeed shorten the time taken to load the modules, but ACPI events are not being detected. (/proc/acpi/button/lid/LID/state is correct however)

I have attached lamby-acpidump.txt, lamby-lspcivv.txt and lamby-demsg.txt. My machine is an IBM Thinkpad R51e A8043, running 2.6.23-rc5 (w/noapic).

I saw the same problem in 2.6.18, and possibly earlier. I am willing to perform git-fu if that would helpful.
Comment 16 Hornet 2007-10-06 17:38:39 UTC
I have the same issues; adding irqpoll to the boot options fixes the boot time, but also fills kern.log, syslog and messages with about half a meg each of "Oct  6 09:53:29 Hades kernel: [ 5312.660000] ACPI Error (evgpe-0711): No handler or method for GPE[ F], disabling event [20060707]" _every second_, not sure if that's of any help to you.

According to this thread ( https://bugs.launchpad.net/ubuntu/+source/casper/+bug/107516 ) the issue isn't present with kernel 2.6.17-11-generic, however I haven't tested this myself.

Hope that helps. :)  thesilverhornet - I use gmail, the .com version

Fairly new to Linux and using the machine for work, so can't risk testing anything potentially dubious, but I'm happy to run any safe tests that may be needed. :)
Comment 17 Mark Doughty 2007-10-14 11:35:37 UTC
I have the same issue with my R40e thinkpad.  I can confirm that it worked ok with 2.6.17 (loading the acpi modules was fine).  I think I remember seeing an irqpoll error during boot up though.  I've still got a 2.6.17 kernel on my thinkpad so I can check.

What useful logs etc should I provide ?
Comment 18 Zhang Rui 2007-11-18 23:54:50 UTC
to Chis:
it seems that this has been fixed in the latest kernel release.
Could you please give a try?

to Hornet,
It seems that the problem on your laptop is different.
Could you have a look at bug #6217 and try the patch in the last comments?
And if it doesn't help, please open a new bug and attach your dmesg and acpidump.

to Boris,
Does the problem still exist in the latest kernel tree?
Comment 19 Fu Michael 2007-12-09 21:50:50 UTC
(In reply to comment #17)
> I have the same issue with my R40e thinkpad.  I can confirm that it worked ok
> with 2.6.17 (loading the acpi modules was fine).  I think I remember seeing
> an
> irqpoll error during boot up though.  I've still got a 2.6.17 kernel on my
> thinkpad so I can check.
> 
> What useful logs etc should I provide ?
> 
Mark, if you still see this on latest kernel, would you please open a new bug? you are using a different platform as others. thanks.
Comment 20 Len Brown 2008-01-08 20:04:38 UTC
Chris,
After reproducing the lack of acpi interrupts on the R51e
using 2.6.24-rc...

Please grab the acpidump output again using the latest version of
pmutils here: http://www.lesswatts.org/patches/linux_acpi/
The reason is that the newest version will grab the RSDT
in addition to the XSDT, and I want to see if any of this:

ACPI: FACP 2BEE1900, 00F4 (r3 IBM    TP-78        1580 IBM         1)
ACPI Error (tbfadt-0453): 32/64X address mismatch in "Gpe0Block": [00008020] [0000000000008028], using 64X [20070126]

or this:

ACPI Error (evgpe-0705): No handler or method for GPE[ 0], disabling event [20070126]
ACPI Error (evgpe-0705): No handler or method for GPE[ 1], disabling event [20070126]
ACPI Error (evgpe-0705): No handler or method for GPE[ 2], disabling event [20070126]

...
ACPI Error (evgpe-0705): No handler or method for GPE[1F], disabling event [20070126]

might be caused by us using the wrong tables.

If they are, then you'll be able to apply the debug patch
http://bugzilla.kernel.org/attachment.cgi?id=12464&action=view
from bug 8630 and boot with "rsdp_forced" to run with the
RSDP tables instead of the XSDT tables.
Comment 21 Chris Lamb 2008-01-08 23:52:22 UTC
Created attachment 14376 [details]
2.6.23-rc6 SMP i686 (cmdline: ro ec_intr=0 noapic) acpidump (Version 20071116)
Comment 22 Chris Lamb 2008-01-08 23:54:38 UTC
I have attached another acpidump output using pmtools-20071116.tar.gz. (It's the same Thinkpad R51e A8043.)
Comment 23 Zhang Rui 2008-01-23 00:21:39 UTC
RSDT and XSDT have different FADT table.
For the FADT table in XSDT,
[050h 080  4]           GPE0 Block Address : 00008020
and
[0DCh 220 12]                   GPE0 Block : <Generic Address Structure>
...
[0E0h 224  8]                      Address : 0000000000008028
And in the FADT gotton from RSDT:
[050h 080  4]           GPE0 Block Address : 00008020

IMO, the right address of GPE0 should be 8020 rather than 8028.

Chris,
I'm almost sure this is the root cause of the problem on your laptop.
So please do the test in comment #20 and attach the dmesg output.
Comment 24 Chris Lamb 2008-01-24 16:00:26 UTC
Applying the patch against 2.6.24 and booting acpi=rsdp_forced works for me: the Fn+foo keys and closing the lid now issue ACPI events, Fn wakes up from suspend, etc. Many thanks.

Could this be made to work without requiring a kernel parameter?
Comment 25 Zhang Rui 2008-02-17 18:21:43 UTC
len, the patch
http://bugzilla.kernel.org/attachment.cgi?id=12464&action=view
fix the problem for chris,
is there any chance for this patch to go upstream?
Comment 26 ykzhao 2008-02-19 23:54:40 UTC
Created attachment 14910 [details]
try the patch that force to use RSDT in case of 32/64X address mismatch

Hi, Chris
   Will you please try the attached patch and see whether the bug can be fixed by this patch? 
   No kernel paramter is required. 
   Thanks.
Comment 27 Chris Lamb 2008-03-02 06:14:52 UTC
Hi Yakui,
Alas your patch (id=14910) did not fix the problem for me. dmesg contains:

ACPI: RSDP 000F6C80, 0024 (r2 IBM   )
ACPI: XSDT 2BEE1879, 0054 (r1 IBM    TP-78        1580  LTP        0)
ACPI: FACP 2BEE1900, 00F4 (r3 IBM    TP-78        1580 IBM         1)
ACPI Error (tbfadt-0475): 32/64X address mismatch in "Gpe0Block": [00008020] [0000000000008028], using 64X [20070126]
ACPI: DSDT 2BEE1AE7, 92BB (r1 IBM    TP-78        1580 MSFT  100000E)
ACPI: FACS 2BEEC000, 0040
ACPI: RSDP 000F6C80, 0024 (r2 IBM   )
ACPI: RSDT 2BEE183D, 003C (r1 IBM    TP-78        1580  LTP        0)
ACPI: FACP 2BEE1A00, 0081 (r2 IBM    TP-78        1580 IBM         1)
ACPI: DSDT 2BEE1AE7, 92BB (r1 IBM    TP-78        1580 MSFT  100000E)
ACPI: FACS 2BEEC000, 0040
ACPI: SSDT 2BEE1A81, 0033 (r1 IBM    TP-78        1580 MSFT  100000E)
ACPI: ECDT 2BEEADA2, 0052 (r1 IBM    TP-78        1580 IBM         1)
ACPI: APIC 2BEEADF4, 005A (r1 IBM    TP-78        1580 IBM         1)
ACPI: MCFG 2BEEAE4E, 003E (r1 IBM    TP-78        1580 IBM         1)
ACPI: BOOT 2BEEAFD8, 0028 (r1 IBM    TP-78        1580  LTP        1)

..but I am receiving any ACPI events on lid, suspend button, etc.
Comment 28 ykzhao 2008-03-02 17:17:12 UTC
Hi, Chris
    Thanks for the test.
    The patch in comment #26 won't suppress the warning message. When it detects the error , the RSDT table is automatically used. It has the same purpose with the  patch  in : 
http://bugzilla.kernel.org/attachment.cgi?id=12464&action=view. 
The only difference is that it is unnecessary to add the boot parameter.
>ACPI: RSDP 000F6C80, 0024 (r2 IBM   )
>ACPI: XSDT 2BEE1879, 0054 (r1 IBM    TP-78        1580  LTP        0)
>ACPI: FACP 2BEE1900, 00F4 (r3 IBM    TP-78        1580 IBM         1)
>ACPI Error (tbfadt-0475): 32/64X address mismatch in "Gpe0Block": [00008020]
>[0000000000008028], using 64X [20070126]
Comment 29 Chris Lamb 2008-03-02 17:51:16 UTC
Sorry, I should have made it more clear:

 * The dmesg output was not meant or used as a testcase.
 * I am *not* receiving any ACPI events on lid, suspend button, etc. with the patch from comment 26.
Comment 30 ykzhao 2008-03-09 17:39:44 UTC
HI, Chris
    Will you please enable the acpi debug function in kernel configuration and attach the full dmesg ? ( Please apply the patch in comment #26).
    Thanks.
Comment 31 Chris Lamb 2008-03-22 10:28:08 UTC
Created attachment 15398 [details]
Thinkpad R51e A8043 2.6.25-rc6 -- dmesg
Comment 32 Chris Lamb 2008-03-22 10:33:57 UTC
 I've applied the patch in comment #26 and enabled ACPI_DEBUG - the result is in attachment #15398 [details].
 Curiously, ACPI events for the lid and the "sleep" button are now functioning. Does this mean I messed up last time I tested?
 Thanks,
 
Comment 33 Zhang Rui 2008-03-23 23:22:57 UTC

(In reply to comment #32)
>  I've applied the patch in comment #26 and enabled ACPI_DEBUG - the result is
> in attachment #15398 [details].
>  Curiously, ACPI events for the lid and the "sleep" button are now
>  functioning.
> Does this mean I messed up last time I tested?
Probably, we spent some time on reviewing the patch but didn't know why it failed to work for you.
So please make sure that the patch in comment #26 is correct so that we can start to push it upstream. 
Comment 34 Chris Lamb 2008-03-24 15:18:03 UTC
Hi,

I've just tested the patch in comment #26 again and--for the most part--works. :)

My reservations are that the sleep button (ACPI "ibm/hotkey", Fn-F4 key) only triggers once per bootup/resume, whilst all the other keys will respond repeatedly.

More problematic, the machine will now fail to resume from suspend unless the Fn button is hit repeatedly (interrupt problem?). It will also fail to suspend after resuming unless the key is pressed. Attempting to resume past this fails, with no response.

These effects are a regression from a kernel without this patch (and with the ec_intr=0 command line).

Chris
Comment 35 Jonny Lamb 2008-04-16 06:39:32 UTC
I just tested the patch in comment #26 with a 2.6.24.2 kernel *without* the ec_intr=0 kernel command, on a ThinkPad R51e.

For me, it works fine with a couple of little niggles:

* The sleep button (Fn-F4 key) or hibernate button (Fn-F12) only works once per bootup/resume, as Chris highlighted. This, of course, isn't a problem if you're actually using it to suspend the machine as you'll only be able to press it once per bootup/resume! But if it's mapped to do something else, then it can be fairly annoying. There are many other key combinations on the keyboard though, so this doesn't bother me at all.

* When suspended, pressing the Fn key will unsuspend the machine. This also doesn't actually affect me though.

Other than that, everything seems to work. Suspending has been flawless and apart from the addition of ACPI events and the above niggles, the computer appears to act as it did before.
Comment 36 Jonny Lamb 2008-04-16 06:46:59 UTC
P.S. I think this should go upstream. I like this patch a lot.
Comment 37 Boris Petersen 2008-04-17 03:33:43 UTC
Is this already in the acpi git tree? I couldn'T find the commit.
Comment 38 Zhang Rui 2008-04-17 18:18:59 UTC
No, it's not.
Yakui, please push len to merge this patch.
Comment 39 Boris Petersen 2008-04-19 09:07:47 UTC
Ok I can confirm this patch works. Finally after using 2.6.22 for ages i can update. Thank you all for your effort.
Comment 40 Thomas Renninger 2008-04-22 08:03:53 UTC
Can everybody affected, pls have a look at or attach dmidecode.
Is there one with a BIOS version 1SET*WW?
For these machines C-states are disabled/blacklisted, the patch from Yakui could fix that and C-states might work, it would be great if someone has such a machine and is willing to give it a try.

Affected people should see somthing similar like this in dmesg:
IBM ThinkPad R40e detected - limiting to C* max_cstate. Override with processor.max_cstate=9
With Yakui's patch can you do:
processor.max_cstate=9
and you then get a working machine and:
cat /proc/acpi/processor/*/power
shows more than C-state?
Comment 41 Thomas Renninger 2008-04-22 08:05:59 UTC
> shows more than C-state?
shows more than one, but three working C-states?
Comment 42 Thomas Renninger 2008-04-22 09:18:48 UTC
From googling, I expect the only ThinkPad showing the C-state freeze with 1SET* BIOS seem to be the R40e. Already the R40 has a 1PET BIOS...
Mark Doughty has a R40e...
Mark, I am not sure whether this really is related, can you please check whether you get working C-states with this patch by also adding:
processor.max_cstate=9

I wonder why I didn't get any bugs against 10.3 (2.6.22)..., this would have been noticed if this is since 2.6.19.7 ...
What kind of change could have introduced this, it worked before?
Comment 43 Thomas Renninger 2008-05-01 12:14:09 UTC
Created attachment 16003 [details]
Fix up length fiddling of event block in xFADT addresses
Comment 44 Thomas Renninger 2008-05-01 12:24:56 UTC
Can someone try the patch from comment #43, pls.
If this one works, it's the way to go.

While Yakui's patch is nice from a generic point of view, it violates the spec and has the potential to break other machines.
Yakui, if the patch from comment #43 should not work, it would be great if you can push a force_rsdt variable or similar inside ACPICA, so that machines can be dmi blacklisted.
Hmm, I am still anxious that we ignore the real bug then. IMO it looks like a bug slipped in between 2.6.19.7 and 2.6.20 and I expect XSDT already was used (according to the specs) in 2.6.19 and before already?
Comment 45 Mark Doughty 2008-05-03 05:00:45 UTC
Thomas,

Sorry for taking a while to get back to you.  I've tried the patch from comment  23 on my thinkpad R40e.

It works well and solves the original ACPI problem and when i add the processor.max_cstate=9 to the kernel options it now gives me 3 available processor states, (it took me a while to realise that I had compiled the processor.ko as module which was why it didn't work)

I'll try your patch tonight and let you know that works too.

Thanks for working on this everyone.  It's great to finally have Linux working on my laptop properly with all the acpi buttons and everything.

Mark
Comment 46 Mark Doughty 2008-05-03 10:25:57 UTC
Thomas,

I've tried your patch. It works from the point of view of loading the modules and reading the battery state from /proc/ but none of the acpi buttons (volume, mute, suspend, etc) work.

These worked with Yakui's patch.  Also if I add processor.max_cstate=9  as a boot option then the the system won't boot.  I tried it a couple of times but got a message about the cpu temperature being 91C and then it shutdown.

Let me know if you need any logs etc.

Cheers

Mark
Comment 47 Robert Moore 2008-05-19 15:06:04 UTC
>>ACPI Error (tbfadt-0475): 32/64X address mismatch in "Gpe0Block": [00008020]
[0000000000008028], using 64X [20070126]

Would we not need to use the RSDT and the "other" FADT if instead of using the 64-bit address in the error condition above, we used the 32-bit address instead?
Comment 48 Vitus Jensen 2008-06-17 00:52:56 UTC
Created attachment 16522 [details]
Thinkpad R51e 1843-6NG -- dmesg
Comment 49 Vitus Jensen 2008-06-17 00:55:50 UTC
I've applied the patch from comment #26 to 2.6.24-gentoo-r8 on my Thinpad R51e and while ACPI events now work (hurray!  I can't rememeber if or when this worked before :-) there are two problems on my machine related to that patch:

1) in the first minutes (until KDE is up, NFS mounted) the R51e is prone to hang hard.  The disk LED is always on, sometimes WLAN LED is on and there is no reaction to keyboard or mouse input.  This happens even when working on the console (Ctrl-Alt-F1 and doing the mount there).  Removing battery (see #2) won't help.

2) sometimes the communication with the battery is lost.  I've seen an empty symbol in klaptop and experienced a kind emergency suspend (needing reset) after a "battery full" dialog.

See previous comment for dmesg from the patched kernel.  You will notice that I run "noapic": interrupts from wlan, sound and possible others stop when using the apic (on most recent kernels, 2.6.15 was fine).

By[t]e,
   Vitus
Comment 50 Fernando Hauscarriaga 2008-08-24 06:42:16 UTC
Hello, 
      I've applied the patch on comment #26 on 2.6.25.4, work great!, thanks a lot!!

Regards,
                             Fernando
Comment 51 Jonny Lamb 2008-09-30 09:16:47 UTC
I've been using the patch in comment #26 on kernels for about six months now and I can confirm it works perfectly. I recommend it gets sent upstream as soon as possible.

Thanks,
Comment 52 Vitus Jensen 2008-10-02 01:35:22 UTC
Some time after my last comment I switch to wireless-testing kernel (from git) to use the ath5k driver.  I settled on a 2.6.27-rc3 which worked great.
And yesterday I applied the patch from #26 and had that complete lockup while booting.  Last night I did a git-pull and got 2.6.27-rc8 which already had that patch applied.  It lasted just long enough to boot into KDE, do a git-apply and understand the result.  Closing emacs locked the machine hard.

Thinkpad R51e, type 1843-6NG.  That's a machine with ATI IXP chipset, tuned with a Pentium-M instead of Celeron-M (the type no probably refers to Celeron-M but they were sold with Pentium-M, too).

Hard lockup, always HDU LED lit, no keyboard, no mouse, no ping from the outside.  So no switching to the console and no logs :(   I kept it in this condition and perhaps there will be a timeout tonight.  No idea what I can do to assist in looking into the problem.  PLease advise.

Vitus
Comment 53 ykzhao 2008-10-03 05:56:30 UTC
I will refresh the patch in comment #26 and resend it to the ACPI mailing list again.
Thanks.
Comment 54 Vitus Jensen 2008-10-04 00:49:45 UTC
Created attachment 18149 [details]
kernel 2.6.27-wl w/ patch -- dmesg 

Sorry, I was wrong with the "already applied patch" :-(

Did the patching myself, 2.6.27-wl locks up as usual.  The attachment is dmesg with the patch from comment #26.  The interesting part, not in the logs w/o the patch is:

"ACPI: EC: non-query interrupt received, switching to interrupt mode"

As ec_intr=0 is no longer in 2.6.27 I did a quick hack to force polling mode.  Now ACPI events work as enabled through comment #26 but the machine doesn't lockup.  The polling mode has to re-forced after suspend-to-ram.
Comment 55 Vitus Jensen 2008-10-04 00:53:06 UTC
Created attachment 18150 [details]
quick'n dirty hack to force EC pollmode

works for me
Comment 56 Thomas Renninger 2008-10-06 03:08:03 UTC
The problem is that patch from comment #26 is wrong.
It just always takes the 32 bit address (if you choose the 32 bit address if 32 and 64 are not equal, you efficiently always take the 32 bit address...).
Lenovo verified:
  - XP is taking 32 bit addresses
  - Vista is taking 64 bit addresses

Lenovo also verified that some laptops are buggy, these are a handful which do support XP and became Vista support at a late stage.
They will not get another BIOS update, the R51e and R40e (the latter is known to be able to do C-states then again which is currently blacklisted for this machine).

It cannot be chosen at run-time whether to take 32 or 64 bit addresses based on _OSI (whether machine supports Vista or XP), because the IO addresses from FADT are touched before the DSDT is parsed.

Lenovo also verified that this is a bug on some rare machines which got Vista support late (it's also a bug on Vista, but it seem to not show up there).
THE RIGHT THING TO DO is to blacklist these machines. It's a BIOS bugs on some rare machines and those have to be blacklisted.

I can send my patchset again (touches a little bit ACPICA sources) if Len considers the problem "enough debugged" and is finally ok with blacklisting.

The patches are in OpenSuSE for quite a while now and can be considered as well tested.
Comment 57 Len Brown 2008-10-16 23:52:23 UTC
Thomas, in light of your hints from Lenovo;
please include the patches SuSE is using here
so we can consider them for upstream.

thanks,
-Len
Comment 58 Thomas Renninger 2008-10-19 15:04:06 UTC
I saw the last comment after sending the patches to the list.
Hope sending the patches to the acpi list is sufficient.
Comment 59 Vitus Jensen 2008-10-27 22:09:20 UTC
Just as an info: I had to disable acpi wakeup sources by using "acpitool -w 1".  Otherwise the R51e would only stay in suspend once after boot and would return immediately on every further attempt.  Wakeup via Fn is now disabled but using power button for this purpose works.

2.6.27-wl, Thinkpad R51e, type 1843-6NG

Please mail if you want me to provide further information, I'm content with the current state.
Comment 60 Len Brown 2008-11-24 18:51:40 UTC
I don't like the acpi_gbl_force_rsdt w/ DMI blacklist approach.
We need to be smarter about when to disqualify garbled tables
so that we don't have to maintain a blacklist.
Comment 61 ykzhao 2008-11-26 19:06:01 UTC
Hi, Len
    It is a burden to maitain the DMI blacklist. But IMO it is reasonable to add the boot option of using rsdt. 
    From the discussion it seems that the RSDT is always used on Windows XP. In such case it will use the 32X address.(For example: GPE block address, RSDT, FACS address). But it seems that XSDT is used on windows Vista if both RSDT and XSDT exist. 
   For Linux it is difficult to cover the different cases. Maybe it is reasonable to add the boot option of using RSDT. When 32X/64X address mismatches , the user can be prompted to try the RSDT boot option. Maybe the system will work well after RSDT is used instead of XSDT table.

   For example: On some boxes there are two FACS tables. One is obtained from XSDT table and another is obtained from RSDT table. If the FACS table obtained from XSDT is used, the system will be rebooted instead of resume. But if RSDT is  used, the system can be resumed correctly. In fact there exists the 32/64X address about FACS in the FADT table obtained from XSDT table. And this root cause is identified by using the RSDT table.

   In such case when 32/64X address mismatch, the warning message is reported and the user can be prompted to try the boot option of RSDT.

   thanks.
   
   
Comment 62 Thomas Renninger 2008-11-27 11:05:58 UTC
Yeah, the time FADT provided IO addresses are touched is very early.
No assumptions about a possible supported OS or similar can be made.

If it is assumed that every X86 machine supports a Windows OS, then it would be safe to unconditionally move to the RSDT provided table.

But I very much doubt that.
There are probably dozens of projects (e.g. coreboot, they provide their own BIOS and provide their own FADT values) which do not support a Windows OS on their machines/BIOSes.

In this case it is very likely that the 64-bit value is correct and the 32-bit value is uninitialized and contains garbage (which is spec conform!).

Pushing these guys who very much contribute to the Linux/BSD/Solaris/... community into an unsupportable state must not happen. Therefore I'd say you can never do an unconditional (without at least providing a boot param) switch to 32-bit values.
Comment 63 ykzhao 2009-01-04 22:29:38 UTC
Hi, Thomas
    With the help of KVM now it is confirmed that only RSDT is used on windows XP even when both RSDT/XSDT exist. In such case the FADT obtained from RSDT will be used, in which there exists 32X address.
    But on windows vista the XSDT will be used when both exist. For Linux it is diffcult to cover the two cases. 
    So my proposal is that the user can be prompted to try the boot option when 32X/64X address mismatches and the system can't work well. If the system can work well when using XSDT, it is unnecessary for the user to try the boot option. 
    How about it?
     Thanks.
    
Comment 64 Thomas Renninger 2009-01-07 11:47:21 UTC
> the user can be prompted
This does not work that early. You also want to have this working automatically on every machine where you know which one to take (32/64 bit addresses). Printing out 32X/64X address mismatches when a sanity check fails and point to a "try acpi=rsdt boot param" is certainly a good idea.
I do not see another solution than the boot param and blacklist approach. Len might have have another idea (see comment #60).
Regarding the time I wasted for this, I won't post for this bug anymore. I proposed a save and sane solution, the issue is debugged to the root cause.

> But on windows vista the XSDT will be used when both exist
But still the 32 bit addresses I expect?
Comment 65 ykzhao 2009-01-07 16:56:23 UTC
This is tested on vista by using KVM. On the vista the XSDT will be used instead of RSDT when both exist.
   In fact there is a presentation , which introduces ACPI on vista. You can google it.
  
Thanks.
   
    
Comment 66 Vitus Jensen 2009-02-08 16:50:28 UTC
(In reply to comment #55)
> Created an attachment (id=18150) [details]
> quick'n dirty hack to force EC pollmode

On kernel 2.6.29-rc3-wl acpi=rsdt is available which allows to use ACPI on my Thinkpad R51e but I still had to apply the patch from #55 to disable EC interrupt mode.  There is code to detected GPE storms in that module but it doesn't help in this case.
Comment 67 ykzhao 2009-02-22 19:22:11 UTC
Hi, Rui
    How about close this bug as this issue can be resolved by adding the boot option of "acpi=rsdt"?
    thanks.
Comment 68 Len Brown 2009-04-07 02:58:10 UTC
FYI

shipped in 2.6.30 merge window (2.6.29-git14)

commit 67dc092187626ac55a60877485f78bc291cbfa81
Author: Thomas Renninger <trenn@suse.de>
Date:   Thu Apr 2 14:11:20 2009 +0200

    ACPI: Remove R40e c-state blacklist

    The recent ACPICA patch
    (ACPICA: FADT: Favor 32-bit register addresses for compatibility)
    makes machine to use the right FADT HW addresses
    and C-states now work fine.