Bug 13041

Summary: OQO 01+ doesn't power off since 2.6.24
Product: ACPI Reporter: Jamie Lentin (jm)
Component: Power-OffAssignee: Zhang Rui (rui.zhang)
Status: CLOSED CODE_FIX    
Severity: normal CC: acpi-bugzilla, lenb, shaohua.li, yakui.zhao
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.28.2+ Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Patch: delete the _GTS/_BFS object
patch vs 2.6.30-rc2
Correct comments to match kernel options
patch vs 2.6.30-rc4

Description Jamie Lentin 2009-04-07 20:56:33 UTC
Since 2.6.24, my OQO 01+ will not enter S5.  There are no abnormal messages, however the device sits in the halted state.  Weirdly, S3 & S4 are fine.  Using git bisect, the commit that caused S5 to stop working is:-

http://git.kernel.org/?p=linux/kernel/git/aegl/linux-2.6.git;a=commit;h=c95d47a868f35cd47643d116a3c680cdaa954df8

ACPI: Separate invocations of _GTS and _BFS from _PTS and _WAK

In particular, the call to GTS in acpi_enter_sleep_state is upsetting something.

I have tried the following:-
* Moving the GTS method call to the beginning of acpi_enter_sleep_state(), this has no effect.
* Removing the GTS method call entirely results in working S5---obviously I won't be submitting this patch but keeps me happy for now.

Any suggestions as to what to try next for a better fix?

Basic debug info can be found here:-

acpidump:  http://jamie.lentin.co.uk/hardware/oqo_01plus/acpidump.txt
dmidecode: http://jamie.lentin.co.uk/hardware/oqo_01plus/dmidecode.txt
dmesg (2.6.28.2): http://jamie.lentin.co.uk/hardware/oqo_01plus/dmesg.txt

Thanks for any help
Comment 1 ykzhao 2009-04-08 05:46:50 UTC
From the acpidump there exists the _GTS object and it will be evaluated before accessing the PM1A control register. And the _PTS object is defined as the following:
   >Method (_GTS, 1, NotSerialized)
    {
        Store (Add (Arg0, 0x10), \_SB.PCI0.EC0.PINF)
    }
    The internal EC register will be accessed.
    
    While evaluating the _PTS object, the EC GPE is disabled and timer interrupt is also disabled. It seems that EC access can't be finished.
    
    Will you please double check whether the S4 can work? In fact the _GTS object is also evaluated in course of S4.
    Thanks.
Comment 2 Jamie Lentin 2009-04-08 09:59:57 UTC
I've used S4 on this device ~since 2.6.28 came out, *very* occasionally it has failed to turn off (it's only happened twice I think), in general it's fine.  Before 2.6.28 the hibernation process was unstable, but I think it could still turn off.
Comment 3 ykzhao 2009-04-09 06:03:28 UTC
It is very weird. In fact when hibernate is used, the shutdown flowchart is the same as that using poweroff. And the function of kernel_power_off is called, in which the S5 is entered.
    Thanks.
Comment 4 ykzhao 2009-04-10 06:28:24 UTC
Hi, Jamie
    Thanks for your work. 
    Now this issue can be reproduced on my own laptop if the custom DSDT is used. In the custom DSDT the _GTS object is added, in which the internal EC register is accessed. 
    And if the platform mode is selected for the hibernate, S4 can work well. But if the shutdown mode is selected for the hibernate, it can't work well.
   
    At the same time if the EC access is removed in the _GTS object, the S4/S5 can work well. 

    thanks.
Comment 5 ykzhao 2009-04-13 02:32:14 UTC
Created attachment 20954 [details]
Patch: delete the _GTS/_BFS object

The attached is the patch, in which the _GTS/_BFS object is deleted. Of course we can also avoid the EC access in course of suspend/resume/shutdown.
Comment 6 Zhang Rui 2009-04-13 02:45:21 UTC
A question:
is this the fix you proposed for this bug?
shouldn't we switch the EC to poll mode rather than deleting the _GTS/_BFS support?
Comment 7 ykzhao 2009-04-13 07:12:33 UTC
hi, Jamie
    Thanks for your work.
    With your help we get the root cause about this issue. On this box there exists the _GTS/_BFS object, in which the EC internal register is accessed. And this issue is related with the incorrect EC working mode. 
    In course of suspend/resume/shutdown(S5) the polling mode should be used. At the same time the udelay should be used instead of msleep as the timer interrupt is disabled. But there is no shutdown function for ACPI EC driver. In such case the EC interrupt mode is still used in course of shutdown. It causes that the EC access can't be finished.(As the timer interrupt is disabled, the function of wait_event_timeout will become useless). 

    In my test the box can be shutdown correctly if the polling mode is used in the shutdown function. BTW: this is tested on the 2.6.28.3 kernel.And the udelay is used when the polling mode is used.  And I also do such a test on the 2.6.30-rc1 kernel and find that the S5 can't work well even when the polling mode is used in the ACPI EC shutdown function. 
    
    And we find that the _GTS/_BFS object won't be evaluated on windows XP/Vista even when it exists. So it will be appropriate that the _GTS/_BFS object should also be deleted on Linux.
Comment 8 ykzhao 2009-04-13 07:27:05 UTC
It is easier to delete the _GTS/_BFS object.

And if we switch the EC to poll mode, the problem still exists on the latest upstream kernel. In the latest upstream kernel the udelay is replaced by the msleep when EC works in polling mode. And this issue is changed several times back and forth. 
   It seems that the EC working mode will be changed by the following RFC patch:
   >http://marc.info/?l=linux-acpi&m=123888812925650&w=2
      [RFC] [PATCH] ACPI: EC: Merge poll and interrupt modes

   In fact another solution is that Ec should be switched to the polling mode in course of suspend/resume/shutdown. At the same time udelay should be used.
   Thanks.
Comment 9 Len Brown 2009-04-18 03:38:17 UTC
Created attachment 21041 [details]
patch vs 2.6.30-rc2

Please test this patch.
It disables (but does not delete) _GTS and _BFS support.
It should print some lines in your dmesg when it finds _GTS or _BFS
on your system, asking you to try acpi.gts=1 and acpi.bfs=1.
If you try them, it should return you to the original (failing) behaviour.
Comment 10 Jamie Lentin 2009-04-27 21:37:35 UTC
Created attachment 21146 [details]
Correct comments to match kernel options

The options added are part of the "hwsleep" module, not "acpi", so specifying "acpi.gts=1" results in an unknown kernel option error.  The above patch corrects the printk output to match what the code recognises.

Bar this, the patch works exactly as you'd expect.  The methods are detected and the prink output shown, "hwsleep.gts=1" will break shutdown, "hwsleep.bfs=1" has no effect (since this works fine).

(sorry for the late reply)

Thanks for all your help,
Comment 11 Len Brown 2009-04-30 06:09:10 UTC
Thanks for testing.

> hwsleep.gts=1

Rats.  Originally I had acpi.gts working, but I moved the param
down into the acpica code to make the workaround more localized,
but didn't consider that I'd left the acpi.* namespace...
I don't like leaving it this way, b/c hwsleep is a totally
arbitrary name, Probably I should update the Makefile to make it acpica.gts=1...
or maybe the upcoming ACPICA release has a new name we should adopt...
Comment 12 Len Brown 2009-05-07 19:39:16 UTC
Created attachment 21262 [details]
patch vs 2.6.30-rc4

This patch updates the ACPICA Makefile so that any modparams
in the ACPICA code will be of the form "acpi.*"

So after this, "hwsleep.gts=1" should become "acpi.gts=1",
as it was originally intended by the previous patch.
Comment 13 Jamie Lentin 2009-05-09 19:49:11 UTC
> So after this, "hwsleep.gts=1" should become "acpi.gts=1",
> as it was originally intended by the previous patch.

Works for me. This way makes far more sense, I only submitted a change to the comments since I didn't know how to fix the option prefix.

Thanks again.
Comment 14 Len Brown 2009-05-18 16:29:54 UTC
patch in comment #12 shipped in 2.6.30-rc6-git1

closed