Bug 7961 - Kernel 2.6.17 disables power button following poweroff, tyan S2466N motherboard
Summary: Kernel 2.6.17 disables power button following poweroff, tyan S2466N motherboard
Status: REJECTED UNREPRODUCIBLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Off (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: acpi_power-off
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-02-07 12:14 UTC by David Mathog
Modified: 2007-12-23 17:32 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.17
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
/var/log/dmesg (9.23 KB, text/plain)
2007-02-07 12:16 UTC, David Mathog
Details
/proc/acpi/wakeup (86 bytes, text/plain)
2007-02-07 12:18 UTC, David Mathog
Details
output of lsmod (1.81 KB, text/plain)
2007-02-07 12:18 UTC, David Mathog
Details
kernel config file for 2.6.19.3 (72.60 KB, text/plain)
2007-02-07 12:19 UTC, David Mathog
Details
output of acpidump, 2.6.19.3 kernel (38.52 KB, text/plain)
2007-02-07 12:21 UTC, David Mathog
Details
output of: iasl -d dsdt (83.34 KB, text/plain)
2007-02-07 12:26 UTC, David Mathog
Details
Always set WAKE capability of buttons (735 bytes, patch)
2007-02-08 04:13 UTC, Alexey Starikovskiy
Details | Diff

Description David Mathog 2007-02-07 12:14:21 UTC
Most recent kernel where this bug did *NOT* occur:
Distribution:
2.6.8.1
Hardware Environment: 
Tyan S2466N-4M mobo
Athlon MP 2200+
BIOS 4.06
Software Environment: 
Mandrake 10.1 (no problem observed)
Mandriva 2007 (problem observed)
Problem Description:

Following a "poweroff" on this system with the 2.6.8.1
kernel the front panel power button could be used to restart the system.

With Kernel 2.6.17-8mdv (from Mandriva 2007) and vanilla kernel
2.6.19.3 this no longer works.  "poweroff" takes the system down but
the front panel power switch is ignored.  To reboot the line cord must
be unplugged, 20 seconds allowed to pass, and the power cord reattached.
At that point the power switch once again becomes active.

Note, "button" is loaded as a module.

Steps to reproduce:

1.  Install a kernel 2.6.17 or higher on a Tyan S2466N-4M system.
2.  Start the OS to either multi-user or failsafe mode
3.  "poweroff"
4.  The power button will not restart the system.
Comment 1 David Mathog 2007-02-07 12:16:20 UTC
Created attachment 10335 [details]
/var/log/dmesg

/var/log/dmesg with vanilla kernel 2.6.19.3
Comment 2 David Mathog 2007-02-07 12:18:01 UTC
Created attachment 10336 [details]
/proc/acpi/wakeup

/proc/acpi/wakeup

Notice, no PWRB or anything like that.
Comment 3 David Mathog 2007-02-07 12:18:40 UTC
Created attachment 10337 [details]
output of lsmod

output of lsmod, notice "button" is present
Comment 4 David Mathog 2007-02-07 12:19:30 UTC
Created attachment 10338 [details]
kernel config file for 2.6.19.3
Comment 5 David Mathog 2007-02-07 12:21:27 UTC
Created attachment 10339 [details]
output of acpidump, 2.6.19.3 kernel
Comment 6 David Mathog 2007-02-07 12:26:17 UTC
Created attachment 10340 [details]
output of: iasl -d dsdt

Note:

iasl -tc dsdt.dsl

gives a warning that _WAK does not return a value.  However other
Mandriva 2007 systems with different hardware also show that warning
and the power button works for them.  There are also "method local variable
is not initialized" errors associated with an apparent form of noop in 
five lines that contain only:

  Store ( Local0, Local0)

These are the only iasl compiler warnings or errors.
Comment 7 David Mathog 2007-02-07 12:46:01 UTC
Note there are no acpi options in the kernel start up.  The various
machines that work or don't work all have some variation on this lilo.conf entry
(partitions change, everything else the same)

image=/boot/vmlinuz
        label="linux"
        root=/dev/hda3
        initrd=/boot/initrd.img
        append="resume=/dev/hda2"
Comment 8 Alexey Starikovskiy 2007-02-07 12:54:43 UTC
Ok, you seem to have two power buttons:
ACPI: Power Button (FF) [PWRF]
ACPI: Sleep Button (FF) [SLPF]
ACPI: Power Button (CM) [PWRB]
Are you able to switch machine off with them?
Comment 9 David Mathog 2007-02-07 13:02:12 UTC
Not reliably.  I've tried it a couple of times and once it did something,
the other times nothing. The one one time it had an effect it removed
the power immediately (like kicking the plug out).
Comment 10 David Mathog 2007-02-07 13:03:30 UTC
Note, the front panel has only two switches:

power
reset

I have no idea why it thinks there are three buttons.  There is a keyboard
and a mouse currently plugged in.  Perhaps that is confusing it?
Comment 11 Alexey Starikovskiy 2007-02-07 13:10:26 UTC
Probably you have sleep button connector somewhere on mainboard...
Power button is often appears twice as FF and CM flavours.
Did you try to tweak BIOS variables ? Loading defaults, etc?
Comment 12 David Mathog 2007-02-07 13:22:44 UTC
There was nothing obvious in the BIOS that affects the behavior of the
power button.  In any case, I have not touched the BIOS settings on this node
compared to the other nodes.  So some kernel change is correlated with the
problem.  (I'm not going to say that it is the problem since it may have
just exposed an existing BIOS issue.)
Comment 13 David Mathog 2007-02-07 13:31:02 UTC
Also, why don't those power buttons, or at least one of those power buttons,
end up in /proc/acpi/wakeup???  I tried 

echo PWRB >/proc/acpi/wakeup
echo PWRF >/proc/acpi/wakeup

and it didn't add them.

The relevant section of /var/log/dmesg is:

ACPI Exception (acpi_processor-0681): AE_NOT_FOUND, Processor Device is not
present [20060707]
ACPI: Getting cpuindex for acpiid 0x1
ACPI: Power Button (FF) [PWRF]
ACPI: Sleep Button (FF) [SLPF]
ACPI: Power Button (CM) [PWRB]
Using specific hotkey driver
ibm_acpi: ec object not found

Does one of the lines above or below the Button lines give any clue as to why
they might not be working?

Comment 14 Alexey Starikovskiy 2007-02-07 14:05:25 UTC
From ACPI spec: 4.7.2.2.1.1 Fixed Power Button
...
While the system is in the G1 or G2 global states (S1, S2, S3, S4 or S5 
states), any further power button
press after the button press that transitioned the system into the sleeping 
state unconditionally sets the
power button status bit and wakes the system, regardless of the value of the 
power button enable bit. OSPM
responds by clearing the power button status bit and waking the system.

This means that FF button does not need to be in wakeup devices list to wakeup 
machine from sleep/off state. 
Comment 15 David Mathog 2007-02-07 14:31:50 UTC
Ok, except it doesn't shut down reliably from the front panel button either.

The OS in these recent kernels must now be setting (or forgetting to set)
some bit somewhere when it does "poweroff" which somehow renders that "off"
state different from the "off" state the system is in following initial
application of power to the system.

Is there some easy way to determine what that bit is?

Could that bit possibly be set explicitly in the shutdown sequence
with some other program?

Basically I just need a solution, even if it's a user space solution.  Having to
physically unplug and replug numerous machines after poweroff is not an 
acceptable situation.  I've had to resort to similar hacks in the past, for
instance to get WOL working with various cards.  Fixing it in the kernel would
be great, but getting something working soon is my immediate goal.

Thanks
Comment 16 Alexey Starikovskiy 2007-02-08 04:13:58 UTC
Created attachment 10341 [details]
Always set WAKE capability of buttons

Please check if this patch works for you.
Comment 17 David Mathog 2007-02-08 12:56:11 UTC
Well that was a fun morning.

Built the patched version, installed it, and nothing changed.

At that point I figured it probably wasn't the kernel but something else,
so I proceded to turn off one service after another in chkconfig and
reboot.  Turning off acpi and acpid didn't do anything, but turning
of "haldaemon" allowed the power button to work following a "poweroff".
Assuming the system would go down all the way, which it often wouldn't do.
It would lock up in "Sending all processes the TERM signal", or
"Sending all processes the KILL signal", both of which are calls to
killall5.  In one instance it rebooted and then crashed in the BIOS,
something I've never seen before.  I was logged in on the console (no
X11) at the time all of these strange "won't shut down normally" issues 
arose.  In the end I used chkconfig to disable all of the following:

acpi, acpid, harddrake, haldaemon, wltool, messagebus, mandi

and now it seems to poweroff and reboot reliably from an rsh from another node.
I can't really say which of these is responsible for the ill behavior, it
might well be an interaction between more than one of them.

Also for the record, for unknown reasons the module asus_acpi was loading.
This is a Tyan motherboard and there's just no reason to load that.  It didn't
load on a Gigabyte motherboard also running Mandriva 2007.  Since I couldn't
see what was loading this module, to stop it from loading asus_acpi.ko was
removed from the /lib/modules directory tree.  I made this change very early and
can't say for sure if it helped or not.

Finally, even with nothing in /etc/rc.d/init.d explicitly loading acpi modules
button, fan, etc. are still loading.  Perhaps kacpid is doing that? 

Thanks for your help.
Comment 18 Alexey Starikovskiy 2007-02-08 22:30:20 UTC
kacpid is a kernel thread and is not able to load any module.
in other distributions such as Ubuntu or SuSE ACPI modules are loaded by 
either acpid init script or powersave init script. They have external config 
files in /etc, so grep /etc/init.d/* will not help :)
Comment 19 David Mathog 2007-02-09 08:02:05 UTC
Good hint. /etc/inittab has an entry:

  si::sysinit:/etc/rc.d/rc.sysinit

and rc.sysinit has a loop which attempts to load every module it finds
in kernel/drivers/acpi:

# Initialize ACPI bits
if [ -d /proc/acpi ]; then
    for module in /lib/modules/$unamer/kernel/drivers/acpi/* ; do
        module=${module##*/}
        module=${module%.ko}
        modprobe $module >/dev/null 2>&1
    done
fi

I think I'll try taking out the extraneous drivers (ibm_acpi.ko and
toshiba_acpi.ko),  and give acpi/acpid another shot.  It could be that the
failed loading of one of these other modules sets
the stage for the meltdown that occurs later in haldaemon etc.  Come to think of
it, since the only ACPI this system uses involve poweroff/shutdown, are any
of the modules other than button required?  Maybe processor.ko for athcool?
Comment 20 David Mathog 2007-02-09 11:58:49 UTC
This is so giving me a headache.  I reduced the modules that can load to just
button, container, processor, thermal, and video.  I also turned off console
redirection in the BIOS (this allows one to do BIOS operations on a serial line,
that isn't needed anymore since there is now access to the back of the nodes).

Anyway, if button is loaded once (by rc.sysinit) then following poweroff
the front panel switch works.  However doing this:

  service acpi start

or this

  rmmod button
  modprobe button

or even just this alone (sometimes, other times it did nothing)

  modprobe button

seems to disable the front panel switch after "poweroff".

Just to make life interesting the system is also really touchy about how
it is shut down.  From a remote node "rsh nodename poweroff" generally works.
However login on the console and it may lock in killall5 on a shutdown (usually)
or a poweroff (sometimes).  This seems not to have anything to do with athcool
since that has been turned off by that point.  I have no idea how the system
can get stuck in "killall5 -15" or "killall5 -9", but it does.  These are
in /etc/rc.d/init.d/halt.  I've waited as long as 10 minutes and it never
comes out of killall5 once it sticks.  Maybe that's a Mandriva problem though.

Too many variables...

Comment 21 David Mathog 2007-02-12 14:48:18 UTC
Status: the poweroff/shutdown issues were Mandriva 2007 related problems.  Short
story: transition to state 0 or 6 was not running the sge init script
because that script was not making a /var/lock/subsys/sge file.  So
even though the KXX links were present in /etc/rc.d/rc0.d and
/etc/rc.d/rc6.d, they were ignored by the /etc/rc.d/rc script.

After resolving that issue I tested acpi once more.  It is still the case
that following boot with services acpi and acpid disabled that the system
can be shutdown and the front panel button works.  The "button" module is
loaded by /etc/rc.d/rc.sysinit because it is present in
/lib/modules/KERNELNAME/drivers/acpi.  However, subsequently starting the acpi
service causes the front panel button to NOT work following a subsequent
poweroff.  If acpid is started then the front panel button may be used to power
off the system, however, after doing so, it cannot be powered back up from the
front panel again, presumably because the acpi service is needed by the acpid
service, and the former breaks the front panel button on poweroff.  Simply
unloading button, ac, etc. and then reloading button is enough to disable the
front panel button following poweroff.

So there are apparently issues with the loading/unloading/reloading of the
button module on this system.  If it is loaded only once though, then it will
enable the front panel power switch to be used to restart the system following
"poweroff".
Comment 22 Zhang Rui 2007-09-17 00:52:22 UTC
Is this still a problem in the latest kernel release?
Comment 23 Fu Michael 2007-11-12 18:18:47 UTC
David, any repsonse for rui's inquiry?
Comment 24 David Mathog 2007-11-18 11:52:21 UTC
I have not upgraded the kernel on these machines so I can't answer the question.
Comment 25 Fu Michael 2007-12-23 17:32:23 UTC
David, we can't see this issue on other platforms. I'll mark this bug as unreproducible for now, as we have to count on you to test and give update. Please reopen when you meet this issue again on a newer kernel. thanks.

Note You need to log in before you can comment on or make changes to this bug.