Most recent kernel where this bug did *NOT* occur: Distribution: 2.6.8.1 Hardware Environment: Tyan S2466N-4M mobo Athlon MP 2200+ BIOS 4.06 Software Environment: Mandrake 10.1 (no problem observed) Mandriva 2007 (problem observed) Problem Description: Following a "poweroff" on this system with the 2.6.8.1 kernel the front panel power button could be used to restart the system. With Kernel 2.6.17-8mdv (from Mandriva 2007) and vanilla kernel 2.6.19.3 this no longer works. "poweroff" takes the system down but the front panel power switch is ignored. To reboot the line cord must be unplugged, 20 seconds allowed to pass, and the power cord reattached. At that point the power switch once again becomes active. Note, "button" is loaded as a module. Steps to reproduce: 1. Install a kernel 2.6.17 or higher on a Tyan S2466N-4M system. 2. Start the OS to either multi-user or failsafe mode 3. "poweroff" 4. The power button will not restart the system.
Created attachment 10335 [details] /var/log/dmesg /var/log/dmesg with vanilla kernel 2.6.19.3
Created attachment 10336 [details] /proc/acpi/wakeup /proc/acpi/wakeup Notice, no PWRB or anything like that.
Created attachment 10337 [details] output of lsmod output of lsmod, notice "button" is present
Created attachment 10338 [details] kernel config file for 2.6.19.3
Created attachment 10339 [details] output of acpidump, 2.6.19.3 kernel
Created attachment 10340 [details] output of: iasl -d dsdt Note: iasl -tc dsdt.dsl gives a warning that _WAK does not return a value. However other Mandriva 2007 systems with different hardware also show that warning and the power button works for them. There are also "method local variable is not initialized" errors associated with an apparent form of noop in five lines that contain only: Store ( Local0, Local0) These are the only iasl compiler warnings or errors.
Note there are no acpi options in the kernel start up. The various machines that work or don't work all have some variation on this lilo.conf entry (partitions change, everything else the same) image=/boot/vmlinuz label="linux" root=/dev/hda3 initrd=/boot/initrd.img append="resume=/dev/hda2"
Ok, you seem to have two power buttons: ACPI: Power Button (FF) [PWRF] ACPI: Sleep Button (FF) [SLPF] ACPI: Power Button (CM) [PWRB] Are you able to switch machine off with them?
Not reliably. I've tried it a couple of times and once it did something, the other times nothing. The one one time it had an effect it removed the power immediately (like kicking the plug out).
Note, the front panel has only two switches: power reset I have no idea why it thinks there are three buttons. There is a keyboard and a mouse currently plugged in. Perhaps that is confusing it?
Probably you have sleep button connector somewhere on mainboard... Power button is often appears twice as FF and CM flavours. Did you try to tweak BIOS variables ? Loading defaults, etc?
There was nothing obvious in the BIOS that affects the behavior of the power button. In any case, I have not touched the BIOS settings on this node compared to the other nodes. So some kernel change is correlated with the problem. (I'm not going to say that it is the problem since it may have just exposed an existing BIOS issue.)
Also, why don't those power buttons, or at least one of those power buttons, end up in /proc/acpi/wakeup??? I tried echo PWRB >/proc/acpi/wakeup echo PWRF >/proc/acpi/wakeup and it didn't add them. The relevant section of /var/log/dmesg is: ACPI Exception (acpi_processor-0681): AE_NOT_FOUND, Processor Device is not present [20060707] ACPI: Getting cpuindex for acpiid 0x1 ACPI: Power Button (FF) [PWRF] ACPI: Sleep Button (FF) [SLPF] ACPI: Power Button (CM) [PWRB] Using specific hotkey driver ibm_acpi: ec object not found Does one of the lines above or below the Button lines give any clue as to why they might not be working?
From ACPI spec: 4.7.2.2.1.1 Fixed Power Button ... While the system is in the G1 or G2 global states (S1, S2, S3, S4 or S5 states), any further power button press after the button press that transitioned the system into the sleeping state unconditionally sets the power button status bit and wakes the system, regardless of the value of the power button enable bit. OSPM responds by clearing the power button status bit and waking the system. This means that FF button does not need to be in wakeup devices list to wakeup machine from sleep/off state.
Ok, except it doesn't shut down reliably from the front panel button either. The OS in these recent kernels must now be setting (or forgetting to set) some bit somewhere when it does "poweroff" which somehow renders that "off" state different from the "off" state the system is in following initial application of power to the system. Is there some easy way to determine what that bit is? Could that bit possibly be set explicitly in the shutdown sequence with some other program? Basically I just need a solution, even if it's a user space solution. Having to physically unplug and replug numerous machines after poweroff is not an acceptable situation. I've had to resort to similar hacks in the past, for instance to get WOL working with various cards. Fixing it in the kernel would be great, but getting something working soon is my immediate goal. Thanks
Created attachment 10341 [details] Always set WAKE capability of buttons Please check if this patch works for you.
Well that was a fun morning. Built the patched version, installed it, and nothing changed. At that point I figured it probably wasn't the kernel but something else, so I proceded to turn off one service after another in chkconfig and reboot. Turning off acpi and acpid didn't do anything, but turning of "haldaemon" allowed the power button to work following a "poweroff". Assuming the system would go down all the way, which it often wouldn't do. It would lock up in "Sending all processes the TERM signal", or "Sending all processes the KILL signal", both of which are calls to killall5. In one instance it rebooted and then crashed in the BIOS, something I've never seen before. I was logged in on the console (no X11) at the time all of these strange "won't shut down normally" issues arose. In the end I used chkconfig to disable all of the following: acpi, acpid, harddrake, haldaemon, wltool, messagebus, mandi and now it seems to poweroff and reboot reliably from an rsh from another node. I can't really say which of these is responsible for the ill behavior, it might well be an interaction between more than one of them. Also for the record, for unknown reasons the module asus_acpi was loading. This is a Tyan motherboard and there's just no reason to load that. It didn't load on a Gigabyte motherboard also running Mandriva 2007. Since I couldn't see what was loading this module, to stop it from loading asus_acpi.ko was removed from the /lib/modules directory tree. I made this change very early and can't say for sure if it helped or not. Finally, even with nothing in /etc/rc.d/init.d explicitly loading acpi modules button, fan, etc. are still loading. Perhaps kacpid is doing that? Thanks for your help.
kacpid is a kernel thread and is not able to load any module. in other distributions such as Ubuntu or SuSE ACPI modules are loaded by either acpid init script or powersave init script. They have external config files in /etc, so grep /etc/init.d/* will not help :)
Good hint. /etc/inittab has an entry: si::sysinit:/etc/rc.d/rc.sysinit and rc.sysinit has a loop which attempts to load every module it finds in kernel/drivers/acpi: # Initialize ACPI bits if [ -d /proc/acpi ]; then for module in /lib/modules/$unamer/kernel/drivers/acpi/* ; do module=${module##*/} module=${module%.ko} modprobe $module >/dev/null 2>&1 done fi I think I'll try taking out the extraneous drivers (ibm_acpi.ko and toshiba_acpi.ko), and give acpi/acpid another shot. It could be that the failed loading of one of these other modules sets the stage for the meltdown that occurs later in haldaemon etc. Come to think of it, since the only ACPI this system uses involve poweroff/shutdown, are any of the modules other than button required? Maybe processor.ko for athcool?
This is so giving me a headache. I reduced the modules that can load to just button, container, processor, thermal, and video. I also turned off console redirection in the BIOS (this allows one to do BIOS operations on a serial line, that isn't needed anymore since there is now access to the back of the nodes). Anyway, if button is loaded once (by rc.sysinit) then following poweroff the front panel switch works. However doing this: service acpi start or this rmmod button modprobe button or even just this alone (sometimes, other times it did nothing) modprobe button seems to disable the front panel switch after "poweroff". Just to make life interesting the system is also really touchy about how it is shut down. From a remote node "rsh nodename poweroff" generally works. However login on the console and it may lock in killall5 on a shutdown (usually) or a poweroff (sometimes). This seems not to have anything to do with athcool since that has been turned off by that point. I have no idea how the system can get stuck in "killall5 -15" or "killall5 -9", but it does. These are in /etc/rc.d/init.d/halt. I've waited as long as 10 minutes and it never comes out of killall5 once it sticks. Maybe that's a Mandriva problem though. Too many variables...
Status: the poweroff/shutdown issues were Mandriva 2007 related problems. Short story: transition to state 0 or 6 was not running the sge init script because that script was not making a /var/lock/subsys/sge file. So even though the KXX links were present in /etc/rc.d/rc0.d and /etc/rc.d/rc6.d, they were ignored by the /etc/rc.d/rc script. After resolving that issue I tested acpi once more. It is still the case that following boot with services acpi and acpid disabled that the system can be shutdown and the front panel button works. The "button" module is loaded by /etc/rc.d/rc.sysinit because it is present in /lib/modules/KERNELNAME/drivers/acpi. However, subsequently starting the acpi service causes the front panel button to NOT work following a subsequent poweroff. If acpid is started then the front panel button may be used to power off the system, however, after doing so, it cannot be powered back up from the front panel again, presumably because the acpi service is needed by the acpid service, and the former breaks the front panel button on poweroff. Simply unloading button, ac, etc. and then reloading button is enough to disable the front panel button following poweroff. So there are apparently issues with the loading/unloading/reloading of the button module on this system. If it is loaded only once though, then it will enable the front panel power switch to be used to restart the system following "poweroff".
Is this still a problem in the latest kernel release?
David, any repsonse for rui's inquiry?
I have not upgraded the kernel on these machines so I can't answer the question.
David, we can't see this issue on other platforms. I'll mark this bug as unreproducible for now, as we have to count on you to test and give update. Please reopen when you meet this issue again on a newer kernel. thanks.