Hardware: Fujitsu-Siemens V5505 notebook
Kernel version: 2.6.30, 2.6.31-rc3
Symptom: After executing 'shutdown -h' the system will not start normally but instead after about 2 seconds of LED/disk spin up activty system will stop completely. After about 2 more seconds the start up process will run normally with BIOS splash screen, grub startup and the system booting normally.
If after 'shutdown -h' I would disconnect the power supply and the battery the issue disappears.
This also breaks suspend to ram.
Bisecting found this:
9710794383ee5008d67f1a6613a4717bf6de47bc is first bad commit
Author: Arjan van de Ven <email@example.com>
Date: Sun Mar 15 11:11:44 2009 -0700
async: remove the temporary (2.6.29) "async is off by default" code
Now that everyone has been able to test the async code (and it's being used
in the Moblin betas by default), we can enable it by default.
The various fixes needed have gone into 2.6.29 already.
[With an important bugfix from Stefan Richter]
Signed-off-by: Arjan van de Ven <firstname.lastname@example.org>
Reverting this patch on top of 2.6.31-rc3 fixes the issue for me.
Can you be more specific where you get when things get stuck ?
(make sure to also remove "quiet" from the kernel command line if that is present,that will hide all useful messages if it's there)
Sorry. The symptoms are a bit difficult to explain. The issue is actually visible before kernel boots. Or even before BIOS splash screen appears.
If I shutdown a system running a kernel with this patch (with 'async') the system will shutdown normally but then later when I want to start the computer it won't start normally. I press the power button, the LEDs will turn on and the disk will spin up for about two seconds. Then the LEDs turn off and the disk spins down (as if the computer turns itself off). Then the LEDs turn on again and the computer starts normally.
The same cycle happens when I resume from suspend to ram. But this actually breaks the resume.
If I shutdown the computer and disconnect power supply and battery it will start up normally.
So it seems that 'async' is causing something in the kernel to leave a little mess in...BIOS maybe?
I don't see any errors from kernel either during startup or shutdown. I don't use 'quiet'.
> --- Comment #2 from Rafal Kaczynski <email@example.com> 2009-07-18 19:18:43
> Sorry. The symptoms are a bit difficult to explain. The issue is actually
> visible before kernel boots. Or even before BIOS splash screen appears.
> If I shutdown a system running a kernel with this patch (with 'async') the
> system will shutdown normally but then later when I want to start the
> it won't start normally. I press the power button, the LEDs will turn on and
> the disk will spin up for about two seconds. Then the LEDs turn off and the
> disk spins down (as if the computer turns itself off). Then the LEDs turn on
> again and the computer starts normally.
Now the problem with the bisect is that it pointed out the commit that flipped a default,
not which patch actually introduced a problem. In order to chase this down, if you
are up for this, you need to add the kernel parameter to enable this even for earlier kernels,
and redo the bisect. You can start with the git ID that introduces the flip of the switch,
and bisect back to 2.6.29 release....
So I've started to bisect between 2.6.28 i 2.6.29-rc1. Bisect told me:
67acd8b4b7a3f1b183ae358e1dfdb8a80e170736 is first bad commit
async: don't do the initcall stuff post boot
bootchart: improve output based on Dave Jones' feedback
async: make the final inode deletion an asynchronous event
fastboot: Make libata initialization even more async
fastboot: make the libata port scan asynchronous
fastboot: make scsi probes asynchronous
async: Asynchronous function calls to speed up kernel boot
I feel a 'git-lost'. Should I try to revert any particular patch of this merge?
Do you mean that the suspend is affected by the async fastboot? How does this happen?
Do you mean that the second boot is abnormal after the box is poweroffed by using "shutdown -h"? Right? It seems that the second boot will remember what happened in the first boot. It can't be understood.
Will you please double check it again? Please confirm whether the issue still exists if you wait for some time after shutdown the box.
Yes fastboot will make the system boot abnormally and breaks resume after suspend to ram.
Let me try to define the issue this way:
A - normal boot sequence
1) power LED turn on, hdd spins up
2) BIOS splashscreen
4) kernel boot
B - abnormal boot sequence after shutdown of a kernel with 'fastboot' (kernel does not show any errors during shutdown, battery is installed)
1) power LED turn on, hdd spins up
2) power LED turn off, hdd spins down (no activity for about two second)
3) power LED turn on, hdd spins up
2) BIOS splashscreen
4) kernel boot
Kernel v2.6.30 or 2.6.30-rc3 - issue exists - case B
If battery is disconnected after shutdown: no issue - case A
Kernel v2.6.30-rc3 with commit 9710794383ee5008d67f1a6613a4717bf6de47bc reverted - no issue - case A
Kernel v2.6.29 (no parameters)- no issue - case A
Kernel v2.6.29 (fastboot kernel parameter) - issue exists - case B
While bisecting between v2.6.28 and v2.6.29 with fastboot parameter - bisect told me the first bad commit is 67acd8b4b7a3f1b183ae358e1dfdb8a80e170736.
If I shutdown in the evening, keep the battery attached it will still boot abnormally (case B) in the morning.
I'll be glad to test any patch... just please don't ask me to go through 13 bisects again ;-)
I've tried to look at the original fastboot merge. I thought I could try to disable particular areas (scsi, libata...) but this is too complex for me.
Thanks for the response.
Can suspend/resume work well if the commit is reverted?
If so, will you please enable "CONFIG_PM_DEBUG" in kernel configuration and do the following test?
a. kill the process using /proc/acpi/event
b. echo freezer > /sys/power/pm_test
c. echo mem > /sys/power/state ; dmesg > dmesg_freezer
d. please wait for five seconds and see whether it can be resumed
Please echo "devices/platform/core/cpu" > /sys/power/pm_test and do the above test.
Yes in every case where I can boot normally after shutdown I can also resume from suspend to ram. In every case when I would get the power LEDs on-off-on at startup it would also not resume.
I've switched to v2.6.29 because of the fastboot parameter there.
I couldn't echo "devices/platform/core/cpu" > /sys/power/pm_test. Echo to pm_test would only work if I specify a single value.
So I've run a series of tests for each value separately with fastboot parameter and without it. I'm attaching the results in pm_debug_dmesgs.tar.gz and also config file used.
In every case where sth was echoed to pm_test I could resume without problem (although the system didn't suspend fully...the LEDs were still on...but I guess this is expected when pm_test is used). Of course when with fastboot and pm_test set to none it wouldn't resume.
Hope it helps...
Created attachment 22445 [details]
pm debug dmesgs
Created attachment 22446 [details]
Created attachment 22447 [details]
Created attachment 22464 [details]
disable the async probe for multiple ATA ports in one ATA controller
Will you please try the debug patch and see whether the issue still exists?
(In reply to comment #12)
> Created an attachment (id=22464) [details]
> disable the async probe for multiple ATA ports in one ATA controller
> Will you please try the debug patch and see whether the issue still exists?
This patch helps a little bit. With the patched kernel I still get LEDs on-off-on on startup after shutdown. But I can now resume from suspend-to-RAM about 50% of the time. Sometimes it resumes normally, sometimes it goes into this LEDs on-off-on again. Even if it resumes normally, on the second suspend attempt it will not resume.
Checked 2.6.30-rc7. No difference.
ok i'm a completly clueless bystander, but for me it looks like as if the system shutdown is borked somehow like if you turn your system back on, the bios continues to shut down... maybe a race between two 'shut the machine off' mechanism that got uncovered by running (something) async?
a) bios gets asked to shut down and needs a while (now async)
b) parallel some other shut-off mechanism gets called which is very fast (something like an acpi-low-powerstate fallback?)
if you now press the powerbutton to turn the machine on, the bios is still in progress of shutting down via a)
but this is pure speculation.
anyway, i added some cc's...
hm, i can't add firstname.lastname@example.org ..
maybe you could try another reboot mechanism for your machine. (boot with reboot=pci or reboot=acpi or reboot=force on the grub kernel cmdline)
/* reboot=b[ios] | s[mp] | t[riple] | k[bd] | e[fi] [, [w]arm | [c]old] | p[ci]
warm Don't set the cold reboot flag
cold Set the cold reboot flag
bios Reboot by jumping through the BIOS (only for X86_32)
smp Reboot by executing reset on BSP or other CPU (only for X86_32)
triple Force a triple fault (init)
kbd Use the keyboard controller. cold reset (default)
acpi Use the RESET_REG in the FADT
efi Use efi reset_system runtime service
pci Use the so-called "PCI reset register", CF9
force Avoid anything that could hang.
Thanks for your comments and ideas!
I've tested on 2.6.29 with fastboot. Tried both S3 suspend and shutdown scenarios with reboot=pci, reboot=acp, reboot=force, reboot-bios.
There is no difference.
Tried 2.6.32-rc5 - the issue remains.
Does this issue still persist?
still an issue with 2.6.37?
I'm closing this as unreproducible for now.