Bug 11240
Summary: | 2.6.26.1 boot hang unless "nolapic_timer" - acer ferrari 1100, AMD X2, SB600 | ||
---|---|---|---|
Product: | ACPI | Reporter: | Rus (harbour) |
Component: | Other | Assignee: | ykzhao (yakui.zhao) |
Status: | REJECTED INVALID | ||
Severity: | blocking | CC: | acpi-bugzilla, akpm, bunk, harbour |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.26.1 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
acpidump (text version)
acpidump (binary version) 2.6.26.1 hang screenshot acpidump diff (BIOS 1.06 -> 1.07) dmesg 2.6.25.10 2.6.26.1 hang screenshot with new 1.07 BIOS 2.6.26.1 hang screenshot with acpi.debug_layer=0x00010000 acpi.debug_level=0x17 2.6.26.1 hang screenshot with acpi.debug_layer=0x00010000 acpi.debug_level=0x17 and initcall_debug 2.6.25.10 dmesg with acpi.debug_layer=0x00010000 acpi.debug_level=0x17 2.6.27-rc2 dmesg try the debug patch on the kernel of 2.6.25.10 2.6.26.1 booted with nosmp kernel option 2.6.26.1 booted with noapic maxcpus=1 2.6.26.1 dmesg booted with nolapic_timer |
Description
Rus
2008-08-03 04:01:56 UTC
can supply any additional info. mobo has no serial port, so can submit only photo or network console logs. Will you please boot the system with acpi disabled and attach the output of acpidump? It seems that is is a regression. Will you please use git-bisect to identify which commit causes this regression? Will you please capture the picture when the system hangs in the boot phase? Thanks. With acpi=off 2.6.26.1 kernel doesn't find sata disk, so boot is failed. I've attached acpidumps made under working 2.6.25.10 kernel. Picture (sorry for quality) is attached too - it hangs at: .... calling acpi_scan_init+0x0/0xed .... I can't bisect, but can insert any debug output in acpi_scan_init or patch the code for getting needed info. Created attachment 17070 [details]
acpidump (text version)
Created attachment 17071 [details]
acpidump (binary version)
Created attachment 17072 [details]
2.6.26.1 hang screenshot
Oh, have found that Acer released 29 July a new version (1.07) of the bios for this mobo - currently flashing - will post results. Will you please attach the output of dmesg on the kernel of 2.6.25.10? Thank.s With the new BIOS 2.6.26.1 still hangs in the same place. acpidump.diff for the new BIOS is attached. dmesg is attached too Created attachment 17073 [details]
acpidump diff (BIOS 1.06 -> 1.07)
Created attachment 17074 [details]
dmesg 2.6.25.10
Noticed after flashing new BIOS and power-off/power-on cycle that hangs in 2.6.26.1 is occuring in another place. screenshot attached. Created attachment 17079 [details]
2.6.26.1 hang screenshot with new 1.07 BIOS
Will you please enable CONFIG_ACPI_DEBUG in kernel configuration and boot the system with the option of " acpi.debug_layer=0x00010000 acpi.debug_level=0x17"? Please test it on the kernel of 2.6.25.10 and 2.6.26.1. If the system hangs, please attach the screenshot. If not ,please attach the output of dmesg. It is noted that you had better use the old BIOS. Thanks. It is not possible on this brain-dead laptop to downgrade BIOS ;) But 2.6.25.10 is working ok on both new and old BIOS. Compiling now kernel, will post results shortly ... Created attachment 17097 [details]
2.6.26.1 hang screenshot with acpi.debug_layer=0x00010000 acpi.debug_level=0x17
Created attachment 17098 [details]
2.6.26.1 hang screenshot with acpi.debug_layer=0x00010000 acpi.debug_level=0x17 and initcall_debug
Created attachment 17099 [details]
2.6.25.10 dmesg with acpi.debug_layer=0x00010000 acpi.debug_level=0x17
Tell me if you need netconsole logs of 2.6.26.1 hangs with debug output, may be screenshots not too informative. Forget about netconsole, it is unlikely it will work. Today I've compiled new shiny 2.6.27-rc2 - it is booted perfectly (excluding small ahci bug). dmesg is attached. Now I don't need 2.6.26.x ;), so the bug may be closed as rare case (?) Created attachment 17103 [details]
2.6.27-rc2 dmesg
Thanks for this update. I've reopened this because as far as I can tell the regression is still present in 2.6.26.x. harbour@sfinx.od.ua is unable to make progress with http://bugzilla.kernel.org/show_bug.cgi?id=11262 because 2.6.26 doesn't work. Do we know which patch fixed 2.6.27.x? Hi, Rus From the 2.6.25.10 dmesg we can get the following message: > Clockevents: could not switch to one-shot mode:<6>Clockevents: >could not switch to one-shot mode: lapic is not functional. > Could not switch to high resolution mode on CPU 1 > lapic is not functional. >Could not switch to high resolution mode on CPU 0 Will you please confirm whether the different .config file is used on the different kernel? Please add the boot option of "nohz=off highres=off" on 2.6.26.x kernel and see whether the system still hangs. Thanks. 2.6.25/26 cant enable high resolution timers with Turion X2 C1E, as local APIC timers stop in C1E state, please see my invalid bugreport http://bugzilla.kernel.org/show_bug.cgi?id=10986, highres started working on this hardware only from 2.6.27 kernel. Anyway I've tested 2.6.26.1 with "nohz=off highres=off" - it freezes in the same place. As the new kernels >= 2.6.27* works, I've propose to close this bug as rare hardware one (bios made for vista only). Forgot to say - config for all (2.6.25/26/27) kernels is the same. Hi, Rus Thanks for the reminder and test. But from the screenshot in comment #17 it seems that the system doesn't hang in acpi_scan_init any more. Instead it hangs in the function of genl_init. Will you please double check it? Will you please use git-bisect to find which commit causes the regression between 2.6.25.10 and 2.6.26.1? Although the system can work well on the kernel of 2.6.27-rc2, maybe it will be better to find the root cause. Appreciate your efforts. thanks. Yes, it depends of the laptop bios version: Bios v1.06 - kernel was hanging in acpi_scan_init Bios v1.07 - kernel hangs in genl_init Sorry, I cant't bisect 2.6.26.1. Hi, Rus Thanks for the reply. But I am still confused about the hangs in acpi_scan_init (BIOS v1.06). After checking the change log about the acpi_scan_init between 2.6.25.10 and 2.6.26.1, only a very few commits are merged. The difference in the function of acpi_scan_init between 2.6.25.10 and 2.6.26.1 is that the _PSW/_DSW control method will be called in the boot phase. Will you please revert the following commit on the 2.6.26.1 kernel and see whether the problem still exists? (If possible, please try it on 1.06 BIOS) > commit 729b2bdbfa19dd9be98dbd49caf2773b3271cc24 > Author: Zhao Yakui <yakui.zhao@intel.com> > Date: Wed Mar 19 13:26:54 2008 +0800 > ACPI : Disable the device's ability to wake the sleeping system in the boot phase Thanks. Created attachment 17254 [details]
try the debug patch on the kernel of 2.6.25.10
HI, Rus
Maybe it is not very easy to revert the commit 729b2bdbfa19dd9be98dbd49caf2773b3271cc24.
Will you please try the attached debug patch on the kernel of 2.6.25.10 and see whether the system hangs in the function of acpi_scan_init?
In BIOS v1.07: The commit is already included in the 2.6.27-rc2/2.6.26.1 kernel. The system can be booted normally on the kernel of 2.6.27-rc2. Although the system hangs on 2.6.26.1 kernel, it hangs in the function of genl_init instead of acpi_scan_init.
So it will be great that you can test it on BIOS v1.06.
thanks.
I've flashed back the v1.06 BIOS, applied attached patch for 2.6.25.10 and successfully boot it - no hangs. Double checked 2.6.26.1 - it still hangs, but now in genl_init _only_, even on v1.06 ! Seems like flashing new bios changed something in laptop that can't be reverted by flashing old one. Found that 2.6.26.1 boots with nosmp kernel boot option. Dmesg attached. Created attachment 17265 [details]
2.6.26.1 booted with nosmp kernel option
Hi, Rus Thanks for the test. It seems that the system will hang in the function of genl_init instead of acpi_scan_init. It is very interesting that the 2.6.26.1 kernel can be booted with the option of "nosmp". Will you please try the following options on 2.6.26.1 kernel? a. noapic maxcpus=1 b. processor.max_cstate=1( The processor driver in drivers/acpi/ should be built in kernel) c. nolapic_timer thanks. Exactly not - I've got single hang again in acpi_scan_init, but hangs in genl_init occures more often. a) hanged in ide_scan_pcibus, sreenshot attached b) hanged in genl_init as usual c) booted ok, dmesg attached Created attachment 17286 [details]
2.6.26.1 booted with noapic maxcpus=1
Created attachment 17287 [details]
2.6.26.1 dmesg booted with nolapic_timer
Does "idle=poll" allow the 2.6.26 system to boot w/o hanging? (if yes, then this is surely an issue with the lapic timer workaround) Yes, "idle=poll" allow normal 2.6.26.1 boot too. Hi, Len What you said is right. It seems that the system can be booted very normally if the boot option of "nolapic_timer" is added. Maybe this is an issue related with the lapic timer workaround. As there exists such an issue, the C-state is also affected. How can we to solve this problem? Is it appropriate to add the DMI check to disable lapic timer on such laptops? Or the max C-state is limited to C1? thanks. Hi, Rus Will you please try the latest vanilla kernel(2.6.27-rc5) and see whether the system can work well? Please don't add the boot option of "nolapic_timer" or "processor.max_cstate=". Thanks. As said kernel starting from 2.6.27 is working ok on this system. I'm running all recent rc's as they appears. In fact this bug is related with local APIC timer (in 2.6.26.1 kernel). The boot option of "nolapic_timer" can make the system work well. As the issue is already fixed in the kernel 2.6.27, the bug will be marked as resovled. The reason this bug report is not closed is that we have not identified why 2.6.26.1 broke, and why 2.6.27 works. In the mean-time, 2.6.26.6 has been released. Please test it, and if it works, we don't care about 2.6.26.1 any more and we can close this report. Sorry, I can't test 2.6.26.x on this hardware more. |