Latest working kernel version: none Earliest failing kernel version: 2.6.24 Distribution: Ubuntu 8.04 / 8.10 64 bit and 32 bit Hardware Environment: Laptop LG E500 V.APSCG, Intel Core2Duo T8100 @2.1GHz, 2GB RAM, ATI Mobility Radeon Software Environment: Curently Ubuntu 8.10 "Intrepid" 64 bit (same problem with other Distributions and also with 32 bit) Problem Description: The boot process crashes in a very early stage (before the root file system is mounted). I figured out that the problem vanishes as soon as I compile a kernel with the ACPI_PROCESSOR option turned off. Further more, the systems boot also with ACPI_PROCESSOR enabled as soon as I turn off dual core support in the AMI BIOS i.e. boot on a single core. I tried to fix a DSDT table issue by using iasl. Nevertheless this does not help. I am not sure if it is a pure kernel problem or a mixed kernel/DSDT or SSDT table problem. Since the crash occurs that early I cannot provide much helpful information. I could send you a photo of the screen in crashed state and of course also the complete ACPI table infos extracted with acpidump. Steps to reproduce: Take an LG E500 V.APSCG and simpliy insert a life CD of Ubuntu 8.04/8.10 - or an other distribution. Note: I did not try older distributions. The oldest kernel I tries was 2.6.24
Will you please attach the output of acpidump? It will be great if you can attach the screenshot when the system crashes. Will you please try the latest kernel(for example: 2.6.27/2.6.26) and see whether the problem still exists? Will you please try the following boot option with the ACPI_PROCESSOR enabled and see whether the problem still exists?(Of course the processor should be compiled as built-in kernel). a. idle=poll b. processor.max_cstate=1
Created attachment 18771 [details] screen shot of system in crashed state as requested you can find here a screen shot of the system in crashed state.
Created attachment 18772 [details] acpidump output (binary) as requested here you find the binary output of acpidump: acpidump --binary
Hi yakui zhao, I tried the boot options: a) idle=poll indeed lets the system boot in dual core mode. Nevertheless it seems that significant parts of the ACPI power and thermal management do not work correctly. b) when I use processor.max_cstate=1 the system still hangs. Nevertheless it hangs at a different stage with a different screen output. But again it is a very early state. If you wish I can attach also a screen shot of this hanging state. A comment to option a) idle=poll: As I said the system boots when I use the option. Nevertheless when I look into /proc/acpi/processor/P001/info I see this: pascal@laptignio:/proc/acpi/processor/P001$ cat info processor id: 0 acpi id: 1 bus mastering control: yes power management: no throttling control: no limit interface: no Furthermode in /proc/acpi/thermal_zone/THRM I see this: pascal@laptignio:/proc/acpi/thermal_zone/THRM$ cat cooling_mode <setting not supported> pascal@laptignio:/proc/acpi/thermal_zone/THRM$ cat trip_points critical (S5): 100 C --> it seems to me that some essential information from the systems ACPI tables (I guess from the SSDTs) is not processed successfully.
sorry, I forgott to mention that I am using kernel 2.6.27.4.
Will you please attach the output of acpidump ?(Not the binary format) At the same time the following output is also required. acpidump --addr 0x7ffce190 --length 0x1FA -o cpu0ist acpidump --addr 0x7ffce420 --length 0x594 -o cpu0cst Thanks.
Will you please try the boot option of "idle=nomwait" and see whether the system can be booted? Will you please attach the screenshot when the system hangs with the boot option of "processor.max_cstate=1"? Thanks.
As there is no working kernel, clear the regression flag.
Created attachment 18775 [details] screen shot of system in crashed state - processor.maxcstate=1
Created attachment 18776 [details] screen shot of system in crashed state - idle=nomwait
Created attachment 18777 [details] acpidump as requested (1) acpidump --addr 0x7ffce190 --length 0x1FA -o cpu0ist
Created attachment 18778 [details] acpidump as requested (2) acpidump --addr 0x7ffce420 --length 0x594 -o cpu0cst
Created attachment 18779 [details] output of dmesg after booting with idle=poll
Hi, the boot attempt with idle=nomwait failed. I attached a screen shot of this test. In addition I attached the requested screen shot of the processor.maxcstate=1 test as requested. Furthermode I attached: 1) the two requested acpidump outputs 2) the output of dmesg - maybe this contains helpful information. Greetings, Pascal
just for grins, can you verify that booting with "thermal.off=1" makes no difference (or build with CONFIG_THERMAL=n and CONFIG_ACPI_THERMAL=n also, can you verify that booting with just "maxcpus=1" works fine?
Hi, here are the test results: 1) Tested boot with thermal.off=1 Result: The system crashes (screen looks the same as if I ommit this option). 2) Tested boot with maxcpus=1 Result: The system boots. best regards, Pascal
Hi, I've just seen that the state of this bug is still NEEDINFO. Is there anything more you need from me at this stage? best regards, Pascal
Please attach the output of acpidump. Sorry I don't update the status of bug in time.
Hi, can you give me more detail about what additional output I should generate with acpidump? I have already attached the two acpidump outputs you have requested in your comment #6 (see below). If you need additional acpidump output, could you please specify the respective command line arguments for me? Thanks in advance, Pascal
Hi, Pascal You attach two files in comment #11, #12. The two files are obtained by using the following commands: acpidump --addr 0x7ffce190 --length 0x1FA -o cpu0ist acpidump --addr 0x7ffce420 --length 0x594 -o cpu0cst In fact we expect that you attach the output of acpidump by using the following command: ./acpidump > acpidump.out Sorry that I don't describe it very clearly. Thanks.
Created attachment 19003 [details] acpidump.out Output of the following command: acpidump > acpidump.out
Hi, Pascal thanks for the info. From the acpidump it seems that C2/C3 is supported on MP system. >C2 works on MP system : 1 But from the problem description the system can't be booted on MP system if C- state is enabled. More serious is that MP system still can't be booted even when C1 state is enabled. Will you please try the following boot option and see whether the system can booted? a. nolapic b. nolapic_timer c. idle=halt Will you please attach the output of /proc/cpuinfo? Thanks.
Hi, Venki Do you have time to look at this issue? thanks.
Hi, here we go with the latest experiment results: a) boot option: nolapic Result: System boots but detects and uses only one processor core. b) boot option: nolapic_timer Result: System crashes as without any boot option. c) boot option: idle=halt Result: System boots in dual core mode. Greetings, Pascal
Created attachment 19015 [details] content of /proc/cpuinfo The file contains the content of /proc/cpuinfo after booting the system with idle=halt
Looks like some nohz and idle interaction bug. Does "nohz=off" also make the system boot? And how about "highres=off"? Thanks.
Hi, here are the test results: 1) nohz=off --> The system boots using both CPU cores. 2) highres=off --> The system crashes in early boot. The crash screen looks slightly different with this boot option. I will attach a screen shot. Greetings Pascal
Created attachment 19097 [details] screen shot of system in crashed state - highres=off
Hi, Pascal Sorry for the late response. Will you please try the boot option of "nolapic_timer" on the latest kernel(2.6.29-rc2/rc3) and see whether the box can be booted? From the problem description it seems that the box still can't be booted under the following case: a. processor.max_cstate=1. In such case only C1 is used when CPU enters C1,the lapic timer will be used. It seems that the system can't exit the C1-state because of lapic timer interrupt. (In fact the difference between the boot option of "idle=poll" and "processor.max_cstate=1" is that polling is used when CPU is idle). From the test in comment #27 it seems that the box can be booted if adding the boot option of "nohz=off". In fact in such case the system will work in tick mode. After the processor module is loaded, the HPET/pit timer will be used instead of local APIC timer, which means that the local APIC timer won't be used again. Maybe this issue is related with local APIC timer.(There is no option of "nolapic_timer" is added for the 64-bit platform on the kernel of 2.6.27-7). Thanks.
Hi, Pascal Do you have an opportunity to try the boot option of "nolapic_timer" on the latest kernel and see whether the box can be booted normally? Will you please also try the boot option of "idle=nomwait processor.max_cstate=1" and see whether the box can be booted normally? Hi, Rui As there is no response for more than one month, the bug will be rejected. If the problem still exists, please try the boot option as suggested and then reopen the bug. Thanks.
ping Pascal.
Hi all, sorry for my late reply. I was (and am) very busy since I became a daddy - my wife and me got our first child Paul. There was no time for any kernel testing in the meantime but I learned how to change diapers quickly ;-) I hope I can make the tests you mentioned above. My biggest problem is that I am not to deep into testing kernels. Is there some sort of a "howto" that I could consult to learn how I install rc kernels (which I do not get as complete ubuntu package)? Best regards, Pascal
Congrats, Pascal. :) Now you just need to try kernel 2.6.29. you can get the source code tarball linux-2.6.29.tar.bz2 at http://www.kernel.org/pub/linux/kernel/v2.6/
Hi, thank you :-) Finally I installed 2.6.29 and tested these options: 1) nolapic_timer 2) idle=nomwait processor.max_cstate=1 In both cases the system boots successfully. Without option 1) or 2) the kernel 2.6.29 also hangs as the older ones did. I hope this information is helpful for you. Greetings, Pascal
Hi, Pascal thanks for test. Now it is very clear that the issue is related with the local APIC timer. If the boot option of "nolapic_timer" is added, the system can be booted successfully. Do you mean that the box can be booted only when the both boot options of "idle=nomwait" and "processor.max_cstate=2" are added together? How about using the boot option of "idle=nomwait" or "processor.max_cstate=1"? Thanks.
Hi, sorry for the confusion. I thought I was expected to test the two parameters explicitly in combination. Here we go with the individual tests: 1) idle=nomwait - system does *not* boot. 2) processor.max_cstate=1 - system does boot successfully. Just for me to understand: The problem we investigate here is only there when "core multi processing" is enabled in the BIOS. When I switch it off (single core mode) the system boots without any trouble. Is it nevertheless possible that it is related to the local APIC timer? Best regards, Pascal
Hi, I just see that the status is till NEED INFO. Did I miss anything? Do I need to do another test? Best regards, Pascal
Hi, Pascal Sorry for the slow response. From the comment #36 it seems that the box can be booted with "processor.max_cstate=1". But from the comment #9 it seems that the box can't be booted with the same boot option. Contradictory. Will you please double check it again? It will be great if you can double check the boot option of "idle=halt". thanks.
Hi, sorry for the confusion. Maybe I made a mistake in the test referred to in comment #9. As you can see in my comment I typed "maxcstate" instead of "max_cstate". Maybe I made the same mistake in the boot option... ok here we go with the double check: processor.max_cstate=1 - system boots fine idle=halt - system boots fine So the information I delivered in comment #9 is wrong. Best regards Pascal P.S.: added 2.6.28 and 2.6.29 to field "Kernel Version"
Hi, Pascal Sorry for the late response. And thanks for the confirm again. From the above test in comment #39/34 it seems that the box can be booted with the following options: a. nolapic_timer b. processor.max_cstate=1 (the processor can be waken up from C1 by the local APIC timer interrupt). In fact when the boot option of "nolapic_timer" is used, the local APIC timer is replaced by broadcast timer. And the cpu can be waken up from C1 by the broadcast timer. But unfortunately it can't waken up from C2/C3 by the broadcast timer. Will you please try the following boot options and see whether the box can be booted? 1. hpet=disable 2. acpi_skip_timer_override thanks.
Hi, here we go with the test results: 1) hpet=disable --> system boot fails. 2) acpi_skip_timer_override --> system boot fails. 3) hpet=disable + acpi_skip_timer_override --> system boot fails. The system does not boot with any of the parameters. I also tried the combination of both but no success. Best regards, Pascal
Hi, Pascal Thanks for the confirmation. It seems that the box can be booted normally only when the C1 is used or in single-core mode. If the cpu can enter c2/c3, it can't be waken up from c2/c3 by the broadcast timer. I have no idea why the box can't be waken up from c2/c3 by the broadcast timer. Will you please attach the output of dmidecode so that only C1 is used for your box? Hi, Venki Any idea about this issue? Thanks.
Hi, sorry for the long time without any new information. I have news to report: The system works fine with SMP and C states enabled when I use kernel 2.6.31. I tried the recently released Ubuntu 9.10 distribution and the kernel works out-of-the box with standard settings. It seems that the cpu power management (hibernate, ...) does still not work correctly. Nevertheless the system is running on SMP and I do not have to turn-off ACPI. Best regards, Pascal
Thanks for the reporting. It is good news that the system can work well on the latest upstream kernel. It will be great if we can identify the commit which fixes this issue. Anyway, The issue is fixed. Hi, Rui Can you close this bug now? thanks.