Bug 11785
Summary: | boot hang - unless "nolapic" - Asus M3N, Asus M300N | ||
---|---|---|---|
Product: | ACPI | Reporter: | Tony White (tonywhite100) |
Component: | Config-Other | Assignee: | acpi_acpica-core (acpi_acpica-core) |
Status: | CLOSED DUPLICATE | ||
Severity: | high | CC: | acpi-bugzilla |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.23.1, 2.6.25.5, 2.6.27.1 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
Output from acpidump in a text file
cpuinfo dmidecode output acpi=debug dmesg info log warnings log Newest bios version dmidecode output |
Description
Tony White
2008-10-18 19:17:43 UTC
Will you please attach the output of acpidump? Will you please capture the screenshot when the system hangs without the "nolapic" boot option? Thanks. Created attachment 18388 [details]
Output from acpidump in a text file
As requested, Here's the output from acpidump in a text file.
Unforunately, I can't screenshot anything useful. From grub I get a black screen with no text and the system hangs. The system stays that way until the power button is held down to turn the system off. I have attached the output of acpidump as requested. If you can provide me with some sort of guide to the information you need, I can build a debug kernel and boot it if that would provide data that would enable you to fix this. Please advise. Thank you for responding to my bug report, Tony. I should also mention I have now compiled 2.6.27.2 and this problem is the same. Can you please try some old kernels, like 2.6.26/2.6.24...? We have a similar issue in Asus laptop several years ago, and the issue should already be fixed I've been experiencing this problem for over a year on the same machine without actually knowing how to report it to you guys. I have had the same problem with kernel-2.6.25.5 in OpenSuSe 11 and also kernel-2.6.23.1 in Fedora 8. It's not something I have been checking for because once you set nolapic in grub, The option gets carried over when a new kernel is installed. The system experiences this bug everytime I boot without nolapic without fail, With every kernel that I have booted on it and using every distribution I have tried (A few.) Would you like anymore info? I'd really like to help sort this, If possible, Please. please paste the output from $ cat /proc/cpuinfo (i expect it will show the apic bit set) This is a uni-processor box that has no IOAPIC. I'm curious why you're running a CONFIG_SMP kernel on it. It is possible that during its day, only uni-processor kernels would be installed on such a box (though today, some distros run SMP on everything) It used to be that Linux enabled the lapic by default, even if the BIOS disabled it. We fixed a lot of systems when we simply did what the BIOS told us to do and ignored a disabled LAPIC. Maybe this system has a BIOS that should disable the lapic, but doesn't? Please attach the output from dmidecode. detect_init_APIC() should be smart enough, and I see no evidence of "Local APIC disabled by BIOS -- reenabling." in your dmesg... btw. have you tried "nolapictimer"? This would use the lapic, but w/o its timer capability. apic=debug may also give us a clue Thanks Len, I will do as you have requested but I have uncovered something pretty horrible in all this mess, Which is pretty frustrating because I want Linux running on this machine. I went to the asus website and found that there were bios updates to be had. Great I thought, Lets try that. Flashed the bios with the most recent rom, Great. Booted my copy of Mandriva 2009. Massive slow down. It was about 5 minutes to get to x. vanilla 2.6.27.3 and everything was slow. Particularly udev. Re-installed tried again, Same thing slow, Right from boot. Both with and without nolapic resulted in a huge lag during anything. Tried other distributions, Tried with and without nolapic. Still the same. I tried loading the bios' default settings also, Still the same. Basically, The current version bios for this machine and it's previous revision all cause the kernel to run very low. I tried the version before current and it performed in the same way. The original bios version is not slow and just needs nolapic or will not boot. I was prudent enough to dump the bios before hand and have now restored the original rom to the bios, Which works perfectly as before but needing nolapic. So you guys need to know that this machine, m3n with it's latest bios version installed is not working properly at all. It will boot without nolapic but it will run real slow. I don't know what to do with it, If it wasn't such a useful tool, It would have gone out of the window. Len you are right that this machine has one processor with one core. If local apic is to do with smp then is it therefore correct that local apic is not required? If so, Yes, The version of the bios that runs with nolapic would seem to need it switching off or not turned on by the kernel to avoid a crash. As far as the CONFIG_SMP goes, It's a make oldconfig on the kernel that ships with Mandriva 2009, So I guess they turn lots of stuff on and I don't have a great idea what to turn on, So went for what was working already. I'll add attachments and stick with the old version of the bios that needs nolapic for now. Do you think that this is something I should talk to asus about? It's an old(ish) Laptop so I doubt that they will be that interested because windows runs fine with all the bios revisions. Could this possibly be anything to do with a certain mr gates and a rubbish secret acpi implementation that is deliberately designed to obstruct Linux, I wonder? What would need to be done to attempt to fix the latest bios revision slowdown? Created attachment 18521 [details]
cpuinfo
cat /proc/cpuinfo
Created attachment 18522 [details]
dmidecode output
nolapic timer resulted in same problem. No boot. acpi=debug returned nothing, Screen still blank. Should I create another bug report about the updated bios? noapic lets the system boot but then freezes. The last line it freezes is : Initializing Device/Processor/Thermal objects by executing _INI methods... Will you please add the boot option of "apic=debug nolapic" and attach the output of dmesg? Thanks. OK, Quick(ish) update : I have been able to build a kernel that will boot and run smoothly with both bios versions, The only thing I have yet to resolve is an alsa issue which is not relevant to this bug at all and I may be able to solve with an alsa recompile. I have been testing with linux-2.6.27.4 and have had many failures with this machine but right now I am in a workable environment using a build that seems good. Firstly, The first bios version, Which is the original bios version that came preloaded onto the machine, Which this bug is all about. That bios version will only boot and not hang if I do not enable : Local APIC Support on Uniprocessors in a build. However, I do not think that this setting is required at all, The kernel runs very smoothly without it set and does not fail to boot. I know the kernel should not fail to boot with this setting turned on but I would like to know if it is needed at all, I thought lapic was smp related? It seems a strange default option to me but I know little more that it does not work on this m3n. Now, Having created a working build, I then attempted to solve the problem with the massive speed slowdown with the latest bios revision installed. With Local APIC Support on Uniprocessors support built into the build, The system boots and runs very slowly using this bios revision, So disabled. When I turn on highmem support (4gb) The system hangs at udev a bit and then freezes at udev events. That's maybe not acpi but we don't know yet. I should not need to use highmem, The system has 1gb but the kernel spews out a message to turn it on in the dmseg. It's not going on in it's current incarnation. It does'nt play well here. So to sum up so far, I have a working build without Local APIC Support on Uniprocessors and Highmem Support built into the kernel. Please note that I have been using pure vanilla and not patch kludging anything. To follow this bug up with some more useful and meaningful data, I will build two kernels with full debug support, Boot with all the debug turned on and attach the dmesg.log(s) here. One that works and another one with both highmem (4gb) + Local APIC Support on Uniprocessors built in. I will test both bios versions, So 4 dmesg.log attachments. Maybe then it might be clear what's happening here. OK. Here's, Hopefully, Some meaningfull output. This data is against 2.6.27.4 now. I still have the same problem. The logs posted here are against the original bios that shipped with the machine not the updated bios version. The latest bios version provides even more complications and I will add a new report for that against 2.6.27.4 because I believe it to be different slightly to this issue. I hope that this information prooves useful in some way. Please see attached. Created attachment 18669 [details]
acpi=debug dmesg
acpi=debug dmesg
Created attachment 18670 [details]
info log
info log
Created attachment 18671 [details]
warnings log
warnings log
These messages from comment #18 suggest your slow-down may be related to MTRR bogosity: Nov 3 21:40:22 localhost kernel: WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 7MB of RAM. Nov 3 21:40:22 localhost kernel: ------------[ cut here ]------------ Nov 3 21:40:22 localhost kernel: WARNING: at arch/x86/kernel/cpu/mtrr/main.c:1558 mtrr_trim_uncached_memory+0x366/0x381() Nov 3 21:40:22 localhost kernel: Modules linked in: Nov 3 21:40:22 localhost kernel: Pid: 0, comm: swapper Not tainted 2.6.27-desktop586-0.rc8.2mnb #1 Nov 3 21:40:22 localhost kernel: [<c0382b32>] ? printk+0x18/0x1e Nov 3 21:40:22 localhost kernel: [<c0131084>] warn_on_slowpath+0x54/0x80 Nov 3 21:40:22 localhost kernel: [<c0385323>] ? _spin_unlock_irqrestore+0x23/0x40 Nov 3 21:40:22 localhost kernel: [<c0131919>] ? release_console_sem+0x1b9/0x1d0 Nov 3 21:40:22 localhost kernel: [<c0131cf8>] ? vprintk+0x188/0x3f0 Nov 3 21:40:22 localhost kernel: [<c04c36f0>] ? e820_update_range_map+0x1bb/0x236 Nov 3 21:40:22 localhost kernel: [<c04c6daf>] mtrr_trim_uncached_memory+0x366/0x381 Nov 3 21:40:22 localhost kernel: [<c04c1746>] setup_arch+0x50c/0xaf6 Nov 3 21:40:22 localhost kernel: [<c0131919>] ? release_console_sem+0x1b9/0x1d0 Nov 3 21:40:22 localhost kernel: [<c014aa5a>] ? down_trylock+0x2a/0x40 Nov 3 21:40:22 localhost kernel: [<c0131cf8>] ? vprintk+0x188/0x3f0 Nov 3 21:40:22 localhost kernel: [<c04c405f>] ? __reserve_early+0x98/0x149 Nov 3 21:40:22 localhost kernel: [<c04ba4d8>] start_kernel+0x63/0x354 Nov 3 21:40:22 localhost kernel: [<c04ba10a>] ? reserve_ebda_region+0x69/0x7f Nov 3 21:40:22 localhost kernel: [<c04ba099>] __init_begin+0x99/0xa1 re: nolapic needed at this point, it seems that adding a DMI entry to diable the lapic on this box is the way to go. We'll need the dmidecode from the 2nd BIOS as well as the one above. "WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 7MB of RAM." Doesn't appear anymore with the bios update, So Asus' problem and not the kernel but for anyone with this old bios version, It might be nice if it did, Just work. I'm a little more concerned about the new bugs found with the new bios though, As I have stopped using this old bios version because it is outdated by several years but it is the bios that the machine was shipped with. dmidecode from the newer bios attached, As requested. Created attachment 18874 [details]
Newest bios version dmidecode output
I believe this bug goes back to up this one.... and is cauched by memory caching problem. http://bugzilla.kernel.org/show_bug.cgi?id=6139 Thanks Mark, Yes! That is exactly it. Exactly the same problem, Lagging at boot and needing highmem turned off in the kernel configuration. So the memory caching problem is in the acpi bios and it's not the kernel's fault? I will try specifying mem= to the kernel to see if that makes a difference. hi, tony, any update? I can't test this old bios version any more, It's the old one but I assume it's the same memory problem. Please see : http://bugzilla.kernel.org/show_bug.cgi?id=11953 *** This bug has been marked as a duplicate of bug 11953 *** Tony, to answer your question... No, the Local APIC isn't extremely useful on a uni-processor. It does provide an additional timer, and that timer is efficient, but that timer tends to stop when the system is idle, making its use somewhat problematic on the old uniprocessor laptops. As you tested that the "nolapic_timer" wasn't able to replace "nolapic", the issue with the original BIOS was not related to that. One other thing that (would have) been useful to try on that old BIOS also would be nmi_watchdog=0 -- just in case your config had enabled it. But I see from bug 11953 that you are now building and booting with LAPIC support on the latest BIOS, so this bug can be closed. |