Latest working kernel version: none Earliest failing kernel version: since tickless kernel feature Distribution: Fedora, openSuse, Ubuntu Hardware Environment: Fujitsu Siemens Amilo Xi 2428, Intel Core2Duo T8100, 4GB RAM, Intel PM965 (Crestline-PM) + ICH8M, nVidia GF 8600 GS (G86M) Software Environment: All Problem Description: Notebook only works with nohz=off, else I get random freezes or short random freezes until I move the mouse or type something. It seems it has something to do with the tickless kernel feature. If you search the forums you will have a lot of others having this problem on several notebooks. I get this error with 32bit and with 64bit kernel versions.
Will you please attach the output of acpidump? Will you please try the following boot option and see whether the system can work well? a. processor.max_cstate=1 ( The processor should be compiled as built-in kernel) b. idle=poll c. idle=nomwait d. nolapic_timer Thanks.
Hello I wasn't able to get a output of acpidump, I have to look how to get it. I tried all the commands and it seems that all of them made things a little better. While testing I had no freezes and while booting and starting some basic operations. I tried all of the commands twice and all seemed to work for the moment. Of course I have to do a long time test to see how stable this will work. This will take some time and I try to contact a few people with the same problem to test the same commands. If I get this problem fixed without using nohz=off, and with one of the commands above for a longer time, I will post again. Thanks and regards.
Thanks for the test. It seems that the system can be booted after the boot option mentioned in comment #1 is used. Please use the acpidump tools to get the output of acpidump.The latest dump tool can be found in http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/ Thanks.
an tickless kernel bug? it doesn't look like an ACPI bug from the current description. re-assign to timers-other category.
Exactly what kernel version is this reported against? (uname -a) Do you still see it with 2.6.27.stable? how about with 2.6.28-rc?
Created attachment 19158 [details] Kernel failure message with nohz=off
Well, nothing new from me. I tested the new fedora 10 live cd to see if some things have changed in kernel 2.6.27. Without nohz=off I got a lot of freezes, but even with I got this message after a while (see attachment above).
Something new from me: On Fedora (2.6.27.7-134.fc10.i686) nohz=off causes freezes and a very slow pc performance. Very strange because before it solved my problems. I'm now testing nolapic_timer and it seems to work fine (so far). If new problems occur or this is a permanently fix I will post again..
nolapic_timer really seems to fix it on 2.6.27.7-134.fc10.i686. Now I wonder what exactly does this option do and why does nohz=off doesn't work anymore?
nolapic_timer does not fix it. It just disables code pathes which expose the problem. Can you please upload complete boot logs for a boot with and without nolapic_timer on the kernel command line ? Thanks, tglx
hello, do you mean the log file which can be found in /var/log/boot.log ? because both seem to contain the same information, even with nolapic_timer. everything is "ok" there.
> hello, do you mean the log file which can be found in /var/log/boot.log ? > because both seem to contain the same information, even with nolapic_timer. > everything is "ok" there. Do after boot: # dmesg >boot.txt And upload the file.
Created attachment 19456 [details] Boot-Log with nolapic_timer
Created attachment 19457 [details] Boot-Log without nolapic_timer
Pay87, I believe I am seeing something similar on my hardware --- also a T8100-based notebook with the same chipset as yours, as far back as kernel 2.6.24 on Fedora 8. I filed under bug 12390; please let me know if my symptoms match yours and I'll mark my bug as a duplicate of this one.
*** Bug 12390 has been marked as a duplicate of this bug. ***
Hello, I think I see the same thing on the same processor and chipset. I reported it in the Red Hat Bugzilla[1] back in April, where I found that processor.max_cstate=1 seemed to stop the issue (I'd have to check again); someone else said it went away at processor.max_cstate=2. Anyone want any more logs/dumps? ;) 1. https://bugzilla.redhat.com/show_bug.cgi?id=443155
Ok, I checked the boot logs. The interesting differences: --- boot-with.txt 2009-01-14 08:48:44.000000000 +0100 +++ boot-without.txt 2009-01-14 08:48:53.000000000 +0100 +Marking TSC unstable due to TSC halts in idle +Clocksource tsc unstable (delta = -94756607 ns) +CE: hpet increasing min_delta_ns to 15000 nsec +CE: hpet increasing min_delta_ns to 22500 nsec +CE: hpet increasing min_delta_ns to 33750 nsec That means, that we don't go into deep power states when nolapic_timer is on the kernel command line. Please try the following steps: 1) boot w/o the nolapic_timer option and wait until the freezes start to happen. When the freezes become longer, run dmesg >log.txt and upload the file. 2) boot with and w/o nolapic_timer option and provide the output of # cat /proc/timer_list and # cat /sys/devices/system/clocksource/clocksource0/current_clocksource and # cat /proc/acpi/processor/CPU0/power for each 3) boot with "hpet=disable" on the kernel command line 4) boot with "clocksource=acpi_pm" on the kernel command line 5) If possible can you try 2.6.28 ? Thanks, tglx
Created attachment 19794 [details] Output from normal boot plus points 2--4 in comment #18 tglx, I haven't uploaded a dmesg because there's nothing in it for me between freezes. In my case, they show up after around 10 seconds and I've observed them to last between 30--60 seconds, or until a hardware interrupt comes along (so, keypress, attach USB device, network activity, etc.). In order to reproduce this, I have to disable all networking, plus unplug all external and internal USB devices. Processes coming out of sleep don't seem to wake things up. What I'm attaching comes from my default config, plus points 2--4 above. Rather than overload the attachment list, it's all in timer.tar.bz2 --- the filenames should be self-explanatory. Hope this is helpful!
> What I'm attaching comes from my default config, plus points 2--4 > above. And the freezes happened in all 4 scenarios ? Thanks, tglx
> And the freezes happened in all 4 scenarios ? Just noticed, that on 2.6.27 when you run a 64bit kernel you need "noapictimer" instead of "nolapic_timer" :( Thanks, tglx
Created attachment 19799 [details] dmesg from James Ettle's notebook (2.6.27.10-169.fc10, normal cmdline params) Sorry, tglx, I forgot... Yes, the "pauses" happen in all four cases, this is with kernel-2.6.27.10-169 from Fedora 10 (2.6.28 has some other "resource sanity check" bug so I'm leaving that one alone for the time-being). I'll get the noapictimer results for you soon. I've decided to attach dmesg for this kernel (normal boot options) since it might have some useful info for you anyway.
I tried noapictimer on 2.6.27. *As far as observed*, the bug did not manifest. The default clocksource was tsc, which is normally marked unstable; this upset a number of multimedia applications. I'll upload a new archive obsoleting the old one with the results using this boot option.
Created attachment 19819 [details] Info req'd in comment #18
> I tried noapictimer on 2.6.27. *As far as observed*, the bug did not > manifest. The default clocksource was tsc, which is normally marked > unstable; Hmm. The power log says, that the system is permanent in C0 state. The TSC is not marked unstable in that case. > this upset a number of multimedia applications. Can you add "clocksource=acpi_pm" as well ? Are the multimedia apps more happy then ? Thanks, tglx
Another test would be to add "idle=nomwait" (no other options) to the kernel command line. Thanks, tglx
The bug still happens with "idle=nomwait" (as in Comment #26). Using "noapictimer clocksource=acpi_pm", I didn't see the processor entering anything below C1; I'm not sure the MM apps were *completely* happy, either --- I think PulseAudio in particular likes hpet.
I confirm this with an Clevo M720R mainboard, T9300 processor, PM965/GM965/GL960, 4G RAM Kernel 2.6.27-11, different distributions (Ubuntu, SUSE, both 32bit or 64 bit) One additinal observation, after the freezes the system time is delayed by 5 min or multiples of this. With nohz=off, the bug doesn't occur. idle=nomwait not tested yet. I hope someone finds a solution Thanks Arne
(In reply to comment #28) > One additinal observation, after the freezes the system time is delayed by 5 > min or multiples of this. Just to add to the confusion, I've NOT seen any clockskew on my M720R...
Just tested with kernel 2.6.27-14, 64bit, with nohz=off no freezes and no clockskews, without nohz=off the problem persists Thanks Arne
Any more thoughts on this? Running with nohz seems to make the latest "glitchless" PulseAudio rather, er, glitchy... this is on 2.6.29.1-54.fc11.x86_64.
I use nolapic_timer because nohz=off caused some random crashs on my sys.
Hi, I note this bug is still NEEDINFO. Please let me know what extra information is required and I'll try and provide it. Thanks!
No different with kernel-2.6.30-0.91.rc7.git1.fc12.x86_64.
Anyone else notice an improvement between 2.6.29 and 2.6.30.4, which I'm testing now? .29 had it severely, the system was basically unusable without nohz=off or continual keypresses; however .30.4 isn't perfect so I'm not going to cry "fixedforme" just yet...
Addendum: It still happens in 2.6.30.4, but it's somewhat rarer.
Still exhibited by 2.6.31-series kernels.
Just for reference, there's another bug (bug #14280) that is an Amilo Pro 2030, which seems to have an ACPI PM timer that changes speed when NO_HZ is enabled. May be related to this issue. Might need some sort of pciquirk that disables NOHZ on these boxes?
My notebook currently defaults to hpet as its clocksource (.31); a few kernels ago (.29? .30?) it used tsc; neither made a different to the problem. It's like the interrupts for whatever was supposed to be waking the machine up either weren't being received --- or not being sent in the first place. The trouble is NOHZ=off prevents the machine from reaching the higher C-states. If anyone can point me to a deep diagnostic test to find out precisely what's (not) going on, I'd be quite willing to try it.
No improvement for me with 2.6.32; if anything, it's become worse.
I should add, my notebook seems to experience the freezes even with nohz=off.
Could this be the same as/related to bug 11166?
I have the same issue on my lenovo ideapad S12. I have tried both hpet and acpi_pm clocksources and on both kernel 2.6.32 (debian trunk) and 2.6.34rc2. I started investigating because of the system time being wrong. See this for other details. http://forums.debian.net/viewtopic.php?f=5&t=50634 Basicly, the system time lost 6 hours overnight. Time is correct using using nohz=off, but I get some extra 500 wakeups per second. My kernel is compiled with CONFIG_HZ_250=y, which (i think) explains the number of 500. powertop reports: Top causes for wakeups: 81,2% (500,4) <kernel core> : hrtimer_start_range_ns (tick_sched_timer) and also Cn Avg residency P-states (frequencies) C0 (cpu running) ( 4,0%) 1,60 Ghz 0,0% polling 8,2ms (96,0%) 1333 Mhz 0,0% C1 mwait 0,0ms ( 0,0%) 1067 Mhz 0,0% C2 mwait 0,0ms ( 0,0%) 800 Mhz 100,0% C4 mwait 0,0ms ( 0,0%) I don't get the polling part, but it is not a good solution never to use the low C-levels. I expect the battery lifetime will be way longer if I started using the low power modes of the processor :-) That was my two-pence, since this bug is still marked as NEEDINFO. I would like to see it fixed, so what other info is needed?
I am seeing the very same problem on a Fujitsu-Siemens Amilo Pi 2540 using the stock kernels in Fedora 12 as well as the SystemRescueCd live distribution (kernel versions 2.6.31 as well as 2.6.32). The problem goes away if I boot with the processor.max_cstate=1 option. On the other hand changing clock source doesn't seem to fix the problem. I can post more information if needed (hardware listing, dumps, etc...).
I have tried the processor.max_cstate=1 parameter. Two observations: 1) I don't get the systematic 500 wakeups per seconds. This is expected since I now use dynamic ticks. 2) powertop doesn't show the C states anymore. It says "< Detailed C-state information is not available.>" I would have expected to have both C0 and C1 shown. 3) Even with dynamic ticks, the time is correct and I get no random freezes. In conclusion, processor.max_cstate=1 seems to work, but still, the part about using the lower c-states would be nice :-) Cpu info - if relevant. leon:~# cat /proc/cpuinfo [snip] model name : Intel(R) Atom(TM) CPU N270 @ 1.60GHz [/snip] This is for both processors.
Reading through some recent entries on the kernel mailing list about another notebook observed to do this, I tried adding the command-line option pci=nomsi and this seems to work. No strange pauses. (Doesn't help with bug 12788, but it means the machine now seems to be able to use C3 without nodding off.)
I have tried the pci=nomsi command-line option on my machine and while it reduces significantly the number of pauses it doesn't eliminate them entirely. My machine still freezes from time to time even though it takes several seconds for this to happen, without the option it happens pretty much all the time.
Anyone tried adding acpi_skip_timer_override to the kernel cmdline? I have a *suspicion* this fixes things on mine, bit I'm still investigating.
(In reply to comment #48) > Anyone tried adding acpi_skip_timer_override to the kernel cmdline? I have a > *suspicion* this fixes things on mine, bit I'm still investigating. On my machine (Amilo Pi 2540) using this flag greatly improves the situation but doesn't remove the pauses entirely, I'm using Fedora 14 (x86), kernel 2.6.35.
I swicthed from processor.max_cstate=1 to using the suggested acpi_skip_timer_override. I changed a couple of days after James Ettles comment, and have been using it since. It works. I don't have random freezes, and the C states works also. I have an issue with too many wake-ups-from-idle, but that might be something else (see below). Good suggestion, thanks. PS. running debian testing with custom compiled kernel 2.6.34 --- powertop output, should it be relevant --- leon:~# powertop -d PowerTOP 1.11 (C) 2007, 2008 Intel Corporation Collecting data for 15 seconds Your CPU supports the following C-states : C1 C2 C4 Your BIOS reports the following C-states : C1 C2 C4 Cn Avg residency C0 (cpu running) ( 5,9%) C0 0,0ms ( 0,0%) C1 mwait 9,7ms (47,5%) C2 mwait 0,9ms (29,2%) C4 mwait 0,4ms (17,3%) P-states (frequencies) 1,60 Ghz 2,1% 1333 Mhz 0,1% 1067 Mhz 0,1% 800 Mhz 97,8% Wakeups-from-idle per second : 859,0 interval: 15,0s no ACPI power usage estimate available Top causes for wakeups: 38,7% ( 89,5) <kernel core> : hrtimer_start_range_ns (tick_sched_timer) 33,2% ( 76,9) java : hrtimer_start_range_ns (hrtimer_wakeup)
(In reply to comment #50) > I have an issue with too many wake-ups-from-idle, but that might be something > else (see below). I see this too, many more wake-ups with the timer_override option (around 2500 wups).
I am still having freezing when booting. It is not always the same time but mostly when configuring network. As soon as I touch the touchpad booting resumes. Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.36-2.slh.3-aptosid-686 root=UUID=ab6a290e-a341-4776-9846-fd8787b9d3ad ro acpi_skip_timer_override quiet The first line on the screen when booting is: Jan 5 20:49:48 xtrema kernel: [ 0.010999] ..MP-BIOS bug: 8254 timer not conn ected to IO-APIC Jan 5 20:49:48 xtrema kernel: [ 0.010999] ...trying to set up timer (IRQ0) through the 8259A ... Jan 5 20:49:48 xtrema kernel: [ 0.010999] ..... (found apic 0 pin 0) ... Jan 5 20:49:48 xtrema kernel: [ 0.021803] ....... works. This has anything to do with it? Thomas
I have the exact same 4 lines i dmesg, and I don't experience the random freezes anymore, so most likely the answer is no. See my comment #50 above for system details.
Ok, now I have tried the latest kernel: Jan 6 23:27:58 xtrema kernel: [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.37-0.slh.1-aptosid-686 root=UUID=ab6a290e-a341-4776-9846-fd8787b9d3ad ro acpi_skip_timer_override quiet Still about the same. Hangs twice and resumes action as soon as I hit the touchpad. How come the acpi_skip_timer_override doesn't work for my box? We are all talking about the same laptop, right?
I work with a Lenovo S12 with an Intel Mobile 945GME graphics card and 2gb of memory. CPU is reported to be two Intel(R) Atom(TM) CPU N270 @ 1.60GHz
Another update, if I boot my machine with acpi_skip_timer_override the pauses last only until X starts. Once X has started the machines doesn't pause any more. However with this option on I noticed the following error message at boot: [ 0.012999] ..MP-BIOS bug: 8254 timer not connected to IO-APIC I will try the noapic option to see if the problem goes away. Still on Fedora 14 BTW, kernel 2.6.35.13-92.fc14.i686.PAE.
This bug is still present in kernel 2.6.39.2. It's also still marked NEEDINFO; what specific further information is required at this time?
James: First thanks for your diligence here and sorry this issue has gone on for so long. Could you provide a brief summary of which boot options resolve the issue (against the 2.6.39 kernel)? From the logs above it seems there is some uncertainty as to how well "nolapic_timer" and "acpi_skip_timer_override" help. I suspect we are going to have to quirk the specific system to try to address this, as it seems the board in your laptop acts oddly enough and I'm not sure we have a good method to detect the problem without causing issues on other systems. Could you also provide the output to dmidecode so we have the right machine id to wire the quirk up to?
I looked at it again, and the issue seems to have gone. This is the conclusion after one day. I run a custom compiled kernel, but I think the important part is that I now use version 2.6.37. I can supply the .config, if anyone is interested. # dmesg | grep Kernel [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.37.6 root=UUID=2b09e3eb-7445-4a4b-9af7-6dd00c036061 ro no_console_suspend and in powertop it show me all the C-states and processor speeds. I have around 100-150 wakeups per seconds, divided among my applications, so that looks normal (as opposed to 500+ wakeups/sec that I reported earlier). I have not tested with 2.6.39, since it breaks hibernation, but that is (most likely) unrelated.
(In reply to comment #58) > James: First thanks for your diligence here and sorry this issue has gone on > for so long. Likewise, I apologise for the delay (thankfully this Bugzilla has now resumed normal service!). I have attached the machine's dmidecode below. I'm currently testing acpi_skip_timer_override on kernel 3.1.6 (Fedora build). This *seems* to resolve the pausing issue and now no longer introduces excessive wakeups on this machine. However, I'd like to test this for a few more days; if it doesn't work out, I'll be back to max_cstate=1.
Created attachment 72125 [details] dmidecode from a Clevo M720R
Using the acpi_skip_timer_override seems to work for me too on kernel 3.1.9 (Fedora build). I will attach the dmidecode data from my laptop too, in case it could be useful.
Created attachment 72133 [details] dmidecode from a Fujitsu-Siemens Amilo Pi 2540
(In reply to comment #60) > this machine. However, I'd like to test this for a few more days; if it > doesn't > work out, I'll be back to max_cstate=1. Yes, I spoke too soon. Only processor.max_cstate=1 consistently stops the freezing on mine at this stage.
I just reinstalled my laptop witht he latest debian testing, and the problem remains. The acpi_skip_timer_override is still necessary looking at powertop, hrtimer_wakeups are at 20-25/s. I guess that is fairly normal - it is uncomparable to the 500 wakeups/s that I reported in comment 43 above. $ cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.1.0-1-686-pae root=/dev/mapper/leon-root ro acpi_skip_timer_override
Have we done parallel investigations? https://bugzilla.novell.com/show_bug.cgi?id=579932 If you think we are experiencing the same problem, I think my fellow bughunters would like to join forces with you.
My test of ancient distros (as interpreted by Nik Swiridow at openSUSE) indicated that random hangs appeared with the introduction of dyn-ticks: https://bugzilla.novell.com/show_bug.cgi?id=579932#c82 In other words, there was never introduced a bug per se, dyn-ticks just never worked (at least on my laptop). Well, it works for the vast majority of timer interrupts, it's just that some go unnoticed. Note that some kernels hang vastly easier than others, making this very hard to test. In my tests, I used a short-sleeping task (rt-benchmark) to provoke hangs (audio playback is also effective). I can reliably hang some kernels (e.g. 2.6.35) quicker than humanly observable (instant indefinite kernel hang without root = LOL), whereas with 2.6.32, you would never notice anything wrong during normal desktop usage (but it eventually hung a few hundred seconds on the bench). Linux 3.1 feels kinda-good, but not like 2.6.32. My laptop: Multicom Compal JFL92+ Intel Core 2 Duo T8100 Phoenix BIOS v1.16 Elmar Stellnberger is having random hangs with this: Fujistu Siemens Amilo Xi 2550 Intel Core 2 Duo T9300 Phoenix BIOS v1.15 Aaron Burgemeister only has hangs during boot, maybe related to bug 15289. His hardware: HP Pavilion dv6700 Notebook PC AMD Turion(tm) 64 X2 Mobile Technology TL-60 Hewlett-Packard BIOS version F.25 (released 2007-11-29) What bioses do people have here? Has anyone tried updating their bios? (I didn't make it because installing DOS was too difficult)
I'm on the latest BIOS from Fujitsu (1.15c) but it didn't solve the problem (and the MP-BIOS bug: 8254 timer not connected to IO-APIC message still appears though I don't know if the BIOS bug is the cause of the freezes). acpi_skip_timer_override mostly solves it but I can still run in the odd freeze. The only robust way to prevent the freezes is to use either nohz=off or processor.max_cstate=1. Now on kernel 3.3.6-3.fc16.i686.PAE (Fedora 16).
Created attachment 73388 [details] dmidecode from a Multicom Compal JFL92+
As requested, I have included my BIOS info from dmidecode below. As I told in #65 above, it is still necessary for me to do the acpi_timer_override. I recently checked lenovos homepage, and they did not have a BIOS update available. Just for completeness, I still get a kernel error saying "MP-BIOS bug: 8254 timer not connected to IO-APIC". Since I am not into the details of how the kernel handles timers, I don't know what kind of info that might relevant. Please suggest stuff to post. --- BIOS Information Vendor: LENOVO Version: 19CN21WW Release Date: 07/17/2009 Address: 0xE71C0 Runtime Size: 101952 bytes ROM Size: 1024 kB Characteristics: PCI is supported PC Card (PCMCIA) is supported PNP is supported BIOS is upgradeable BIOS shadowing is allowed ESCD support is available Boot from CD is supported ACPI is supported USB legacy is supported BIOS boot specification is supported Targeted content distribution is supported BIOS Revision: 1.12 Firmware Revision: 3.30
confirmed for kernel 3.7.6-1.2-desktop (FS Amilo Xi-2550).
Note that this is NOT a BIOS issue. "... C2 is the 2nd idle state. The external I/O Controller Hub blocks interrupts to the processor. And so on with C3, C4, etc. I'll discuss this further down in this paper. By the way, there is nothing preventing the OS from busy waiting in its idle state, and thus keeping the processor in C0, as did older operating systems. ... " http://software.intel.com/en-us/blogs/2008/03/27/update-c-states-c-states-and-even-more-c-states/
... you will need to busy wait with at least one core on any Intel Core 2 Duo system if there are pending timers. "C1 is the first idle state. The clock running to the processor is gated, i.e. the clock is prevented from reaching the core, effectively shutting it down in an operational sense. " ... or perhaps use the APIC timer to wake up at a coarser granularity.
Consistently with others here, I did independently conclude that: * processor.max_cstate=1 works * processor.max_cstate=2 does not (In reply to comment #72) > C2 is the 2nd idle state. The external I/O Controller Hub blocks > interrupts to the processor. Nice finding, Elmar! HPET uses interrupts (according to wikipedia), so based on this info, HPET should not work in C2. But then I don't get why other timers would work in C2 either… Except we know there must be different levels of interrupts or something, since the kind of interrupt coming from user interaction works. I doubt busy waiting (C0) is necessary — experience says C1 works. I would say: – At least one core of an Intel Core 2 Duo needs to be in CC1 or CC0 whenever HPET is the only timer with a pending interrupt. Otherwise, the processor sleeps indefinately.
I believe I've also been encountering the same problem with dynticks. I've got a Gigabyte AMD motherboard GA-MA78GPM with (if I'm reading the manual right) an AMD 780G chipset. I recently lost my old kernel compilation history in a hard drive failure but I remember struggling with dynticks and another related feature (maybe having to do with ACPI, but I can't swear to it now). Dynticks has never worked for me since the feature was released. Since I compile my own kernels I just disabled both of them and get on with my life. Now recently it became an issue for me because (due to the aforementioned hard drive failure) I was booted into a rescue CD that had a kernel with dynticks enabled. In order to get stuff done I had to constantly move the mouse or tap the Shift key to generate interrupts to get things to actually finish. I see some kernel parameters to try at boot time from this thread. In the next few days I'll give them a try and see what I can find, if I can confirm that I'm seeing the very same issue here. I see this bug has been quiet for over a year. I'm hoping we can finally get this thing fixed, since now it appears that it might affect a lot of unsuspecting users who don't compile their own kernels. As I said, I compile my own kernels so I can try configurations and test patches and stuff. I've become motivated to get this thing squashed. :D
This may be old news, but net searching brought me back around to bug 13053. Apparently that bug was fixed for the OP with a BIOS update.
Concerning me, I have experienced this bug on multiple machines all of them provided with the latest BIOS. However I have run out of time and resources and could no more continue my testing effort on this bug. Nonetheless it is somehow possible to live with that bug when certain command line options are used. This bug has a long history. You may also find some interesting material at: https://bugzilla.novell.com/show_bug.cgi?id=579932.