Bug 15804

Summary: BUG: soft lockup - CPU#0 stuck
Product: Process Management Reporter: Tony Mugan (tmugan)
Component: OtherAssignee: process_other
Status: CLOSED DOCUMENTED    
Severity: high CC: akpm, alan
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.34-020634rc4-generic Subsystem:
Regression: No Bisected commit-id:
Attachments: kern.log

Description Tony Mugan 2010-04-18 02:13:00 UTC
I have been testing Ubuntu 10.04 Lucid with all the latest updates and have periodically experienced a soft lockup.  This happens on multiple Ubuntu-released kernels.

They recommend that their packaged "mainline" kernels are tried so I have installed a couple to see if the issue is related to particular kernel releases.

I am using 2.6.33-020633-generic now which has not locked up after 30 minutes.

I had been using the later version 2.6.34-020634rc4-generic which caused lockups within about 5 minutes of a restart or cold boot. When locked up, the screen freezes and caps-lock will not register. If I wait for a few minutes, the lockup will end and everything starts to work again.

I am using an Asus M2V-MX motherboard.

Previously reported issues when running the Ubuntu Kernels to the Ubuntu Launchpad site are shown here with diagnostic files attached.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/565202

https://bugs.launchpad.net/ubuntu/+source/linux-backports-modules-2.6.32/+bug/560676

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/560493

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/559781

I am happy to help with testing this, just let me know what you would like me to attempt.
Comment 1 Tony Mugan 2010-04-18 04:22:48 UTC
After leaving the machine idle for a couple of hours, I started to use it again and within 15 minutes had a soft lockup for about a minute with Ubuntu-packaged mainline kernel 2.6.33-020633-generic.

I am using only Evolution, Firefox and Terminator (Terminal replacement).
Comment 2 Andrew Morton 2010-04-19 22:20:52 UTC
Please add a copy of the kernel's soft-lockup diagnostic messages to this report.
Comment 3 Tony Mugan 2010-04-20 11:50:48 UTC
Andrew, 

Would be glad to.
Can you direct me to any documentation on how to get that?

Regards,

Tony.
Comment 4 Tony Mugan 2010-04-20 12:06:29 UTC
Created attachment 26060 [details]
kern.log
Comment 5 Tony Mugan 2010-04-20 12:07:33 UTC
There was a line on startup I noticed as being odd (in attached kern.log), not sure if it's relevant

[    0.356714] pci 0000:00:02.0: address space collision: [mem 0xd0000000-0xdfffffff 64bit pref] conflicts with GART [mem 0xd0000000-0xd7ffffff]
Comment 6 Tony Mugan 2010-04-20 18:10:01 UTC
I have recently replaced my motherboard going from an ASUS M2V-MX SE (which was running very well with Ubuntu 10.04 Lucid in the Alpha stages) to its predecessor ASUS M2V-MX.
This replacement motherboard has suffered from CPU soft lockups periodically but noticeably since Ubuntu kernel 2.6.32-21-generic.

I am using the following now (uname -a) for about 30 minutes without a lockup.
On the later kernel (or the "mainline" kernels up to vmlinuz-2.6.34-020634rc5-generic, this would have certainly locked up within 10 minutes of usage.

Linux orac 2.6.32-20-generic #30-Ubuntu SMP Mon Apr 12 15:20:57 UTC 2010 x86_64 GNU/Linux

I am keen to help with diagnosing this and am happy to learn what is required to assist in debugging the issue.
Comment 7 Tony Mugan 2010-04-26 12:15:04 UTC
I have disabled APIC and ACPI in the BIOS and have not had any lockups since.

The lockups were so regular that I am sure this is what was causing the issue.

If anyone wants more information, let me know.
Comment 8 Tony Mugan 2010-05-02 12:14:47 UTC
Several days later and still no lockups.  The BIOS changes I made have definitely resolved the issue.

Is there any benefit in further investigation on this?