Latest working kernel version: 2.6.22.19 Earliest failing kernel version: 2.6.25 (from debian packages) Distribution: Debian GNU/Linux etch+testing+unstable Hardware Environment: Lenovo Thinkpad (Z61m 9453-A11, Core 2 Duo, Intel ICH7 chipset, ATI/AMD Radeon Mobility X1400 graphics) Software Environment: Problem Description: System freezes hard after a random amount of time Steps to reproduce: * Start system under any kernel from the 2.6.25 series or higher (I had similar freezes with anything after 2.6.22.19, but I haven't verified that those were related) * Wait The crash is apparently unrelated to any installed software. It happens with or without DRM modules loaded, with or without X running, with or without user interaction. This is the same as http://kerneltrap.org/node/16521 (which I will reproduce here). The freezes result in a complete lockup of the system. No output is generated on the console, in syslog, or in messages. * Magic SysRq is inoperable. * I tried a lot of options in kernel hacking, including lock debugging. That only sped up the time to freeze. NMI watchdog produces output. * I built a minimal kernel with all but the essential drivers disabled, so I rule out issues with sound, network, PCCard, DRI/DRM, and others. * It happens with a stock Debian kernel (2.6.25, built for 486 arch) as well as with custom-built kernels. * I tried building with both GCC 4.3 and 4.2. * The systems run perfectly fine with older kernels (2.6.21, 2.6.22 series), as well as Windows. memtest86+ doesn't find any issues. * "noacpi" is not an option since the laptop won't even boot with that. I tried disabling stuff like MSI(-X), IRQ balancing, tick-free kernel, all to no avail. * 2.6.26.2 runs fine on a non-SMP AMD system. Affected systems are dual-core Intels. Setting the "nosmp" option doesn't help. I have talked to someone else who is stuck to 2.6.21 kernels due to mystery freezes as well. The codepath in the stack trace below also comes up in a lot of reports, so maybe this should even be blocking until it is resolved. The output below is largely the same on all tested kernels, except for precise offsets. --- NMI watchdog output (stock 2.6.26.2 kernel) --- Pid: 0, comm: swapper Not tainted (2.6.26.2-debug #2) EIP: 0060:[<c0117210>] EFLAGS: 00000097 CPU: 0 EIP is at hpet_rtc_interrupt+0x2e0/0x320 EAX: 00000000 EBX: 00000002 ECX: 00000046 EDX: 00000002 ESI: ffffc8ab EDI: c03f1edc EBP: c03f1ee8 ESP: c03f1e9c DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 0, ti=c03f0000 task=c03c9300 task.ti=c03f0000) Stack: 03aa5b2e 00000000 f7bc7c00 f8800128 00000000 a61408d3 0061fd6e 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 f7b87f80 00000000 00000000 c03f1f00 c0159d81 00000000 c03e7080 f7b87f80 Call Trace: [<c0159d81>] ? handle_IRQ_event+0x31/0x60 [<c015af65>] ? handle_edge_irq+0xb5/0x150 [<c0106c50>] ? do_IRQ+0x40/0x80 [<c0104783>] ? common_iterrupt+0x23/0x28 [<c013007b>] ? del_timer_sync+0x1b/0x20 [<f8858058>] ? acpi_idle_enter_bm+0x2c2/0x344 [processor] [<c013f6c6>] ? pm_qos_requirement+0x26/0x30 [<c0298891>] ? cpuidle_idle_call+0x81/0xc0 [<c0298810>] ? cpuidle_idle_call+0x0/0xc0 [<c0102c82>] ? cpu_idle+0x62/0xe0 [<c0319f6e>] ? rest_init+0x4e/0x60 ======================= Code: 80 8d 04 46 89 45 d8 89 f8 83 e7 0f c1 f8 04 8d 04 80 8d 04 47 89 45 dc 8b 45 cc 48 89 45 e0 e9 70 fd ff ff 8d b4 26 00 00 00 00 <f3> 90 a1 80 6b 3e c0 29 f0 83 f8 04 76 f2 e9 d2 fe ff ff 90 8d
Created attachment 17434 [details] Configuration of a kernel that freezes
Created attachment 17435 [details] acpidump output (hexdump)
Seems to be a lot better in 2.6.27-rc5. Same computer is running with the "crash config" for almost 48 hours now, with both extended periods of high load and idle time.
I believe we may have stumbled across the same bug. The last version that worked on one of my systems is 2.6.22. 2.6.24 locks up hard frequently, and I have been testing 2.6.27 and it has also been locking up hard. My hardware is a Dell Inspiron 2600 running Ubuntu 8.04.1. I have reported this as a Ubuntu bug, but I tested by compiling a 2.6.27-rc6 kernel via git and the problem still exists. It has been recommended by one of the Ubuntu developers that I proceed with git bisects to attempt to find the regression, so I am starting that process.
This looks like the same problem from bug #11142, see the link of comment #11 for a fix.
*** This bug has been marked as a duplicate of bug 11142 ***