Bug 11422 - Kernel freezes hard on Intel systems
Summary: Kernel freezes hard on Intel systems
Status: CLOSED DUPLICATE of bug 11142
Alias: None
Product: Power Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 high
Assignee: power-management_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-08-25 04:27 UTC by Nicos Gollan
Modified: 2011-03-03 01:10 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.27-rc4
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Configuration of a kernel that freezes (62.74 KB, text/plain)
2008-08-25 04:28 UTC, Nicos Gollan
Details
acpidump output (hexdump) (255.79 KB, text/plain)
2008-08-25 04:30 UTC, Nicos Gollan
Details

Description Nicos Gollan 2008-08-25 04:27:11 UTC
Latest working kernel version: 2.6.22.19
Earliest failing kernel version: 2.6.25 (from debian packages)
Distribution: Debian GNU/Linux etch+testing+unstable
Hardware Environment: Lenovo Thinkpad (Z61m 9453-A11, Core 2 Duo,
  Intel ICH7 chipset, ATI/AMD Radeon Mobility X1400 graphics)
Software Environment:
Problem Description: System freezes hard after a random amount of time

Steps to reproduce:
 * Start system under any kernel from the 2.6.25 series or higher (I had similar freezes with anything after 2.6.22.19, but I haven't verified that those were related)
 * Wait

The crash is apparently unrelated to any installed software. It happens with or without DRM modules loaded, with or without X running, with or without user interaction.

This is the same as http://kerneltrap.org/node/16521 (which I will reproduce here).

The freezes result in a complete lockup of the system. No output is generated on the console, in syslog, or in messages.

    * Magic SysRq is inoperable.
    * I tried a lot of options in kernel hacking, including lock debugging. That only sped up the time to freeze. NMI watchdog produces output.
    * I built a minimal kernel with all but the essential drivers disabled, so I rule out issues with sound, network, PCCard, DRI/DRM, and others.
    * It happens with a stock Debian kernel (2.6.25, built for 486 arch) as well as with custom-built kernels.
    * I tried building with both GCC 4.3 and 4.2.
    * The systems run perfectly fine with older kernels (2.6.21, 2.6.22 series), as well as Windows. memtest86+ doesn't find any issues.
    * "noacpi" is not an option since the laptop won't even boot with that. I tried disabling stuff like MSI(-X), IRQ balancing, tick-free kernel, all to no avail.
    * 2.6.26.2 runs fine on a non-SMP AMD system. Affected systems are dual-core Intels. Setting the "nosmp" option doesn't help.

I have talked to someone else who is stuck to 2.6.21 kernels due to mystery freezes as well. The codepath in the stack trace below also comes up in a lot of reports, so maybe this should even be blocking until it is resolved.

The output below is largely the same on all tested kernels, except for precise offsets.

--- NMI watchdog output (stock 2.6.26.2 kernel) ---
Pid: 0, comm: swapper Not tainted (2.6.26.2-debug #2)
EIP: 0060:[<c0117210>] EFLAGS: 00000097 CPU: 0
EIP is at hpet_rtc_interrupt+0x2e0/0x320
EAX: 00000000 EBX: 00000002 ECX: 00000046 EDX: 00000002
ESI: ffffc8ab EDI: c03f1edc EBP: c03f1ee8 ESP: c03f1e9c
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 0, ti=c03f0000 task=c03c9300 task.ti=c03f0000)
Stack: 03aa5b2e 00000000 f7bc7c00 f8800128 00000000 a61408d3 0061fd6e 00000000
       00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
       f7b87f80 00000000 00000000 c03f1f00 c0159d81 00000000 c03e7080 f7b87f80
Call Trace:
 [<c0159d81>] ? handle_IRQ_event+0x31/0x60
 [<c015af65>] ? handle_edge_irq+0xb5/0x150
 [<c0106c50>] ? do_IRQ+0x40/0x80
 [<c0104783>] ? common_iterrupt+0x23/0x28
 [<c013007b>] ? del_timer_sync+0x1b/0x20
 [<f8858058>] ? acpi_idle_enter_bm+0x2c2/0x344 [processor]
 [<c013f6c6>] ? pm_qos_requirement+0x26/0x30
 [<c0298891>] ? cpuidle_idle_call+0x81/0xc0
 [<c0298810>] ? cpuidle_idle_call+0x0/0xc0
 [<c0102c82>] ? cpu_idle+0x62/0xe0
 [<c0319f6e>] ? rest_init+0x4e/0x60
 =======================
Code: 80 8d 04 46 89 45 d8 89 f8 83 e7 0f c1 f8 04 8d 04 80 8d 04 47 89 45 dc 8b 45 cc 48 89 45 e0 e9 70 fd ff ff 8d b4 26 00 00 00 00 <f3> 90 a1 80 6b 3e c0 29 f0 83 f8 04 76 f2 e9 d2 fe ff ff 90 8d
Comment 1 Nicos Gollan 2008-08-25 04:28:52 UTC
Created attachment 17434 [details]
Configuration of a kernel that freezes
Comment 2 Nicos Gollan 2008-08-25 04:30:13 UTC
Created attachment 17435 [details]
acpidump output (hexdump)
Comment 3 Nicos Gollan 2008-09-04 04:32:41 UTC
Seems to be a lot better in 2.6.27-rc5. Same computer is running with the "crash config" for almost 48 hours now, with both extended periods of high load and idle time.
Comment 4 Ryan Novosielski 2008-09-13 23:16:05 UTC
I believe we may have stumbled across the same bug. The last version that worked on one of my systems is 2.6.22. 2.6.24 locks up hard frequently, and I have been testing 2.6.27 and it has also been locking up hard.

My hardware is a Dell Inspiron 2600 running Ubuntu 8.04.1. I have reported this as a Ubuntu bug, but I tested by compiling a 2.6.27-rc6 kernel via git and the problem still exists.

It has been recommended by one of the Ubuntu developers that I proceed with git bisects to attempt to find the regression, so I am starting that process.
Comment 5 Thomas Jarosch 2008-09-14 15:39:21 UTC
This looks like the same problem from bug #11142, see the link of comment #11 for a fix.
Comment 6 Rafael J. Wysocki 2008-09-14 16:31:21 UTC

*** This bug has been marked as a duplicate of bug 11142 ***

Note You need to log in before you can comment on or make changes to this bug.