Distribution: Gentoo Hardware Environment: x86_64 2 core cpu, Asus A8V-E SE mainboard Software Environment: glibc 2.6.1, gcc 4.1.2, gkrellm 2.3.0 Problem Description: This has only happened once so far, but seems sufficiently odd to be worth mentioning. I have been running 2.6.24.3 for 8 days. I arrived home from work to discover that gkrellm showed 1 cpu pegged at 100%. Looked at top and discovered that it is gkrellm itself. Attempting to kill gkrellm caused much weirdness to happen. Before the machine froze completely I got in to a shell and say the error below. After resetting I checked the system log and saw that this message had been repeating every 11 seconds for at least the last 6 hours. Since this occurred while the machine was unattended all I can think to do is disable the log rotater and start gkrellm again. What information would be useful? BUG: soft lockup - CPU#0 stuck for 11s! [gkrellm2:7621] CPU 0: Modules linked in: w83627ehf hwmon_vid eeprom ipt_TOS iptable_nat nf_nat nf_conntrack_ipv4 snd_seq_midi snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq analog ns558 joydev fuse parport_pc snd_mpu401 parport snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm k8temp snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd uhci_hcd i2c_viapro i2c_core sg tulip Pid: 7621, comm: gkrellm2 Not tainted 2.6.24.3-md #1 RIP: 0010:[<ffffffff804556fe>] [<ffffffff804556fe>] acpi_ns_delete_namespace_by_owner+0x45/0xde RSP: 0018:ffff810017c0fcb8 EFLAGS: 00000246 RAX: ffff81007df60550 RBX: ffff81007df60570 RCX: 0000000000000000 RDX: ffff81007df60550 RSI: ffffffff80c06480 RDI: ffff81007df60570 RBP: ffff810010076040 R08: ffff81007688ec00 R09: 0000000000000000 R10: 0000000000000001 R11: ffffc20000310a08 R12: ffff8100010241e0 R13: ffff810010076040 R14: ffff810017c0e000 R15: ffff8100807f7000 FS: 0000000040800950(0063) GS:ffffffff807b0000(0000) knlGS:00000000f7d5d6d0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaaaab1000 CR3: 000000006e48a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: [<ffffffff804556fe>] acpi_ns_delete_namespace_by_owner+0x45/0xde [<ffffffff8044927b>] acpi_ds_terminate_control_method+0x75/0xc8 [<ffffffff804575fc>] acpi_ps_parse_aml+0x190/0x285 [<ffffffff8045894c>] acpi_ps_execute_method+0x135/0x1e1 [<ffffffff80455880>] acpi_ns_evaluate+0xa4/0x100 [<ffffffff8045547d>] acpi_evaluate_object+0x133/0x1de [<ffffffff80447285>] acpi_evaluate_integer+0x8c/0xc7 [<ffffffff8029cd4c>] kmem_cache_alloc+0xbc/0x120 [<ffffffff80469948>] acpi_thermal_get_temperature+0x2e/0x3b [<ffffffff8046a869>] acpi_thermal_temp_seq_show+0x1d/0x63 [<ffffffff802c00bc>] seq_read+0x8c/0x300 [<ffffffff802c0030>] seq_read+0x0/0x300 [<ffffffff802e1643>] proc_reg_read+0x83/0xd0 [<ffffffff802a3225>] vfs_read+0xc5/0x160 [<ffffffff802a3703>] sys_read+0x53/0x90 [<ffffffff8020bc3e>] system_call+0x7e/0x83
please attach dmesg & acpidump file
Created attachment 15754 [details] output of acpidump
Created attachment 15755 [details] output of dmesg
Created attachment 15778 [details] patch against 2.6.24 If reproducible please have a try this patch NOTE, this is not the final patch
can you please try latest kernel, which has a lot of ACPI updates?
hi, michael, any updates on this?
Apologies for going so long without an update. I have not been able to duplicate this with either 24.3 or 25.4 and am starting to think it could be bad hardware. I occasionally see "missed interrupt" for hda or hdb which are cd drives. In the hope that there is only one underlying problem I booted with "hda=none hdb=none" and have been running for 27 days without any kernel error messages. However, the time between crashes seems to be between 3 and 5 weeks.
As the problem can not be reproduced, reject it.