Bug 10439 - soft lockup when reading proc file
Summary: soft lockup when reading proc file
Status: REJECTED UNREPRODUCIBLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-10 18:02 UTC by Michael Davidsaver
Modified: 2008-08-28 01:24 UTC (History)
0 users

See Also:
Kernel Version: 2.6.24.3
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
output of acpidump (140.88 KB, text/plain)
2008-04-14 18:03 UTC, Michael Davidsaver
Details
output of dmesg (29.28 KB, text/plain)
2008-04-14 18:03 UTC, Michael Davidsaver
Details
patch against 2.6.24 (8.22 KB, patch)
2008-04-16 21:14 UTC, Lin Ming
Details | Diff

Description Michael Davidsaver 2008-04-10 18:02:03 UTC
Distribution: Gentoo 
Hardware Environment: x86_64 2 core cpu, Asus A8V-E SE mainboard
Software Environment: glibc 2.6.1, gcc 4.1.2, gkrellm 2.3.0
Problem Description:

This has only happened once so far, but seems sufficiently odd to be worth mentioning.

I have been running 2.6.24.3 for 8 days.

I arrived home from work to discover that gkrellm showed 1 cpu pegged at 100%.
Looked at top and discovered that it is gkrellm itself.
Attempting to kill gkrellm caused much weirdness to happen.
Before the machine froze completely I got in to a shell and say the error below.
After resetting I checked the system log and saw that this message had been repeating every 11 seconds for at least the last 6 hours.

Since this occurred while the machine was unattended all I can think to do is disable the log rotater and start gkrellm again.

What information would be useful?


BUG: soft lockup - CPU#0 stuck for 11s! [gkrellm2:7621]
CPU 0:
Modules linked in: w83627ehf hwmon_vid eeprom ipt_TOS iptable_nat nf_nat nf_conntrack_ipv4 snd_seq_midi snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq analog ns558 joydev fuse parport_pc snd_mpu401 parport snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm k8temp snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd uhci_hcd i2c_viapro i2c_core sg tulip
Pid: 7621, comm: gkrellm2 Not tainted 2.6.24.3-md #1
RIP: 0010:[<ffffffff804556fe>]  [<ffffffff804556fe>] acpi_ns_delete_namespace_by_owner+0x45/0xde
RSP: 0018:ffff810017c0fcb8  EFLAGS: 00000246
RAX: ffff81007df60550 RBX: ffff81007df60570 RCX: 0000000000000000
RDX: ffff81007df60550 RSI: ffffffff80c06480 RDI: ffff81007df60570
RBP: ffff810010076040 R08: ffff81007688ec00 R09: 0000000000000000
R10: 0000000000000001 R11: ffffc20000310a08 R12: ffff8100010241e0
R13: ffff810010076040 R14: ffff810017c0e000 R15: ffff8100807f7000
FS:  0000000040800950(0063) GS:ffffffff807b0000(0000) knlGS:00000000f7d5d6d0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002aaaaaab1000 CR3: 000000006e48a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
 [<ffffffff804556fe>] acpi_ns_delete_namespace_by_owner+0x45/0xde
 [<ffffffff8044927b>] acpi_ds_terminate_control_method+0x75/0xc8
 [<ffffffff804575fc>] acpi_ps_parse_aml+0x190/0x285
 [<ffffffff8045894c>] acpi_ps_execute_method+0x135/0x1e1
 [<ffffffff80455880>] acpi_ns_evaluate+0xa4/0x100
 [<ffffffff8045547d>] acpi_evaluate_object+0x133/0x1de
 [<ffffffff80447285>] acpi_evaluate_integer+0x8c/0xc7
 [<ffffffff8029cd4c>] kmem_cache_alloc+0xbc/0x120
 [<ffffffff80469948>] acpi_thermal_get_temperature+0x2e/0x3b
 [<ffffffff8046a869>] acpi_thermal_temp_seq_show+0x1d/0x63
 [<ffffffff802c00bc>] seq_read+0x8c/0x300
 [<ffffffff802c0030>] seq_read+0x0/0x300
 [<ffffffff802e1643>] proc_reg_read+0x83/0xd0
 [<ffffffff802a3225>] vfs_read+0xc5/0x160
 [<ffffffff802a3703>] sys_read+0x53/0x90
 [<ffffffff8020bc3e>] system_call+0x7e/0x83
Comment 1 Lin Ming 2008-04-13 21:58:05 UTC
please attach dmesg & acpidump file
Comment 2 Michael Davidsaver 2008-04-14 18:03:20 UTC
Created attachment 15754 [details]
output of acpidump
Comment 3 Michael Davidsaver 2008-04-14 18:03:54 UTC
Created attachment 15755 [details]
output of dmesg
Comment 4 Lin Ming 2008-04-16 21:14:07 UTC
Created attachment 15778 [details]
patch against 2.6.24

If reproducible please have a try this patch

NOTE, this is not the final patch
Comment 5 Shaohua 2008-05-28 00:02:20 UTC
can you please try latest kernel, which has a lot of ACPI updates?
Comment 6 Zhang Rui 2008-07-14 20:32:29 UTC
hi, michael,
any updates on this?
Comment 7 Michael Davidsaver 2008-07-19 12:08:55 UTC
Apologies for going so long without an update.  I have not been able to duplicate this with either 24.3 or 25.4 and am starting to think it could be bad hardware.

I occasionally see "missed interrupt" for hda or hdb which are cd drives.  In the hope that there is only one underlying problem I booted with "hda=none hdb=none" and have been running for 27 days without any kernel error messages.  However, the time between crashes seems to be between 3 and 5 weeks.
Comment 8 Zhang Rui 2008-08-28 01:24:43 UTC
As the problem can not be reproduced, reject it.

Note You need to log in before you can comment on or make changes to this bug.