Bug 43047 - Need help: soft lockup for siglock in linux kernel 2.6.24.7
Summary: Need help: soft lockup for siglock in linux kernel 2.6.24.7
Status: RESOLVED DUPLICATE of bug 43056
Alias: None
Product: Process Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: process_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-05 08:01 UTC by m_joshi@hotmail.com
Modified: 2012-04-14 09:50 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.24.7
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description m_joshi@hotmail.com 2012-04-05 08:01:05 UTC
Need help, if this has been seen earlier and any solution/patches for same
Running linux 2.6.37 on an x86 mutiprocessor multicore hardware. The following prints are continuously coming on the console. Since the console is flooded with these messages, no other activity is possible. This is happening only with one or 2 machines, rest all running the same kernel are fine. Anyone seen this earlier (this is case of a process attempting to take a siglock of another process and somehow that siglock is either not valid or lost)

BUG: soft lockup - CPU#0 stuck for 11s! [pidof:17798]
CPU 0:
Modules linked in: bpctl_mod jnet_igb jnet e1000
Pid: 17798, comm: pidof Tainted: G D 2.6.24.7.JNET #1
RIP: 0010:[<ffffffff80467c92>] [<ffffffff80467c92>] _spin_lock_irqsave+0x12/0x30
RSP: 0018:ffff810100905b90 EFLAGS: 00000286
RAX: 0000000000000282 RBX: ffff81042d0e1f88 RCX: ffff810100905e67
RDX: 0000000000000000 RSI: ffff810100905e70 RDI: ffff81042d0e1f88
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000074 R12: 000000000000001a
R13: ffff810100905b08 R14: ffff810100905d00 R15: 0000003000000030
FS: 00002b01fd4686e0(0000) GS:ffffffff8056f000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000065e4a8 CR3: 00000002eb542000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff8024731a>] lock_task_sighand+0x3a/0x80
[<ffffffff802e13ce>] do_task_stat+0xee/0xb60
[<ffffffff802ddc9f>] pid_revalidate+0x3f/0x110
[<ffffffff802a77bf>] do_lookup+0x8f/0x210
[<ffffffff802b2b36>] dput+0xa6/0x140
[<ffffffff802a9d98>] __link_path_walk+0xc28/0xdf0
[<ffffffff802dce23>] task_dumpable+0x23/0x40
[<ffffffff802b7057>] mntput_no_expire+0x27/0xb0
[<ffffffff802a9fe1>] link_path_walk+0x81/0x100
[<ffffffff802a2bbb>] sys_readlinkat+0x8b/0xc0
[<ffffffff8028c455>] vma_adjust+0x145/0x4c0
[<ffffffff8027e6f1>] __alloc_pages+0x61/0x3c0
[<ffffffff802def4a>] proc_info_read+0xba/0x100
[<ffffffff8029fe55>] vfs_read+0xc5/0x160
[<ffffffff802a0333>] sys_read+0x53/0x90
[<ffffffff8020c35e>] system_call+0x7e/0x83
Comment 1 m_joshi@hotmail.com 2012-04-05 08:32:23 UTC
Sorry the kernel is 2.6.24.7
Comment 2 m_joshi@hotmail.com 2012-04-05 09:23:33 UTC
Kernel 2.6.24.7
Need help, if this has been seen earlier and any solution/patches for same
Running linux 2.6.24.7 on an x86 mutiprocessor multicore hardware. The following
prints are continuously coming on the console. Since the console is flooded
with these messages, no other activity is possible. This is happening only with
one or 2 machines, rest all running the same kernel are fine. Anyone seen this
earlier (this is case of a process attempting to take a siglock of another
process and somehow that siglock is either not valid or lost)

BUG: soft lockup - CPU#0 stuck for 11s! [pidof:17798]
CPU 0:
Modules linked in: bpctl_mod jnet_igb jnet e1000
Pid: 17798, comm: pidof Tainted: G D 2.6.24.7.JNET #1
RIP: 0010:[<ffffffff80467c92>] [<ffffffff80467c92>]
_spin_lock_irqsave+0x12/0x30
RSP: 0018:ffff810100905b90 EFLAGS: 00000286
RAX: 0000000000000282 RBX: ffff81042d0e1f88 RCX: ffff810100905e67
RDX: 0000000000000000 RSI: ffff810100905e70 RDI: ffff81042d0e1f88
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000074 R12: 000000000000001a
R13: ffff810100905b08 R14: ffff810100905d00 R15: 0000003000000030
FS: 00002b01fd4686e0(0000) GS:ffffffff8056f000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000065e4a8 CR3: 00000002eb542000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff8024731a>] lock_task_sighand+0x3a/0x80
[<ffffffff802e13ce>] do_task_stat+0xee/0xb60
[<ffffffff802ddc9f>] pid_revalidate+0x3f/0x110
[<ffffffff802a77bf>] do_lookup+0x8f/0x210
[<ffffffff802b2b36>] dput+0xa6/0x140
[<ffffffff802a9d98>] __link_path_walk+0xc28/0xdf0
[<ffffffff802dce23>] task_dumpable+0x23/0x40
[<ffffffff802b7057>] mntput_no_expire+0x27/0xb0
[<ffffffff802a9fe1>] link_path_walk+0x81/0x100
[<ffffffff802a2bbb>] sys_readlinkat+0x8b/0xc0
[<ffffffff8028c455>] vma_adjust+0x145/0x4c0
[<ffffffff8027e6f1>] __alloc_pages+0x61/0x3c0
[<ffffffff802def4a>] proc_info_read+0xba/0x100
[<ffffffff8029fe55>] vfs_read+0xc5/0x160
[<ffffffff802a0333>] sys_read+0x53/0x90
[<ffffffff8020c35e>] system_call+0x7e/0x83
Comment 3 m_joshi@hotmail.com 2012-04-14 09:50:09 UTC
https://bugzilla.kernel.org/show_bug.cgi?id=43056

*** This bug has been marked as a duplicate of bug 43056 ***

Note You need to log in before you can comment on or make changes to this bug.