Bug 11551 - Semi-repeatable hard lockup on 2.6.27-rc6
Semi-repeatable hard lockup on 2.6.27-rc6
Status: CLOSED UNREPRODUCIBLE
Product: Platform Specific/Hardware
Classification: Unclassified
Component: i386
All Linux
: P1 normal
Assigned To: platform_x86_64@kernel-bugs.osdl.org
:
Depends on:
Blocks: Regressions-2.6.26
  Show dependency treegraph
 
Reported: 2008-09-12 11:20 UTC by Rafael J. Wysocki
Modified: 2008-10-11 13:49 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.27-rc6
Tree: Mainline
Regression: Yes


Attachments

Description Rafael J. Wysocki 2008-09-12 11:20:45 UTC
Subject    : Semi-repeatable hard lockup on 2.6.27-rc6
Submitter  : "Steven Noonan" <steven@uplinklabs.net>
Date       : 2008-09-10 18:07
References : http://marc.info/?l=linux-kernel&m=122107007407994&w=4

This entry is being used for tracking a regression from 2.6.26.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Steven Noonan 2008-09-12 19:16:22 UTC
Actually, this affects an x86 machine, not x86_64.
Comment 2 Steven Noonan 2008-09-12 19:36:13 UTC
Also, apologies to whomever posted this: http://www.gossamer-threads.com/lists/linux/kernel/972192#972192

My email isn't receiving some of the linux-kernel messages for some reason.

The machine is an HP dv5178us. Intel Core Duo 1.66GHz, 2GB RAM, 200GB hard drive. I'm running Linux 2.6.27-rc6 i686 on there.
Comment 3 Rafael J. Wysocki 2008-09-26 16:01:40 UTC
On Sunday, 21 of September 2008, Steven Noonan wrote:
> On Sun, Sep 21, 2008 at 11:54 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.26.  Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=11551
> > Subject         : Semi-repeatable hard lockup on 2.6.27-rc6
> > Submitter       : Steven Noonan <steven@uplinklabs.net>
> > Date            : 2008-09-10 18:07 (12 days old)
> > References      : http://marc.info/?l=linux-kernel&m=122107007407994&w=4
> >
> >
> 
> The machine with these symptoms was sent in for service on Friday. I
> suspect there may have been dodgy hardware involved on this one. I
> think this bug should be closed for the time being. Once I get the
> machine back, I'll reopen the bug if I can still reproduce it.

Comment 4 Steven Noonan 2008-10-09 21:13:49 UTC
So I got the machine back from repairs, and I booted the kernel I was running before sending it in. This showed up in dmesg (I don't know if it's fixed yet, but it makes -perfect- sense for why it locked up on 'Waiting for udev events...' before):

[   47.087247] =======================================================
[   47.087806] [ INFO: possible circular locking dependency detected ]
[   47.088133] 2.6.27-rc6-tip-00275-g44c7698 #1
[   47.088417] -------------------------------------------------------
[   47.088744] udevd/1202 is trying to acquire lock:
[   47.088749]  (&inode->inotify_mutex){--..}, at: [<c01a8aa3>] inotify_inode_queue_event+0x3e/0xb6
[   47.088763] 
[   47.088765] but task is already holding lock:
[   47.088769]  (&mm->mmap_sem){----}, at: [<c017380b>] sys_munmap+0x20/0x3c
[   47.088779] 
[   47.088781] which lock already depends on the new lock.
[   47.088783] 
[   47.088785] 
[   47.088787] the existing dependency chain (in reverse order) is:
[   47.088790] 
[   47.088792] -> #3 (&mm->mmap_sem){----}:
[   47.088797]        [<c0147e56>] validate_chain+0x839/0xaf5
[   47.088806]        [<c014873a>] __lock_acquire+0x628/0x6c0
[   47.088813]        [<c014881a>] lock_acquire+0x48/0x64
[   47.088819]        [<c0170901>] might_fault+0x51/0x71
[   47.088825]        [<c02b1924>] copy_to_user+0x2d/0x41
[   47.088832]        [<c01a99f5>] inotify_read+0x108/0x173
[   47.088838]        [<c0184442>] vfs_read+0x8f/0x10b
[   47.088845]        [<c018475c>] sys_read+0x40/0x65
[   47.088851]        [<c0102f5d>] sysenter_do_call+0x12/0x35
[   47.088857]        [<ffffffff>] 0xffffffff
[   47.088869] 
[   47.088870] -> #2 (&dev->ev_mutex){--..}:
[   47.088875]        [<c0147e56>] validate_chain+0x839/0xaf5
[   47.088882]        [<c014873a>] __lock_acquire+0x628/0x6c0
[   47.088889]        [<c014881a>] lock_acquire+0x48/0x64
[   47.088895]        [<c0471cdf>] mutex_lock_nested+0xcd/0x24f
[   47.088903]        [<c01a9793>] inotify_dev_queue_event+0x25/0x12b
[   47.088910]        [<c01a8aec>] inotify_inode_queue_event+0x87/0xb6
[   47.088916]        [<c01a90c6>] inotify_dentry_parent_queue_event+0x69/0x83
[   47.088923]        [<c0184afd>] __fput+0x5c/0x140
[   47.088929]        [<c0184e46>] fput+0x1c/0x1e
[   47.088934]        [<c018266a>] filp_close+0x55/0x5f
[   47.088941]        [<c0183716>] sys_close+0x6d/0xa6
[   47.088947]        [<c0102f5d>] sysenter_do_call+0x12/0x35
[   47.088953]        [<ffffffff>] 0xffffffff
[   47.088959] 
[   47.088961] -> #1 (&ih->mutex){--..}:
[   47.088966]        [<c0147e56>] validate_chain+0x839/0xaf5
[   47.088974]        [<c014873a>] __lock_acquire+0x628/0x6c0
[   47.088980]        [<c014881a>] lock_acquire+0x48/0x64
[   47.088986]        [<c0471cdf>] mutex_lock_nested+0xcd/0x24f
[   47.088993]        [<c01a88d6>] inotify_find_update_watch+0x4d/0x8e
[   47.088999]        [<c01a9b23>] sys_inotify_add_watch+0xc3/0x164
[   47.089006]        [<c0102f5d>] sysenter_do_call+0x12/0x35
[   47.089011]        [<ffffffff>] 0xffffffff
[   47.089029] 
[   47.089031] -> #0 (&inode->inotify_mutex){--..}:
[   47.089036]        [<c0147b98>] validate_chain+0x57b/0xaf5
[   47.089043]        [<c014873a>] __lock_acquire+0x628/0x6c0
[   47.089049]        [<c014881a>] lock_acquire+0x48/0x64
[   47.089055]        [<c0471cdf>] mutex_lock_nested+0xcd/0x24f
[   47.089062]        [<c01a8aa3>] inotify_inode_queue_event+0x3e/0xb6
[   47.089068]        [<c01a90c6>] inotify_dentry_parent_queue_event+0x69/0x83
[   47.089075]        [<c0184afd>] __fput+0x5c/0x140
[   47.089081]        [<c0184e46>] fput+0x1c/0x1e
[   47.089087]        [<c0171f7d>] remove_vma+0x2d/0x4c
[   47.089093]        [<c017299c>] do_munmap+0x191/0x1ab
[   47.089099]        [<c0173818>] sys_munmap+0x2d/0x3c
[   47.089106]        [<c0102f5d>] sysenter_do_call+0x12/0x35
[   47.089111]        [<ffffffff>] 0xffffffff
[   47.089118] 
[   47.089119] other info that might help us debug this:
[   47.089121] 
[   47.089125] 1 lock held by udevd/1202:
[   47.089128]  #0:  (&mm->mmap_sem){----}, at: [<c017380b>] sys_munmap+0x20/0x3c
[   47.089137] 
[   47.089138] stack backtrace:
[   47.089143] Pid: 1202, comm: udevd Not tainted 2.6.27-rc6-tip-00275-g44c7698 #1
[   47.089148]  [<c0147612>] print_circular_bug_tail+0xa1/0xac
[   47.089157]  [<c0147b98>] validate_chain+0x57b/0xaf5
[   47.089164]  [<c0145473>] ? save_trace+0x37/0x88
[   47.089171]  [<c014873a>] __lock_acquire+0x628/0x6c0
[   47.089179]  [<c014881a>] lock_acquire+0x48/0x64
[   47.089185]  [<c01a8aa3>] ? inotify_inode_queue_event+0x3e/0xb6
[   47.089192]  [<c0471cdf>] mutex_lock_nested+0xcd/0x24f
[   47.089199]  [<c01a8aa3>] ? inotify_inode_queue_event+0x3e/0xb6
[   47.089206]  [<c01a8aa3>] ? inotify_inode_queue_event+0x3e/0xb6
[   47.089214]  [<c01a8aa3>] inotify_inode_queue_event+0x3e/0xb6
[   47.089220]  [<c01a90b0>] ? inotify_dentry_parent_queue_event+0x53/0x83
[   47.089228]  [<c01a90c6>] inotify_dentry_parent_queue_event+0x69/0x83
[   47.089235]  [<c0184afd>] __fput+0x5c/0x140
[   47.089241]  [<c0184e46>] fput+0x1c/0x1e
[   47.089246]  [<c0171f7d>] remove_vma+0x2d/0x4c
[   47.089253]  [<c017299c>] do_munmap+0x191/0x1ab
[   47.089259]  [<c0173818>] sys_munmap+0x2d/0x3c
[   47.089266]  [<c0102f5d>] sysenter_do_call+0x12/0x35
[   47.089273]  =======================
Comment 5 Ingo Molnar 2008-10-11 03:51:34 UTC
> [   47.087247] =======================================================
> [   47.087806] [ INFO: possible circular locking dependency detected ]
> [   47.088133] 2.6.27-rc6-tip-00275-g44c7698 #1
> [   47.088417] -------------------------------------------------------
> [   47.088744] udevd/1202 is trying to acquire lock:
> [   47.088749]  (&inode->inotify_mutex){--..}, at: [<c01a8aa3>]
> inotify_inode_queue_event+0x3e/0xb6

this is known and should be fixed in v2.6.27 and latest tip/master, 
could you please try it and check whether the message went away? Thanks,

	Ingo

Comment 6 Steven Noonan 2008-10-11 13:49:42 UTC
Ah, good. It's fixed then.

I've been running 2.6.27 on several branches, and it seems fine so far.

Note You need to log in before you can comment on or make changes to this bug.