Bug 9543 - RTNL: assertion failed at net/ipv6/addrconf.c (2164)/RTNL: assertion failed at net/ipv4/devinet.c (1055)
Summary: RTNL: assertion failed at net/ipv6/addrconf.c (2164)/RTNL: assertion failed a...
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jeff Garzik
URL:
Keywords:
Depends on:
Blocks: 9243
  Show dependency tree
 
Reported: 2007-12-11 03:20 UTC by Krzysztof Oledzki
Modified: 2008-02-12 09:49 UTC (History)
0 users

See Also:
Kernel Version: 2.6.24-rc4-git7
Tree: Mainline
Regression: Yes


Attachments

Description Krzysztof Oledzki 2007-12-11 03:20:48 UTC
Most recent kernel where this bug did not occur: 2.6.23
Distribution: Gentoo

Problem Description:
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
RTNL: assertion failed at net/ipv6/addrconf.c (2164)
Pid: 9, comm: events/0 Not tainted 2.6.24-rc4-git7 #1
 [<78402cfb>] addrconf_notify+0x5b4/0x7b7
 [<7812203a>] finish_task_switch+0x0/0x8c
 [<781346ff>] worker_thread+0x0/0x85
 [<78438e23>] schedule+0x545/0x55f
 [<781408d1>] print_lock_contention_bug+0x11/0xd2
 [<783bfa72>] rt_run_flush+0x43/0x8b
 [<783bfa93>] rt_run_flush+0x64/0x8b
 [<7813ac54>] notifier_call_chain+0x2a/0x52
 [<7813ac9e>] raw_notifier_call_chain+0x17/0x1a
 [<783a3471>] netdev_state_change+0x18/0x29
 [<783ac6a9>] __linkwatch_run_queue+0x150/0x17e
 [<783ac6f4>] linkwatch_event+0x1d/0x22
 [<78133cdf>] run_workqueue+0xdb/0x1b6
 [<78133c8b>] run_workqueue+0x87/0x1b6
 [<783ac6d7>] linkwatch_event+0x0/0x22
 [<781346ff>] worker_thread+0x0/0x85
 [<78134778>] worker_thread+0x79/0x85
 [<781371ad>] autoremove_wake_function+0x0/0x35
 [<781370f6>] kthread+0x38/0x5e
 [<781370be>] kthread+0x0/0x5e
 [<78104baf>] kernel_thread_helper+0x7/0x10
 =======================
RTNL: assertion failed at net/ipv6/addrconf.c (1610)
Pid: 9, comm: events/0 Not tainted 2.6.24-rc4-git7 #1
 [<78402290>] addrconf_add_dev+0x36/0x59
 [<78402d27>] addrconf_notify+0x5e0/0x7b7
 [<7812203a>] finish_task_switch+0x0/0x8c
 [<781346ff>] worker_thread+0x0/0x85
 [<78438e23>] schedule+0x545/0x55f
 [<781408d1>] print_lock_contention_bug+0x11/0xd2
 [<783bfa72>] rt_run_flush+0x43/0x8b
 [<783bfa93>] rt_run_flush+0x64/0x8b
 [<7813ac54>] notifier_call_chain+0x2a/0x52
 [<7813ac9e>] raw_notifier_call_chain+0x17/0x1a
 [<783a3471>] netdev_state_change+0x18/0x29
 [<783ac6a9>] __linkwatch_run_queue+0x150/0x17e
 [<783ac6f4>] linkwatch_event+0x1d/0x22
 [<78133cdf>] run_workqueue+0xdb/0x1b6
 [<78133c8b>] run_workqueue+0x87/0x1b6
 [<783ac6d7>] linkwatch_event+0x0/0x22
 [<781346ff>] worker_thread+0x0/0x85
 [<78134778>] worker_thread+0x79/0x85
 [<781371ad>] autoremove_wake_function+0x0/0x35
 [<781370f6>] kthread+0x38/0x5e
 [<781370be>] kthread+0x0/0x5e
 [<78104baf>] kernel_thread_helper+0x7/0x10
 =======================
RTNL: assertion failed at net/ipv6/addrconf.c (414)
Pid: 9, comm: events/0 Not tainted 2.6.24-rc4-git7 #1
 [<78402229>] ipv6_find_idev+0x36/0x67
 [<78402297>] addrconf_add_dev+0x3d/0x59
 [<78402d27>] addrconf_notify+0x5e0/0x7b7
 [<7812203a>] finish_task_switch+0x0/0x8c
 [<781346ff>] worker_thread+0x0/0x85
 [<78438e23>] schedule+0x545/0x55f
 [<781408d1>] print_lock_contention_bug+0x11/0xd2
 [<783bfa72>] rt_run_flush+0x43/0x8b
 [<783bfa93>] rt_run_flush+0x64/0x8b
 [<7813ac54>] notifier_call_chain+0x2a/0x52
 [<7813ac9e>] raw_notifier_call_chain+0x17/0x1a
 [<783a3471>] netdev_state_change+0x18/0x29
 [<783ac6a9>] __linkwatch_run_queue+0x150/0x17e
 [<783ac6f4>] linkwatch_event+0x1d/0x22
 [<78133cdf>] run_workqueue+0xdb/0x1b6
 [<78133c8b>] run_workqueue+0x87/0x1b6
 [<783ac6d7>] linkwatch_event+0x0/0x22
 [<781346ff>] worker_thread+0x0/0x85
 [<78134778>] worker_thread+0x79/0x85
 [<781371ad>] autoremove_wake_function+0x0/0x35
 [<781370f6>] kthread+0x38/0x5e
 [<781370be>] kthread+0x0/0x5e
 [<78104baf>] kernel_thread_helper+0x7/0x10
 =======================
RTNL: assertion failed at net/ipv4/devinet.c (1055)
Pid: 9, comm: events/0 Not tainted 2.6.24-rc4-git7 #1
 [<783e3a93>] inetdev_event+0x56/0x465
 [<7813ac54>] notifier_call_chain+0x2a/0x52
 [<7813ac9e>] raw_notifier_call_chain+0x17/0x1a
 [<783a3471>] netdev_state_change+0x18/0x29
 [<783ac6a9>] __linkwatch_run_queue+0x150/0x17e
 [<783ac6f4>] linkwatch_event+0x1d/0x22
 [<78133cdf>] run_workqueue+0xdb/0x1b6
 [<78133c8b>] run_workqueue+0x87/0x1b6
 [<783ac6d7>] linkwatch_event+0x0/0x22
 [<781346ff>] worker_thread+0x0/0x85
 [<78134778>] worker_thread+0x79/0x85
 [<781371ad>] autoremove_wake_function+0x0/0x35
 [<781370f6>] kthread+0x38/0x5e
 [<781370be>] kthread+0x0/0x5e
 [<78104baf>] kernel_thread_helper+0x7/0x10
 =======================
bond0: no IPv6 routers present

Steps to reproduce:
 This happens when the system starts up.
Comment 1 Anonymous Emailer 2007-12-11 03:46:49 UTC
Reply-To: akpm@linux-foundation.org

On Tue, 11 Dec 2007 03:20:48 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9543
> 
>            Summary: RTNL: assertion failed at net/ipv6/addrconf.c
>                     (2164)/RTNL: assertion failed at net/ipv4/devinet.c
>                     (1055)
>            Product: Drivers
>            Version: 2.5
>      KernelVersion: 2.6.24-rc4-git7
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Network
>         AssignedTo: jgarzik@pobox.com
>         ReportedBy: olel@ans.pl
> 
> 
> Most recent kernel where this bug did not occur: 2.6.23
> Distribution: Gentoo
> 
> Problem Description:
> ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
> RTNL: assertion failed at net/ipv6/addrconf.c (2164)
> Pid: 9, comm: events/0 Not tainted 2.6.24-rc4-git7 #1
>  [<78402cfb>] addrconf_notify+0x5b4/0x7b7
>  [<7812203a>] finish_task_switch+0x0/0x8c
>  [<781346ff>] worker_thread+0x0/0x85
>  [<78438e23>] schedule+0x545/0x55f
>  [<781408d1>] print_lock_contention_bug+0x11/0xd2
>  [<783bfa72>] rt_run_flush+0x43/0x8b
>  [<783bfa93>] rt_run_flush+0x64/0x8b
>  [<7813ac54>] notifier_call_chain+0x2a/0x52
>  [<7813ac9e>] raw_notifier_call_chain+0x17/0x1a
>  [<783a3471>] netdev_state_change+0x18/0x29
>  [<783ac6a9>] __linkwatch_run_queue+0x150/0x17e
>  [<783ac6f4>] linkwatch_event+0x1d/0x22
>  [<78133cdf>] run_workqueue+0xdb/0x1b6
>  [<78133c8b>] run_workqueue+0x87/0x1b6
>  [<783ac6d7>] linkwatch_event+0x0/0x22
>  [<781346ff>] worker_thread+0x0/0x85
>  [<78134778>] worker_thread+0x79/0x85
>  [<781371ad>] autoremove_wake_function+0x0/0x35
>  [<781370f6>] kthread+0x38/0x5e
>  [<781370be>] kthread+0x0/0x5e
>  [<78104baf>] kernel_thread_helper+0x7/0x10
>  =======================
> RTNL: assertion failed at net/ipv6/addrconf.c (1610)


Hopefully this is due to the bug you reported in bug #9542.

Does this patch fix both issues?


From: Andrew Morton <akpm@linux-foundation.org>

Remove stray rtnl_unlock().

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9542

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Krzysztof Oledzki <olel@ans.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/net/bonding/bond_sysfs.c |    2 --
 1 file changed, 2 deletions(-)

diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix drivers/net/bonding/bond_sysfs.c
--- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
+++ a/drivers/net/bonding/bond_sysfs.c
@@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
 out:
 	write_unlock_bh(&bond->lock);
 
-	rtnl_unlock();
-
 	return count;
 }
 static DEVICE_ATTR(primary, S_IRUGO | S_IWUSR, bonding_show_primary, bonding_store_primary);
_
Comment 2 Krzysztof Oledzki 2007-12-11 03:51:11 UTC
Will test ASAP, but I would like to wait for about 2h to test if 2.6.24-rc4-git7 solves bug #9182. It seems it does fix the dirty memory leak not but I need to to 100% sure.
Comment 3 Krzysztof Oledzki 2007-12-11 07:04:44 UTC

On Tue, 11 Dec 2007, Andrew Morton wrote:

> On Tue, 11 Dec 2007 03:20:48 -0800 (PST) bugme-daemon@bugzilla.kernel.org
> wrote:
>
>> http://bugzilla.kernel.org/show_bug.cgi?id=9543
>>
>>            Summary: RTNL: assertion failed at net/ipv6/addrconf.c
>>                     (2164)/RTNL: assertion failed at net/ipv4/devinet.c
>>                     (1055)
>>            Product: Drivers
>>            Version: 2.5
>>      KernelVersion: 2.6.24-rc4-git7
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: Network
>>         AssignedTo: jgarzik@pobox.com
>>         ReportedBy: olel@ans.pl
>>
>>
>> Most recent kernel where this bug did not occur: 2.6.23
>> Distribution: Gentoo
>>
>> Problem Description:
>> ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>> RTNL: assertion failed at net/ipv6/addrconf.c (2164)
>> Pid: 9, comm: events/0 Not tainted 2.6.24-rc4-git7 #1
>>  [<78402cfb>] addrconf_notify+0x5b4/0x7b7
>>  [<7812203a>] finish_task_switch+0x0/0x8c
>>  [<781346ff>] worker_thread+0x0/0x85
>>  [<78438e23>] schedule+0x545/0x55f
>>  [<781408d1>] print_lock_contention_bug+0x11/0xd2
>>  [<783bfa72>] rt_run_flush+0x43/0x8b
>>  [<783bfa93>] rt_run_flush+0x64/0x8b
>>  [<7813ac54>] notifier_call_chain+0x2a/0x52
>>  [<7813ac9e>] raw_notifier_call_chain+0x17/0x1a
>>  [<783a3471>] netdev_state_change+0x18/0x29
>>  [<783ac6a9>] __linkwatch_run_queue+0x150/0x17e
>>  [<783ac6f4>] linkwatch_event+0x1d/0x22
>>  [<78133cdf>] run_workqueue+0xdb/0x1b6
>>  [<78133c8b>] run_workqueue+0x87/0x1b6
>>  [<783ac6d7>] linkwatch_event+0x0/0x22
>>  [<781346ff>] worker_thread+0x0/0x85
>>  [<78134778>] worker_thread+0x79/0x85
>>  [<781371ad>] autoremove_wake_function+0x0/0x35
>>  [<781370f6>] kthread+0x38/0x5e
>>  [<781370be>] kthread+0x0/0x5e
>>  [<78104baf>] kernel_thread_helper+0x7/0x10
>>  =======================
>> RTNL: assertion failed at net/ipv6/addrconf.c (1610)
>
>
> Hopefully this is due to the bug you reported in bug #9542.
>
> Does this patch fix both issues?

Unfortunately not. I just updated bugzilla.

Best regards,

 				Krzysztof Ol
Comment 4 Anonymous Emailer 2007-12-11 12:30:45 UTC
Reply-To: akpm@linux-foundation.org

On Tue, 11 Dec 2007 16:04:27 +0100 (CET) Krzysztof Oledzki <olel@ans.pl> wrote:

> >>  [<783bfa72>] rt_run_flush+0x43/0x8b
> >>  [<783bfa93>] rt_run_flush+0x64/0x8b
> >>  [<7813ac54>] notifier_call_chain+0x2a/0x52
> >>  [<7813ac9e>] raw_notifier_call_chain+0x17/0x1a
> >>  [<783a3471>] netdev_state_change+0x18/0x29
> >>  [<783ac6a9>] __linkwatch_run_queue+0x150/0x17e
> >>  [<783ac6f4>] linkwatch_event+0x1d/0x22
> >>  [<78133cdf>] run_workqueue+0xdb/0x1b6
> >>  [<78133c8b>] run_workqueue+0x87/0x1b6
> >>  [<783ac6d7>] linkwatch_event+0x0/0x22
> >>  [<781346ff>] worker_thread+0x0/0x85
> >>  [<78134778>] worker_thread+0x79/0x85
> >>  [<781371ad>] autoremove_wake_function+0x0/0x35
> >>  [<781370f6>] kthread+0x38/0x5e
> >>  [<781370be>] kthread+0x0/0x5e
> >>  [<78104baf>] kernel_thread_helper+0x7/0x10
> >>  =======================
> >> RTNL: assertion failed at net/ipv6/addrconf.c (1610)
> >
> >
> > Hopefully this is due to the bug you reported in bug #9542.
> >
> > Does this patch fix both issues?
> 
> Unfortunately not. I just updated bugzilla.

argh.  Please conduct these discussions via emailed reply-to-all, not via
the bugzilla web interface.

I think my patch _did_ fix http://bugzilla.kernel.org/show_bug.cgi?id=9542
But then you hit a new bonding locking bug.

Rafael, please track 9542 and 9543 as post-2.6.23 regressions, thanks.
Comment 5 Herbert Xu 2007-12-12 05:53:06 UTC
Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> Remove stray rtnl_unlock().
> 
> Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9542

Adnrew, please cc Jay Vosburgh <fubar@us.ibm.com> on bonding
issues.

> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> drivers/net/bonding/bond_sysfs.c
> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> +++ a/drivers/net/bonding/bond_sysfs.c
> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
> out:
>        write_unlock_bh(&bond->lock);
> 
> -       rtnl_unlock();
> -

Looking at the changeset that added this perhaps the intention
is to hold the lock? If so we should add an rtnl_lock to the start
of the function.

Thanks,
Comment 6 Jay Vosburgh 2007-12-12 09:47:12 UTC
Herbert Xu <herbert@gondor.apana.org.au> wrote:

>> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>> drivers/net/bonding/bond_sysfs.c
>> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>> +++ a/drivers/net/bonding/bond_sysfs.c
>> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
>> out:
>>        write_unlock_bh(&bond->lock);
>> 
>> -       rtnl_unlock();
>> -
>
>Looking at the changeset that added this perhaps the intention
>is to hold the lock? If so we should add an rtnl_lock to the start
>of the function.

	Yes, this function needs to hold locks, and more than just
what's there now.  I believe the following should be correct; I haven't
tested it, though (I'm supposedly on vacation right now).

	The following change should be correct for the
bonding_store_primary case discussed in this thread, and also corrects
the bonding_store_active case which performs similar functions.

	The bond_change_active_slave and bond_select_active_slave
functions both require rtnl, bond->lock for read and curr_slave_lock for
write_bh, and no other locks.  This is so that the lower level
mode-specific functions can release locks down to just rtnl in order to
call, e.g., dev_set_mac_address with the locks it expects (rtnl only).

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 11b76b3..28a2d80 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device *d,
 	struct slave *slave;
 	struct bonding *bond = to_bond(d);
 
-	write_lock_bh(&bond->lock);
+	rtnl_lock();
+	read_lock(&bond->lock);
+	write_lock_bh(&bond->curr_slave_lock);
+
 	if (!USES_PRIMARY(bond->params.mode)) {
 		printk(KERN_INFO DRV_NAME
 		       ": %s: Unable to set primary slave; %s is in mode %d\n",
@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device *d,
 		}
 	}
 out:
-	write_unlock_bh(&bond->lock);
-
+	write_unlock_bh(&bond->curr_slave_lock);
+	read_unlock(&bond->lock);
 	rtnl_unlock();
 
 	return count;
@@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
 	struct bonding *bond = to_bond(d);
 
 	rtnl_lock();
-	write_lock_bh(&bond->lock);
+	read_lock(&bond->lock);
+	write_lock_bh(&bond->curr_slave_lock);
 
 	if (!USES_PRIMARY(bond->params.mode)) {
 		printk(KERN_INFO DRV_NAME
@@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
 		}
 	}
 out:
-	write_unlock_bh(&bond->lock);
+	write_unlock_bh(&bond->curr_slave_lock);
+	read_unlock(&bond->lock);
 	rtnl_unlock();
 
 	return count;
Comment 7 Anonymous Emailer 2007-12-12 11:07:16 UTC
Reply-To: andy@greyhouse.net

On Wed, Dec 12, 2007 at 09:46:55AM -0800, Jay Vosburgh wrote:
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
> 
> >> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> drivers/net/bonding/bond_sysfs.c
> >> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >> +++ a/drivers/net/bonding/bond_sysfs.c
> >> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
> >> out:
> >>        write_unlock_bh(&bond->lock);
> >> 
> >> -       rtnl_unlock();
> >> -
> >
> >Looking at the changeset that added this perhaps the intention
> >is to hold the lock? If so we should add an rtnl_lock to the start
> >of the function.
> 
>       Yes, this function needs to hold locks, and more than just
> what's there now.  I believe the following should be correct; I haven't
> tested it, though (I'm supposedly on vacation right now).
> 
>       The following change should be correct for the
> bonding_store_primary case discussed in this thread, and also corrects
> the bonding_store_active case which performs similar functions.
> 
>       The bond_change_active_slave and bond_select_active_slave
> functions both require rtnl, bond->lock for read and curr_slave_lock for
> write_bh, and no other locks.  This is so that the lower level
> mode-specific functions can release locks down to just rtnl in order to
> call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
> 
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

This looks good to me as well....

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Comment 8 Krzysztof Oledzki 2007-12-14 07:55:20 UTC

On Wed, 12 Dec 2007, bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9543
>
>
>
>
>
> ------- Comment #6 from fubar@us.ibm.com  2007-12-12 09:47 -------
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
>>> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>> drivers/net/bonding/bond_sysfs.c
>>> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>> +++ a/drivers/net/bonding/bond_sysfs.c
>>> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
>>> out:
>>>        write_unlock_bh(&bond->lock);
>>>
>>> -       rtnl_unlock();
>>> -
>>
>> Looking at the changeset that added this perhaps the intention
>> is to hold the lock? If so we should add an rtnl_lock to the start
>> of the function.
>
>        Yes, this function needs to hold locks, and more than just
> what's there now.  I believe the following should be correct; I haven't
> tested it, though (I'm supposedly on vacation right now).
>
>        The following change should be correct for the
> bonding_store_primary case discussed in this thread, and also corrects
> the bonding_store_active case which performs similar functions.
>
>        The bond_change_active_slave and bond_select_active_slave
> functions both require rtnl, bond->lock for read and curr_slave_lock for
> write_bh, and no other locks.  This is so that the lower level
> mode-specific functions can release locks down to just rtnl in order to
> call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
>
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>
> diff --git a/drivers/net/bonding/bond_sysfs.c
> b/drivers/net/bonding/bond_sysfs.c
> index 11b76b3..28a2d80 100644
> --- a/drivers/net/bonding/bond_sysfs.c
> +++ b/drivers/net/bonding/bond_sysfs.c
> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device *d,
>        struct slave *slave;
>        struct bonding *bond = to_bond(d);
>
> -       write_lock_bh(&bond->lock);
> +       rtnl_lock();
> +       read_lock(&bond->lock);
> +       write_lock_bh(&bond->curr_slave_lock);
> +
>        if (!USES_PRIMARY(bond->params.mode)) {
>                printk(KERN_INFO DRV_NAME
>                       ": %s: Unable to set primary slave; %s is in mode
>                       %d\n",
> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device *d,
>                }
>        }
> out:
> -       write_unlock_bh(&bond->lock);
> -
> +       write_unlock_bh(&bond->curr_slave_lock);
> +       read_unlock(&bond->lock);
>        rtnl_unlock();
>
>        return count;
> @@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct device
> *d,
>        struct bonding *bond = to_bond(d);
>
>        rtnl_lock();
> -       write_lock_bh(&bond->lock);
> +       read_lock(&bond->lock);
> +       write_lock_bh(&bond->curr_slave_lock);
>
>        if (!USES_PRIMARY(bond->params.mode)) {
>                printk(KERN_INFO DRV_NAME
> @@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct device
> *d,
>                }
>        }
> out:
> -       write_unlock_bh(&bond->lock);
> +       write_unlock_bh(&bond->curr_slave_lock);
> +       read_unlock(&bond->lock);
>        rtnl_unlock();
>
>        return count;
>

Any chances for undamaged patch so I can test it? This one has tabs 
converted into spaces and long lines splitted. :(

Best regards,

 					Krzysztof Ol
Comment 9 Krzysztof Oledzki 2007-12-14 08:18:00 UTC

On Wed, 12 Dec 2007, Jay Vosburgh wrote:

> Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
>>> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>> drivers/net/bonding/bond_sysfs.c
>>> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>> +++ a/drivers/net/bonding/bond_sysfs.c
>>> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
>>> out:
>>>        write_unlock_bh(&bond->lock);
>>>
>>> -       rtnl_unlock();
>>> -
>>
>> Looking at the changeset that added this perhaps the intention
>> is to hold the lock? If so we should add an rtnl_lock to the start
>> of the function.
>
>       Yes, this function needs to hold locks, and more than just
> what's there now.  I believe the following should be correct; I haven't
> tested it, though (I'm supposedly on vacation right now).
>
>       The following change should be correct for the
> bonding_store_primary case discussed in this thread, and also corrects
> the bonding_store_active case which performs similar functions.
>
>       The bond_change_active_slave and bond_select_active_slave
> functions both require rtnl, bond->lock for read and curr_slave_lock for
> write_bh, and no other locks.  This is so that the lower level
> mode-specific functions can release locks down to just rtnl in order to
> call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
>
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>
> diff --git a/drivers/net/bonding/bond_sysfs.c
> b/drivers/net/bonding/bond_sysfs.c
> index 11b76b3..28a2d80 100644
> --- a/drivers/net/bonding/bond_sysfs.c
> +++ b/drivers/net/bonding/bond_sysfs.c
> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device *d,
>       struct slave *slave;
>       struct bonding *bond = to_bond(d);
>
> -     write_lock_bh(&bond->lock);
> +     rtnl_lock();
> +     read_lock(&bond->lock);
> +     write_lock_bh(&bond->curr_slave_lock);
> +
>       if (!USES_PRIMARY(bond->params.mode)) {
>               printk(KERN_INFO DRV_NAME
>                      ": %s: Unable to set primary slave; %s is in mode %d\n",
> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device *d,
>               }
>       }
> out:
> -     write_unlock_bh(&bond->lock);
> -
> +     write_unlock_bh(&bond->curr_slave_lock);
> +     read_unlock(&bond->lock);
>       rtnl_unlock();
>
>       return count;
> @@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct device
> *d,
>       struct bonding *bond = to_bond(d);
>
>       rtnl_lock();
> -     write_lock_bh(&bond->lock);
> +     read_lock(&bond->lock);
> +     write_lock_bh(&bond->curr_slave_lock);
>
>       if (!USES_PRIMARY(bond->params.mode)) {
>               printk(KERN_INFO DRV_NAME
> @@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct device
> *d,
>               }
>       }
> out:
> -     write_unlock_bh(&bond->lock);
> +     write_unlock_bh(&bond->curr_slave_lock);
> +     read_unlock(&bond->lock);
>       rtnl_unlock();
>
>       return count;

Vanilla 2.6.24-rc5 plus this patch:

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.24-rc5 #1
---------------------------------------------------------
events/0/9 just changed the state of lock:
  (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
but this lock took another, soft-read-irq-unsafe lock in the past:
  (&bond->lock){-.--}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
4 locks held by events/0/9:
  #0:  (events){--..}, at: [<c0133c57>] run_workqueue+0x87/0x1b6
  #1:  ((linkwatch_work).work){--..}, at: [<c0133c57>] 
run_workqueue+0x87/0x1b6
  #2:  (rtnl_mutex){--..}, at: [<c03abd50>] linkwatch_event+0x5/0x22
  #3:  (&ndev->lock){-.-+}, at: [<c0411b61>] 
mld_ifc_timer_expire+0x17/0x1fb

the first lock's dependencies:
-> (&mc->mca_lock){-+..} ops: 10 {
    initial-use  at:
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c014289c>] __lock_acquire+0x4ba/0xc07
                         [<c0109ef2>] save_stack_trace+0x20/0x3a
                         [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                         [<c0412452>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c0439d62>] _spin_lock_bh+0x3b/0x64
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                         [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                         [<c0412452>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                         [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                         [<c0401834>] ipv6_add_dev+0x21c/0x24b
                         [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                         [<c05c5b40>] addrconf_init+0x13/0x193
                         [<c0199f63>] proc_net_fops_create+0x10/0x21
                         [<c0419b38>] ip6_flowlabel_init+0x1e/0x20
                         [<c05c5a20>] inet6_init+0x1f0/0x2ad
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    in-softirq-W at:
                         [<c0142822>] __lock_acquire+0x440/0xc07
                         [<c013e0f3>] clockevents_program_event+0xe0/0xee
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c0439d62>] _spin_lock_bh+0x3b/0x64
                         [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                         [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c012df52>] run_timer_softirq+0xfa/0x15d
                         [<c012a8a6>] __do_softirq+0x56/0xdb
                         [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                         [<c012a8b8>] __do_softirq+0x68/0xdb
                         [<c012a961>] do_softirq+0x36/0x51
                         [<c012ae4a>] local_bh_enable_ip+0xad/0xed
                         [<c03bf107>] rt_run_flush+0x64/0x8b
                         [<c03e9296>] fib_netdev_event+0x61/0x65
                         [<c013ac20>] notifier_call_chain+0x2a/0x52
                         [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a2ae5>] netdev_state_change+0x18/0x29
                         [<c03abd1d>] __linkwatch_run_queue+0x150/0x17e
                         [<c03abd68>] linkwatch_event+0x1d/0x22
                         [<c0133cab>] run_workqueue+0xdb/0x1b6
                         [<c0133c57>] run_workqueue+0x87/0x1b6
                         [<c03abd4b>] linkwatch_event+0x0/0x22
                         [<c01346cb>] worker_thread+0x0/0x85
                         [<c0134744>] worker_thread+0x79/0x85
                         [<c0137179>] autoremove_wake_function+0x0/0x35
                         [<c01370c2>] kthread+0x38/0x5e
                         [<c013708a>] kthread+0x0/0x5e
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-W at:
                         [<c01417ee>] find_usage_backwards+0xbb/0xe2
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c014286a>] __lock_acquire+0x488/0xc07
                         [<c0109ef2>] save_stack_trace+0x20/0x3a
                         [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                         [<c0412452>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c0439d62>] _spin_lock_bh+0x3b/0x64
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                         [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                         [<c0412452>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                         [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                         [<c0401834>] ipv6_add_dev+0x21c/0x24b
                         [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                         [<c05c5b40>] addrconf_init+0x13/0x193
                         [<c0199f63>] proc_net_fops_create+0x10/0x21
                         [<c0419b38>] ip6_flowlabel_init+0x1e/0x20
                         [<c05c5a20>] inet6_init+0x1f0/0x2ad
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
  }
  ... key      at: [<c087e2d8>] __key.30798+0x0/0x8
  -> (_xmit_ETHER){-...} ops: 8 {
     initial-use  at:
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] 
ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] 
register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014286a>] __lock_acquire+0x488/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] 
ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] 
register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0439d62>] _spin_lock_bh+0x3b/0x64
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0412114>] igmp6_group_added+0x56/0x11d
    [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
    [<c0410100>] igmp6_mc_seq_start+0xde/0x138
    [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
    [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
    [<c0401834>] ipv6_add_dev+0x21c/0x24b
    [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
    [<c0401e17>] addrconf_notify+0x60/0x7b7
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0141dac>] mark_held_locks+0x39/0x53
    [<c0439066>] mutex_lock_nested+0x286/0x2ac
    [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
    [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
    [<c03a3f88>] register_netdevice_notifier+0x49/0x126
    [<c05c5bda>] addrconf_init+0xad/0x193
    [<c05c5b48>] addrconf_init+0x1b/0x193
    [<c05c5a20>] inet6_init+0x1f0/0x2ad
    [<c05a9499>] kernel_init+0x150/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c0104baf>] kernel_thread_helper+0x7/0x10
    [<ffffffff>] 0xffffffff

  -> (&bonding_netdev_xmit_lock_key){-...} ops: 6 {
     initial-use  at:
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] 
ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] 
register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014286a>] __lock_acquire+0x488/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] 
ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] 
register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c0877804>] bonding_netdev_xmit_lock_key+0x0/0x8
   -> (&bond->lock){-.--} ops: 99 {
      initial-use  at:
                             [<c013fe35>] put_lock_stats+0xa/0x1e
                             [<c014289c>] __lock_acquire+0x4ba/0xc07
                             [<c0141dac>] mark_held_locks+0x39/0x53
                             [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c0439eee>] _read_lock_bh+0x3b/0x64
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c03aa3e7>] rtnl_fill_ifinfo+0x2bf/0x563
                             [<c03aa965>] rtmsg_ifinfo+0x5d/0xdf
                             [<c03aaa26>] rtnetlink_event+0x3f/0x42
                             [<c013ac20>] notifier_call_chain+0x2a/0x52
                             [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                             [<c03a31de>] register_netdevice+0x2a7/0x2e7
                             [<c02ed862>] bond_create+0x1f2/0x26a
                             [<c05bedcd>] bonding_init+0x761/0x7ea
                             [<c05be635>] e1000_init_module+0x45/0x7c
                             [<c05a9499>] kernel_init+0x150/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
      hardirq-on-W at:
                             [<c014286a>] __lock_acquire+0x488/0xc07
                             [<c012093c>] try_to_wake_up+0x2ce/0x2d8
                             [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02eda75>] 
bond_set_multicast_list+0x1d/0x241
                             [<c0439e25>] _write_lock_bh+0x3b/0x64
                             [<c02eda75>] 
bond_set_multicast_list+0x1d/0x241
                             [<c02eda75>] 
bond_set_multicast_list+0x1d/0x241
                             [<c013fe35>] put_lock_stats+0xa/0x1e
                             [<c03a14db>] __dev_set_rx_mode+0x7b/0x7d
                             [<c03a1675>] dev_set_rx_mode+0x23/0x36
                             [<c03a3d50>] dev_open+0x5e/0x77
                             [<c03a2a1f>] dev_change_flags+0x9d/0x14b
                             [<c03a1823>] __dev_get_by_name+0x68/0x73
                             [<c03e3850>] devinet_ioctl+0x22b/0x536
                             [<c03a3b45>] dev_ioctl+0x46f/0x5b7
                             [<c0399c78>] sock_ioctl+0x167/0x18b
                             [<c0399b11>] sock_ioctl+0x0/0x18b
                             [<c01725f7>] do_ioctl+0x1f/0x62
                             [<c0172867>] vfs_ioctl+0x22d/0x23f
                             [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                             [<c01728ac>] sys_ioctl+0x33/0x4b
                             [<c0103e92>] sysenter_past_esp+0x5f/0xa5
                             [<ffffffff>] 0xffffffff
      softirq-on-R at:
                             [<c0141986>] mark_lock+0x64/0x451
                             [<c013575e>] __kernel_text_address+0x5/0xe
                             [<c0104ee2>] dump_trace+0x83/0x8d
                             [<c0142889>] __lock_acquire+0x4a7/0xc07
                             [<c013fc76>] save_trace+0x37/0x89
                             [<c0133c57>] run_workqueue+0x87/0x1b6
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02eee5d>] bond_mii_monitor+0x19/0x85
                             [<c0439f4d>] _read_lock+0x36/0x5f
                             [<c02eee5d>] bond_mii_monitor+0x19/0x85
                             [<c02eee5d>] bond_mii_monitor+0x19/0x85
                             [<c0133cab>] run_workqueue+0xdb/0x1b6
                             [<c0133c57>] run_workqueue+0x87/0x1b6
                             [<c02eee44>] bond_mii_monitor+0x0/0x85
                             [<c01346cb>] worker_thread+0x0/0x85
                             [<c0134744>] worker_thread+0x79/0x85
                             [<c0137179>] autoremove_wake_function+0x0/0x35
                             [<c01370c2>] kthread+0x38/0x5e
                             [<c013708a>] kthread+0x0/0x5e
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
      hardirq-on-R at:
                             [<c013fe0a>] get_lock_stats+0xd/0x2e
                             [<c013fe35>] put_lock_stats+0xa/0x1e
                             [<c0142844>] __lock_acquire+0x462/0xc07
                             [<c0141dac>] mark_held_locks+0x39/0x53
                             [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c0439eee>] _read_lock_bh+0x3b/0x64
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c03aa3e7>] rtnl_fill_ifinfo+0x2bf/0x563
                             [<c03aa965>] rtmsg_ifinfo+0x5d/0xdf
                             [<c03aaa26>] rtnetlink_event+0x3f/0x42
                             [<c013ac20>] notifier_call_chain+0x2a/0x52
                             [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                             [<c03a31de>] register_netdevice+0x2a7/0x2e7
                             [<c02ed862>] bond_create+0x1f2/0x26a
                             [<c05bedcd>] bonding_init+0x761/0x7ea
                             [<c05be635>] e1000_init_module+0x45/0x7c
                             [<c05a9499>] kernel_init+0x150/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
    }
    ... key      at: [<c08777d0>] __key.32969+0x0/0x8
    -> (_xmit_ETHER){-...} ops: 8 {
       initial-use  at:
                               [<c014289c>] __lock_acquire+0x4ba/0xc07
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c0439d62>] _spin_lock_bh+0x3b/0x64
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c0412114>] igmp6_group_added+0x56/0x11d
                               [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                               [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                               [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                               [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                               [<c0401834>] ipv6_add_dev+0x21c/0x24b
                               [<c040b07d>] 
ndisc_ifinfo_sysctl_change+0x0/0x1ef
                               [<c0401e17>] addrconf_notify+0x60/0x7b7
                               [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                               [<c0141dac>] mark_held_locks+0x39/0x53
                               [<c0439066>] mutex_lock_nested+0x286/0x2ac
                               [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                               [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                               [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                               [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                               [<c03a3f88>] 
register_netdevice_notifier+0x49/0x126
                               [<c05c5bda>] addrconf_init+0xad/0x193
                               [<c05c5b48>] addrconf_init+0x1b/0x193
                               [<c05c5a20>] inet6_init+0x1f0/0x2ad
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
       hardirq-on-W at:
                               [<c0141986>] mark_lock+0x64/0x451
                               [<c014286a>] __lock_acquire+0x488/0xc07
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c0439d62>] _spin_lock_bh+0x3b/0x64
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c0412114>] igmp6_group_added+0x56/0x11d
                               [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                               [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                               [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                               [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                               [<c0401834>] ipv6_add_dev+0x21c/0x24b
                               [<c040b07d>] 
ndisc_ifinfo_sysctl_change+0x0/0x1ef
                               [<c0401e17>] addrconf_notify+0x60/0x7b7
                               [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                               [<c0141dac>] mark_held_locks+0x39/0x53
                               [<c0439066>] mutex_lock_nested+0x286/0x2ac
                               [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                               [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                               [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                               [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                               [<c03a3f88>] 
register_netdevice_notifier+0x49/0x126
                               [<c05c5bda>] addrconf_init+0xad/0x193
                               [<c05c5b48>] addrconf_init+0x1b/0x193
                               [<c05c5a20>] inet6_init+0x1f0/0x2ad
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
     }
     ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
    ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
    [<c0109ef2>] save_stack_trace+0x20/0x3a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0439d62>] _spin_lock_bh+0x3b/0x64
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c02ee492>] bond_change_active_slave+0x1a9/0x3bf
    [<c02ec7c3>] bond_update_speed_duplex+0x26/0x65
    [<c02ee9af>] bond_select_active_slave+0x95/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f6197>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fe9>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

    -> (lweventlist_lock){.+..} ops: 10 {
       initial-use  at:
                               [<c0141986>] mark_lock+0x64/0x451
                               [<c014289c>] __lock_acquire+0x4ba/0xc07
                               [<c02e365c>] e1000_read_phy_reg+0x1c7/0x1d3
                               [<c02e348b>] e1000_write_phy_reg+0xb9/0xc3
                               [<c024a7de>] delay_tsc+0x25/0x3b
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c03abbbb>] linkwatch_fire_event+0x25/0x37
                               [<c02e1c43>] e1000_probe+0xad1/0xbe8
                               [<c0257f3f>] pci_device_probe+0x36/0x57
                               [<c02d0e5f>] driver_probe_device+0xe1/0x15f
                               [<c043a2f9>] _spin_unlock+0x25/0x3b
                               [<c04375b2>] klist_next+0x58/0x6d
                               [<c02d0f6f>] __driver_attach+0x0/0x7f
                               [<c02d0fb8>] __driver_attach+0x49/0x7f
                               [<c02d0403>] bus_for_each_dev+0x36/0x58
                               [<c02d0cb7>] driver_attach+0x16/0x18
                               [<c02d0f6f>] __driver_attach+0x0/0x7f
                               [<c02d06fa>] bus_add_driver+0x6d/0x18d
                               [<c0258089>] __pci_register_driver+0x53/0x7f
                               [<c05be635>] e1000_init_module+0x45/0x7c
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
       in-softirq-W at:
                               [<c011d20a>] __wake_up_common+0x32/0x5c
                               [<c0142822>] __lock_acquire+0x440/0xc07
                               [<c043a3c5>] 
_spin_unlock_irqrestore+0x40/0x58
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c02dff01>] e1000_watchdog+0x0/0x5c9
                               [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c03abbbb>] linkwatch_fire_event+0x25/0x37
                               [<c03aeb42>] netif_carrier_on+0x16/0x27
                               [<c02e0156>] e1000_watchdog+0x255/0x5c9
                               [<c02dff01>] e1000_watchdog+0x0/0x5c9
                               [<c012df52>] run_timer_softirq+0xfa/0x15d
                               [<c012a8a6>] __do_softirq+0x56/0xdb
                               [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                               [<c012a8b8>] __do_softirq+0x68/0xdb
                               [<c012a961>] do_softirq+0x36/0x51
                               [<c012ab07>] irq_exit+0x43/0x4e
                               [<c0114122>] 
smp_apic_timer_interrupt+0x74/0x80
                               [<c0104a01>] apic_timer_interrupt+0x29/0x38
                               [<c0104a0b>] apic_timer_interrupt+0x33/0x38
                               [<c01600d8>] sys_swapon+0x29c/0x9aa
                               [<c01021a6>] mwait_idle_with_hints+0x3b/0x3f
                               [<c0102447>] mwait_idle+0x0/0xf
                               [<c0102581>] cpu_idle+0x99/0xc6
                               [<c05a98c7>] start_kernel+0x2c7/0x2cf
                               [<c05a90e0>] unknown_bootoption+0x0/0x195
                               [<ffffffff>] 0xffffffff
     }
     ... key      at: [<c058a194>] lweventlist_lock+0x14/0x40
    ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c03abbbb>] linkwatch_fire_event+0x25/0x37
    [<c03aeb42>] netif_carrier_on+0x16/0x27
    [<c02ede2c>] bond_set_carrier+0x31/0x55
    [<c02ee9b6>] bond_select_active_slave+0x9c/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f6197>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fe9>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

   ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c0143062>] lock_acquire+0x79/0x93
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c0439e25>] _write_lock_bh+0x3b/0x64
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c013fe35>] put_lock_stats+0xa/0x1e
    [<c03a14db>] __dev_set_rx_mode+0x7b/0x7d
    [<c03a1675>] dev_set_rx_mode+0x23/0x36
    [<c03a3d50>] dev_open+0x5e/0x77
    [<c03a2a1f>] dev_change_flags+0x9d/0x14b
    [<c03a1823>] __dev_get_by_name+0x68/0x73
    [<c03e3850>] devinet_ioctl+0x22b/0x536
    [<c03a3b45>] dev_ioctl+0x46f/0x5b7
    [<c0399c78>] sock_ioctl+0x167/0x18b
    [<c0399b11>] sock_ioctl+0x0/0x18b
    [<c01725f7>] do_ioctl+0x1f/0x62
    [<c0172867>] vfs_ioctl+0x22d/0x23f
    [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
    [<c01728ac>] sys_ioctl+0x33/0x4b
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0439d62>] _spin_lock_bh+0x3b/0x64
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0412114>] igmp6_group_added+0x56/0x11d
    [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
    [<c0410100>] igmp6_mc_seq_start+0xde/0x138
    [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
    [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
    [<c0401834>] ipv6_add_dev+0x21c/0x24b
    [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
    [<c0401e17>] addrconf_notify+0x60/0x7b7
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0141dac>] mark_held_locks+0x39/0x53
    [<c0439066>] mutex_lock_nested+0x286/0x2ac
    [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
    [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
    [<c03a3f88>] register_netdevice_notifier+0x49/0x126
    [<c05c5bda>] addrconf_init+0xad/0x193
    [<c05c5b48>] addrconf_init+0x1b/0x193
    [<c05c5a20>] inet6_init+0x1f0/0x2ad
    [<c05a9499>] kernel_init+0x150/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c0104baf>] kernel_thread_helper+0x7/0x10
    [<ffffffff>] 0xffffffff


the second lock's dependencies:
-> (&bond->lock){-.--} ops: 99 {
    initial-use  at:
                         [<c013fe35>] put_lock_stats+0xa/0x1e
                         [<c014289c>] __lock_acquire+0x4ba/0xc07
                         [<c0141dac>] mark_held_locks+0x39/0x53
                         [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c0439eee>] _read_lock_bh+0x3b/0x64
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c03aa3e7>] rtnl_fill_ifinfo+0x2bf/0x563
                         [<c03aa965>] rtmsg_ifinfo+0x5d/0xdf
                         [<c03aaa26>] rtnetlink_event+0x3f/0x42
                         [<c013ac20>] notifier_call_chain+0x2a/0x52
                         [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a31de>] register_netdevice+0x2a7/0x2e7
                         [<c02ed862>] bond_create+0x1f2/0x26a
                         [<c05bedcd>] bonding_init+0x761/0x7ea
                         [<c05be635>] e1000_init_module+0x45/0x7c
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-W at:
                         [<c014286a>] __lock_acquire+0x488/0xc07
                         [<c012093c>] try_to_wake_up+0x2ce/0x2d8
                         [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                         [<c0439e25>] _write_lock_bh+0x3b/0x64
                         [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                         [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                         [<c013fe35>] put_lock_stats+0xa/0x1e
                         [<c03a14db>] __dev_set_rx_mode+0x7b/0x7d
                         [<c03a1675>] dev_set_rx_mode+0x23/0x36
                         [<c03a3d50>] dev_open+0x5e/0x77
                         [<c03a2a1f>] dev_change_flags+0x9d/0x14b
                         [<c03a1823>] __dev_get_by_name+0x68/0x73
                         [<c03e3850>] devinet_ioctl+0x22b/0x536
                         [<c03a3b45>] dev_ioctl+0x46f/0x5b7
                         [<c0399c78>] sock_ioctl+0x167/0x18b
                         [<c0399b11>] sock_ioctl+0x0/0x18b
                         [<c01725f7>] do_ioctl+0x1f/0x62
                         [<c0172867>] vfs_ioctl+0x22d/0x23f
                         [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                         [<c01728ac>] sys_ioctl+0x33/0x4b
                         [<c0103e92>] sysenter_past_esp+0x5f/0xa5
                         [<ffffffff>] 0xffffffff
    softirq-on-R at:
                         [<c0141986>] mark_lock+0x64/0x451
                         [<c013575e>] __kernel_text_address+0x5/0xe
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c0142889>] __lock_acquire+0x4a7/0xc07
                         [<c013fc76>] save_trace+0x37/0x89
                         [<c0133c57>] run_workqueue+0x87/0x1b6
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02eee5d>] bond_mii_monitor+0x19/0x85
                         [<c0439f4d>] _read_lock+0x36/0x5f
                         [<c02eee5d>] bond_mii_monitor+0x19/0x85
                         [<c02eee5d>] bond_mii_monitor+0x19/0x85
                         [<c0133cab>] run_workqueue+0xdb/0x1b6
                         [<c0133c57>] run_workqueue+0x87/0x1b6
                         [<c02eee44>] bond_mii_monitor+0x0/0x85
                         [<c01346cb>] worker_thread+0x0/0x85
                         [<c0134744>] worker_thread+0x79/0x85
                         [<c0137179>] autoremove_wake_function+0x0/0x35
                         [<c01370c2>] kthread+0x38/0x5e
                         [<c013708a>] kthread+0x0/0x5e
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-R at:
                         [<c013fe0a>] get_lock_stats+0xd/0x2e
                         [<c013fe35>] put_lock_stats+0xa/0x1e
                         [<c0142844>] __lock_acquire+0x462/0xc07
                         [<c0141dac>] mark_held_locks+0x39/0x53
                         [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c0439eee>] _read_lock_bh+0x3b/0x64
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c03aa3e7>] rtnl_fill_ifinfo+0x2bf/0x563
                         [<c03aa965>] rtmsg_ifinfo+0x5d/0xdf
                         [<c03aaa26>] rtnetlink_event+0x3f/0x42
                         [<c013ac20>] notifier_call_chain+0x2a/0x52
                         [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a31de>] register_netdevice+0x2a7/0x2e7
                         [<c02ed862>] bond_create+0x1f2/0x26a
                         [<c05bedcd>] bonding_init+0x761/0x7ea
                         [<c05be635>] e1000_init_module+0x45/0x7c
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
  }
  ... key      at: [<c08777d0>] __key.32969+0x0/0x8
  -> (_xmit_ETHER){-...} ops: 8 {
     initial-use  at:
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] 
ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] 
register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014286a>] __lock_acquire+0x488/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] 
ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] 
register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] 
register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
    [<c0109ef2>] save_stack_trace+0x20/0x3a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0439d62>] _spin_lock_bh+0x3b/0x64
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c02ee492>] bond_change_active_slave+0x1a9/0x3bf
    [<c02ec7c3>] bond_update_speed_duplex+0x26/0x65
    [<c02ee9af>] bond_select_active_slave+0x95/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f6197>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fe9>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

  -> (lweventlist_lock){.+..} ops: 10 {
     initial-use  at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c02e365c>] e1000_read_phy_reg+0x1c7/0x1d3
                           [<c02e348b>] e1000_write_phy_reg+0xb9/0xc3
                           [<c024a7de>] delay_tsc+0x25/0x3b
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c03abbbb>] linkwatch_fire_event+0x25/0x37
                           [<c02e1c43>] e1000_probe+0xad1/0xbe8
                           [<c0257f3f>] pci_device_probe+0x36/0x57
                           [<c02d0e5f>] driver_probe_device+0xe1/0x15f
                           [<c043a2f9>] _spin_unlock+0x25/0x3b
                           [<c04375b2>] klist_next+0x58/0x6d
                           [<c02d0f6f>] __driver_attach+0x0/0x7f
                           [<c02d0fb8>] __driver_attach+0x49/0x7f
                           [<c02d0403>] bus_for_each_dev+0x36/0x58
                           [<c02d0cb7>] driver_attach+0x16/0x18
                           [<c02d0f6f>] __driver_attach+0x0/0x7f
                           [<c02d06fa>] bus_add_driver+0x6d/0x18d
                           [<c0258089>] __pci_register_driver+0x53/0x7f
                           [<c05be635>] e1000_init_module+0x45/0x7c
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     in-softirq-W at:
                           [<c011d20a>] __wake_up_common+0x32/0x5c
                           [<c0142822>] __lock_acquire+0x440/0xc07
                           [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c02dff01>] e1000_watchdog+0x0/0x5c9
                           [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c03abbbb>] linkwatch_fire_event+0x25/0x37
                           [<c03aeb42>] netif_carrier_on+0x16/0x27
                           [<c02e0156>] e1000_watchdog+0x255/0x5c9
                           [<c02dff01>] e1000_watchdog+0x0/0x5c9
                           [<c012df52>] run_timer_softirq+0xfa/0x15d
                           [<c012a8a6>] __do_softirq+0x56/0xdb
                           [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                           [<c012a8b8>] __do_softirq+0x68/0xdb
                           [<c012a961>] do_softirq+0x36/0x51
                           [<c012ab07>] irq_exit+0x43/0x4e
                           [<c0114122>] smp_apic_timer_interrupt+0x74/0x80
                           [<c0104a01>] apic_timer_interrupt+0x29/0x38
                           [<c0104a0b>] apic_timer_interrupt+0x33/0x38
                           [<c01600d8>] sys_swapon+0x29c/0x9aa
                           [<c01021a6>] mwait_idle_with_hints+0x3b/0x3f
                           [<c0102447>] mwait_idle+0x0/0xf
                           [<c0102581>] cpu_idle+0x99/0xc6
                           [<c05a98c7>] start_kernel+0x2c7/0x2cf
                           [<c05a90e0>] unknown_bootoption+0x0/0x195
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c058a194>] lweventlist_lock+0x14/0x40
  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c03abbbb>] linkwatch_fire_event+0x25/0x37
    [<c03aeb42>] netif_carrier_on+0x16/0x27
    [<c02ede2c>] bond_set_carrier+0x31/0x55
    [<c02ee9b6>] bond_select_active_slave+0x9c/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f6197>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fe9>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff


stack backtrace:
Pid: 9, comm: events/0 Not tainted 2.6.24-rc5 #1
  [<c0140b38>] print_irq_inversion_bug+0x108/0x112
  [<c014191d>] check_usage_forwards+0x3c/0x41
  [<c0141b09>] mark_lock+0x1e7/0x451
  [<c0142822>] __lock_acquire+0x440/0xc07
  [<c013e0f3>] clockevents_program_event+0xe0/0xee
  [<c0143062>] lock_acquire+0x79/0x93
  [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
  [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
  [<c0439d62>] _spin_lock_bh+0x3b/0x64
  [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
  [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
  [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
  [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
  [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
  [<c012df52>] run_timer_softirq+0xfa/0x15d
  [<c012a8a6>] __do_softirq+0x56/0xdb
  [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
  [<c012a8b8>] __do_softirq+0x68/0xdb
  [<c012a961>] do_softirq+0x36/0x51
  [<c012ae4a>] local_bh_enable_ip+0xad/0xed
  [<c03bf107>] rt_run_flush+0x64/0x8b
  [<c03e9296>] fib_netdev_event+0x61/0x65
  [<c013ac20>] notifier_call_chain+0x2a/0x52
  [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
  [<c03a2ae5>] netdev_state_change+0x18/0x29
  [<c03abd1d>] __linkwatch_run_queue+0x150/0x17e
  [<c03abd68>] linkwatch_event+0x1d/0x22
  [<c0133cab>] run_workqueue+0xdb/0x1b6
  [<c0133c57>] run_workqueue+0x87/0x1b6
  [<c03abd4b>] linkwatch_event+0x0/0x22
  [<c01346cb>] worker_thread+0x0/0x85
  [<c0134744>] worker_thread+0x79/0x85
  [<c0137179>] autoremove_wake_function+0x0/0x35
  [<c01370c2>] kthread+0x38/0x5e
  [<c013708a>] kthread+0x0/0x5e
  [<c0104baf>] kernel_thread_helper+0x7/0x10
  =======================



Best regards,

 				Krzysztof Ol
Comment 10 Anonymous Emailer 2007-12-14 10:26:54 UTC
Reply-To: andy@greyhouse.net

On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
> 
> 
> On Wed, 12 Dec 2007, Jay Vosburgh wrote:
> 
> >Herbert Xu <herbert@gondor.apana.org.au> wrote:
> >
> >>>diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix 
> >>>drivers/net/bonding/bond_sysfs.c
> >>>--- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>+++ a/drivers/net/bonding/bond_sysfs.c
> >>>@@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
> >>>out:
> >>>       write_unlock_bh(&bond->lock);
> >>>
> >>>-       rtnl_unlock();
> >>>-
> >>
> >>Looking at the changeset that added this perhaps the intention
> >>is to hold the lock? If so we should add an rtnl_lock to the start
> >>of the function.
> >
> >     Yes, this function needs to hold locks, and more than just
> >what's there now.  I believe the following should be correct; I haven't
> >tested it, though (I'm supposedly on vacation right now).
> >
> >     The following change should be correct for the
> >bonding_store_primary case discussed in this thread, and also corrects
> >the bonding_store_active case which performs similar functions.
> >
> >     The bond_change_active_slave and bond_select_active_slave
> >functions both require rtnl, bond->lock for read and curr_slave_lock for
> >write_bh, and no other locks.  This is so that the lower level
> >mode-specific functions can release locks down to just rtnl in order to
> >call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
> >
> >Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
> >
> >diff --git a/drivers/net/bonding/bond_sysfs.c 
> >b/drivers/net/bonding/bond_sysfs.c
> >index 11b76b3..28a2d80 100644
> >--- a/drivers/net/bonding/bond_sysfs.c
> >+++ b/drivers/net/bonding/bond_sysfs.c
> >@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device 
> >*d,
> >     struct slave *slave;
> >     struct bonding *bond = to_bond(d);
> >
> >-    write_lock_bh(&bond->lock);
> >+    rtnl_lock();
> >+    read_lock(&bond->lock);
> >+    write_lock_bh(&bond->curr_slave_lock);
> >+
> >     if (!USES_PRIMARY(bond->params.mode)) {
> >             printk(KERN_INFO DRV_NAME
> >                    ": %s: Unable to set primary slave; %s is in mode 
> >                    %d\n",
> >@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device 
> >*d,
> >             }
> >     }
> >out:
> >-    write_unlock_bh(&bond->lock);
> >-
> >+    write_unlock_bh(&bond->curr_slave_lock);
> >+    read_unlock(&bond->lock);
> >     rtnl_unlock();
> >
> >     return count;
> >@@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct 
> >device *d,
> >     struct bonding *bond = to_bond(d);
> >
> >     rtnl_lock();
> >-    write_lock_bh(&bond->lock);
> >+    read_lock(&bond->lock);
> >+    write_lock_bh(&bond->curr_slave_lock);
> >
> >     if (!USES_PRIMARY(bond->params.mode)) {
> >             printk(KERN_INFO DRV_NAME
> >@@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct 
> >device *d,
> >             }
> >     }
> >out:
> >-    write_unlock_bh(&bond->lock);
> >+    write_unlock_bh(&bond->curr_slave_lock);
> >+    read_unlock(&bond->lock);
> >     rtnl_unlock();
> >
> >     return count;
> 
> Vanilla 2.6.24-rc5 plus this patch:
> 
> =========================================================
> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.24-rc5 #1
> ---------------------------------------------------------
> events/0/9 just changed the state of lock:
>  (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
> but this lock took another, soft-read-irq-unsafe lock in the past:
>  (&bond->lock){-.--}
> 
> and interrupts could create inverse lock ordering between them.
> 
> 

Grrr, I should have seen that -- sorry.  Try your luck with this instead:

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 11b76b3..0694254 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device *d,
 	struct slave *slave;
 	struct bonding *bond = to_bond(d);
 
-	write_lock_bh(&bond->lock);
+	rtnl_lock();
+	read_lock_bh(&bond->lock);
+	write_lock_bh(&bond->curr_slave_lock);
+
 	if (!USES_PRIMARY(bond->params.mode)) {
 		printk(KERN_INFO DRV_NAME
 		       ": %s: Unable to set primary slave; %s is in mode %d\n",
@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device *d,
 		}
 	}
 out:
-	write_unlock_bh(&bond->lock);
-
+	write_unlock_bh(&bond->curr_slave_lock);
+	read_unlock_bh(&bond->lock);
 	rtnl_unlock();
 
 	return count;
@@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
 	struct bonding *bond = to_bond(d);
 
 	rtnl_lock();
-	write_lock_bh(&bond->lock);
+	read_lock_bh(&bond->lock);
+	write_lock_bh(&bond->curr_slave_lock);
 
 	if (!USES_PRIMARY(bond->params.mode)) {
 		printk(KERN_INFO DRV_NAME
@@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
 		}
 	}
 out:
-	write_unlock_bh(&bond->lock);
+	write_unlock_bh(&bond->curr_slave_lock);
+	read_unlock_bh(&bond->lock);
 	rtnl_unlock();
 
 	return count;
Comment 11 Krzysztof Oledzki 2007-12-14 10:58:01 UTC

On Fri, 14 Dec 2007, Andy Gospodarek wrote:

> On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
>>
>>
>> On Wed, 12 Dec 2007, Jay Vosburgh wrote:
>>
>>> Herbert Xu <herbert@gondor.apana.org.au> wrote:
>>>
>>>>> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>>>> drivers/net/bonding/bond_sysfs.c
>>>>> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>>>> +++ a/drivers/net/bonding/bond_sysfs.c
>>>>> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
>>>>> out:
>>>>>       write_unlock_bh(&bond->lock);
>>>>>
>>>>> -       rtnl_unlock();
>>>>> -
>>>>
>>>> Looking at the changeset that added this perhaps the intention
>>>> is to hold the lock? If so we should add an rtnl_lock to the start
>>>> of the function.
>>>
>>>     Yes, this function needs to hold locks, and more than just
>>> what's there now.  I believe the following should be correct; I haven't
>>> tested it, though (I'm supposedly on vacation right now).
>>>
>>>     The following change should be correct for the
>>> bonding_store_primary case discussed in this thread, and also corrects
>>> the bonding_store_active case which performs similar functions.
>>>
>>>     The bond_change_active_slave and bond_select_active_slave
>>> functions both require rtnl, bond->lock for read and curr_slave_lock for
>>> write_bh, and no other locks.  This is so that the lower level
>>> mode-specific functions can release locks down to just rtnl in order to
>>> call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
>>>
>>> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>>>
>>> diff --git a/drivers/net/bonding/bond_sysfs.c
>>> b/drivers/net/bonding/bond_sysfs.c
>>> index 11b76b3..28a2d80 100644
>>> --- a/drivers/net/bonding/bond_sysfs.c
>>> +++ b/drivers/net/bonding/bond_sysfs.c
>>> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device
>>> *d,
>>>     struct slave *slave;
>>>     struct bonding *bond = to_bond(d);
>>>
>>> -   write_lock_bh(&bond->lock);
>>> +   rtnl_lock();
>>> +   read_lock(&bond->lock);
>>> +   write_lock_bh(&bond->curr_slave_lock);
>>> +
>>>     if (!USES_PRIMARY(bond->params.mode)) {
>>>             printk(KERN_INFO DRV_NAME
>>>                    ": %s: Unable to set primary slave; %s is in mode
>>>                    %d\n",
>>> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device
>>> *d,
>>>             }
>>>     }
>>> out:
>>> -   write_unlock_bh(&bond->lock);
>>> -
>>> +   write_unlock_bh(&bond->curr_slave_lock);
>>> +   read_unlock(&bond->lock);
>>>     rtnl_unlock();
>>>
>>>     return count;
>>> @@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct
>>> device *d,
>>>     struct bonding *bond = to_bond(d);
>>>
>>>     rtnl_lock();
>>> -   write_lock_bh(&bond->lock);
>>> +   read_lock(&bond->lock);
>>> +   write_lock_bh(&bond->curr_slave_lock);
>>>
>>>     if (!USES_PRIMARY(bond->params.mode)) {
>>>             printk(KERN_INFO DRV_NAME
>>> @@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct
>>> device *d,
>>>             }
>>>     }
>>> out:
>>> -   write_unlock_bh(&bond->lock);
>>> +   write_unlock_bh(&bond->curr_slave_lock);
>>> +   read_unlock(&bond->lock);
>>>     rtnl_unlock();
>>>
>>>     return count;
>>
>> Vanilla 2.6.24-rc5 plus this patch:
>>
>> =========================================================
>> [ INFO: possible irq lock inversion dependency detected ]
>> 2.6.24-rc5 #1
>> ---------------------------------------------------------
>> events/0/9 just changed the state of lock:
>>  (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
>> but this lock took another, soft-read-irq-unsafe lock in the past:
>>  (&bond->lock){-.--}
>>
>> and interrupts could create inverse lock ordering between them.
>>
>>
>
> Grrr, I should have seen that -- sorry.  Try your luck with this instead:
<CUT>

No luck.

bonding: bond0: setting mode to active-backup (1).
bonding: bond0: Setting MII monitoring interval to 100.
ADDRCONF(NETDEV_UP): bond0: link is not ready
bonding: bond0: Adding slave eth0.
e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX/TX
bonding: bond0: making interface eth0 the new active one.
bonding: bond0: first active interface up!
bonding: bond0: enslaving eth0 as an active interface with an up link.
bonding: bond0: Adding slave eth1.
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.24-rc5 #1
---------------------------------------------------------
events/0/9 just changed the state of lock:
  (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
but this lock took another, soft-read-irq-unsafe lock in the past:
  (&bond->lock){-.--}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
4 locks held by events/0/9:
  #0:  (events){--..}, at: [<c0133c57>] run_workqueue+0x87/0x1b6
  #1:  ((linkwatch_work).work){--..}, at: [<c0133c57>] run_workqueue+0x87/0x1b6
  #2:  (rtnl_mutex){--..}, at: [<c03abd50>] linkwatch_event+0x5/0x22
  #3:  (&ndev->lock){-.-+}, at: [<c0411b61>] mld_ifc_timer_expire+0x17/0x1fb

the first lock's dependencies:
-> (&mc->mca_lock){-+..} ops: 10 {
    initial-use  at:
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c014289c>] __lock_acquire+0x4ba/0xc07
                         [<c0109ef2>] save_stack_trace+0x20/0x3a
                         [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                         [<c0412452>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c0439d62>] _spin_lock_bh+0x3b/0x64
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                         [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                         [<c0412452>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                         [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                         [<c0401834>] ipv6_add_dev+0x21c/0x24b
                         [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                         [<c05c5b40>] addrconf_init+0x13/0x193
                         [<c0199f63>] proc_net_fops_create+0x10/0x21
                         [<c0419b38>] ip6_flowlabel_init+0x1e/0x20
                         [<c05c5a20>] inet6_init+0x1f0/0x2ad
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    in-softirq-W at:
                         [<c0142822>] __lock_acquire+0x440/0xc07
                         [<c013fe0a>] get_lock_stats+0xd/0x2e
                         [<c013fe35>] put_lock_stats+0xa/0x1e
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c0439d62>] _spin_lock_bh+0x3b/0x64
                         [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                         [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c012df52>] run_timer_softirq+0xfa/0x15d
                         [<c012a8a6>] __do_softirq+0x56/0xdb
                         [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                         [<c012a8b8>] __do_softirq+0x68/0xdb
                         [<c012a961>] do_softirq+0x36/0x51
                         [<c012ae4a>] local_bh_enable_ip+0xad/0xed
                         [<c03bf107>] rt_run_flush+0x64/0x8b
                         [<c03e9296>] fib_netdev_event+0x61/0x65
                         [<c013ac20>] notifier_call_chain+0x2a/0x52
                         [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a2ae5>] netdev_state_change+0x18/0x29
                         [<c03abd1d>] __linkwatch_run_queue+0x150/0x17e
                         [<c03abd68>] linkwatch_event+0x1d/0x22
                         [<c0133cab>] run_workqueue+0xdb/0x1b6
                         [<c0133c57>] run_workqueue+0x87/0x1b6
                         [<c03abd4b>] linkwatch_event+0x0/0x22
                         [<c01346cb>] worker_thread+0x0/0x85
                         [<c0134744>] worker_thread+0x79/0x85
                         [<c0137179>] autoremove_wake_function+0x0/0x35
                         [<c01370c2>] kthread+0x38/0x5e
                         [<c013708a>] kthread+0x0/0x5e
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-W at:
                         [<c01417ee>] find_usage_backwards+0xbb/0xe2
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c014286a>] __lock_acquire+0x488/0xc07
                         [<c0109ef2>] save_stack_trace+0x20/0x3a
                         [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                         [<c0412452>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c0439d62>] _spin_lock_bh+0x3b/0x64
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c04120d6>] igmp6_group_added+0x18/0x11d
                         [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                         [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                         [<c0412452>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                         [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                         [<c0401834>] ipv6_add_dev+0x21c/0x24b
                         [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                         [<c05c5b40>] addrconf_init+0x13/0x193
                         [<c0199f63>] proc_net_fops_create+0x10/0x21
                         [<c0419b38>] ip6_flowlabel_init+0x1e/0x20
                         [<c05c5a20>] inet6_init+0x1f0/0x2ad
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
  }
  ... key      at: [<c087e2d8>] __key.30798+0x0/0x8
  -> (_xmit_ETHER){-...} ops: 8 {
     initial-use  at:
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] 
ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014286a>] __lock_acquire+0x488/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0439d62>] _spin_lock_bh+0x3b/0x64
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0412114>] igmp6_group_added+0x56/0x11d
    [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
    [<c0410100>] igmp6_mc_seq_start+0xde/0x138
    [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
    [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
    [<c0401834>] ipv6_add_dev+0x21c/0x24b
    [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
    [<c0401e17>] addrconf_notify+0x60/0x7b7
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0141dac>] mark_held_locks+0x39/0x53
    [<c0439066>] mutex_lock_nested+0x286/0x2ac
    [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
    [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
    [<c03a3f88>] register_netdevice_notifier+0x49/0x126
    [<c05c5bda>] addrconf_init+0xad/0x193
    [<c05c5b48>] addrconf_init+0x1b/0x193
    [<c05c5a20>] inet6_init+0x1f0/0x2ad
    [<c05a9499>] kernel_init+0x150/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c0104baf>] kernel_thread_helper+0x7/0x10
    [<ffffffff>] 0xffffffff

  -> (&bonding_netdev_xmit_lock_key){-...} ops: 6 {
     initial-use  at:
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014286a>] __lock_acquire+0x488/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c0877804>] bonding_netdev_xmit_lock_key+0x0/0x8
   -> (&bond->lock){-.--} ops: 98 {
      initial-use  at:
                             [<c013fe35>] put_lock_stats+0xa/0x1e
                             [<c014289c>] __lock_acquire+0x4ba/0xc07
                             [<c0141dac>] mark_held_locks+0x39/0x53
                             [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c0439eee>] _read_lock_bh+0x3b/0x64
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c03aa3e7>] rtnl_fill_ifinfo+0x2bf/0x563
                             [<c03aa965>] rtmsg_ifinfo+0x5d/0xdf
                             [<c03aaa26>] rtnetlink_event+0x3f/0x42
                             [<c013ac20>] notifier_call_chain+0x2a/0x52
                             [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                             [<c03a31de>] register_netdevice+0x2a7/0x2e7
                             [<c02ed862>] bond_create+0x1f2/0x26a
                             [<c05bedcd>] bonding_init+0x761/0x7ea
                             [<c05be635>] e1000_init_module+0x45/0x7c
                             [<c05a9499>] kernel_init+0x150/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
      hardirq-on-W at:
                             [<c014286a>] __lock_acquire+0x488/0xc07
                             [<c012093c>] try_to_wake_up+0x2ce/0x2d8
                             [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                             [<c0439e25>] _write_lock_bh+0x3b/0x64
                             [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                             [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                             [<c013fe35>] put_lock_stats+0xa/0x1e
                             [<c03a14db>] __dev_set_rx_mode+0x7b/0x7d
                             [<c03a1675>] dev_set_rx_mode+0x23/0x36
                             [<c03a3d50>] dev_open+0x5e/0x77
                             [<c03a2a1f>] dev_change_flags+0x9d/0x14b
                             [<c03a1823>] __dev_get_by_name+0x68/0x73
                             [<c03e3850>] devinet_ioctl+0x22b/0x536
                             [<c03a3b45>] dev_ioctl+0x46f/0x5b7
                             [<c0399c78>] sock_ioctl+0x167/0x18b
                             [<c0399b11>] sock_ioctl+0x0/0x18b
                             [<c01725f7>] do_ioctl+0x1f/0x62
                             [<c0172867>] vfs_ioctl+0x22d/0x23f
                             [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                             [<c01728ac>] sys_ioctl+0x33/0x4b
                             [<c0103e92>] sysenter_past_esp+0x5f/0xa5
                             [<ffffffff>] 0xffffffff
      softirq-on-R at:
                             [<c0141986>] mark_lock+0x64/0x451
                             [<c013575e>] __kernel_text_address+0x5/0xe
                             [<c0104ee2>] dump_trace+0x83/0x8d
                             [<c0142889>] __lock_acquire+0x4a7/0xc07
                             [<c013fc76>] save_trace+0x37/0x89
                             [<c0133c57>] run_workqueue+0x87/0x1b6
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02eee5d>] bond_mii_monitor+0x19/0x85
                             [<c0439f4d>] _read_lock+0x36/0x5f
                             [<c02eee5d>] bond_mii_monitor+0x19/0x85
                             [<c02eee5d>] bond_mii_monitor+0x19/0x85
                             [<c0133cab>] run_workqueue+0xdb/0x1b6
                             [<c0133c57>] run_workqueue+0x87/0x1b6
                             [<c02eee44>] bond_mii_monitor+0x0/0x85
                             [<c01346cb>] worker_thread+0x0/0x85
                             [<c0134744>] worker_thread+0x79/0x85
                             [<c0137179>] autoremove_wake_function+0x0/0x35
                             [<c01370c2>] kthread+0x38/0x5e
                             [<c013708a>] kthread+0x0/0x5e
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
      hardirq-on-R at:
                             [<c013fe0a>] get_lock_stats+0xd/0x2e
                             [<c013fe35>] put_lock_stats+0xa/0x1e
                             [<c0142844>] __lock_acquire+0x462/0xc07
                             [<c0141dac>] mark_held_locks+0x39/0x53
                             [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c0439eee>] _read_lock_bh+0x3b/0x64
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c03aa3e7>] rtnl_fill_ifinfo+0x2bf/0x563
                             [<c03aa965>] rtmsg_ifinfo+0x5d/0xdf
                             [<c03aaa26>] rtnetlink_event+0x3f/0x42
                             [<c013ac20>] notifier_call_chain+0x2a/0x52
                             [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                             [<c03a31de>] register_netdevice+0x2a7/0x2e7
                             [<c02ed862>] bond_create+0x1f2/0x26a
                             [<c05bedcd>] bonding_init+0x761/0x7ea
                             [<c05be635>] e1000_init_module+0x45/0x7c
                             [<c05a9499>] kernel_init+0x150/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
    }
    ... key      at: [<c08777d0>] __key.32969+0x0/0x8
    -> (_xmit_ETHER){-...} ops: 8 {
       initial-use  at:
                               [<c014289c>] __lock_acquire+0x4ba/0xc07
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c0439d62>] _spin_lock_bh+0x3b/0x64
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c0412114>] igmp6_group_added+0x56/0x11d
                               [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                               [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                               [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                               [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                               [<c0401834>] ipv6_add_dev+0x21c/0x24b
                               [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                               [<c0401e17>] addrconf_notify+0x60/0x7b7
                               [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                               [<c0141dac>] mark_held_locks+0x39/0x53
                               [<c0439066>] mutex_lock_nested+0x286/0x2ac
                               [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                               [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                               [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                               [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                               [<c03a3f88>] register_netdevice_notifier+0x49/0x126
                               [<c05c5bda>] addrconf_init+0xad/0x193
                               [<c05c5b48>] addrconf_init+0x1b/0x193
                               [<c05c5a20>] inet6_init+0x1f0/0x2ad
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
       hardirq-on-W at:
                               [<c0141986>] mark_lock+0x64/0x451
                               [<c014286a>] __lock_acquire+0x488/0xc07
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c0439d62>] _spin_lock_bh+0x3b/0x64
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                               [<c0412114>] igmp6_group_added+0x56/0x11d
                               [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                               [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                               [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                               [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                               [<c0401834>] ipv6_add_dev+0x21c/0x24b
                               [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                               [<c0401e17>] addrconf_notify+0x60/0x7b7
                               [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                               [<c0141dac>] mark_held_locks+0x39/0x53
                               [<c0439066>] mutex_lock_nested+0x286/0x2ac
                               [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                               [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                               [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                               [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                               [<c03a3f88>] register_netdevice_notifier+0x49/0x126
                               [<c05c5bda>] addrconf_init+0xad/0x193
                               [<c05c5b48>] addrconf_init+0x1b/0x193
                               [<c05c5a20>] inet6_init+0x1f0/0x2ad
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
     }
     ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
    ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
    [<c0109ef2>] save_stack_trace+0x20/0x3a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0439d62>] _spin_lock_bh+0x3b/0x64
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c02ee492>] bond_change_active_slave+0x1a9/0x3bf
    [<c02ec7c3>] bond_update_speed_duplex+0x26/0x65
    [<c02ee9af>] bond_select_active_slave+0x95/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f6197>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fe9>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

    -> (lweventlist_lock){.+..} ops: 10 {
       initial-use  at:
                               [<c0141986>] mark_lock+0x64/0x451
                               [<c014289c>] __lock_acquire+0x4ba/0xc07
                               [<c02e365c>] e1000_read_phy_reg+0x1c7/0x1d3
                               [<c02e348b>] e1000_write_phy_reg+0xb9/0xc3
                               [<c024a7de>] delay_tsc+0x25/0x3b
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c03abbbb>] linkwatch_fire_event+0x25/0x37
                               [<c02e1c43>] e1000_probe+0xad1/0xbe8
                               [<c0257f3f>] pci_device_probe+0x36/0x57
                               [<c02d0e5f>] driver_probe_device+0xe1/0x15f
                               [<c043a2f9>] _spin_unlock+0x25/0x3b
                               [<c04375b2>] klist_next+0x58/0x6d
                               [<c02d0f6f>] __driver_attach+0x0/0x7f
                               [<c02d0fb8>] __driver_attach+0x49/0x7f
                               [<c02d0403>] bus_for_each_dev+0x36/0x58
                               [<c02d0cb7>] driver_attach+0x16/0x18
                               [<c02d0f6f>] __driver_attach+0x0/0x7f
                               [<c02d06fa>] bus_add_driver+0x6d/0x18d
                               [<c0258089>] __pci_register_driver+0x53/0x7f
                               [<c05be635>] e1000_init_module+0x45/0x7c
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
       in-softirq-W at:
                               [<c011d20a>] __wake_up_common+0x32/0x5c
                               [<c0142822>] __lock_acquire+0x440/0xc07
                               [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c02dff01>] e1000_watchdog+0x0/0x5c9
                               [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                               [<c03abbbb>] linkwatch_fire_event+0x25/0x37
                               [<c03aeb42>] netif_carrier_on+0x16/0x27
                               [<c02e0156>] e1000_watchdog+0x255/0x5c9
                               [<c02dff01>] e1000_watchdog+0x0/0x5c9
                               [<c012df52>] run_timer_softirq+0xfa/0x15d
                               [<c012a8a6>] __do_softirq+0x56/0xdb
                               [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                               [<c012a8b8>] __do_softirq+0x68/0xdb
                               [<c012a961>] do_softirq+0x36/0x51
                               [<c012ab07>] irq_exit+0x43/0x4e
                               [<c0114122>] smp_apic_timer_interrupt+0x74/0x80
                               [<c0104a01>] apic_timer_interrupt+0x29/0x38
                               [<c0104a0b>] apic_timer_interrupt+0x33/0x38
                               [<c01600d8>] sys_swapon+0x29c/0x9aa
                               [<c01021a6>] mwait_idle_with_hints+0x3b/0x3f
                               [<c0102447>] mwait_idle+0x0/0xf
                               [<c0102581>] cpu_idle+0x99/0xc6
                               [<c05a98c7>] start_kernel+0x2c7/0x2cf
                               [<c05a90e0>] unknown_bootoption+0x0/0x195
                               [<ffffffff>] 0xffffffff
     }
     ... key      at: [<c058a194>] lweventlist_lock+0x14/0x40
    ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c03abbbb>] linkwatch_fire_event+0x25/0x37
    [<c03aeb42>] netif_carrier_on+0x16/0x27
    [<c02ede2c>] bond_set_carrier+0x31/0x55
    [<c02ee9b6>] bond_select_active_slave+0x9c/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f6197>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fe9>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

   ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c0143062>] lock_acquire+0x79/0x93
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c0439e25>] _write_lock_bh+0x3b/0x64
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c013fe35>] put_lock_stats+0xa/0x1e
    [<c03a14db>] __dev_set_rx_mode+0x7b/0x7d
    [<c03a1675>] dev_set_rx_mode+0x23/0x36
    [<c03a3d50>] dev_open+0x5e/0x77
    [<c03a2a1f>] dev_change_flags+0x9d/0x14b
    [<c03a1823>] __dev_get_by_name+0x68/0x73
    [<c03e3850>] devinet_ioctl+0x22b/0x536
    [<c03a3b45>] dev_ioctl+0x46f/0x5b7
    [<c0399c78>] sock_ioctl+0x167/0x18b
    [<c0399b11>] sock_ioctl+0x0/0x18b
    [<c01725f7>] do_ioctl+0x1f/0x62
    [<c0172867>] vfs_ioctl+0x22d/0x23f
    [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
    [<c01728ac>] sys_ioctl+0x33/0x4b
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0439d62>] _spin_lock_bh+0x3b/0x64
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0412114>] igmp6_group_added+0x56/0x11d
    [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
    [<c0410100>] igmp6_mc_seq_start+0xde/0x138
    [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
    [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
    [<c0401834>] ipv6_add_dev+0x21c/0x24b
    [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
    [<c0401e17>] addrconf_notify+0x60/0x7b7
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0141dac>] mark_held_locks+0x39/0x53
    [<c0439066>] mutex_lock_nested+0x286/0x2ac
    [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
    [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
    [<c03a3f88>] register_netdevice_notifier+0x49/0x126
    [<c05c5bda>] addrconf_init+0xad/0x193
    [<c05c5b48>] addrconf_init+0x1b/0x193
    [<c05c5a20>] inet6_init+0x1f0/0x2ad
    [<c05a9499>] kernel_init+0x150/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c0104baf>] kernel_thread_helper+0x7/0x10
    [<ffffffff>] 0xffffffff


the second lock's dependencies:
-> (&bond->lock){-.--} ops: 98 {
    initial-use  at:
                         [<c013fe35>] put_lock_stats+0xa/0x1e
                         [<c014289c>] __lock_acquire+0x4ba/0xc07
                         [<c0141dac>] mark_held_locks+0x39/0x53
                         [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c0439eee>] _read_lock_bh+0x3b/0x64
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c03aa3e7>] rtnl_fill_ifinfo+0x2bf/0x563
                         [<c03aa965>] rtmsg_ifinfo+0x5d/0xdf
                         [<c03aaa26>] rtnetlink_event+0x3f/0x42
                         [<c013ac20>] notifier_call_chain+0x2a/0x52
                         [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a31de>] register_netdevice+0x2a7/0x2e7
                         [<c02ed862>] bond_create+0x1f2/0x26a
                         [<c05bedcd>] bonding_init+0x761/0x7ea
                         [<c05be635>] e1000_init_module+0x45/0x7c
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-W at:
                         [<c014286a>] __lock_acquire+0x488/0xc07
                         [<c012093c>] try_to_wake_up+0x2ce/0x2d8
                         [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                         [<c0439e25>] _write_lock_bh+0x3b/0x64
                         [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                         [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                         [<c013fe35>] put_lock_stats+0xa/0x1e
                         [<c03a14db>] __dev_set_rx_mode+0x7b/0x7d
                         [<c03a1675>] dev_set_rx_mode+0x23/0x36
                         [<c03a3d50>] dev_open+0x5e/0x77
                         [<c03a2a1f>] dev_change_flags+0x9d/0x14b
                         [<c03a1823>] __dev_get_by_name+0x68/0x73
                         [<c03e3850>] devinet_ioctl+0x22b/0x536
                         [<c03a3b45>] dev_ioctl+0x46f/0x5b7
                         [<c0399c78>] sock_ioctl+0x167/0x18b
                         [<c0399b11>] sock_ioctl+0x0/0x18b
                         [<c01725f7>] do_ioctl+0x1f/0x62
                         [<c0172867>] vfs_ioctl+0x22d/0x23f
                         [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                         [<c01728ac>] sys_ioctl+0x33/0x4b
                         [<c0103e92>] sysenter_past_esp+0x5f/0xa5
                         [<ffffffff>] 0xffffffff
    softirq-on-R at:
                         [<c0141986>] mark_lock+0x64/0x451
                         [<c013575e>] __kernel_text_address+0x5/0xe
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c0142889>] __lock_acquire+0x4a7/0xc07
                         [<c013fc76>] save_trace+0x37/0x89
                         [<c0133c57>] run_workqueue+0x87/0x1b6
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02eee5d>] bond_mii_monitor+0x19/0x85
                         [<c0439f4d>] _read_lock+0x36/0x5f
                         [<c02eee5d>] bond_mii_monitor+0x19/0x85
                         [<c02eee5d>] bond_mii_monitor+0x19/0x85
                         [<c0133cab>] run_workqueue+0xdb/0x1b6
                         [<c0133c57>] run_workqueue+0x87/0x1b6
                         [<c02eee44>] bond_mii_monitor+0x0/0x85
                         [<c01346cb>] worker_thread+0x0/0x85
                         [<c0134744>] worker_thread+0x79/0x85
                         [<c0137179>] autoremove_wake_function+0x0/0x35
                         [<c01370c2>] kthread+0x38/0x5e
                         [<c013708a>] kthread+0x0/0x5e
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-R at:
                         [<c013fe0a>] get_lock_stats+0xd/0x2e
                         [<c013fe35>] put_lock_stats+0xa/0x1e
                         [<c0142844>] __lock_acquire+0x462/0xc07
                         [<c0141dac>] mark_held_locks+0x39/0x53
                         [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c0439eee>] _read_lock_bh+0x3b/0x64
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c03aa3e7>] rtnl_fill_ifinfo+0x2bf/0x563
                         [<c03aa965>] rtmsg_ifinfo+0x5d/0xdf
                         [<c03aaa26>] rtnetlink_event+0x3f/0x42
                         [<c013ac20>] notifier_call_chain+0x2a/0x52
                         [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a31de>] register_netdevice+0x2a7/0x2e7
                         [<c02ed862>] bond_create+0x1f2/0x26a
                         [<c05bedcd>] bonding_init+0x761/0x7ea
                         [<c05be635>] e1000_init_module+0x45/0x7c
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
  }
  ... key      at: [<c08777d0>] __key.32969+0x0/0x8
  -> (_xmit_ETHER){-...} ops: 8 {
     initial-use  at:
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014286a>] __lock_acquire+0x488/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0439d62>] _spin_lock_bh+0x3b/0x64
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c03a5aa1>] dev_mc_add+0x1a/0x6a
                           [<c0412114>] igmp6_group_added+0x56/0x11d
                           [<c04124a8>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0xde/0x138
                           [<c04124dd>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412205>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0401834>] ipv6_add_dev+0x21c/0x24b
                           [<c040b07d>] 
ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401e17>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c0439066>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f4d>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f88>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
    [<c0109ef2>] save_stack_trace+0x20/0x3a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c0439d62>] _spin_lock_bh+0x3b/0x64
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c03a5aa1>] dev_mc_add+0x1a/0x6a
    [<c02ee492>] bond_change_active_slave+0x1a9/0x3bf
    [<c02ec7c3>] bond_update_speed_duplex+0x26/0x65
    [<c02ee9af>] bond_select_active_slave+0x95/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f6197>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fe9>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

  -> (lweventlist_lock){.+..} ops: 10 {
     initial-use  at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c02e365c>] e1000_read_phy_reg+0x1c7/0x1d3
                           [<c02e348b>] e1000_write_phy_reg+0xb9/0xc3
                           [<c024a7de>] delay_tsc+0x25/0x3b
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c03abbbb>] linkwatch_fire_event+0x25/0x37
                           [<c02e1c43>] e1000_probe+0xad1/0xbe8
                           [<c0257f3f>] pci_device_probe+0x36/0x57
                           [<c02d0e5f>] driver_probe_device+0xe1/0x15f
                           [<c043a2f9>] _spin_unlock+0x25/0x3b
                           [<c04375b2>] klist_next+0x58/0x6d
                           [<c02d0f6f>] __driver_attach+0x0/0x7f
                           [<c02d0fb8>] __driver_attach+0x49/0x7f
                           [<c02d0403>] bus_for_each_dev+0x36/0x58
                           [<c02d0cb7>] driver_attach+0x16/0x18
                           [<c02d0f6f>] __driver_attach+0x0/0x7f
                           [<c02d06fa>] bus_add_driver+0x6d/0x18d
                           [<c0258089>] __pci_register_driver+0x53/0x7f
                           [<c05be635>] e1000_init_module+0x45/0x7c
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     in-softirq-W at:
                           [<c011d20a>] __wake_up_common+0x32/0x5c
                           [<c0142822>] __lock_acquire+0x440/0xc07
                           [<c043a3c5>] _spin_unlock_irqrestore+0x40/0x58
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c02dff01>] e1000_watchdog+0x0/0x5c9
                           [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c03abaf7>] linkwatch_add_event+0xd/0x2c
                           [<c03abbbb>] linkwatch_fire_event+0x25/0x37
                           [<c03aeb42>] netif_carrier_on+0x16/0x27
                           [<c02e0156>] e1000_watchdog+0x255/0x5c9
                           [<c02dff01>] e1000_watchdog+0x0/0x5c9
                           [<c012df52>] run_timer_softirq+0xfa/0x15d
                           [<c012a8a6>] __do_softirq+0x56/0xdb
                           [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                           [<c012a8b8>] __do_softirq+0x68/0xdb
                           [<c012a961>] do_softirq+0x36/0x51
                           [<c012ab07>] irq_exit+0x43/0x4e
                           [<c0114122>] smp_apic_timer_interrupt+0x74/0x80
                           [<c0104a01>] apic_timer_interrupt+0x29/0x38
                           [<c0104a0b>] apic_timer_interrupt+0x33/0x38
                           [<c01600d8>] sys_swapon+0x29c/0x9aa
                           [<c01021a6>] mwait_idle_with_hints+0x3b/0x3f
                           [<c0102447>] mwait_idle+0x0/0xf
                           [<c0102581>] cpu_idle+0x99/0xc6
                           [<c05a98c7>] start_kernel+0x2c7/0x2cf
                           [<c05a90e0>] unknown_bootoption+0x0/0x195
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c058a194>] lweventlist_lock+0x14/0x40
  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c043a07f>] _spin_lock_irqsave+0x3f/0x6c
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c03abaf7>] linkwatch_add_event+0xd/0x2c
    [<c03abbbb>] linkwatch_fire_event+0x25/0x37
    [<c03aeb42>] netif_carrier_on+0x16/0x27
    [<c02ede2c>] bond_set_carrier+0x31/0x55
    [<c02ee9b6>] bond_select_active_slave+0x9c/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c0439084>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f6197>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fe9>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff


stack backtrace:
Pid: 9, comm: events/0 Not tainted 2.6.24-rc5 #1
  [<c0140b38>] print_irq_inversion_bug+0x108/0x112
  [<c014191d>] check_usage_forwards+0x3c/0x41
  [<c0141b09>] mark_lock+0x1e7/0x451
  [<c0142822>] __lock_acquire+0x440/0xc07
  [<c013fe0a>] get_lock_stats+0xd/0x2e
  [<c013fe35>] put_lock_stats+0xa/0x1e
  [<c0143062>] lock_acquire+0x79/0x93
  [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
  [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
  [<c0439d62>] _spin_lock_bh+0x3b/0x64
  [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
  [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
  [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
  [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
  [<c0411b4a>] mld_ifc_timer_expire+0x0/0x1fb
  [<c012df52>] run_timer_softirq+0xfa/0x15d
  [<c012a8a6>] __do_softirq+0x56/0xdb
  [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
  [<c012a8b8>] __do_softirq+0x68/0xdb
  [<c012a961>] do_softirq+0x36/0x51
  [<c012ae4a>] local_bh_enable_ip+0xad/0xed
  [<c03bf107>] rt_run_flush+0x64/0x8b
  [<c03e9296>] fib_netdev_event+0x61/0x65
  [<c013ac20>] notifier_call_chain+0x2a/0x52
  [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
  [<c03a2ae5>] netdev_state_change+0x18/0x29
  [<c03abd1d>] __linkwatch_run_queue+0x150/0x17e
  [<c03abd68>] linkwatch_event+0x1d/0x22
  [<c0133cab>] run_workqueue+0xdb/0x1b6
  [<c0133c57>] run_workqueue+0x87/0x1b6
  [<c03abd4b>] linkwatch_event+0x0/0x22
  [<c01346cb>] worker_thread+0x0/0x85
  [<c0134744>] worker_thread+0x79/0x85
  [<c0137179>] autoremove_wake_function+0x0/0x35
  [<c01370c2>] kthread+0x38/0x5e
  [<c013708a>] kthread+0x0/0x5e
  [<c0104baf>] kernel_thread_helper+0x7/0x10
  =======================
bonding: bond0: enslaving eth1 as a backup interface with a down link.
bonding: bond0: Setting eth0 as primary slave.
bond0: no IPv6 routers present

Best regards,

 				Krzysztof Ol
Comment 12 Anonymous Emailer 2007-12-14 14:03:53 UTC
Reply-To: andy@greyhouse.net

On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote:
> 
> 
> On Fri, 14 Dec 2007, Andy Gospodarek wrote:
> 
> >On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
> >>
> >>
> >>On Wed, 12 Dec 2007, Jay Vosburgh wrote:
> >>
> >>>Herbert Xu <herbert@gondor.apana.org.au> wrote:
> >>>
> >>>>>diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>>>drivers/net/bonding/bond_sysfs.c
> >>>>>--- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>>>+++ a/drivers/net/bonding/bond_sysfs.c
> >>>>>@@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
> >>>>>out:
> >>>>>      write_unlock_bh(&bond->lock);
> >>>>>
> >>>>>-       rtnl_unlock();
> >>>>>-
> >>>>
> >>>>Looking at the changeset that added this perhaps the intention
> >>>>is to hold the lock? If so we should add an rtnl_lock to the start
> >>>>of the function.
> >>>
> >>>   Yes, this function needs to hold locks, and more than just
> >>>what's there now.  I believe the following should be correct; I haven't
> >>>tested it, though (I'm supposedly on vacation right now).
> >>>
> >>>   The following change should be correct for the
> >>>bonding_store_primary case discussed in this thread, and also corrects
> >>>the bonding_store_active case which performs similar functions.
> >>>
> >>>   The bond_change_active_slave and bond_select_active_slave
> >>>functions both require rtnl, bond->lock for read and curr_slave_lock for
> >>>write_bh, and no other locks.  This is so that the lower level
> >>>mode-specific functions can release locks down to just rtnl in order to
> >>>call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
> >>>
> >>>Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
> >>>
> >>>diff --git a/drivers/net/bonding/bond_sysfs.c
> >>>b/drivers/net/bonding/bond_sysfs.c
> >>>index 11b76b3..28a2d80 100644
> >>>--- a/drivers/net/bonding/bond_sysfs.c
> >>>+++ b/drivers/net/bonding/bond_sysfs.c
> >>>@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device
> >>>*d,
> >>>   struct slave *slave;
> >>>   struct bonding *bond = to_bond(d);
> >>>
> >>>-  write_lock_bh(&bond->lock);
> >>>+  rtnl_lock();
> >>>+  read_lock(&bond->lock);
> >>>+  write_lock_bh(&bond->curr_slave_lock);
F
> >>>+
> >>>   if (!USES_PRIMARY(bond->params.mode)) {
> >>>           printk(KERN_INFO DRV_NAME
> >>>                  ": %s: Unable to set primary slave; %s is in mode
> >>>                  %d\n",
> >>>@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device
> >>>*d,
> >>>           }
> >>>   }
> >>>out:
> >>>-  write_unlock_bh(&bond->lock);
> >>>-
> >>>+  write_unlock_bh(&bond->curr_slave_lock);
> >>>+  read_unlock(&bond->lock);
> >>>   rtnl_unlock();
> >>>
> >>>   return count;
> >>>@@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct
> >>>device *d,
> >>>   struct bonding *bond = to_bond(d);
> >>>
> >>>   rtnl_lock();
> >>>-  write_lock_bh(&bond->lock);
> >>>+  read_lock(&bond->lock);
> >>>+  write_lock_bh(&bond->curr_slave_lock);
> >>>
> >>>   if (!USES_PRIMARY(bond->params.mode)) {
> >>>           printk(KERN_INFO DRV_NAME
> >>>@@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct
> >>>device *d,
> >>>           }
> >>>   }
> >>>out:
> >>>-  write_unlock_bh(&bond->lock);
> >>>+  write_unlock_bh(&bond->curr_slave_lock);
> >>>+  read_unlock(&bond->lock);
> >>>   rtnl_unlock();
> >>>
> >>>   return count;
> >>
> >>Vanilla 2.6.24-rc5 plus this patch:
> >>
> >>=========================================================
> >>[ INFO: possible irq lock inversion dependency detected ]
> >>2.6.24-rc5 #1
> >>---------------------------------------------------------
> >>events/0/9 just changed the state of lock:
> >> (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
> >>but this lock took another, soft-read-irq-unsafe lock in the past:
> >> (&bond->lock){-.--}
> >>
> >>and interrupts could create inverse lock ordering between them.
> >>
> >>
> >
> >Grrr, I should have seen that -- sorry.  Try your luck with this instead:
> <CUT>
> 
> No luck.
> 
> bonding: bond0: setting mode to active-backup (1).
> bonding: bond0: Setting MII monitoring interval to 100.
> ADDRCONF(NETDEV_UP): bond0: link is not ready
> bonding: bond0: Adding slave eth0.
> e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow 
> Control: RX/TX
> bonding: bond0: making interface eth0 the new active one.
> bonding: bond0: first active interface up!
> bonding: bond0: enslaving eth0 as an active interface with an up link.
> bonding: bond0: Adding slave eth1.
> ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready

<SNIP>

> bonding: bond0: enslaving eth1 as a backup interface with a down link.
> bonding: bond0: Setting eth0 as primary slave.
> bond0: no IPv6 routers present
> 
 
Based on the console log, I'm guessing your initialization scripts use
sysfs to set eth0 as the primary interface for bond0?  Can you confirm?

If you did somehow use sysfs to set the primary device as eth0, I'm
guessing you never see this issue without that line or without this
patch.  Please confirm this as well.

Thanks,

-andy
Comment 13 Krzysztof Oledzki 2007-12-14 14:11:30 UTC

On Fri, 14 Dec 2007, Andy Gospodarek wrote:

> On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote:
>>
>>
>> On Fri, 14 Dec 2007, Andy Gospodarek wrote:
>>
>>> On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
>>>>
>>>>
>>>> On Wed, 12 Dec 2007, Jay Vosburgh wrote:
>>>>
>>>>> Herbert Xu <herbert@gondor.apana.org.au> wrote:
>>>>>
>>>>>>> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>>>>>> drivers/net/bonding/bond_sysfs.c
>>>>>>> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>>>>>> +++ a/drivers/net/bonding/bond_sysfs.c
>>>>>>> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
>>>>>>> out:
>>>>>>>      write_unlock_bh(&bond->lock);
>>>>>>>
>>>>>>> -       rtnl_unlock();
>>>>>>> -
>>>>>>
>>>>>> Looking at the changeset that added this perhaps the intention
>>>>>> is to hold the lock? If so we should add an rtnl_lock to the start
>>>>>> of the function.
>>>>>
>>>>>   Yes, this function needs to hold locks, and more than just
>>>>> what's there now.  I believe the following should be correct; I haven't
>>>>> tested it, though (I'm supposedly on vacation right now).
>>>>>
>>>>>   The following change should be correct for the
>>>>> bonding_store_primary case discussed in this thread, and also corrects
>>>>> the bonding_store_active case which performs similar functions.
>>>>>
>>>>>   The bond_change_active_slave and bond_select_active_slave
>>>>> functions both require rtnl, bond->lock for read and curr_slave_lock for
>>>>> write_bh, and no other locks.  This is so that the lower level
>>>>> mode-specific functions can release locks down to just rtnl in order to
>>>>> call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
>>>>>
>>>>> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>>>>>
>>>>> diff --git a/drivers/net/bonding/bond_sysfs.c
>>>>> b/drivers/net/bonding/bond_sysfs.c
>>>>> index 11b76b3..28a2d80 100644
>>>>> --- a/drivers/net/bonding/bond_sysfs.c
>>>>> +++ b/drivers/net/bonding/bond_sysfs.c
>>>>> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device
>>>>> *d,
>>>>>   struct slave *slave;
>>>>>   struct bonding *bond = to_bond(d);
>>>>>
>>>>> - write_lock_bh(&bond->lock);
>>>>> + rtnl_lock();
>>>>> + read_lock(&bond->lock);
>>>>> + write_lock_bh(&bond->curr_slave_lock);
> F
>>>>> +
>>>>>   if (!USES_PRIMARY(bond->params.mode)) {
>>>>>           printk(KERN_INFO DRV_NAME
>>>>>                  ": %s: Unable to set primary slave; %s is in mode
>>>>>                  %d\n",
>>>>> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device
>>>>> *d,
>>>>>           }
>>>>>   }
>>>>> out:
>>>>> - write_unlock_bh(&bond->lock);
>>>>> -
>>>>> + write_unlock_bh(&bond->curr_slave_lock);
>>>>> + read_unlock(&bond->lock);
>>>>>   rtnl_unlock();
>>>>>
>>>>>   return count;
>>>>> @@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct
>>>>> device *d,
>>>>>   struct bonding *bond = to_bond(d);
>>>>>
>>>>>   rtnl_lock();
>>>>> - write_lock_bh(&bond->lock);
>>>>> + read_lock(&bond->lock);
>>>>> + write_lock_bh(&bond->curr_slave_lock);
>>>>>
>>>>>   if (!USES_PRIMARY(bond->params.mode)) {
>>>>>           printk(KERN_INFO DRV_NAME
>>>>> @@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct
>>>>> device *d,
>>>>>           }
>>>>>   }
>>>>> out:
>>>>> - write_unlock_bh(&bond->lock);
>>>>> + write_unlock_bh(&bond->curr_slave_lock);
>>>>> + read_unlock(&bond->lock);
>>>>>   rtnl_unlock();
>>>>>
>>>>>   return count;
>>>>
>>>> Vanilla 2.6.24-rc5 plus this patch:
>>>>
>>>> =========================================================
>>>> [ INFO: possible irq lock inversion dependency detected ]
>>>> 2.6.24-rc5 #1
>>>> ---------------------------------------------------------
>>>> events/0/9 just changed the state of lock:
>>>> (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
>>>> but this lock took another, soft-read-irq-unsafe lock in the past:
>>>> (&bond->lock){-.--}
>>>>
>>>> and interrupts could create inverse lock ordering between them.
>>>>
>>>>
>>>
>>> Grrr, I should have seen that -- sorry.  Try your luck with this instead:
>> <CUT>
>>
>> No luck.
>>
>> bonding: bond0: setting mode to active-backup (1).
>> bonding: bond0: Setting MII monitoring interval to 100.
>> ADDRCONF(NETDEV_UP): bond0: link is not ready
>> bonding: bond0: Adding slave eth0.
>> e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow
>> Control: RX/TX
>> bonding: bond0: making interface eth0 the new active one.
>> bonding: bond0: first active interface up!
>> bonding: bond0: enslaving eth0 as an active interface with an up link.
>> bonding: bond0: Adding slave eth1.
>> ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>
> <SNIP>
>
>> bonding: bond0: enslaving eth1 as a backup interface with a down link.
>> bonding: bond0: Setting eth0 as primary slave.
>> bond0: no IPv6 routers present
>>
>
> Based on the console log, I'm guessing your initialization scripts use
> sysfs to set eth0 as the primary interface for bond0?  Can you confirm?

Yep, that's correct:

postup() {
         if [[ ${IFACE} == "bond0" ]] ; then
                 echo -n +eth0 > /sys/class/net/${IFACE}/bonding/slaves
                 echo -n +eth1 > /sys/class/net/${IFACE}/bonding/slaves
                 echo -n  eth0 > /sys/class/net/${IFACE}/bonding/primary
         fi
}

> If you did somehow use sysfs to set the primary device as eth0, I'm
> guessing you never see this issue without that line or without this
> patch.  Please confirm this as well.

Without this patch I get another error, of course. Unfortunately checking 
what happens without using sysfs to set the primary device as eth0, have 
to wait, as I am not able to test it before Monday.

Best regards,

 				Krzysztof Ol
Comment 14 Anonymous Emailer 2007-12-14 14:27:14 UTC
Reply-To: andy@greyhouse.net

On Fri, Dec 14, 2007 at 11:11:15PM +0100, Krzysztof Oledzki wrote:
> 
> 
> On Fri, 14 Dec 2007, Andy Gospodarek wrote:
> 
> >On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote:
> >>
> >>
> >>On Fri, 14 Dec 2007, Andy Gospodarek wrote:
> >>
> >>>On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
> >>>>
> >>>>
> >>>>On Wed, 12 Dec 2007, Jay Vosburgh wrote:
> >>>>
> >>>>>Herbert Xu <herbert@gondor.apana.org.au> wrote:
> >>>>>
> >>>>>>>diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>>>>>drivers/net/bonding/bond_sysfs.c
> >>>>>>>--- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>>>>>+++ a/drivers/net/bonding/bond_sysfs.c
> >>>>>>>@@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
> >>>>>>>out:
> >>>>>>>     write_unlock_bh(&bond->lock);
> >>>>>>>
> >>>>>>>-       rtnl_unlock();
> >>>>>>>-
> >>>>>>
> >>>>>>Looking at the changeset that added this perhaps the intention
> >>>>>>is to hold the lock? If so we should add an rtnl_lock to the start
> >>>>>>of the function.
> >>>>>
> >>>>> Yes, this function needs to hold locks, and more than just
> >>>>>what's there now.  I believe the following should be correct; I haven't
> >>>>>tested it, though (I'm supposedly on vacation right now).
> >>>>>
> >>>>> The following change should be correct for the
> >>>>>bonding_store_primary case discussed in this thread, and also corrects
> >>>>>the bonding_store_active case which performs similar functions.
> >>>>>
> >>>>> The bond_change_active_slave and bond_select_active_slave
> >>>>>functions both require rtnl, bond->lock for read and curr_slave_lock 
> >>>>>for
> >>>>>write_bh, and no other locks.  This is so that the lower level
> >>>>>mode-specific functions can release locks down to just rtnl in order to
> >>>>>call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
> >>>>>
> >>>>>Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
> >>>>>
> >>>>>diff --git a/drivers/net/bonding/bond_sysfs.c
> >>>>>b/drivers/net/bonding/bond_sysfs.c
> >>>>>index 11b76b3..28a2d80 100644
> >>>>>--- a/drivers/net/bonding/bond_sysfs.c
> >>>>>+++ b/drivers/net/bonding/bond_sysfs.c
> >>>>>@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct 
> >>>>>device
> >>>>>*d,
> >>>>> struct slave *slave;
> >>>>> struct bonding *bond = to_bond(d);
> >>>>>
> >>>>>-        write_lock_bh(&bond->lock);
> >>>>>+        rtnl_lock();
> >>>>>+        read_lock(&bond->lock);
> >>>>>+        write_lock_bh(&bond->curr_slave_lock);
> >F
> >>>>>+
> >>>>> if (!USES_PRIMARY(bond->params.mode)) {
> >>>>>         printk(KERN_INFO DRV_NAME
> >>>>>                ": %s: Unable to set primary slave; %s is in mode
> >>>>>                %d\n",
> >>>>>@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct 
> >>>>>device
> >>>>>*d,
> >>>>>         }
> >>>>> }
> >>>>>out:
> >>>>>-        write_unlock_bh(&bond->lock);
> >>>>>-
> >>>>>+        write_unlock_bh(&bond->curr_slave_lock);
> >>>>>+        read_unlock(&bond->lock);
> >>>>> rtnl_unlock();
> >>>>>
> >>>>> return count;
> >>>>>@@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct
> >>>>>device *d,
> >>>>> struct bonding *bond = to_bond(d);
> >>>>>
> >>>>> rtnl_lock();
> >>>>>-        write_lock_bh(&bond->lock);
> >>>>>+        read_lock(&bond->lock);
> >>>>>+        write_lock_bh(&bond->curr_slave_lock);
> >>>>>
> >>>>> if (!USES_PRIMARY(bond->params.mode)) {
> >>>>>         printk(KERN_INFO DRV_NAME
> >>>>>@@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct
> >>>>>device *d,
> >>>>>         }
> >>>>> }
> >>>>>out:
> >>>>>-        write_unlock_bh(&bond->lock);
> >>>>>+        write_unlock_bh(&bond->curr_slave_lock);
> >>>>>+        read_unlock(&bond->lock);
> >>>>> rtnl_unlock();
> >>>>>
> >>>>> return count;
> >>>>
> >>>>Vanilla 2.6.24-rc5 plus this patch:
> >>>>
> >>>>=========================================================
> >>>>[ INFO: possible irq lock inversion dependency detected ]
> >>>>2.6.24-rc5 #1
> >>>>---------------------------------------------------------
> >>>>events/0/9 just changed the state of lock:
> >>>>(&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
> >>>>but this lock took another, soft-read-irq-unsafe lock in the past:
> >>>>(&bond->lock){-.--}
> >>>>
> >>>>and interrupts could create inverse lock ordering between them.
> >>>>
> >>>>
> >>>
> >>>Grrr, I should have seen that -- sorry.  Try your luck with this instead:
> >><CUT>
> >>
> >>No luck.
> >>
> >>bonding: bond0: setting mode to active-backup (1).
> >>bonding: bond0: Setting MII monitoring interval to 100.
> >>ADDRCONF(NETDEV_UP): bond0: link is not ready
> >>bonding: bond0: Adding slave eth0.
> >>e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow
> >>Control: RX/TX
> >>bonding: bond0: making interface eth0 the new active one.
> >>bonding: bond0: first active interface up!
> >>bonding: bond0: enslaving eth0 as an active interface with an up link.
> >>bonding: bond0: Adding slave eth1.
> >>ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
> >
> ><SNIP>
> >
> >>bonding: bond0: enslaving eth1 as a backup interface with a down link.
> >>bonding: bond0: Setting eth0 as primary slave.
> >>bond0: no IPv6 routers present
> >>
> >
> >Based on the console log, I'm guessing your initialization scripts use
> >sysfs to set eth0 as the primary interface for bond0?  Can you confirm?
> 
> Yep, that's correct:
> 
> postup() {
>         if [[ ${IFACE} == "bond0" ]] ; then
>                 echo -n +eth0 > /sys/class/net/${IFACE}/bonding/slaves
>                 echo -n +eth1 > /sys/class/net/${IFACE}/bonding/slaves
>                 echo -n  eth0 > /sys/class/net/${IFACE}/bonding/primary
>         fi
> }
> 

Good. Thanks for the confirmation.

> >If you did somehow use sysfs to set the primary device as eth0, I'm
> >guessing you never see this issue without that line or without this
> >patch.  Please confirm this as well.
> 
> Without this patch I get another error, of course. Unfortunately checking 

:-/


> what happens without using sysfs to set the primary device as eth0, have 
> to wait, as I am not able to test it before Monday.
> 

No problem, thanks for the feedback and quick response.
Comment 15 Anonymous Emailer 2007-12-14 14:47:34 UTC
Reply-To: andy@greyhouse.net

On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote:
> 
> 
> On Fri, 14 Dec 2007, Andy Gospodarek wrote:
> 
> >On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
> >>
> >>
> >>On Wed, 12 Dec 2007, Jay Vosburgh wrote:
> >>
> >>>Herbert Xu <herbert@gondor.apana.org.au> wrote:
> >>>
> >>>>>diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>>>drivers/net/bonding/bond_sysfs.c
> >>>>>--- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>>>+++ a/drivers/net/bonding/bond_sysfs.c
> >>>>>@@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
> >>>>>out:
> >>>>>      write_unlock_bh(&bond->lock);
> >>>>>
> >>>>>-       rtnl_unlock();
> >>>>>-
> >>>>
> >>>>Looking at the changeset that added this perhaps the intention
> >>>>is to hold the lock? If so we should add an rtnl_lock to the start
> >>>>of the function.
> >>>
> >>>   Yes, this function needs to hold locks, and more than just
> >>>what's there now.  I believe the following should be correct; I haven't
> >>>tested it, though (I'm supposedly on vacation right now).
> >>>
> >>>   The following change should be correct for the
> >>>bonding_store_primary case discussed in this thread, and also corrects
> >>>the bonding_store_active case which performs similar functions.
> >>>
> >>>   The bond_change_active_slave and bond_select_active_slave
> >>>functions both require rtnl, bond->lock for read and curr_slave_lock for
> >>>write_bh, and no other locks.  This is so that the lower level
> >>>mode-specific functions can release locks down to just rtnl in order to
> >>>call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
> >>>
> >>>Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
> >>>
> >>>diff --git a/drivers/net/bonding/bond_sysfs.c
> >>>b/drivers/net/bonding/bond_sysfs.c
> >>>index 11b76b3..28a2d80 100644
> >>>--- a/drivers/net/bonding/bond_sysfs.c
> >>>+++ b/drivers/net/bonding/bond_sysfs.c
> >>>@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device
> >>>*d,
> >>>   struct slave *slave;
> >>>   struct bonding *bond = to_bond(d);
> >>>
> >>>-  write_lock_bh(&bond->lock);
> >>>+  rtnl_lock();
> >>>+  read_lock(&bond->lock);
> >>>+  write_lock_bh(&bond->curr_slave_lock);
> >>>+
> >>>   if (!USES_PRIMARY(bond->params.mode)) {
> >>>           printk(KERN_INFO DRV_NAME
> >>>                  ": %s: Unable to set primary slave; %s is in mode
> >>>                  %d\n",
> >>>@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device
> >>>*d,
> >>>           }
> >>>   }
> >>>out:
> >>>-  write_unlock_bh(&bond->lock);
> >>>-
> >>>+  write_unlock_bh(&bond->curr_slave_lock);
> >>>+  read_unlock(&bond->lock);
> >>>   rtnl_unlock();
> >>>
> >>>   return count;
> >>>@@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct
> >>>device *d,
> >>>   struct bonding *bond = to_bond(d);
> >>>
> >>>   rtnl_lock();
> >>>-  write_lock_bh(&bond->lock);
> >>>+  read_lock(&bond->lock);
> >>>+  write_lock_bh(&bond->curr_slave_lock);
> >>>
> >>>   if (!USES_PRIMARY(bond->params.mode)) {
> >>>           printk(KERN_INFO DRV_NAME
> >>>@@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct
> >>>device *d,
> >>>           }
> >>>   }
> >>>out:
> >>>-  write_unlock_bh(&bond->lock);
> >>>+  write_unlock_bh(&bond->curr_slave_lock);
> >>>+  read_unlock(&bond->lock);
> >>>   rtnl_unlock();
> >>>
> >>>   return count;
> >>
> >>Vanilla 2.6.24-rc5 plus this patch:
> >>
> >>=========================================================
> >>[ INFO: possible irq lock inversion dependency detected ]
> >>2.6.24-rc5 #1
> >>---------------------------------------------------------
> >>events/0/9 just changed the state of lock:
> >> (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
> >>but this lock took another, soft-read-irq-unsafe lock in the past:
> >> (&bond->lock){-.--}
> >>
> >>and interrupts could create inverse lock ordering between them.
> >>
> >>
> >
> >Grrr, I should have seen that -- sorry.  Try your luck with this instead:
> <CUT>
> 
> No luck.
> 


I'm guessing if we go back to using a write-lock for bond->lock this
will go back to working again, but I'm not totally convinced since there
are plenty of places where we used a read-lock with it.


diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 11b76b3..635b857 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device *d,
 	struct slave *slave;
 	struct bonding *bond = to_bond(d);
 
+	rtnl_lock();
 	write_lock_bh(&bond->lock);
+	write_lock_bh(&bond->curr_slave_lock);
+
 	if (!USES_PRIMARY(bond->params.mode)) {
 		printk(KERN_INFO DRV_NAME
 		       ": %s: Unable to set primary slave; %s is in mode %d\n",
@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device *d,
 		}
 	}
 out:
+	write_unlock_bh(&bond->curr_slave_lock);
 	write_unlock_bh(&bond->lock);
-
 	rtnl_unlock();
 
 	return count;
@@ -1191,6 +1194,7 @@ static ssize_t bonding_store_active_slave(struct device *d,
 
 	rtnl_lock();
 	write_lock_bh(&bond->lock);
+	write_lock_bh(&bond->curr_slave_lock);
 
 	if (!USES_PRIMARY(bond->params.mode)) {
 		printk(KERN_INFO DRV_NAME
@@ -1247,6 +1251,7 @@ static ssize_t bonding_store_active_slave(struct device *d,
 		}
 	}
 out:
+	write_unlock_bh(&bond->curr_slave_lock);
 	write_unlock_bh(&bond->lock);
 	rtnl_unlock();
 
Comment 16 Herbert Xu 2007-12-14 20:11:40 UTC
On Fri, Dec 14, 2007 at 05:47:22PM -0500, Andy Gospodarek wrote:
>
> I'm guessing if we go back to using a write-lock for bond->lock this
> will go back to working again, but I'm not totally convinced since there
> are plenty of places where we used a read-lock with it.

Sorry I forgot to cc you earlier Andy.

But to fix this you need make sure that all read locks on bond->lock
in process context disable BH.  This is because at least one write
lock can be taken from BH context.

You don't need to turn the read locks into write locks however.

This is also something that we can undo once the set_multicast
interface has been fixed to not take the tx lock.

Cheers,
Comment 17 Anonymous Emailer 2007-12-15 07:09:35 UTC
Reply-To: andy@greyhouse.net

On Dec 14, 2007 11:10 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Fri, Dec 14, 2007 at 05:47:22PM -0500, Andy Gospodarek wrote:
> >
> > I'm guessing if we go back to using a write-lock for bond->lock this
> > will go back to working again, but I'm not totally convinced since there
> > are plenty of places where we used a read-lock with it.
>
> Sorry I forgot to cc you earlier Andy.
>
> But to fix this you need make sure that all read locks on bond->lock
> in process context disable BH.  This is because at least one write
> lock can be taken from BH context.

I agree with you completely, Herbet, which is why I was surprised that
my first apparently did not resolve the issue.  I felt it should
have....

> You don't need to turn the read locks into write locks however.
>
> This is also something that we can undo once the set_multicast
> interface has been fixed to not take the tx lock.
>
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
>
Comment 18 Herbert Xu 2007-12-15 18:28:24 UTC
Andy Gospodarek <andy@greyhouse.net> wrote:
>
> I agree with you completely, Herbet, which is why I was surprised that
> my first apparently did not resolve the issue.  I felt it should
> have....

Did it change all occurrences of read_lock(&bond->lock) to
read_lock_bh? If so I better look at the lockdep output again.

Cheers,
Comment 19 Anonymous Emailer 2007-12-15 19:17:42 UTC
Reply-To: andy@greyhouse.net

On Dec 15, 2007 9:27 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
> >
> > I agree with you completely, Herbet, which is why I was surprised that
> > my first apparently did not resolve the issue.  I felt it should
> > have....
>
> Did it change all occurrences of read_lock(&bond->lock) to
> read_lock_bh? If so I better look at the lockdep output again.
>

Not all of them in the bonding code, but all two of them in the small patch.
Comment 20 Herbert Xu 2007-12-15 19:23:41 UTC
On Sat, Dec 15, 2007 at 10:17:35PM -0500, Andy Gospodarek wrote:
>
> Not all of them in the bonding code, but all two of them in the small patch.

OK, we need to change all of the ones that may be called from
process context with BH on.

Cheers,
Comment 21 Krzysztof Oledzki 2007-12-18 11:52:26 UTC

On Fri, 14 Dec 2007, Andy Gospodarek wrote:

> On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote:
>>
>>
>> On Fri, 14 Dec 2007, Andy Gospodarek wrote:
>>
>>> On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
>>>>
>>>>
>>>> On Wed, 12 Dec 2007, Jay Vosburgh wrote:
>>>>
>>>>> Herbert Xu <herbert@gondor.apana.org.au> wrote:
>>>>>
>>>>>>> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>>>>>> drivers/net/bonding/bond_sysfs.c
>>>>>>> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>>>>>> +++ a/drivers/net/bonding/bond_sysfs.c
>>>>>>> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
>>>>>>> out:
>>>>>>>      write_unlock_bh(&bond->lock);
>>>>>>>
>>>>>>> -       rtnl_unlock();
>>>>>>> -
>>>>>>
>>>>>> Looking at the changeset that added this perhaps the intention
>>>>>> is to hold the lock? If so we should add an rtnl_lock to the start
>>>>>> of the function.
>>>>>
>>>>>   Yes, this function needs to hold locks, and more than just
>>>>> what's there now.  I believe the following should be correct; I haven't
>>>>> tested it, though (I'm supposedly on vacation right now).
>>>>>
>>>>>   The following change should be correct for the
>>>>> bonding_store_primary case discussed in this thread, and also corrects
>>>>> the bonding_store_active case which performs similar functions.
>>>>>
>>>>>   The bond_change_active_slave and bond_select_active_slave
>>>>> functions both require rtnl, bond->lock for read and curr_slave_lock for
>>>>> write_bh, and no other locks.  This is so that the lower level
>>>>> mode-specific functions can release locks down to just rtnl in order to
>>>>> call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
>>>>>
>>>>> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>>>>>
>>>>> diff --git a/drivers/net/bonding/bond_sysfs.c
>>>>> b/drivers/net/bonding/bond_sysfs.c
>>>>> index 11b76b3..28a2d80 100644
>>>>> --- a/drivers/net/bonding/bond_sysfs.c
>>>>> +++ b/drivers/net/bonding/bond_sysfs.c
>>>>> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device
>>>>> *d,
>>>>>   struct slave *slave;
>>>>>   struct bonding *bond = to_bond(d);
>>>>>
>>>>> - write_lock_bh(&bond->lock);
>>>>> + rtnl_lock();
>>>>> + read_lock(&bond->lock);
>>>>> + write_lock_bh(&bond->curr_slave_lock);
> F
>>>>> +
>>>>>   if (!USES_PRIMARY(bond->params.mode)) {
>>>>>           printk(KERN_INFO DRV_NAME
>>>>>                  ": %s: Unable to set primary slave; %s is in mode
>>>>>                  %d\n",
>>>>> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device
>>>>> *d,
>>>>>           }
>>>>>   }
>>>>> out:
>>>>> - write_unlock_bh(&bond->lock);
>>>>> -
>>>>> + write_unlock_bh(&bond->curr_slave_lock);
>>>>> + read_unlock(&bond->lock);
>>>>>   rtnl_unlock();
>>>>>
>>>>>   return count;
>>>>> @@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct
>>>>> device *d,
>>>>>   struct bonding *bond = to_bond(d);
>>>>>
>>>>>   rtnl_lock();
>>>>> - write_lock_bh(&bond->lock);
>>>>> + read_lock(&bond->lock);
>>>>> + write_lock_bh(&bond->curr_slave_lock);
>>>>>
>>>>>   if (!USES_PRIMARY(bond->params.mode)) {
>>>>>           printk(KERN_INFO DRV_NAME
>>>>> @@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct
>>>>> device *d,
>>>>>           }
>>>>>   }
>>>>> out:
>>>>> - write_unlock_bh(&bond->lock);
>>>>> + write_unlock_bh(&bond->curr_slave_lock);
>>>>> + read_unlock(&bond->lock);
>>>>>   rtnl_unlock();
>>>>>
>>>>>   return count;
>>>>
>>>> Vanilla 2.6.24-rc5 plus this patch:
>>>>
>>>> =========================================================
>>>> [ INFO: possible irq lock inversion dependency detected ]
>>>> 2.6.24-rc5 #1
>>>> ---------------------------------------------------------
>>>> events/0/9 just changed the state of lock:
>>>> (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
>>>> but this lock took another, soft-read-irq-unsafe lock in the past:
>>>> (&bond->lock){-.--}
>>>>
>>>> and interrupts could create inverse lock ordering between them.
>>>>
>>>>
>>>
>>> Grrr, I should have seen that -- sorry.  Try your luck with this instead:
>> <CUT>
>>
>> No luck.
>>
>> bonding: bond0: setting mode to active-backup (1).
>> bonding: bond0: Setting MII monitoring interval to 100.
>> ADDRCONF(NETDEV_UP): bond0: link is not ready
>> bonding: bond0: Adding slave eth0.
>> e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow
>> Control: RX/TX
>> bonding: bond0: making interface eth0 the new active one.
>> bonding: bond0: first active interface up!
>> bonding: bond0: enslaving eth0 as an active interface with an up link.
>> bonding: bond0: Adding slave eth1.
>> ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>
> <SNIP>
>
>> bonding: bond0: enslaving eth1 as a backup interface with a down link.
>> bonding: bond0: Setting eth0 as primary slave.
>> bond0: no IPv6 routers present
>>
>
> Based on the console log, I'm guessing your initialization scripts use
> sysfs to set eth0 as the primary interface for bond0?  Can you confirm?
>
> If you did somehow use sysfs to set the primary device as eth0, I'm
> guessing you never see this issue without that line or without this
> patch.  Please confirm this as well.

Unpatched 2.6.24-rc5 with "echo -n eth0 > /sys/class/net/${IFACE}/bonding/primary"
removed, still has this issue. I also removed:
#               echo -n 1 > /sys/class/net/${IFACE}/bonding/mode
#               echo -n 100 > /sys/class/net/${IFACE}/bonding/miimon
but this did not help:

bonding: bond0: setting mode to active-backup (1).
bonding: bond0: Setting MII monitoring interval to 100.
ADDRCONF(NETDEV_UP): bond0: link is not ready
bonding: bond0: Adding slave eth0.
e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
bonding: bond0: making interface eth0 the new active one.
bonding: bond0: first active interface up!
bonding: bond0: enslaving eth0 as an active interface with an up link.
bonding: bond0: Adding slave eth1.
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.24-rc5 #1
---------------------------------------------------------
events/0/9 just changed the state of lock:
  (&mc->mca_lock){-+..}, at: [<c0411c52>] mld_ifc_timer_expire+0x130/0x1fb
but this lock took another, soft-read-irq-unsafe lock in the past:
  (&bond->lock){-.--}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
4 locks held by events/0/9:
  #0:  (events){--..}, at: [<c0133c57>] run_workqueue+0x87/0x1b6
  #1:  ((linkwatch_work).work){--..}, at: [<c0133c57>] run_workqueue+0x87/0x1b6
  #2:  (rtnl_mutex){--..}, at: [<c03abd28>] linkwatch_event+0x5/0x22
  #3:  (&ndev->lock){-.-+}, at: [<c0411b39>] mld_ifc_timer_expire+0x17/0x1fb

the first lock's dependencies:
-> (&mc->mca_lock){-+..} ops: 10 {
    initial-use  at:
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c014289c>] __lock_acquire+0x4ba/0xc07
                         [<c0109ef2>] save_stack_trace+0x20/0x3a
                         [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                         [<c041242a>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c04120ae>] igmp6_group_added+0x18/0x11d
                         [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                         [<c04120ae>] igmp6_group_added+0x18/0x11d
                         [<c04120ae>] igmp6_group_added+0x18/0x11d
                         [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                         [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
                         [<c041242a>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
                         [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
                         [<c040180c>] ipv6_add_dev+0x21c/0x24b
                         [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                         [<c05c5b40>] addrconf_init+0x13/0x193
                         [<c0199f63>] proc_net_fops_create+0x10/0x21
                         [<c0419b10>] ip6_flowlabel_init+0x1e/0x20
                         [<c05c5a20>] inet6_init+0x1f0/0x2ad
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    in-softirq-W at:
                         [<c0142822>] __lock_acquire+0x440/0xc07
                         [<c011d686>] update_curr+0x52/0xc4
                         [<c013f175>] tick_sched_timer+0x129/0x165
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c0411c52>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c0411b22>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                         [<c0411c52>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c0411c52>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c0411b22>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                         [<c0411b22>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c012df52>] run_timer_softirq+0xfa/0x15d
                         [<c012a8a6>] __do_softirq+0x56/0xdb
                         [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                         [<c012a8b8>] __do_softirq+0x68/0xdb
                         [<c012a961>] do_softirq+0x36/0x51
                         [<c012ae4a>] local_bh_enable_ip+0xad/0xed
                         [<c03bf0df>] rt_run_flush+0x64/0x8b
                         [<c03e926e>] fib_netdev_event+0x61/0x65
                         [<c013ac20>] notifier_call_chain+0x2a/0x52
                         [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a2abd>] netdev_state_change+0x18/0x29
                         [<c03abcf5>] __linkwatch_run_queue+0x150/0x17e
                         [<c03abd40>] linkwatch_event+0x1d/0x22
                         [<c0133cab>] run_workqueue+0xdb/0x1b6
                         [<c0133c57>] run_workqueue+0x87/0x1b6
                         [<c03abd23>] linkwatch_event+0x0/0x22
                         [<c01346cb>] worker_thread+0x0/0x85
                         [<c0134744>] worker_thread+0x79/0x85
                         [<c0137179>] autoremove_wake_function+0x0/0x35
                         [<c01370c2>] kthread+0x38/0x5e
                         [<c013708a>] kthread+0x0/0x5e
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-W at:
                         [<c01417ee>] find_usage_backwards+0xbb/0xe2
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c014286a>] __lock_acquire+0x488/0xc07
                         [<c0109ef2>] save_stack_trace+0x20/0x3a
                         [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                         [<c041242a>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c04120ae>] igmp6_group_added+0x18/0x11d
                         [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                         [<c04120ae>] igmp6_group_added+0x18/0x11d
                         [<c04120ae>] igmp6_group_added+0x18/0x11d
                         [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                         [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
                         [<c041242a>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
                         [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
                         [<c040180c>] ipv6_add_dev+0x21c/0x24b
                         [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                         [<c05c5b40>] addrconf_init+0x13/0x193
                         [<c0199f63>] proc_net_fops_create+0x10/0x21
                         [<c0419b10>] ip6_flowlabel_init+0x1e/0x20
                         [<c05c5a20>] inet6_init+0x1f0/0x2ad
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
  }
  ... key      at: [<c087e2d8>] __key.30798+0x0/0x8
  -> (_xmit_ETHER){-...} ops: 8 {
     initial-use  at:
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c04120ec>] igmp6_group_added+0x56/0x11d
                           [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0x106/0x138
                           [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c040180c>] ipv6_add_dev+0x21c/0x24b
                           [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401def>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c043903e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f60>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014286a>] __lock_acquire+0x488/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c04120ec>] igmp6_group_added+0x56/0x11d
                           [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0x106/0x138
                           [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c040180c>] ipv6_add_dev+0x21c/0x24b
                           [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401def>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c043903e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f60>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c0439d3a>] _spin_lock_bh+0x3b/0x64
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c04120ec>] igmp6_group_added+0x56/0x11d
    [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
    [<c0410100>] igmp6_mc_seq_start+0x106/0x138
    [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
    [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
    [<c040180c>] ipv6_add_dev+0x21c/0x24b
    [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
    [<c0401def>] addrconf_notify+0x60/0x7b7
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0141dac>] mark_held_locks+0x39/0x53
    [<c043903e>] mutex_lock_nested+0x286/0x2ac
    [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
    [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
    [<c03a3f25>] register_netdevice_notifier+0xe/0x126
    [<c03a3f25>] register_netdevice_notifier+0xe/0x126
    [<c03a3f60>] register_netdevice_notifier+0x49/0x126
    [<c05c5bda>] addrconf_init+0xad/0x193
    [<c05c5b48>] addrconf_init+0x1b/0x193
    [<c05c5a20>] inet6_init+0x1f0/0x2ad
    [<c05a9499>] kernel_init+0x150/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c0104baf>] kernel_thread_helper+0x7/0x10
    [<ffffffff>] 0xffffffff

  -> (&bonding_netdev_xmit_lock_key){-...} ops: 6 {
     initial-use  at:
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c04120ec>] igmp6_group_added+0x56/0x11d
                           [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0x106/0x138
                           [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c040180c>] ipv6_add_dev+0x21c/0x24b
                           [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401def>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c043903e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f60>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014286a>] __lock_acquire+0x488/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c04120ec>] igmp6_group_added+0x56/0x11d
                           [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0x106/0x138
                           [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c040180c>] ipv6_add_dev+0x21c/0x24b
                           [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401def>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c043903e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f60>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c0877804>] bonding_netdev_xmit_lock_key+0x0/0x8
   -> (&bond->lock){-.--} ops: 99 {
      initial-use  at:
                             [<c013fe35>] put_lock_stats+0xa/0x1e
                             [<c014289c>] __lock_acquire+0x4ba/0xc07
                             [<c0141dac>] mark_held_locks+0x39/0x53
                             [<c043a39d>] _spin_unlock_irqrestore+0x40/0x58
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c0439ec6>] _read_lock_bh+0x3b/0x64
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c03aa3bf>] rtnl_fill_ifinfo+0x2bf/0x563
                             [<c03aa93d>] rtmsg_ifinfo+0x5d/0xdf
                             [<c03aa9fe>] rtnetlink_event+0x3f/0x42
                             [<c013ac20>] notifier_call_chain+0x2a/0x52
                             [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                             [<c03a31b6>] register_netdevice+0x2a7/0x2e7
                             [<c02ed862>] bond_create+0x1f2/0x26a
                             [<c05bedcd>] bonding_init+0x761/0x7ea
                             [<c05be635>] e1000_init_module+0x45/0x7c
                             [<c05a9499>] kernel_init+0x150/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
      hardirq-on-W at:
                             [<c014286a>] __lock_acquire+0x488/0xc07
                             [<c0141dac>] mark_held_locks+0x39/0x53
                             [<c043a3d5>] _spin_unlock_irq+0x20/0x41
                             [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                             [<c043a3e0>] _spin_unlock_irq+0x2b/0x41
                             [<c012208a>] finish_task_switch+0x50/0x8c
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                             [<c0439dfd>] _write_lock_bh+0x3b/0x64
                             [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                             [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                             [<c013fe35>] put_lock_stats+0xa/0x1e
                             [<c03a14b3>] __dev_set_rx_mode+0x7b/0x7d
                             [<c03a164d>] dev_set_rx_mode+0x23/0x36
                             [<c03a3d28>] dev_open+0x5e/0x77
                             [<c03a29f7>] dev_change_flags+0x9d/0x14b
                             [<c03a17fb>] __dev_get_by_name+0x68/0x73
                             [<c03e3828>] devinet_ioctl+0x22b/0x536
                             [<c03a3b1d>] dev_ioctl+0x46f/0x5b7
                             [<c0399c50>] sock_ioctl+0x167/0x18b
                             [<c0399ae9>] sock_ioctl+0x0/0x18b
                             [<c01725f7>] do_ioctl+0x1f/0x62
                             [<c0172867>] vfs_ioctl+0x22d/0x23f
                             [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                             [<c01728ac>] sys_ioctl+0x33/0x4b
                             [<c0103e92>] sysenter_past_esp+0x5f/0xa5
                             [<ffffffff>] 0xffffffff
      softirq-on-R at:
                             [<c0141986>] mark_lock+0x64/0x451
                             [<c013575e>] __kernel_text_address+0x5/0xe
                             [<c0104ee2>] dump_trace+0x83/0x8d
                             [<c0142889>] __lock_acquire+0x4a7/0xc07
                             [<c013fc76>] save_trace+0x37/0x89
                             [<c0133c57>] run_workqueue+0x87/0x1b6
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02eee5d>] bond_mii_monitor+0x19/0x85
                             [<c0439f25>] _read_lock+0x36/0x5f
                             [<c02eee5d>] bond_mii_monitor+0x19/0x85
                             [<c02eee5d>] bond_mii_monitor+0x19/0x85
                             [<c0133cab>] run_workqueue+0xdb/0x1b6
                             [<c0133c57>] run_workqueue+0x87/0x1b6
                             [<c02eee44>] bond_mii_monitor+0x0/0x85
                             [<c01346cb>] worker_thread+0x0/0x85
                             [<c0134744>] worker_thread+0x79/0x85
                             [<c0137179>] autoremove_wake_function+0x0/0x35
                             [<c01370c2>] kthread+0x38/0x5e
                             [<c013708a>] kthread+0x0/0x5e
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
      hardirq-on-R at:
                             [<c013fe0a>] get_lock_stats+0xd/0x2e
                             [<c013fe35>] put_lock_stats+0xa/0x1e
                             [<c0142844>] __lock_acquire+0x462/0xc07
                             [<c0141dac>] mark_held_locks+0x39/0x53
                             [<c043a39d>] _spin_unlock_irqrestore+0x40/0x58
                             [<c0143062>] lock_acquire+0x79/0x93
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c0439ec6>] _read_lock_bh+0x3b/0x64
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c02edcc1>] bond_get_stats+0x28/0xd0
                             [<c03aa3bf>] rtnl_fill_ifinfo+0x2bf/0x563
                             [<c03aa93d>] rtmsg_ifinfo+0x5d/0xdf
                             [<c03aa9fe>] rtnetlink_event+0x3f/0x42
                             [<c013ac20>] notifier_call_chain+0x2a/0x52
                             [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                             [<c03a31b6>] register_netdevice+0x2a7/0x2e7
                             [<c02ed862>] bond_create+0x1f2/0x26a
                             [<c05bedcd>] bonding_init+0x761/0x7ea
                             [<c05be635>] e1000_init_module+0x45/0x7c
                             [<c05a9499>] kernel_init+0x150/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
    }
    ... key      at: [<c08777d0>] __key.32969+0x0/0x8
    -> (_xmit_ETHER){-...} ops: 8 {
       initial-use  at:
                               [<c014289c>] __lock_acquire+0x4ba/0xc07
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03a5a79>] dev_mc_add+0x1a/0x6a
                               [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                               [<c03a5a79>] dev_mc_add+0x1a/0x6a
                               [<c03a5a79>] dev_mc_add+0x1a/0x6a
                               [<c04120ec>] igmp6_group_added+0x56/0x11d
                               [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
                               [<c0410100>] igmp6_mc_seq_start+0x106/0x138
                               [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
                               [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
                               [<c040180c>] ipv6_add_dev+0x21c/0x24b
                               [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                               [<c0401def>] addrconf_notify+0x60/0x7b7
                               [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                               [<c0141dac>] mark_held_locks+0x39/0x53
                               [<c043903e>] mutex_lock_nested+0x286/0x2ac
                               [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                               [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
                               [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                               [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                               [<c03a3f60>] register_netdevice_notifier+0x49/0x126
                               [<c05c5bda>] addrconf_init+0xad/0x193
                               [<c05c5b48>] addrconf_init+0x1b/0x193
                               [<c05c5a20>] inet6_init+0x1f0/0x2ad
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
       hardirq-on-W at:
                               [<c0141986>] mark_lock+0x64/0x451
                               [<c014286a>] __lock_acquire+0x488/0xc07
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03a5a79>] dev_mc_add+0x1a/0x6a
                               [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                               [<c03a5a79>] dev_mc_add+0x1a/0x6a
                               [<c03a5a79>] dev_mc_add+0x1a/0x6a
                               [<c04120ec>] igmp6_group_added+0x56/0x11d
                               [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
                               [<c0410100>] igmp6_mc_seq_start+0x106/0x138
                               [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
                               [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
                               [<c040180c>] ipv6_add_dev+0x21c/0x24b
                               [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                               [<c0401def>] addrconf_notify+0x60/0x7b7
                               [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                               [<c0141dac>] mark_held_locks+0x39/0x53
                               [<c043903e>] mutex_lock_nested+0x286/0x2ac
                               [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                               [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
                               [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                               [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                               [<c03a3f60>] register_netdevice_notifier+0x49/0x126
                               [<c05c5bda>] addrconf_init+0xad/0x193
                               [<c05c5b48>] addrconf_init+0x1b/0x193
                               [<c05c5a20>] inet6_init+0x1f0/0x2ad
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
     }
     ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
    ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c043a39d>] _spin_unlock_irqrestore+0x40/0x58
    [<c0109ef2>] save_stack_trace+0x20/0x3a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c0439d3a>] _spin_lock_bh+0x3b/0x64
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c02ee492>] bond_change_active_slave+0x1a9/0x3bf
    [<c02ec7c3>] bond_update_speed_duplex+0x26/0x65
    [<c02ee9af>] bond_select_active_slave+0x95/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f616d>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fbf>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

    -> (lweventlist_lock){.+..} ops: 10 {
       initial-use  at:
                               [<c0141986>] mark_lock+0x64/0x451
                               [<c014289c>] __lock_acquire+0x4ba/0xc07
                               [<c02e365c>] e1000_read_phy_reg+0x1c7/0x1d3
                               [<c02e348b>] e1000_write_phy_reg+0xb9/0xc3
                               [<c024a7de>] delay_tsc+0x25/0x3b
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03abacf>] linkwatch_add_event+0xd/0x2c
                               [<c043a057>] _spin_lock_irqsave+0x3f/0x6c
                               [<c03abacf>] linkwatch_add_event+0xd/0x2c
                               [<c03abacf>] linkwatch_add_event+0xd/0x2c
                               [<c03abb93>] linkwatch_fire_event+0x25/0x37
                               [<c02e1c43>] e1000_probe+0xad1/0xbe8
                               [<c0257f3f>] pci_device_probe+0x36/0x57
                               [<c02d0e5f>] driver_probe_device+0xe1/0x15f
                               [<c043a2d1>] _spin_unlock+0x25/0x3b
                               [<c043758a>] klist_next+0x58/0x6d
                               [<c02d0f6f>] __driver_attach+0x0/0x7f
                               [<c02d0fb8>] __driver_attach+0x49/0x7f
                               [<c02d0403>] bus_for_each_dev+0x36/0x58
                               [<c02d0cb7>] driver_attach+0x16/0x18
                               [<c02d0f6f>] __driver_attach+0x0/0x7f
                               [<c02d06fa>] bus_add_driver+0x6d/0x18d
                               [<c0258089>] __pci_register_driver+0x53/0x7f
                               [<c05be635>] e1000_init_module+0x45/0x7c
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
       in-softirq-W at:
                               [<c011d20a>] __wake_up_common+0x32/0x5c
                               [<c0142822>] __lock_acquire+0x440/0xc07
                               [<c043a39d>] _spin_unlock_irqrestore+0x40/0x58
                               [<c0143062>] lock_acquire+0x79/0x93
                               [<c03abacf>] linkwatch_add_event+0xd/0x2c
                               [<c02dff01>] e1000_watchdog+0x0/0x5c9
                               [<c043a057>] _spin_lock_irqsave+0x3f/0x6c
                               [<c03abacf>] linkwatch_add_event+0xd/0x2c
                               [<c03abacf>] linkwatch_add_event+0xd/0x2c
                               [<c03abb93>] linkwatch_fire_event+0x25/0x37
                               [<c03aeb1a>] netif_carrier_on+0x16/0x27
                               [<c02e0156>] e1000_watchdog+0x255/0x5c9
                               [<c02dff01>] e1000_watchdog+0x0/0x5c9
                               [<c012df52>] run_timer_softirq+0xfa/0x15d
                               [<c012a8a6>] __do_softirq+0x56/0xdb
                               [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                               [<c012a8b8>] __do_softirq+0x68/0xdb
                               [<c012a961>] do_softirq+0x36/0x51
                               [<c012ae4a>] local_bh_enable_ip+0xad/0xed
                               [<c03bf0df>] rt_run_flush+0x64/0x8b
                               [<c03e7924>] ip_mc_inc_group+0x184/0x1c1
                               [<c03e79a2>] ip_mc_up+0x41/0x59
                               [<c03e32dc>] inetdev_event+0x257/0x465
                               [<c03aa8da>] rtnl_notify+0x3a/0x40
                               [<c03aa997>] rtmsg_ifinfo+0xb7/0xdf
                               [<c013ac20>] notifier_call_chain+0x2a/0x52
                               [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                               [<c03a3d3b>] dev_open+0x71/0x77
                               [<c02ef626>] bond_enslave+0x30f/0x884
                               [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
                               [<c02f6161>] bonding_store_slaves+0x1a2/0x2fb
                               [<c02f616d>] bonding_store_slaves+0x1ae/0x2fb
                               [<c02f5fbf>] bonding_store_slaves+0x0/0x2fb
                               [<c02ce8d7>] dev_attr_store+0x27/0x2c
                               [<c019bcb9>] sysfs_write_file+0xad/0xe0
                               [<c019bc0c>] sysfs_write_file+0x0/0xe0
                               [<c0168ddc>] vfs_write+0x8a/0x10c
                               [<c0118566>] do_page_fault+0x0/0x54a
                               [<c0169361>] sys_write+0x41/0x67
                               [<c0103e92>] sysenter_past_esp+0x5f/0xa5
                               [<ffffffff>] 0xffffffff
     }
     ... key      at: [<c058a194>] lweventlist_lock+0x14/0x40
    ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03abacf>] linkwatch_add_event+0xd/0x2c
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03abacf>] linkwatch_add_event+0xd/0x2c
    [<c043a057>] _spin_lock_irqsave+0x3f/0x6c
    [<c03abacf>] linkwatch_add_event+0xd/0x2c
    [<c03abacf>] linkwatch_add_event+0xd/0x2c
    [<c03abb93>] linkwatch_fire_event+0x25/0x37
    [<c03aeb1a>] netif_carrier_on+0x16/0x27
    [<c02ede2c>] bond_set_carrier+0x31/0x55
    [<c02ee9b6>] bond_select_active_slave+0x9c/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f616d>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fbf>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

   ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c0141dac>] mark_held_locks+0x39/0x53
    [<c0143062>] lock_acquire+0x79/0x93
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c0439dfd>] _write_lock_bh+0x3b/0x64
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c02eda75>] bond_set_multicast_list+0x1d/0x241
    [<c013fe35>] put_lock_stats+0xa/0x1e
    [<c03a14b3>] __dev_set_rx_mode+0x7b/0x7d
    [<c03a164d>] dev_set_rx_mode+0x23/0x36
    [<c03a3d28>] dev_open+0x5e/0x77
    [<c03a29f7>] dev_change_flags+0x9d/0x14b
    [<c03a17fb>] __dev_get_by_name+0x68/0x73
    [<c03e3828>] devinet_ioctl+0x22b/0x536
    [<c03a3b1d>] dev_ioctl+0x46f/0x5b7
    [<c0399c50>] sock_ioctl+0x167/0x18b
    [<c0399ae9>] sock_ioctl+0x0/0x18b
    [<c01725f7>] do_ioctl+0x1f/0x62
    [<c0172867>] vfs_ioctl+0x22d/0x23f
    [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
    [<c01728ac>] sys_ioctl+0x33/0x4b
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c0439d3a>] _spin_lock_bh+0x3b/0x64
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c04120ec>] igmp6_group_added+0x56/0x11d
    [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
    [<c0410100>] igmp6_mc_seq_start+0x106/0x138
    [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
    [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
    [<c040180c>] ipv6_add_dev+0x21c/0x24b
    [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
    [<c0401def>] addrconf_notify+0x60/0x7b7
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0141dac>] mark_held_locks+0x39/0x53
    [<c043903e>] mutex_lock_nested+0x286/0x2ac
    [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
    [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
    [<c03a3f25>] register_netdevice_notifier+0xe/0x126
    [<c03a3f25>] register_netdevice_notifier+0xe/0x126
    [<c03a3f60>] register_netdevice_notifier+0x49/0x126
    [<c05c5bda>] addrconf_init+0xad/0x193
    [<c05c5b48>] addrconf_init+0x1b/0x193
    [<c05c5a20>] inet6_init+0x1f0/0x2ad
    [<c05a9499>] kernel_init+0x150/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c0104baf>] kernel_thread_helper+0x7/0x10
    [<ffffffff>] 0xffffffff


the second lock's dependencies:
-> (&bond->lock){-.--} ops: 99 {
    initial-use  at:
                         [<c013fe35>] put_lock_stats+0xa/0x1e
                         [<c014289c>] __lock_acquire+0x4ba/0xc07
                         [<c0141dac>] mark_held_locks+0x39/0x53
                         [<c043a39d>] _spin_unlock_irqrestore+0x40/0x58
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c0439ec6>] _read_lock_bh+0x3b/0x64
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c03aa3bf>] rtnl_fill_ifinfo+0x2bf/0x563
                         [<c03aa93d>] rtmsg_ifinfo+0x5d/0xdf
                         [<c03aa9fe>] rtnetlink_event+0x3f/0x42
                         [<c013ac20>] notifier_call_chain+0x2a/0x52
                         [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a31b6>] register_netdevice+0x2a7/0x2e7
                         [<c02ed862>] bond_create+0x1f2/0x26a
                         [<c05bedcd>] bonding_init+0x761/0x7ea
                         [<c05be635>] e1000_init_module+0x45/0x7c
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-W at:
                         [<c014286a>] __lock_acquire+0x488/0xc07
                         [<c0141dac>] mark_held_locks+0x39/0x53
                         [<c043a3d5>] _spin_unlock_irq+0x20/0x41
                         [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                         [<c043a3e0>] _spin_unlock_irq+0x2b/0x41
                         [<c012208a>] finish_task_switch+0x50/0x8c
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                         [<c0439dfd>] _write_lock_bh+0x3b/0x64
                         [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                         [<c02eda75>] bond_set_multicast_list+0x1d/0x241
                         [<c013fe35>] put_lock_stats+0xa/0x1e
                         [<c03a14b3>] __dev_set_rx_mode+0x7b/0x7d
                         [<c03a164d>] dev_set_rx_mode+0x23/0x36
                         [<c03a3d28>] dev_open+0x5e/0x77
                         [<c03a29f7>] dev_change_flags+0x9d/0x14b
                         [<c03a17fb>] __dev_get_by_name+0x68/0x73
                         [<c03e3828>] devinet_ioctl+0x22b/0x536
                         [<c03a3b1d>] dev_ioctl+0x46f/0x5b7
                         [<c0399c50>] sock_ioctl+0x167/0x18b
                         [<c0399ae9>] sock_ioctl+0x0/0x18b
                         [<c01725f7>] do_ioctl+0x1f/0x62
                         [<c0172867>] vfs_ioctl+0x22d/0x23f
                         [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                         [<c01728ac>] sys_ioctl+0x33/0x4b
                         [<c0103e92>] sysenter_past_esp+0x5f/0xa5
                         [<ffffffff>] 0xffffffff
    softirq-on-R at:
                         [<c0141986>] mark_lock+0x64/0x451
                         [<c013575e>] __kernel_text_address+0x5/0xe
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c0142889>] __lock_acquire+0x4a7/0xc07
                         [<c013fc76>] save_trace+0x37/0x89
                         [<c0133c57>] run_workqueue+0x87/0x1b6
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02eee5d>] bond_mii_monitor+0x19/0x85
                         [<c0439f25>] _read_lock+0x36/0x5f
                         [<c02eee5d>] bond_mii_monitor+0x19/0x85
                         [<c02eee5d>] bond_mii_monitor+0x19/0x85
                         [<c0133cab>] run_workqueue+0xdb/0x1b6
                         [<c0133c57>] run_workqueue+0x87/0x1b6
                         [<c02eee44>] bond_mii_monitor+0x0/0x85
                         [<c01346cb>] worker_thread+0x0/0x85
                         [<c0134744>] worker_thread+0x79/0x85
                         [<c0137179>] autoremove_wake_function+0x0/0x35
                         [<c01370c2>] kthread+0x38/0x5e
                         [<c013708a>] kthread+0x0/0x5e
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-R at:
                         [<c013fe0a>] get_lock_stats+0xd/0x2e
                         [<c013fe35>] put_lock_stats+0xa/0x1e
                         [<c0142844>] __lock_acquire+0x462/0xc07
                         [<c0141dac>] mark_held_locks+0x39/0x53
                         [<c043a39d>] _spin_unlock_irqrestore+0x40/0x58
                         [<c0143062>] lock_acquire+0x79/0x93
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c0439ec6>] _read_lock_bh+0x3b/0x64
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c02edcc1>] bond_get_stats+0x28/0xd0
                         [<c03aa3bf>] rtnl_fill_ifinfo+0x2bf/0x563
                         [<c03aa93d>] rtmsg_ifinfo+0x5d/0xdf
                         [<c03aa9fe>] rtnetlink_event+0x3f/0x42
                         [<c013ac20>] notifier_call_chain+0x2a/0x52
                         [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a31b6>] register_netdevice+0x2a7/0x2e7
                         [<c02ed862>] bond_create+0x1f2/0x26a
                         [<c05bedcd>] bonding_init+0x761/0x7ea
                         [<c05be635>] e1000_init_module+0x45/0x7c
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
  }
  ... key      at: [<c08777d0>] __key.32969+0x0/0x8
  -> (_xmit_ETHER){-...} ops: 8 {
     initial-use  at:
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c04120ec>] igmp6_group_added+0x56/0x11d
                           [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0x106/0x138
                           [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c040180c>] ipv6_add_dev+0x21c/0x24b
                           [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401def>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c043903e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f60>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014286a>] __lock_acquire+0x488/0xc07
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c0439d3a>] _spin_lock_bh+0x3b/0x64
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c03a5a79>] dev_mc_add+0x1a/0x6a
                           [<c04120ec>] igmp6_group_added+0x56/0x11d
                           [<c0412480>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] igmp6_mc_seq_start+0x106/0x138
                           [<c04124b5>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c04121dd>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c040180c>] ipv6_add_dev+0x21c/0x24b
                           [<c040b055>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c0401def>] addrconf_notify+0x60/0x7b7
                           [<c0142fa1>] __lock_acquire+0xbbf/0xc07
                           [<c0141dac>] mark_held_locks+0x39/0x53
                           [<c043903e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f9f>] trace_hardirqs_on+0x122/0x14c
                           [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f25>] register_netdevice_notifier+0xe/0x126
                           [<c03a3f60>] register_netdevice_notifier+0x49/0x126
                           [<c05c5bda>] addrconf_init+0xad/0x193
                           [<c05c5b48>] addrconf_init+0x1b/0x193
                           [<c05c5a20>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c043a39d>] _spin_unlock_irqrestore+0x40/0x58
    [<c0109ef2>] save_stack_trace+0x20/0x3a
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c0439d3a>] _spin_lock_bh+0x3b/0x64
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c03a5a79>] dev_mc_add+0x1a/0x6a
    [<c02ee492>] bond_change_active_slave+0x1a9/0x3bf
    [<c02ec7c3>] bond_update_speed_duplex+0x26/0x65
    [<c02ee9af>] bond_select_active_slave+0x95/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f616d>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fbf>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

  -> (lweventlist_lock){.+..} ops: 10 {
     initial-use  at:
                           [<c0141986>] mark_lock+0x64/0x451
                           [<c014289c>] __lock_acquire+0x4ba/0xc07
                           [<c02e365c>] e1000_read_phy_reg+0x1c7/0x1d3
                           [<c02e348b>] e1000_write_phy_reg+0xb9/0xc3
                           [<c024a7de>] delay_tsc+0x25/0x3b
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03abacf>] linkwatch_add_event+0xd/0x2c
                           [<c043a057>] _spin_lock_irqsave+0x3f/0x6c
                           [<c03abacf>] linkwatch_add_event+0xd/0x2c
                           [<c03abacf>] linkwatch_add_event+0xd/0x2c
                           [<c03abb93>] linkwatch_fire_event+0x25/0x37
                           [<c02e1c43>] e1000_probe+0xad1/0xbe8
                           [<c0257f3f>] pci_device_probe+0x36/0x57
                           [<c02d0e5f>] driver_probe_device+0xe1/0x15f
                           [<c043a2d1>] _spin_unlock+0x25/0x3b
                           [<c043758a>] klist_next+0x58/0x6d
                           [<c02d0f6f>] __driver_attach+0x0/0x7f
                           [<c02d0fb8>] __driver_attach+0x49/0x7f
                           [<c02d0403>] bus_for_each_dev+0x36/0x58
                           [<c02d0cb7>] driver_attach+0x16/0x18
                           [<c02d0f6f>] __driver_attach+0x0/0x7f
                           [<c02d06fa>] bus_add_driver+0x6d/0x18d
                           [<c0258089>] __pci_register_driver+0x53/0x7f
                           [<c05be635>] e1000_init_module+0x45/0x7c
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     in-softirq-W at:
                           [<c011d20a>] __wake_up_common+0x32/0x5c
                           [<c0142822>] __lock_acquire+0x440/0xc07
                           [<c043a39d>] _spin_unlock_irqrestore+0x40/0x58
                           [<c0143062>] lock_acquire+0x79/0x93
                           [<c03abacf>] linkwatch_add_event+0xd/0x2c
                           [<c02dff01>] e1000_watchdog+0x0/0x5c9
                           [<c043a057>] _spin_lock_irqsave+0x3f/0x6c
                           [<c03abacf>] linkwatch_add_event+0xd/0x2c
                           [<c03abacf>] linkwatch_add_event+0xd/0x2c
                           [<c03abb93>] linkwatch_fire_event+0x25/0x37
                           [<c03aeb1a>] netif_carrier_on+0x16/0x27
                           [<c02e0156>] e1000_watchdog+0x255/0x5c9
                           [<c02dff01>] e1000_watchdog+0x0/0x5c9
                           [<c012df52>] run_timer_softirq+0xfa/0x15d
                           [<c012a8a6>] __do_softirq+0x56/0xdb
                           [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
                           [<c012a8b8>] __do_softirq+0x68/0xdb
                           [<c012a961>] do_softirq+0x36/0x51
                           [<c012ae4a>] local_bh_enable_ip+0xad/0xed
                           [<c03bf0df>] rt_run_flush+0x64/0x8b
                           [<c03e7924>] ip_mc_inc_group+0x184/0x1c1
                           [<c03e79a2>] ip_mc_up+0x41/0x59
                           [<c03e32dc>] inetdev_event+0x257/0x465
                           [<c03aa8da>] rtnl_notify+0x3a/0x40
                           [<c03aa997>] rtmsg_ifinfo+0xb7/0xdf
                           [<c013ac20>] notifier_call_chain+0x2a/0x52
                           [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
                           [<c03a3d3b>] dev_open+0x71/0x77
                           [<c02ef626>] bond_enslave+0x30f/0x884
                           [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c02f6161>] bonding_store_slaves+0x1a2/0x2fb
                           [<c02f616d>] bonding_store_slaves+0x1ae/0x2fb
                           [<c02f5fbf>] bonding_store_slaves+0x0/0x2fb
                           [<c02ce8d7>] dev_attr_store+0x27/0x2c
                           [<c019bcb9>] sysfs_write_file+0xad/0xe0
                           [<c019bc0c>] sysfs_write_file+0x0/0xe0
                           [<c0168ddc>] vfs_write+0x8a/0x10c
                           [<c0118566>] do_page_fault+0x0/0x54a
                           [<c0169361>] sys_write+0x41/0x67
                           [<c0103e92>] sysenter_past_esp+0x5f/0xa5
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c058a194>] lweventlist_lock+0x14/0x40
  ... acquired at:
    [<c0142dff>] __lock_acquire+0xa1d/0xc07
    [<c03abacf>] linkwatch_add_event+0xd/0x2c
    [<c0142fa1>] __lock_acquire+0xbbf/0xc07
    [<c0143062>] lock_acquire+0x79/0x93
    [<c03abacf>] linkwatch_add_event+0xd/0x2c
    [<c043a057>] _spin_lock_irqsave+0x3f/0x6c
    [<c03abacf>] linkwatch_add_event+0xd/0x2c
    [<c03abacf>] linkwatch_add_event+0xd/0x2c
    [<c03abb93>] linkwatch_fire_event+0x25/0x37
    [<c03aeb1a>] netif_carrier_on+0x16/0x27
    [<c02ede2c>] bond_set_carrier+0x31/0x55
    [<c02ee9b6>] bond_select_active_slave+0x9c/0xcd
    [<c02ed22b>] bond_compute_features+0x45/0x84
    [<c02ef9be>] bond_enslave+0x6a7/0x884
    [<c043905c>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f616d>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f5fbf>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8d7>] dev_attr_store+0x27/0x2c
    [<c019bcb9>] sysfs_write_file+0xad/0xe0
    [<c019bc0c>] sysfs_write_file+0x0/0xe0
    [<c0168ddc>] vfs_write+0x8a/0x10c
    [<c0118566>] do_page_fault+0x0/0x54a
    [<c0169361>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff


stack backtrace:
Pid: 9, comm: events/0 Not tainted 2.6.24-rc5 #1
  [<c0140b38>] print_irq_inversion_bug+0x108/0x112
  [<c014191d>] check_usage_forwards+0x3c/0x41
  [<c0141b09>] mark_lock+0x1e7/0x451
  [<c0142822>] __lock_acquire+0x440/0xc07
  [<c011d686>] update_curr+0x52/0xc4
  [<c013f175>] tick_sched_timer+0x129/0x165
  [<c0143062>] lock_acquire+0x79/0x93
  [<c0411c52>] mld_ifc_timer_expire+0x130/0x1fb
  [<c0411b22>] mld_ifc_timer_expire+0x0/0x1fb
  [<c0439d3a>] _spin_lock_bh+0x3b/0x64
  [<c0411c52>] mld_ifc_timer_expire+0x130/0x1fb
  [<c0411c52>] mld_ifc_timer_expire+0x130/0x1fb
  [<c0411b22>] mld_ifc_timer_expire+0x0/0x1fb
  [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
  [<c0411b22>] mld_ifc_timer_expire+0x0/0x1fb
  [<c012df52>] run_timer_softirq+0xfa/0x15d
  [<c012a8a6>] __do_softirq+0x56/0xdb
  [<c0141f89>] trace_hardirqs_on+0x10c/0x14c
  [<c012a8b8>] __do_softirq+0x68/0xdb
  [<c012a961>] do_softirq+0x36/0x51
  [<c012ae4a>] local_bh_enable_ip+0xad/0xed
  [<c03bf0df>] rt_run_flush+0x64/0x8b
  [<c03e926e>] fib_netdev_event+0x61/0x65
  [<c013ac20>] notifier_call_chain+0x2a/0x52
  [<c013ac6a>] raw_notifier_call_chain+0x17/0x1a
  [<c03a2abd>] netdev_state_change+0x18/0x29
  [<c03abcf5>] __linkwatch_run_queue+0x150/0x17e
  [<c03abd40>] linkwatch_event+0x1d/0x22
  [<c0133cab>] run_workqueue+0xdb/0x1b6
  [<c0133c57>] run_workqueue+0x87/0x1b6
  [<c03abd23>] linkwatch_event+0x0/0x22
  [<c01346cb>] worker_thread+0x0/0x85
  [<c0134744>] worker_thread+0x79/0x85
  [<c0137179>] autoremove_wake_function+0x0/0x35
  [<c01370c2>] kthread+0x38/0x5e
  [<c013708a>] kthread+0x0/0x5e
  [<c0104baf>] kernel_thread_helper+0x7/0x10
  =======================
bonding: bond0: enslaving eth1 as a backup interface with a down link.
bond0: no IPv6 routers present



Best regards,

 				Krzysztof Ol
Comment 22 Krzysztof Oledzki 2007-12-18 11:53:47 UTC

On Fri, 14 Dec 2007, Andy Gospodarek wrote:

> On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote:
>>
>>
>> On Fri, 14 Dec 2007, Andy Gospodarek wrote:
>>
>>> On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
>>>>
>>>>
>>>> On Wed, 12 Dec 2007, Jay Vosburgh wrote:
>>>>
>>>>> Herbert Xu <herbert@gondor.apana.org.au> wrote:
>>>>>
>>>>>>> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>>>>>> drivers/net/bonding/bond_sysfs.c
>>>>>>> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>>>>>> +++ a/drivers/net/bonding/bond_sysfs.c
>>>>>>> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
>>>>>>> out:
>>>>>>>      write_unlock_bh(&bond->lock);
>>>>>>>
>>>>>>> -       rtnl_unlock();
>>>>>>> -
>>>>>>
>>>>>> Looking at the changeset that added this perhaps the intention
>>>>>> is to hold the lock? If so we should add an rtnl_lock to the start
>>>>>> of the function.
>>>>>
>>>>>   Yes, this function needs to hold locks, and more than just
>>>>> what's there now.  I believe the following should be correct; I haven't
>>>>> tested it, though (I'm supposedly on vacation right now).
>>>>>
>>>>>   The following change should be correct for the
>>>>> bonding_store_primary case discussed in this thread, and also corrects
>>>>> the bonding_store_active case which performs similar functions.
>>>>>
>>>>>   The bond_change_active_slave and bond_select_active_slave
>>>>> functions both require rtnl, bond->lock for read and curr_slave_lock for
>>>>> write_bh, and no other locks.  This is so that the lower level
>>>>> mode-specific functions can release locks down to just rtnl in order to
>>>>> call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
>>>>>
>>>>> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>>>>>
>>>>> diff --git a/drivers/net/bonding/bond_sysfs.c
>>>>> b/drivers/net/bonding/bond_sysfs.c
>>>>> index 11b76b3..28a2d80 100644
>>>>> --- a/drivers/net/bonding/bond_sysfs.c
>>>>> +++ b/drivers/net/bonding/bond_sysfs.c
>>>>> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device
>>>>> *d,
>>>>>   struct slave *slave;
>>>>>   struct bonding *bond = to_bond(d);
>>>>>
>>>>> - write_lock_bh(&bond->lock);
>>>>> + rtnl_lock();
>>>>> + read_lock(&bond->lock);
>>>>> + write_lock_bh(&bond->curr_slave_lock);
>>>>> +
>>>>>   if (!USES_PRIMARY(bond->params.mode)) {
>>>>>           printk(KERN_INFO DRV_NAME
>>>>>                  ": %s: Unable to set primary slave; %s is in mode
>>>>>                  %d\n",
>>>>> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device
>>>>> *d,
>>>>>           }
>>>>>   }
>>>>> out:
>>>>> - write_unlock_bh(&bond->lock);
>>>>> -
>>>>> + write_unlock_bh(&bond->curr_slave_lock);
>>>>> + read_unlock(&bond->lock);
>>>>>   rtnl_unlock();
>>>>>
>>>>>   return count;
>>>>> @@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct
>>>>> device *d,
>>>>>   struct bonding *bond = to_bond(d);
>>>>>
>>>>>   rtnl_lock();
>>>>> - write_lock_bh(&bond->lock);
>>>>> + read_lock(&bond->lock);
>>>>> + write_lock_bh(&bond->curr_slave_lock);
>>>>>
>>>>>   if (!USES_PRIMARY(bond->params.mode)) {
>>>>>           printk(KERN_INFO DRV_NAME
>>>>> @@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct
>>>>> device *d,
>>>>>           }
>>>>>   }
>>>>> out:
>>>>> - write_unlock_bh(&bond->lock);
>>>>> + write_unlock_bh(&bond->curr_slave_lock);
>>>>> + read_unlock(&bond->lock);
>>>>>   rtnl_unlock();
>>>>>
>>>>>   return count;
>>>>
>>>> Vanilla 2.6.24-rc5 plus this patch:
>>>>
>>>> =========================================================
>>>> [ INFO: possible irq lock inversion dependency detected ]
>>>> 2.6.24-rc5 #1
>>>> ---------------------------------------------------------
>>>> events/0/9 just changed the state of lock:
>>>> (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
>>>> but this lock took another, soft-read-irq-unsafe lock in the past:
>>>> (&bond->lock){-.--}
>>>>
>>>> and interrupts could create inverse lock ordering between them.
>>>>
>>>>
>>>
>>> Grrr, I should have seen that -- sorry.  Try your luck with this instead:
>> <CUT>
>>
>> No luck.
>>
>
>
> I'm guessing if we go back to using a write-lock for bond->lock this
> will go back to working again, but I'm not totally convinced since there
> are plenty of places where we used a read-lock with it.

Should I check this patch or rather, based on a future discussion, wait 
for another version?

>
> diff --git a/drivers/net/bonding/bond_sysfs.c
> b/drivers/net/bonding/bond_sysfs.c
> index 11b76b3..635b857 100644
> --- a/drivers/net/bonding/bond_sysfs.c
> +++ b/drivers/net/bonding/bond_sysfs.c
> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device *d,
>       struct slave *slave;
>       struct bonding *bond = to_bond(d);
>
> +     rtnl_lock();
>       write_lock_bh(&bond->lock);
> +     write_lock_bh(&bond->curr_slave_lock);
> +
>       if (!USES_PRIMARY(bond->params.mode)) {
>               printk(KERN_INFO DRV_NAME
>                      ": %s: Unable to set primary slave; %s is in mode %d\n",
> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device *d,
>               }
>       }
> out:
> +     write_unlock_bh(&bond->curr_slave_lock);
>       write_unlock_bh(&bond->lock);
> -
>       rtnl_unlock();
>
>       return count;
> @@ -1191,6 +1194,7 @@ static ssize_t bonding_store_active_slave(struct device
> *d,
>
>       rtnl_lock();
>       write_lock_bh(&bond->lock);
> +     write_lock_bh(&bond->curr_slave_lock);
>
>       if (!USES_PRIMARY(bond->params.mode)) {
>               printk(KERN_INFO DRV_NAME
> @@ -1247,6 +1251,7 @@ static ssize_t bonding_store_active_slave(struct device
> *d,
>               }
>       }
> out:
> +     write_unlock_bh(&bond->curr_slave_lock);
>       write_unlock_bh(&bond->lock);
>       rtnl_unlock();
>


Best regards,

 					Krzysztof Ol
Comment 23 Anonymous Emailer 2007-12-19 06:42:24 UTC
Reply-To: andy@greyhouse.net

On Tue, Dec 18, 2007 at 08:53:39PM +0100, Krzysztof Oledzki wrote:
> 
> 
> On Fri, 14 Dec 2007, Andy Gospodarek wrote:
> 
> >On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote:
> >>
> >>
> >>On Fri, 14 Dec 2007, Andy Gospodarek wrote:
> >>
> >>>On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
> >>>>
> >>>>
> >>>>On Wed, 12 Dec 2007, Jay Vosburgh wrote:
> >>>>
> >>>>>Herbert Xu <herbert@gondor.apana.org.au> wrote:
> >>>>>
> >>>>>>>diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>>>>>drivers/net/bonding/bond_sysfs.c
> >>>>>>>--- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>>>>>+++ a/drivers/net/bonding/bond_sysfs.c
> >>>>>>>@@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
> >>>>>>>out:
> >>>>>>>     write_unlock_bh(&bond->lock);
> >>>>>>>
> >>>>>>>-       rtnl_unlock();
> >>>>>>>-
> >>>>>>
> >>>>>>Looking at the changeset that added this perhaps the intention
> >>>>>>is to hold the lock? If so we should add an rtnl_lock to the start
> >>>>>>of the function.
> >>>>>
> >>>>> Yes, this function needs to hold locks, and more than just
> >>>>>what's there now.  I believe the following should be correct; I haven't
> >>>>>tested it, though (I'm supposedly on vacation right now).
> >>>>>
> >>>>> The following change should be correct for the
> >>>>>bonding_store_primary case discussed in this thread, and also corrects
> >>>>>the bonding_store_active case which performs similar functions.
> >>>>>
> >>>>> The bond_change_active_slave and bond_select_active_slave
> >>>>>functions both require rtnl, bond->lock for read and curr_slave_lock 
> >>>>>for
> >>>>>write_bh, and no other locks.  This is so that the lower level
> >>>>>mode-specific functions can release locks down to just rtnl in order to
> >>>>>call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
> >>>>>
> >>>>>Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
> >>>>>
> >>>>>diff --git a/drivers/net/bonding/bond_sysfs.c
> >>>>>b/drivers/net/bonding/bond_sysfs.c
> >>>>>index 11b76b3..28a2d80 100644
> >>>>>--- a/drivers/net/bonding/bond_sysfs.c
> >>>>>+++ b/drivers/net/bonding/bond_sysfs.c
> >>>>>@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct 
> >>>>>device
> >>>>>*d,
> >>>>> struct slave *slave;
> >>>>> struct bonding *bond = to_bond(d);
> >>>>>
> >>>>>-        write_lock_bh(&bond->lock);
> >>>>>+        rtnl_lock();
> >>>>>+        read_lock(&bond->lock);
> >>>>>+        write_lock_bh(&bond->curr_slave_lock);
> >>>>>+
> >>>>> if (!USES_PRIMARY(bond->params.mode)) {
> >>>>>         printk(KERN_INFO DRV_NAME
> >>>>>                ": %s: Unable to set primary slave; %s is in mode
> >>>>>                %d\n",
> >>>>>@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct 
> >>>>>device
> >>>>>*d,
> >>>>>         }
> >>>>> }
> >>>>>out:
> >>>>>-        write_unlock_bh(&bond->lock);
> >>>>>-
> >>>>>+        write_unlock_bh(&bond->curr_slave_lock);
> >>>>>+        read_unlock(&bond->lock);
> >>>>> rtnl_unlock();
> >>>>>
> >>>>> return count;
> >>>>>@@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct
> >>>>>device *d,
> >>>>> struct bonding *bond = to_bond(d);
> >>>>>
> >>>>> rtnl_lock();
> >>>>>-        write_lock_bh(&bond->lock);
> >>>>>+        read_lock(&bond->lock);
> >>>>>+        write_lock_bh(&bond->curr_slave_lock);
> >>>>>
> >>>>> if (!USES_PRIMARY(bond->params.mode)) {
> >>>>>         printk(KERN_INFO DRV_NAME
> >>>>>@@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct
> >>>>>device *d,
> >>>>>         }
> >>>>> }
> >>>>>out:
> >>>>>-        write_unlock_bh(&bond->lock);
> >>>>>+        write_unlock_bh(&bond->curr_slave_lock);
> >>>>>+        read_unlock(&bond->lock);
> >>>>> rtnl_unlock();
> >>>>>
> >>>>> return count;
> >>>>
> >>>>Vanilla 2.6.24-rc5 plus this patch:
> >>>>
> >>>>=========================================================
> >>>>[ INFO: possible irq lock inversion dependency detected ]
> >>>>2.6.24-rc5 #1
> >>>>---------------------------------------------------------
> >>>>events/0/9 just changed the state of lock:
> >>>>(&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
> >>>>but this lock took another, soft-read-irq-unsafe lock in the past:
> >>>>(&bond->lock){-.--}
> >>>>
> >>>>and interrupts could create inverse lock ordering between them.
> >>>>
> >>>>
> >>>
> >>>Grrr, I should have seen that -- sorry.  Try your luck with this instead:
> >><CUT>
> >>
> >>No luck.
> >>
> >
> >
> >I'm guessing if we go back to using a write-lock for bond->lock this
> >will go back to working again, but I'm not totally convinced since there
> >are plenty of places where we used a read-lock with it.
> 
> Should I check this patch or rather, based on a future discussion, wait 
> for another version?
> 
> >
> >diff --git a/drivers/net/bonding/bond_sysfs.c 
> >b/drivers/net/bonding/bond_sysfs.c
> >index 11b76b3..635b857 100644
> >--- a/drivers/net/bonding/bond_sysfs.c
> >+++ b/drivers/net/bonding/bond_sysfs.c
> >@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device 
> >*d,
> >     struct slave *slave;
> >     struct bonding *bond = to_bond(d);
> >
> >+    rtnl_lock();
> >     write_lock_bh(&bond->lock);
> >+    write_lock_bh(&bond->curr_slave_lock);
> >+
> >     if (!USES_PRIMARY(bond->params.mode)) {
> >             printk(KERN_INFO DRV_NAME
> >                    ": %s: Unable to set primary slave; %s is in mode 
> >                    %d\n",
> >@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device 
> >*d,
> >             }
> >     }
> >out:
> >+    write_unlock_bh(&bond->curr_slave_lock);
> >     write_unlock_bh(&bond->lock);
> >-
> >     rtnl_unlock();
> >
> >     return count;
> >@@ -1191,6 +1194,7 @@ static ssize_t bonding_store_active_slave(struct 
> >device *d,
> >
> >     rtnl_lock();
> >     write_lock_bh(&bond->lock);
> >+    write_lock_bh(&bond->curr_slave_lock);
> >
> >     if (!USES_PRIMARY(bond->params.mode)) {
> >             printk(KERN_INFO DRV_NAME
> >@@ -1247,6 +1251,7 @@ static ssize_t bonding_store_active_slave(struct 
> >device *d,
> >             }
> >     }
> >out:
> >+    write_unlock_bh(&bond->curr_slave_lock);
> >     write_unlock_bh(&bond->lock);
> >     rtnl_unlock();
> >
> 
> 
> Best regards,
> 
>                                       Krzysztof Olędzki

For now, I prefer Jay's original patch -- with the read_locks (rather
than read/write_lock_bh) and the added rtnl_lock.  There is still a
lockdep issue that we need to sort-out, but this patch is needed first.
Comment 24 Rafael J. Wysocki 2008-01-01 12:26:20 UTC
Any chance to push the patch upstream for 2.6.24?
Comment 25 Krzysztof Oledzki 2008-01-07 09:58:09 UTC

On Wed, 19 Dec 2007, Andy Gospodarek wrote:

> On Tue, Dec 18, 2007 at 08:53:39PM +0100, Krzysztof Oledzki wrote:
>>
>>
>> On Fri, 14 Dec 2007, Andy Gospodarek wrote:
>>
>>> On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote:
>>>>
>>>>
>>>> On Fri, 14 Dec 2007, Andy Gospodarek wrote:
>>>>
>>>>> On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
>>>>>>
>>>>>>
>>>>>> On Wed, 12 Dec 2007, Jay Vosburgh wrote:
>>>>>>
>>>>>>> Herbert Xu <herbert@gondor.apana.org.au> wrote:
>>>>>>>
>>>>>>>>> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>>>>>>>> drivers/net/bonding/bond_sysfs.c
>>>>>>>>> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>>>>>>>>> +++ a/drivers/net/bonding/bond_sysfs.c
>>>>>>>>> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
>>>>>>>>> out:
>>>>>>>>>     write_unlock_bh(&bond->lock);
>>>>>>>>>
>>>>>>>>> -       rtnl_unlock();
>>>>>>>>> -
>>>>>>>>
>>>>>>>> Looking at the changeset that added this perhaps the intention
>>>>>>>> is to hold the lock? If so we should add an rtnl_lock to the start
>>>>>>>> of the function.
>>>>>>>
>>>>>>>         Yes, this function needs to hold locks, and more than just
>>>>>>> what's there now.  I believe the following should be correct; I haven't
>>>>>>> tested it, though (I'm supposedly on vacation right now).
>>>>>>>
>>>>>>>         The following change should be correct for the
>>>>>>> bonding_store_primary case discussed in this thread, and also corrects
>>>>>>> the bonding_store_active case which performs similar functions.
>>>>>>>
>>>>>>>         The bond_change_active_slave and bond_select_active_slave
>>>>>>> functions both require rtnl, bond->lock for read and curr_slave_lock
>>>>>>> for
>>>>>>> write_bh, and no other locks.  This is so that the lower level
>>>>>>> mode-specific functions can release locks down to just rtnl in order to
>>>>>>> call, e.g., dev_set_mac_address with the locks it expects (rtnl only).
>>>>>>>
>>>>>>> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>>>>>>>
>>>>>>> diff --git a/drivers/net/bonding/bond_sysfs.c
>>>>>>> b/drivers/net/bonding/bond_sysfs.c
>>>>>>> index 11b76b3..28a2d80 100644
>>>>>>> --- a/drivers/net/bonding/bond_sysfs.c
>>>>>>> +++ b/drivers/net/bonding/bond_sysfs.c
>>>>>>> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct
>>>>>>> device
>>>>>>> *d,
>>>>>>>         struct slave *slave;
>>>>>>>         struct bonding *bond = to_bond(d);
>>>>>>>
>>>>>>> -       write_lock_bh(&bond->lock);
>>>>>>> +       rtnl_lock();
>>>>>>> +       read_lock(&bond->lock);
>>>>>>> +       write_lock_bh(&bond->curr_slave_lock);
>>>>>>> +
>>>>>>>         if (!USES_PRIMARY(bond->params.mode)) {
>>>>>>>                 printk(KERN_INFO DRV_NAME
>>>>>>>                        ": %s: Unable to set primary slave; %s is in
>>>>>>> mode
>>>>>>>                        %d\n",
>>>>>>> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct
>>>>>>> device
>>>>>>> *d,
>>>>>>>                 }
>>>>>>>         }
>>>>>>> out:
>>>>>>> -       write_unlock_bh(&bond->lock);
>>>>>>> -
>>>>>>> +       write_unlock_bh(&bond->curr_slave_lock);
>>>>>>> +       read_unlock(&bond->lock);
>>>>>>>         rtnl_unlock();
>>>>>>>
>>>>>>>         return count;
>>>>>>> @@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct
>>>>>>> device *d,
>>>>>>>         struct bonding *bond = to_bond(d);
>>>>>>>
>>>>>>>         rtnl_lock();
>>>>>>> -       write_lock_bh(&bond->lock);
>>>>>>> +       read_lock(&bond->lock);
>>>>>>> +       write_lock_bh(&bond->curr_slave_lock);
>>>>>>>
>>>>>>>         if (!USES_PRIMARY(bond->params.mode)) {
>>>>>>>                 printk(KERN_INFO DRV_NAME
>>>>>>> @@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct
>>>>>>> device *d,
>>>>>>>                 }
>>>>>>>         }
>>>>>>> out:
>>>>>>> -       write_unlock_bh(&bond->lock);
>>>>>>> +       write_unlock_bh(&bond->curr_slave_lock);
>>>>>>> +       read_unlock(&bond->lock);
>>>>>>>         rtnl_unlock();
>>>>>>>
>>>>>>>         return count;
>>>>>>
>>>>>> Vanilla 2.6.24-rc5 plus this patch:
>>>>>>
>>>>>> =========================================================
>>>>>> [ INFO: possible irq lock inversion dependency detected ]
>>>>>> 2.6.24-rc5 #1
>>>>>> ---------------------------------------------------------
>>>>>> events/0/9 just changed the state of lock:
>>>>>> (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb
>>>>>> but this lock took another, soft-read-irq-unsafe lock in the past:
>>>>>> (&bond->lock){-.--}
>>>>>>
>>>>>> and interrupts could create inverse lock ordering between them.
>>>>>>
>>>>>>
>>>>>
>>>>> Grrr, I should have seen that -- sorry.  Try your luck with this instead:
>>>> <CUT>
>>>>
>>>> No luck.
>>>>
>>>
>>>
>>> I'm guessing if we go back to using a write-lock for bond->lock this
>>> will go back to working again, but I'm not totally convinced since there
>>> are plenty of places where we used a read-lock with it.
>>
>> Should I check this patch or rather, based on a future discussion, wait
>> for another version?
>>
>>>
>>> diff --git a/drivers/net/bonding/bond_sysfs.c
>>> b/drivers/net/bonding/bond_sysfs.c
>>> index 11b76b3..635b857 100644
>>> --- a/drivers/net/bonding/bond_sysfs.c
>>> +++ b/drivers/net/bonding/bond_sysfs.c
>>> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device
>>> *d,
>>>     struct slave *slave;
>>>     struct bonding *bond = to_bond(d);
>>>
>>> +   rtnl_lock();
>>>     write_lock_bh(&bond->lock);
>>> +   write_lock_bh(&bond->curr_slave_lock);
>>> +
>>>     if (!USES_PRIMARY(bond->params.mode)) {
>>>             printk(KERN_INFO DRV_NAME
>>>                    ": %s: Unable to set primary slave; %s is in mode
>>>                    %d\n",
>>> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device
>>> *d,
>>>             }
>>>     }
>>> out:
>>> +   write_unlock_bh(&bond->curr_slave_lock);
>>>     write_unlock_bh(&bond->lock);
>>> -
>>>     rtnl_unlock();
>>>
>>>     return count;
>>> @@ -1191,6 +1194,7 @@ static ssize_t bonding_store_active_slave(struct
>>> device *d,
>>>
>>>     rtnl_lock();
>>>     write_lock_bh(&bond->lock);
>>> +   write_lock_bh(&bond->curr_slave_lock);
>>>
>>>     if (!USES_PRIMARY(bond->params.mode)) {
>>>             printk(KERN_INFO DRV_NAME
>>> @@ -1247,6 +1251,7 @@ static ssize_t bonding_store_active_slave(struct
>>> device *d,
>>>             }
>>>     }
>>> out:
>>> +   write_unlock_bh(&bond->curr_slave_lock);
>>>     write_unlock_bh(&bond->lock);
>>>     rtnl_unlock();
>>>
>>
>>
>> Best regards,
>>
>>                                      Krzysztof Ol
Comment 26 Anonymous Emailer 2008-01-07 12:26:46 UTC
Reply-To: andy@greyhouse.net

On Mon, Jan 07, 2008 at 06:57:25PM +0100, Krzysztof Oledzki wrote:
> 
> 
> On Wed, 19 Dec 2007, Andy Gospodarek wrote:
> 
> >On Tue, Dec 18, 2007 at 08:53:39PM +0100, Krzysztof Oledzki wrote:
> >>
> >>
> >>On Fri, 14 Dec 2007, Andy Gospodarek wrote:
> >>
> >>>On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote:
> >>>>
> >>>>
> >>>>On Fri, 14 Dec 2007, Andy Gospodarek wrote:
> >>>>
> >>>>>On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
> >>>>>>
> >>>>>>
> >>>>>>On Wed, 12 Dec 2007, Jay Vosburgh wrote:
> >>>>>>
> >>>>>>>Herbert Xu <herbert@gondor.apana.org.au> wrote:
> >>>>>>>
> >>>>>>>>>diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>>>>>>>drivers/net/bonding/bond_sysfs.c
> >>>>>>>>>--- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> >>>>>>>>>+++ a/drivers/net/bonding/bond_sysfs.c
> >>>>>>>>>@@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
> >>>>>>>>>out:
> >>>>>>>>>    write_unlock_bh(&bond->lock);
> >>>>>>>>>
> >>>>>>>>>-       rtnl_unlock();
> >>>>>>>>>-
> >>>>>>>>
> >>>>>>>>Looking at the changeset that added this perhaps the intention
> >>>>>>>>is to hold the lock? If so we should add an rtnl_lock to the start
> >>>>>>>>of the function.
> >>>>>>>
> >>>>>>>       Yes, this function needs to hold locks, and more than just
> >>>>>>>what's there now.  I believe the following should be correct; I 
> >>>>>>>haven't
> >>>>>>>tested it, though (I'm supposedly on vacation right now).
> >>>>>>>
> >>>>>>>       The following change should be correct for the
> >>>>>>>bonding_store_primary case discussed in this thread, and also 
> >>>>>>>corrects
> >>>>>>>the bonding_store_active case which performs similar functions.
> >>>>>>>
> >>>>>>>       The bond_change_active_slave and bond_select_active_slave
> >>>>>>>functions both require rtnl, bond->lock for read and curr_slave_lock
> >>>>>>>for
> >>>>>>>write_bh, and no other locks.  This is so that the lower level
> >>>>>>>mode-specific functions can release locks down to just rtnl in order 
> >>>>>>>to
> >>>>>>>call, e.g., dev_set_mac_address with the locks it expects (rtnl 
> >>>>>>>only).
> >>>>>>>
> >>>>>>>Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
> >>>>>>>
> >>>>>>>diff --git a/drivers/net/bonding/bond_sysfs.c
> >>>>>>>b/drivers/net/bonding/bond_sysfs.c
> >>>>>>>index 11b76b3..28a2d80 100644
> >>>>>>>--- a/drivers/net/bonding/bond_sysfs.c
> >>>>>>>+++ b/drivers/net/bonding/bond_sysfs.c
> >>>>>>>@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct
> >>>>>>>device
> >>>>>>>*d,
> >>>>>>>       struct slave *slave;
> >>>>>>>       struct bonding *bond = to_bond(d);
> >>>>>>>
> >>>>>>>-      write_lock_bh(&bond->lock);
> >>>>>>>+      rtnl_lock();
> >>>>>>>+      read_lock(&bond->lock);
> >>>>>>>+      write_lock_bh(&bond->curr_slave_lock);
> >>>>>>>+
> >>>>>>>       if (!USES_PRIMARY(bond->params.mode)) {
> >>>>>>>               printk(KERN_INFO DRV_NAME
> >>>>>>>                      ": %s: Unable to set primary slave; %s is in 
> >>>>>>>                      mode
> >>>>>>>                      %d\n",
> >>>>>>>@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct
> >>>>>>>device
> >>>>>>>*d,
> >>>>>>>               }
> >>>>>>>       }
> >>>>>>>out:
> >>>>>>>-      write_unlock_bh(&bond->lock);
> >>>>>>>-
> >>>>>>>+      write_unlock_bh(&bond->curr_slave_lock);
> >>>>>>>+      read_unlock(&bond->lock);
> >>>>>>>       rtnl_unlock();
> >>>>>>>
> >>>>>>>       return count;
> >>>>>>>@@ -1190,7 +1193,8 @@ static ssize_t 
> >>>>>>>bonding_store_active_slave(struct
> >>>>>>>device *d,
> >>>>>>>       struct bonding *bond = to_bond(d);
> >>>>>>>
> >>>>>>>       rtnl_lock();
> >>>>>>>-      write_lock_bh(&bond->lock);
> >>>>>>>+      read_lock(&bond->lock);
> >>>>>>>+      write_lock_bh(&bond->curr_slave_lock);
> >>>>>>>
> >>>>>>>       if (!USES_PRIMARY(bond->params.mode)) {
> >>>>>>>               printk(KERN_INFO DRV_NAME
> >>>>>>>@@ -1247,7 +1251,8 @@ static ssize_t 
> >>>>>>>bonding_store_active_slave(struct
> >>>>>>>device *d,
> >>>>>>>               }
> >>>>>>>       }
> >>>>>>>out:
> >>>>>>>-      write_unlock_bh(&bond->lock);
> >>>>>>>+      write_unlock_bh(&bond->curr_slave_lock);
> >>>>>>>+      read_unlock(&bond->lock);
> >>>>>>>       rtnl_unlock();
> >>>>>>>
> >>>>>>>       return count;
> >>>>>>
> >>>>>>Vanilla 2.6.24-rc5 plus this patch:
> >>>>>>
> >>>>>>=========================================================
> >>>>>>[ INFO: possible irq lock inversion dependency detected ]
> >>>>>>2.6.24-rc5 #1
> >>>>>>---------------------------------------------------------
> >>>>>>events/0/9 just changed the state of lock:
> >>>>>>(&mc->mca_lock){-+..}, at: [<c0411c7a>] 
> >>>>>>mld_ifc_timer_expire+0x130/0x1fb
> >>>>>>but this lock took another, soft-read-irq-unsafe lock in the past:
> >>>>>>(&bond->lock){-.--}
> >>>>>>
> >>>>>>and interrupts could create inverse lock ordering between them.
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>Grrr, I should have seen that -- sorry.  Try your luck with this 
> >>>>>instead:
> >>>><CUT>
> >>>>
> >>>>No luck.
> >>>>
> >>>
> >>>
> >>>I'm guessing if we go back to using a write-lock for bond->lock this
> >>>will go back to working again, but I'm not totally convinced since there
> >>>are plenty of places where we used a read-lock with it.
> >>
> >>Should I check this patch or rather, based on a future discussion, wait
> >>for another version?
> >>
> >>>
> >>>diff --git a/drivers/net/bonding/bond_sysfs.c
> >>>b/drivers/net/bonding/bond_sysfs.c
> >>>index 11b76b3..635b857 100644
> >>>--- a/drivers/net/bonding/bond_sysfs.c
> >>>+++ b/drivers/net/bonding/bond_sysfs.c
> >>>@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device
> >>>*d,
> >>>   struct slave *slave;
> >>>   struct bonding *bond = to_bond(d);
> >>>
> >>>+  rtnl_lock();
> >>>   write_lock_bh(&bond->lock);
> >>>+  write_lock_bh(&bond->curr_slave_lock);
> >>>+
> >>>   if (!USES_PRIMARY(bond->params.mode)) {
> >>>           printk(KERN_INFO DRV_NAME
> >>>                  ": %s: Unable to set primary slave; %s is in mode
> >>>                  %d\n",
> >>>@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device
> >>>*d,
> >>>           }
> >>>   }
> >>>out:
> >>>+  write_unlock_bh(&bond->curr_slave_lock);
> >>>   write_unlock_bh(&bond->lock);
> >>>-
> >>>   rtnl_unlock();
> >>>
> >>>   return count;
> >>>@@ -1191,6 +1194,7 @@ static ssize_t bonding_store_active_slave(struct
> >>>device *d,
> >>>
> >>>   rtnl_lock();
> >>>   write_lock_bh(&bond->lock);
> >>>+  write_lock_bh(&bond->curr_slave_lock);
> >>>
> >>>   if (!USES_PRIMARY(bond->params.mode)) {
> >>>           printk(KERN_INFO DRV_NAME
> >>>@@ -1247,6 +1251,7 @@ static ssize_t bonding_store_active_slave(struct
> >>>device *d,
> >>>           }
> >>>   }
> >>>out:
> >>>+  write_unlock_bh(&bond->curr_slave_lock);
> >>>   write_unlock_bh(&bond->lock);
> >>>   rtnl_unlock();
> >>>
> >>
> >>
> >>Best regards,
> >>
> >>                                    Krzysztof Olędzki
> >
> >For now, I prefer Jay's original patch -- with the read_locks (rather
> >than read/write_lock_bh) and the added rtnl_lock.  There is still a
> >lockdep issue that we need to sort-out, but this patch is needed first.
> 
> This bug has not been fixed yet as it still exists in 2.6.24-rc7. Any 
> chances to cure it before 2.6.24-final?
> 
> Best regards,
> 
>                               Krzysztof Olędzki

Krzysztof,

I doubt the lockdep issue will be fixed, but the patch Jay posted and I
acked needs to be included in 2.6.24.

I played around with the locking when setting the multicast list and I
can make the lockdep issue go away, but I need to be sure that it's OK
to switch it to a read-lock from a write-lock (and I don't really think
it is).

-andy
Comment 27 Jay Vosburgh 2008-01-07 12:40:38 UTC
Andy Gospodarek <andy@greyhouse.net> wrote:

>On Mon, Jan 07, 2008 at 06:57:25PM +0100, Krzysztof Oledzki wrote:
>> 
>> 
>> On Wed, 19 Dec 2007, Andy Gospodarek wrote:
>> 
>> >On Tue, Dec 18, 2007 at 08:53:39PM +0100, Krzysztof Oledzki wrote:
>> >>
>> >>
>> >>On Fri, 14 Dec 2007, Andy Gospodarek wrote:
>> >>
>> >>>On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote:
>> >>>>
>> >>>>
>> >>>>On Fri, 14 Dec 2007, Andy Gospodarek wrote:
>> >>>>
>> >>>>>On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote:
>> >>>>>>
>> >>>>>>
>> >>>>>>On Wed, 12 Dec 2007, Jay Vosburgh wrote:
>> >>>>>>
>> >>>>>>>Herbert Xu <herbert@gondor.apana.org.au> wrote:
>> >>>>>>>
>> >>>>>>>>>diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>> >>>>>>>>>drivers/net/bonding/bond_sysfs.c
>> >>>>>>>>>--- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
>> >>>>>>>>>+++ a/drivers/net/bonding/bond_sysfs.c
>> >>>>>>>>>@@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
>> >>>>>>>>>out:
>> >>>>>>>>>    write_unlock_bh(&bond->lock);
>> >>>>>>>>>
>> >>>>>>>>>-       rtnl_unlock();
>> >>>>>>>>>-
>> >>>>>>>>
>> >>>>>>>>Looking at the changeset that added this perhaps the intention
>> >>>>>>>>is to hold the lock? If so we should add an rtnl_lock to the start
>> >>>>>>>>of the function.
>> >>>>>>>
>> >>>>>>>      Yes, this function needs to hold locks, and more than just
>> >>>>>>>what's there now.  I believe the following should be correct; I 
>> >>>>>>>haven't
>> >>>>>>>tested it, though (I'm supposedly on vacation right now).
>> >>>>>>>
>> >>>>>>>      The following change should be correct for the
>> >>>>>>>bonding_store_primary case discussed in this thread, and also 
>> >>>>>>>corrects
>> >>>>>>>the bonding_store_active case which performs similar functions.
>> >>>>>>>
>> >>>>>>>      The bond_change_active_slave and bond_select_active_slave
>> >>>>>>>functions both require rtnl, bond->lock for read and curr_slave_lock
>> >>>>>>>for
>> >>>>>>>write_bh, and no other locks.  This is so that the lower level
>> >>>>>>>mode-specific functions can release locks down to just rtnl in order 
>> >>>>>>>to
>> >>>>>>>call, e.g., dev_set_mac_address with the locks it expects (rtnl 
>> >>>>>>>only).
>> >>>>>>>
>> >>>>>>>Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
>> >>>>>>>
>> >>>>>>>diff --git a/drivers/net/bonding/bond_sysfs.c
>> >>>>>>>b/drivers/net/bonding/bond_sysfs.c
>> >>>>>>>index 11b76b3..28a2d80 100644
>> >>>>>>>--- a/drivers/net/bonding/bond_sysfs.c
>> >>>>>>>+++ b/drivers/net/bonding/bond_sysfs.c
>> >>>>>>>@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct
>> >>>>>>>device
>> >>>>>>>*d,
>> >>>>>>>      struct slave *slave;
>> >>>>>>>      struct bonding *bond = to_bond(d);
>> >>>>>>>
>> >>>>>>>-     write_lock_bh(&bond->lock);
>> >>>>>>>+     rtnl_lock();
>> >>>>>>>+     read_lock(&bond->lock);
>> >>>>>>>+     write_lock_bh(&bond->curr_slave_lock);
>> >>>>>>>+
>> >>>>>>>      if (!USES_PRIMARY(bond->params.mode)) {
>> >>>>>>>              printk(KERN_INFO DRV_NAME
>> >>>>>>>                     ": %s: Unable to set primary slave; %s is in 
>> >>>>>>>                     mode
>> >>>>>>>                     %d\n",
>> >>>>>>>@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct
>> >>>>>>>device
>> >>>>>>>*d,
>> >>>>>>>              }
>> >>>>>>>      }
>> >>>>>>>out:
>> >>>>>>>-     write_unlock_bh(&bond->lock);
>> >>>>>>>-
>> >>>>>>>+     write_unlock_bh(&bond->curr_slave_lock);
>> >>>>>>>+     read_unlock(&bond->lock);
>> >>>>>>>      rtnl_unlock();
>> >>>>>>>
>> >>>>>>>      return count;
>> >>>>>>>@@ -1190,7 +1193,8 @@ static ssize_t 
>> >>>>>>>bonding_store_active_slave(struct
>> >>>>>>>device *d,
>> >>>>>>>      struct bonding *bond = to_bond(d);
>> >>>>>>>
>> >>>>>>>      rtnl_lock();
>> >>>>>>>-     write_lock_bh(&bond->lock);
>> >>>>>>>+     read_lock(&bond->lock);
>> >>>>>>>+     write_lock_bh(&bond->curr_slave_lock);
>> >>>>>>>
>> >>>>>>>      if (!USES_PRIMARY(bond->params.mode)) {
>> >>>>>>>              printk(KERN_INFO DRV_NAME
>> >>>>>>>@@ -1247,7 +1251,8 @@ static ssize_t 
>> >>>>>>>bonding_store_active_slave(struct
>> >>>>>>>device *d,
>> >>>>>>>              }
>> >>>>>>>      }
>> >>>>>>>out:
>> >>>>>>>-     write_unlock_bh(&bond->lock);
>> >>>>>>>+     write_unlock_bh(&bond->curr_slave_lock);
>> >>>>>>>+     read_unlock(&bond->lock);
>> >>>>>>>      rtnl_unlock();
>> >>>>>>>
>> >>>>>>>      return count;
>> >>>>>>
>> >>>>>>Vanilla 2.6.24-rc5 plus this patch:
>> >>>>>>
>> >>>>>>=========================================================
>> >>>>>>[ INFO: possible irq lock inversion dependency detected ]
>> >>>>>>2.6.24-rc5 #1
>> >>>>>>---------------------------------------------------------
>> >>>>>>events/0/9 just changed the state of lock:
>> >>>>>>(&mc->mca_lock){-+..}, at: [<c0411c7a>] 
>> >>>>>>mld_ifc_timer_expire+0x130/0x1fb
>> >>>>>>but this lock took another, soft-read-irq-unsafe lock in the past:
>> >>>>>>(&bond->lock){-.--}
>> >>>>>>
>> >>>>>>and interrupts could create inverse lock ordering between them.
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>Grrr, I should have seen that -- sorry.  Try your luck with this 
>> >>>>>instead:
>> >>>><CUT>
>> >>>>
>> >>>>No luck.
>> >>>>
>> >>>
>> >>>
>> >>>I'm guessing if we go back to using a write-lock for bond->lock this
>> >>>will go back to working again, but I'm not totally convinced since there
>> >>>are plenty of places where we used a read-lock with it.
>> >>
>> >>Should I check this patch or rather, based on a future discussion, wait
>> >>for another version?
>> >>
>> >>>
>> >>>diff --git a/drivers/net/bonding/bond_sysfs.c
>> >>>b/drivers/net/bonding/bond_sysfs.c
>> >>>index 11b76b3..635b857 100644
>> >>>--- a/drivers/net/bonding/bond_sysfs.c
>> >>>+++ b/drivers/net/bonding/bond_sysfs.c
>> >>>@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device
>> >>>*d,
>> >>>  struct slave *slave;
>> >>>  struct bonding *bond = to_bond(d);
>> >>>
>> >>>+ rtnl_lock();
>> >>>  write_lock_bh(&bond->lock);
>> >>>+ write_lock_bh(&bond->curr_slave_lock);
>> >>>+
>> >>>  if (!USES_PRIMARY(bond->params.mode)) {
>> >>>          printk(KERN_INFO DRV_NAME
>> >>>                 ": %s: Unable to set primary slave; %s is in mode
>> >>>                 %d\n",
>> >>>@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device
>> >>>*d,
>> >>>          }
>> >>>  }
>> >>>out:
>> >>>+ write_unlock_bh(&bond->curr_slave_lock);
>> >>>  write_unlock_bh(&bond->lock);
>> >>>-
>> >>>  rtnl_unlock();
>> >>>
>> >>>  return count;
>> >>>@@ -1191,6 +1194,7 @@ static ssize_t bonding_store_active_slave(struct
>> >>>device *d,
>> >>>
>> >>>  rtnl_lock();
>> >>>  write_lock_bh(&bond->lock);
>> >>>+ write_lock_bh(&bond->curr_slave_lock);
>> >>>
>> >>>  if (!USES_PRIMARY(bond->params.mode)) {
>> >>>          printk(KERN_INFO DRV_NAME
>> >>>@@ -1247,6 +1251,7 @@ static ssize_t bonding_store_active_slave(struct
>> >>>device *d,
>> >>>          }
>> >>>  }
>> >>>out:
>> >>>+ write_unlock_bh(&bond->curr_slave_lock);
>> >>>  write_unlock_bh(&bond->lock);
>> >>>  rtnl_unlock();
>> >>>
>> >>
>> >>
>> >>Best regards,
>> >>
>> >>                                   Krzysztof Olędzki
>> >
>> >For now, I prefer Jay's original patch -- with the read_locks (rather
>> >than read/write_lock_bh) and the added rtnl_lock.  There is still a
>> >lockdep issue that we need to sort-out, but this patch is needed first.
>> 
>> This bug has not been fixed yet as it still exists in 2.6.24-rc7. Any 
>> chances to cure it before 2.6.24-final?
>> 
>> Best regards,
>> 
>>                              Krzysztof Olędzki
>
>Krzysztof,
>
>I doubt the lockdep issue will be fixed, but the patch Jay posted and I
>acked needs to be included in 2.6.24.

	I'm (finally) back from vacation and am working on the lock
problem right now; there are a couple of other changes that need to go
in (in addition to what was posted previously).  One is a spurious RTNL
warning, the other is a similar 'wrong lock' type of problem that arises
during module unload.

	I should have a patch set for this posted in a couple of hours.

>I played around with the locking when setting the multicast list and I
>can make the lockdep issue go away, but I need to be sure that it's OK
>to switch it to a read-lock from a write-lock (and I don't really think
>it is).

	I haven't looked at the lockdep problem yet.  If you want to be
brave and post your working patch for the lockdep thing, I might be able
to crush your hopes that it's ok.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
Comment 28 Krzysztof Oledzki 2008-02-12 09:49:20 UTC
No longer exists in 2.6.24-final. AFAIK it was fixed by:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=e934dd7862e7f613b2ce9730d548a0a70913c8f7

Note You need to log in before you can comment on or make changes to this bug.