Bug 11572 - udp: unbalanced unlock bug in udp_queue_rcv_skb with ipsec
Summary: udp: unbalanced unlock bug in udp_queue_rcv_skb with ipsec
Status: CLOSED CODE_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV4 (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-15 07:04 UTC by Thomas Jarosch
Modified: 2008-09-15 13:32 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.26.4
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Thomas Jarosch 2008-09-15 07:04:22 UTC
Hello together,

after upgrading our production firewall to kernel 2.6.26.4, it freezed numerous times out of the blue. First problem was isolated and now fixed in bug #11142, but the box still freezed every two days. Yesterday night it crashed during setup of ipsec connections, so I've built a test box setting up over 25 ipsec connections to invalid peers. This will issue the following backtrace:

====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
pluto/2943 is trying to release lock (slock-AF_INET) at:
[<c02b660f>] udp_queue_rcv_skb+0xa3/0x210
but there are no more locks to release!

other info that might help us debug this:
no locks held by pluto/2943.

stack backtrace:
Pid: 2943, comm: pluto Not tainted 2.6.26-2.i2nsmp #2
 [<c013ebb9>] print_unlock_inbalance_bug+0xd7/0xe1
 [<c02c73a5>] ? xfrm_sk_policy_lookup+0x44/0x4b
 [<c02c7d7b>] ? __xfrm_policy_check+0x19b/0x4a2
 [<c02b660f>] ? udp_queue_rcv_skb+0xa3/0x210
 [<c013ed95>] lock_release+0xad/0x15c
 [<c02e1056>] _spin_unlock+0x16/0x20
 [<c02b660f>] udp_queue_rcv_skb+0xa3/0x210
 [<c0279ce0>] release_sock+0x5e/0xab
 [<c02b75b6>] udp_sendmsg+0x421/0x4d5
 [<c02bc186>] inet_sendmsg+0x3b/0x48
 [<c02781c0>] sock_sendmsg+0xca/0xe1
 [<c0133471>] ? autoremove_wake_function+0x0/0x33
 [<c02789f7>] sys_sendto+0xa4/0xc3
 [<c02e0fe3>] ? _spin_unlock_irq+0x22/0x2f
 [<c013d99a>] ? trace_hardirqs_on+0xe9/0x10a
 [<c02e0fe3>] ? _spin_unlock_irq+0x22/0x2f
 [<c0279102>] sys_socketcall+0xfb/0x189
 [<c0102ffa>] syscall_call+0x7/0xb
 =======================

And a bit later:

BUG: soft lockup - CPU#0 stuck for 61s! [pluto:2943]
Modules linked in: authenc xfrm4_mode_tunnel deflate zlib_deflate twofish twofish_common camellia serpent blowfish des_generic cbc aes_i586 aes_generic xcbc sha256_generic crypto_null crypto_blkcid

Pid: 2943, comm: pluto Not tainted (2.6.26-2.i2nsmp #2)
EIP: 0060:[<c01cb395>] EFLAGS: 00000282 CPU: 0
EIP is at delay_tsc+0x20/0x59
EAX: 535984b0 EBX: 00000000 ECX: 03301000 EDX: 000001d6
ESI: 00000000 EDI: f682e6a4 EBP: f6947cc0 ESP: f6947c94
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 0016aea0 CR3: 37bd5000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c01cb3e6>] __delay+0x9/0xb
 [<c01d8c0b>] _raw_spin_lock+0xa7/0x118
 [<c02e10ef>] _spin_lock+0x20/0x28
 [<c02b6623>] ? udp_queue_rcv_skb+0xb7/0x210
 [<c02b6623>] udp_queue_rcv_skb+0xb7/0x210
 [<c0279ce0>] release_sock+0x5e/0xab
 [<c02b75b6>] udp_sendmsg+0x421/0x4d5
 [<c02bc186>] inet_sendmsg+0x3b/0x48
 [<c02781c0>] sock_sendmsg+0xca/0xe1
 [<c0133471>] ? autoremove_wake_function+0x0/0x33
 [<c02789f7>] sys_sendto+0xa4/0xc3
 [<c02e0fe3>] ? _spin_unlock_irq+0x22/0x2f
 [<c013d99a>] ? trace_hardirqs_on+0xe9/0x10a
 [<c02e0fe3>] ? _spin_unlock_irq+0x22/0x2f
 [<c0279102>] sys_socketcall+0xfb/0x189
 [<c0102ffa>] syscall_call+0x7/0xb


The locking issue seems to be caused by 

commit d01885e51bf577745ba935c70b55fe1405d6b632
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Wed Aug 27 22:40:04 2008 -0700

    udp: Drop socket lock for encapsulated packets
    
    [ Upstream commit d97106ea52aa57e63ff40d04479016836bbb5a4e ]
    
    The socket lock is there to protect the normal UDP receive path.
    Encapsulation UDP sockets don't need that protection.  In fact
    the locking is deadly for them as they may contain another UDP
    packet within, possibly with the same addresses.
    
    Also the nested bit was copied from TCP.  TCP needs it because
    of accept(2) spawning sockets.  This simply doesn't apply to UDP
    so I've removed it.
    
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

of kernel 2.6.24.4. The test box freezes after around 30s, so I can easily test any patch and give feedback.

Cheers,
Thomas
Comment 1 Thomas Jarosch 2008-09-15 07:07:07 UTC
The last version number "2.6.24.4" should read "2.6.26.4" ;-)
Comment 2 Herbert Xu 2008-09-15 11:34:28 UTC
On Mon, Sep 15, 2008 at 07:04:24AM -0700, bugme-daemon@bugzilla.kernel.org wrote:
> 
> after upgrading our production firewall to kernel 2.6.26.4, it freezed
> numerous
> times out of the blue. First problem was isolated and now fixed in bug
> #11142,
> but the box still freezed every two days. Yesterday night it crashed during
> setup of ipsec connections, so I've built a test box setting up over 25 ipsec
> connections to invalid peers. This will issue the following backtrace:
> 
> ====================================
> [ BUG: bad unlock balance detected! ]
> -------------------------------------
> pluto/2943 is trying to release lock (slock-AF_INET) at:
> [<c02b660f>] udp_queue_rcv_skb+0xa3/0x210
> but there are no more locks to release!
> 
> other info that might help us debug this:
> no locks held by pluto/2943.
> 
> stack backtrace:
> Pid: 2943, comm: pluto Not tainted 2.6.26-2.i2nsmp #2
>  [<c013ebb9>] print_unlock_inbalance_bug+0xd7/0xe1
>  [<c02c73a5>] ? xfrm_sk_policy_lookup+0x44/0x4b
>  [<c02c7d7b>] ? __xfrm_policy_check+0x19b/0x4a2
>  [<c02b660f>] ? udp_queue_rcv_skb+0xa3/0x210
>  [<c013ed95>] lock_release+0xad/0x15c
>  [<c02e1056>] _spin_unlock+0x16/0x20
>  [<c02b660f>] udp_queue_rcv_skb+0xa3/0x210
>  [<c0279ce0>] release_sock+0x5e/0xab

Sorry my previous fix is simply bogus.  The real problem is that
we added the socket lock in the wrong place.  It should be around
the socket queueing only.

udp: Fix rcv socket locking

The previous patch in response to the recursive locking on IPsec
reception is broken as it tries to drop the BH socket lock while in
user context.

This patch fixes it by shrinking the section protected by the
socket lock to sock_queue_rcv_skb only.  The only reason we added
the lock is for the accounting which happens in that function.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 8e42fbb..57e26fa 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -951,6 +951,27 @@ int udp_disconnect(struct sock *sk, int flags)
 	return 0;
 }
 
+static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
+{
+	int is_udplite = IS_UDPLITE(sk);
+	int rc;
+
+	if ((rc = sock_queue_rcv_skb(sk, skb)) < 0) {
+		/* Note that an ENOMEM error is charged twice */
+		if (rc == -ENOMEM)
+			UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_RCVBUFERRORS,
+					 is_udplite);
+		goto drop;
+	}
+
+	return 0;
+
+drop:
+	UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
+	kfree_skb(skb);
+	return -1;
+}
+
 /* returns:
  *  -1: error
  *   0: success
@@ -989,9 +1010,7 @@ int udp_queue_rcv_skb(struct sock * sk, struct sk_buff *skb)
 		    up->encap_rcv != NULL) {
 			int ret;
 
-			bh_unlock_sock(sk);
 			ret = (*up->encap_rcv)(sk, skb);
-			bh_lock_sock(sk);
 			if (ret <= 0) {
 				UDP_INC_STATS_BH(sock_net(sk),
 						 UDP_MIB_INDATAGRAMS,
@@ -1044,17 +1063,16 @@ int udp_queue_rcv_skb(struct sock * sk, struct sk_buff *skb)
 			goto drop;
 	}
 
-	if ((rc = sock_queue_rcv_skb(sk,skb)) < 0) {
-		/* Note that an ENOMEM error is charged twice */
-		if (rc == -ENOMEM) {
-			UDP_INC_STATS_BH(sock_net(sk),
-					UDP_MIB_RCVBUFERRORS, is_udplite);
-			atomic_inc(&sk->sk_drops);
-		}
-		goto drop;
-	}
+	rc = 0;
 
-	return 0;
+	bh_lock_sock(sk);
+	if (!sock_owned_by_user(sk))
+		rc = __udp_queue_rcv_skb(sk, skb);
+	else
+		sk_add_backlog(sk, skb);
+	bh_unlock_sock(sk);
+
+	return rc;
 
 drop:
 	UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
@@ -1092,15 +1110,7 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 				skb1 = skb_clone(skb, GFP_ATOMIC);
 
 			if (skb1) {
-				int ret = 0;
-
-				bh_lock_sock(sk);
-				if (!sock_owned_by_user(sk))
-					ret = udp_queue_rcv_skb(sk, skb1);
-				else
-					sk_add_backlog(sk, skb1);
-				bh_unlock_sock(sk);
-
+				int ret = udp_queue_rcv_skb(sk, skb1);
 				if (ret > 0)
 					/* we should probably re-process instead
 					 * of dropping packets here. */
@@ -1195,13 +1205,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct hlist_head udptable[],
 			uh->dest, inet_iif(skb), udptable);
 
 	if (sk != NULL) {
-		int ret = 0;
-		bh_lock_sock(sk);
-		if (!sock_owned_by_user(sk))
-			ret = udp_queue_rcv_skb(sk, skb);
-		else
-			sk_add_backlog(sk, skb);
-		bh_unlock_sock(sk);
+		int ret = udp_queue_rcv_skb(sk, skb);
 		sock_put(sk);
 
 		/* a return value > 0 means to resubmit the input, but
@@ -1494,7 +1498,7 @@ struct proto udp_prot = {
 	.sendmsg	   = udp_sendmsg,
 	.recvmsg	   = udp_recvmsg,
 	.sendpage	   = udp_sendpage,
-	.backlog_rcv	   = udp_queue_rcv_skb,
+	.backlog_rcv	   = __udp_queue_rcv_skb,
 	.hash		   = udp_lib_hash,
 	.unhash		   = udp_lib_unhash,
 	.get_port	   = udp_v4_get_port,

Cheers,
Comment 3 Thomas Jarosch 2008-09-15 11:43:25 UTC
Thanks, I will test your new version right away and report back. 

I manually reverted the locking fix from 2.6.26.4 and my stress-test of 25 ipsec tunnels is running for hours now. Let's see what happens with the new patch :-)
Comment 4 David S. Miller 2008-09-15 11:51:05 UTC
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 15 Sep 2008 11:34:20 -0700

> udp: Fix rcv socket locking
> 
> The previous patch in response to the recursive locking on IPsec
> reception is broken as it tries to drop the BH socket lock while in
> user context.
> 
> This patch fixes it by shrinking the section protected by the
> socket lock to sock_queue_rcv_skb only.  The only reason we added
> the lock is for the accounting which happens in that function.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, I'll have to queue this up to -stable as well.

Thanks!
Comment 5 Thomas Jarosch 2008-09-15 13:32:46 UTC
                              ,-._.-._.-._.-._.-.
                              `-.             ,-'
 .----------------------.       |             |
|                        |      |             |
| Thanks, the new patch  |      |             |
| is working perfectly!  |      |             |
|                        |     ,';".________.-.
| I'll put it on the     |     ;';_'         )]
| production box now.    |    ;             `-|
|                        `.    `T-            |
 `----------------------._ \    |             |
                          `-;   |             |
                                |..________..-|
                               /\/ |________..|
                          ,'`./  >,(           |
                          \_.-|_/,-/   ii  |   |
                           `."' `-/  .-"""||    |
                            /`^"-;   |    ||____|
                           /     /   `.__/  | ||
                                /           | ||
                                            | ||

Note You need to log in before you can comment on or make changes to this bug.