Bug 11571

Summary: u32_classify Kernel Panic
Product: Networking Reporter: m0sia (m0sia)
Component: OtherAssignee: Arnaldo Carvalho de Melo (acme)
Status: RESOLVED CODE_FIX    
Severity: high    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.26.5 Subsystem:
Regression: No Bisected commit-id:
Attachments: 2.6.18-6-686 Kernel Panic
2.6.25.6 Kernel Panic

Description m0sia 2008-09-15 03:35:15 UTC
Distribution: Debian

Problem Description:
Kernel panic
[<c023f2d8>]    dev_queue_xmit+0x175/0x2a1
[<c0243861>]    neigh_resolve_output+0x1f8/0x2la
[<c025a784>]    ip_finish_output+0x1d7/0x200
[<c025aa2f>]    ip_output+0x6f/0x81
[<c0258218>]    ip_forwardjinish+0x2c/0x2e
[<c0257223>]    ip_rcv_f inish+0x263/0x27f
[<c023cc62>]    netif_receive_skb+0x2c1/0x32b
[<f886f26d>]    e1000_clean_rx_irq+0x395/0x46f [e1000]
[<f886f5f7>]    e1000_clean+0x52/0x1db [e1000]
[<c013e8e4>]    net_rx_action+0x8a/0x153
[<c0128bfa>]    __do_softirq+0x5d/0xc1
[<c0128c98>]    do_softirq+0x32/0x36
[<c0185cb8>]    do_IRQ+0x52/0x66
[<c01887fa>]    mwait_idle+0x8/0x32
[<c018418b>]    common_interrupt+0x23/0x28
[<c01887fa>]    mwait_idle+0xB/0x32
[<c0188829>]    muait_idle+0x2f/0x32
[<c0182545>]    cpu_ i d1e+0x88/0x9c


Code: 0c 8b 80 90 00 00 00 c7 44 24 14 00 00 00 00 c7 44 24 18 00 00 00 00 89 44 
24 18 8b 54 24 0c 8b 74 aa 18 85 f6 0f 84 a0 01 00 00 <8b> 46 38 83 00 01 83 50 
04 00 8b 4c 24 04 8b 46 38 23 81 88 00
EIP: [<f8bf3670>] u32_classify+0x41/0x23f [cls_u32] SS:ESP 8868:f746fd44
Kernel panic - not syncing: Fatal exception in interrupt

Steps to reproduce:

tc qdisc add dev eth1 root handle 1: htb 

tc class add dev eth1 parent 1: classid 1:1 htb rate 3600Kbit
tc class add dev eth1 parent 1:1 classid 1:11 htb rate 2800Kbit prio 0
tc class add dev eth1 parent 1:1 classid 1:15 htb rate 100Kbit ceil 2800Kbit prio 0
tc class add dev eth1 parent 1:1 classid 1:19 htb rate 100Kbit ceil 500Kbit prio 2


N from 10 to 2000
tc class add dev eth1 parent 1:{11,15,19} classid 1:$N htb rate 1Kbit ceil {$SPEED}Kbit
tc filter add dev eth1 parent 1: protocol ip pref $N u32 match ip dst $IP flowid 1:$N

Everything worked with N smaller then 2000. The problem first acquired with kernel 2.6.18-6-686 and is still present in 2.6.26.5
Comment 1 m0sia 2008-09-15 04:34:18 UTC
Created attachment 17782 [details]
2.6.18-6-686 Kernel Panic
Comment 2 m0sia 2008-09-15 04:34:50 UTC
Created attachment 17783 [details]
2.6.25.6 Kernel Panic
Comment 3 Anonymous Emailer 2008-09-16 09:20:11 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 15 Sep 2008 03:35:16 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11571
> 
>            Summary: u32_classify Kernel Panic
>            Product: Networking
>            Version: 2.5
>      KernelVersion: 2.6.26.5
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Other
>         AssignedTo: acme@ghostprotocols.net
>         ReportedBy: m0sia@plotinka.ru
> 
> 
> Distribution: Debian
> 
> Problem Description:
> Kernel panic
> [<c023f2d8>]    dev_queue_xmit+0x175/0x2a1
> [<c0243861>]    neigh_resolve_output+0x1f8/0x2la
> [<c025a784>]    ip_finish_output+0x1d7/0x200
> [<c025aa2f>]    ip_output+0x6f/0x81
> [<c0258218>]    ip_forwardjinish+0x2c/0x2e
> [<c0257223>]    ip_rcv_f inish+0x263/0x27f
> [<c023cc62>]    netif_receive_skb+0x2c1/0x32b
> [<f886f26d>]    e1000_clean_rx_irq+0x395/0x46f [e1000]
> [<f886f5f7>]    e1000_clean+0x52/0x1db [e1000]
> [<c013e8e4>]    net_rx_action+0x8a/0x153
> [<c0128bfa>]    __do_softirq+0x5d/0xc1
> [<c0128c98>]    do_softirq+0x32/0x36
> [<c0185cb8>]    do_IRQ+0x52/0x66
> [<c01887fa>]    mwait_idle+0x8/0x32
> [<c018418b>]    common_interrupt+0x23/0x28
> [<c01887fa>]    mwait_idle+0xB/0x32
> [<c0188829>]    muait_idle+0x2f/0x32
> [<c0182545>]    cpu_ i d1e+0x88/0x9c
> 
> 
> Code: 0c 8b 80 90 00 00 00 c7 44 24 14 00 00 00 00 c7 44 24 18 00 00 00 00 89
> 44 
> 24 18 8b 54 24 0c 8b 74 aa 18 85 f6 0f 84 a0 01 00 00 <8b> 46 38 83 00 01 83
> 50 
> 04 00 8b 4c 24 04 8b 46 38 23 81 88 00
> EIP: [<f8bf3670>] u32_classify+0x41/0x23f [cls_u32] SS:ESP 8868:f746fd44
> Kernel panic - not syncing: Fatal exception in interrupt
> 
> Steps to reproduce:
> 
> tc qdisc add dev eth1 root handle 1: htb 
> 
> tc class add dev eth1 parent 1: classid 1:1 htb rate 3600Kbit
> tc class add dev eth1 parent 1:1 classid 1:11 htb rate 2800Kbit prio 0
> tc class add dev eth1 parent 1:1 classid 1:15 htb rate 100Kbit ceil 2800Kbit
> prio 0
> tc class add dev eth1 parent 1:1 classid 1:19 htb rate 100Kbit ceil 500Kbit
> prio 2
> 
> 
> N from 10 to 2000
> tc class add dev eth1 parent 1:{11,15,19} classid 1:$N htb rate 1Kbit ceil
> {$SPEED}Kbit
> tc filter add dev eth1 parent 1: protocol ip pref $N u32 match ip dst $IP
> flowid 1:$N
> 
> Everything worked with N smaller then 2000. The problem first acquired with
> kernel 2.6.18-6-686 and is still present in 2.6.26.5
> 
> 
> -- 
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
Comment 4 Jarek Poplawski 2008-09-17 12:38:34 UTC
Andrew Morton wrote, On 09/16/2008 06:15 PM:

> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Mon, 15 Sep 2008 03:35:16 -0700 (PDT) bugme-daemon@bugzilla.kernel.org
> wrote:
> 
>> http://bugzilla.kernel.org/show_bug.cgi?id=11571
>>
>>            Summary: u32_classify Kernel Panic
...

Could you add more details:
- .config
- gzipped cls_u32.o (compiled with CONFIG_DEBUG_INFO on)
- the first part of OOPS if possible.
- more exactly these tc commands from your script

Does this happen while creating or deleting something and is this
easy to reproduce?

Thanks,
Jarek P.
Comment 5 Jarek Poplawski 2008-09-17 15:07:02 UTC
On Wed, Sep 17, 2008 at 09:38:32PM +0200, Jarek Poplawski wrote:
...
> >> http://bugzilla.kernel.org/show_bug.cgi?id=11571
> Does this happen while creating or deleting something and is this
> easy to reproduce?

If accidentally there is any deleting around try this patch, please.

Jarek P.

---

 net/sched/cls_u32.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 246f906..9912ad5 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -433,7 +433,9 @@ static int u32_delete(struct tcf_proto *tp, unsigned long arg)
 
 	if (ht->refcnt == 1) {
 		ht->refcnt--;
+		tcf_tree_lock(tp);
 		u32_destroy_hnode(tp, ht);
+		tcf_tree_unlock(tp);
 	} else {
 		return -EBUSY;
 	}
Comment 6 m0sia 2008-09-18 00:07:35 UTC
Jarek Poplawski пишет:
> Andrew Morton wrote, On 09/16/2008 06:15 PM:
>
>   
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Mon, 15 Sep 2008 03:35:16 -0700 (PDT) bugme-daemon@bugzilla.kernel.org
>> wrote:
>>
>>     
>>> http://bugzilla.kernel.org/show_bug.cgi?id=11571
>>>
>>>            Summary: u32_classify Kernel Panic
>>>       
> ...
>
> Could you add more details:
> - .config
> - gzipped cls_u32.o (compiled with CONFIG_DEBUG_INFO on)
> - the first part of OOPS if possible.
> - more exactly these tc commands from your script
>
> Does this happen while creating or deleting something and is this
> easy to reproduce?
>
> Thanks,
> Jarek P.
>   

It happens on a production server and i can't experiment with this bug. 
Now i'm using MARK iptable target and fw mark filter, because of this 
bug. I think it happens when deleting or adding filters(it occur 
automatically by script, when adding new user or user change speed). 
I'll try this patch on test server.
Comment 7 Jarek Poplawski 2008-09-18 00:54:31 UTC
On Thu, Sep 18, 2008 at 01:05:28PM +0600, m0sia wrote:
...
>>>> http://bugzilla.kernel.org/show_bug.cgi?id=11571
>>>>
>>>>            Summary: u32_classify Kernel Panic
...
> It happens on a production server and i can't experiment with this bug.  
> Now i'm using MARK iptable target and fw mark filter, because of this  
> bug. I think it happens when deleting or adding filters(it occur  
> automatically by script, when adding new user or user change speed).  
> I'll try this patch on test server.

OK, no hurry. BTW, it looks like some traffic is needed in a qdisc
while its filters are modified to trigger this. Probably, turning off
CONFIG_CLS_U32_PERF can make this less visible. On the other hand,
turning on memory debugging: CONFIG_DEBUG_SLAB or CONFIG_SLUB_DEBUG_ON
should be helpful here.

Jarek P.
Comment 8 Jarek Poplawski 2009-01-05 05:53:26 UTC
> On Thu, Sep 18, 2008 at 01:05:28PM +0600, m0sia wrote:
> ...
> >>>> http://bugzilla.kernel.org/show_bug.cgi?id=11571
...
(take 2)

It seems there could be a problem with testing if this patch fixes
this bug, but IMHO it's quite probable, and needed anyway.

Jarek P.

----------------->
pkt_sched: cls_u32: Fix locking in u32_change()

New nodes are inserted in u32_change() under rtnl_lock() with wmb(),
so without tcf_tree_lock() like in other classifiers (e.g. cls_fw).
This isn't enough without rmb() on the read side, but on the other
hand adding such barriers doesn't give any savings, so the lock is
added instead.

Reported-by: m0sia <m0sia@plotinka.ru>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

 net/sched/cls_u32.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 05d1780..07372f6 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -638,8 +638,9 @@ static int u32_change(struct tcf_proto *tp, unsigned long base, u32 handle,
 				break;
 
 		n->next = *ins;
-		wmb();
+		tcf_tree_lock(tp);
 		*ins = n;
+		tcf_tree_unlock(tp);
 
 		*arg = (unsigned long)n;
 		return 0;
Comment 9 David S. Miller 2009-01-05 18:15:21 UTC
From: Jarek Poplawski <jarkao2@gmail.com>
Date: Mon, 5 Jan 2009 13:52:45 +0000

> pkt_sched: cls_u32: Fix locking in u32_change()
> 
> New nodes are inserted in u32_change() under rtnl_lock() with wmb(),
> so without tcf_tree_lock() like in other classifiers (e.g. cls_fw).
> This isn't enough without rmb() on the read side, but on the other
> hand adding such barriers doesn't give any savings, so the lock is
> added instead.
> 
> Reported-by: m0sia <m0sia@plotinka.ru>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>

Applied and queued up for -stable, thanks Jarek.
Comment 10 Jarek Poplawski 2009-03-12 11:15:40 UTC
For the record: it was acknowledged here:
http://bugzilla.kernel.org/show_bug.cgi?id=12858#c3
the last patch really fixed the bug, so this report can be closed.

Jarek P.