Bug 8895 (fib6_clean)
Summary: | An ioctl to delete an ipv6 tunnel leads to a kernel panic | ||
---|---|---|---|
Product: | Networking | Reporter: | Vincent Perrier (clowncoder) |
Component: | IPV6 | Assignee: | Hideaki YOSHIFUJI (yoshfuji) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | alan, clowncoder, dassanjib.in, davem, protasnb, qmiao, zhangwf |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.22.3 and also 2.6.21.5 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | patch of the printk and dump of traces |
Description
Vincent Perrier
2007-08-16 12:31:09 UTC
Reply-To: akpm@linux-foundation.org On Thu, 16 Aug 2007 12:24:05 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8895 > > Summary: An ioctl to delete an ipv6 tunnel leads to a kernel > panic > Product: Networking > Version: 2.5 > KernelVersion: 2.6.22.3 and also 2.6.21.5 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: IPV6 > AssignedTo: yoshfuji@linux-ipv6.org > ReportedBy: clowncoder@clownix.net > > > Most recent kernel where this bug did not occur: ? > Distribution: lfs and fedora > Hardware Environment:user mode linux and vmware > Software Environment:an evolution of mip6d (ip mobility daemon) > Problem Description: The mip6d HA was modified to make a redondancy > evolution, > when an HA is interrupted, the other takes over, this leads to some > creation/deletion of routes and tunnels. > Note: The HA ip address known by the mobile (MR) stays the same, the slave HA > takes it with an override neighbor advertisement message. So the tunnel > between > the mobile router and the HA(s) keep the same end adresses. > The problem occurs when a Ctrl C is done on the master HA, the slave takes > over > but sometimes, the master gets a kernel panic. > > Here is the dump of the master: > > ICMPv6 NA: someone advertises our address on eth1! > Slab corruption: ip6_dst_cache start=0867ed00, len=224 > Redzone: 0x9f911029d74e35b/0x9f911029d74e35b. > Last user: [<08157c46>](dst_destroy+0x79/0xad) > 0a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6c 6b 6b 6b > Prev obj: start=0867ec08, len=224 > Redzone: 0xd84156c5635688c0/0xd84156c5635688c0. > Last user: [<08157b05>](dst_alloc+0x26/0x62) > 000: 00 00 00 00 00 00 00 00 00 00 00 00 40 41 6f 08 > 010: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 > Next obj: start=0867edf8, len=224 > Redzone: 0xd84156c5635688c0/0xd84156c5635688c0. > Last user: [<08157b05>](dst_alloc+0x26/0x62) > 000: 00 00 00 00 00 00 00 00 00 00 00 00 60 41 99 0b > 010: 00 00 ff ff 00 00 00 00 7d df ff ff 00 00 00 00 > BUG: failure at net/ipv6/ip6_fib.c:1151/fib6_del_route()! > Kernel panic - not syncing: BUG! > > EIP: 0073:[<080e10b4>] CPU: 0 Not tainted ESP: 007b:bf6d0398 EFLAGS: 00000246 > Not tainted > EAX: ffffffda EBX: 00000006 ECX: 000089f2 EDX: bf6d0428 > ESI: 00000000 EDI: 0815c150 EBP: bf6d0458 DS: 007b ES: 007b > 08a37ae4: [<0806ba80>] show_regs+0xb4/0xb9 > 08a37b10: [<0805a044>] panic_exit+0x25/0x3f > 08a37b24: [<0807b088>] notifier_call_chain+0x21/0x46 > 08a37b44: [<0807b123>] __atomic_notifier_call_chain+0x17/0x19 > 08a37b60: [<0807b13a>] atomic_notifier_call_chain+0x15/0x17 > 08a37b7c: [<0806fff6>] panic+0x52/0xdd > 08a37b9c: [<081bb8d2>] fib6_del_route+0x112/0x175 > 08a37bc0: [<081bb9c6>] fib6_del+0x91/0xcc > 08a37bdc: [<081bbba8>] fib6_clean_node+0x26/0x73 > 08a37bf4: [<081bba8a>] fib6_walk_continue+0x89/0x11f > 08a37c04: [<081bbb57>] fib6_walk+0x37/0x62 > 08a37c18: [<081bbc23>] fib6_clean_tree+0x2e/0x31 > 08a37c4c: [<081bbc83>] fib6_prune_clones+0x15/0x1a > 08a37c64: [<081bb9de>] fib6_del+0xa9/0xcc > 08a37c7c: [<081bbba8>] fib6_clean_node+0x26/0x73 > 08a37c94: [<081bba8a>] fib6_walk_continue+0x89/0x11f > 08a37ca4: [<081bbb57>] fib6_walk+0x37/0x62 > 08a37cb8: [<081bbc23>] fib6_clean_tree+0x2e/0x31 > 08a37cec: [<081bbc51>] fib6_clean_all+0x2b/0x48 > 08a37d10: [<081b9d15>] rt6_ifdown+0x12/0x17 > 08a37d24: [<081b56e3>] addrconf_ifdown+0x54/0x275 > 08a37d40: [<081b562d>] addrconf_notify+0x18a/0x1ec > 08a37d5c: [<0807b088>] notifier_call_chain+0x21/0x46 > 08a37d7c: [<0807b257>] __raw_notifier_call_chain+0x17/0x19 > 08a37d98: [<0807b26e>] raw_notifier_call_chain+0x15/0x17 > 08a37db4: [<08153c18>] dev_close+0x5e/0x68 > 08a37dcc: [<0815619e>] unregister_netdevice+0xb7/0x1bc > 08a37ddc: [<081d75d7>] ip6_tnl_ioctl+0x1a9/0x1d2 > 08a37e34: [<0815578c>] dev_ifsioc+0x3b9/0x3d9 > 08a37e54: [<08155a71>] dev_ioctl+0x2c5/0x300 > 08a37e9c: [<0814b435>] sock_ioctl+0x230/0x243 > 08a37ebc: [<080b0801>] do_ioctl+0x21/0x5a > 08a37ed8: [<080b0ba8>] vfs_ioctl+0x1ec/0x209 > 08a37f00: [<080b0bf3>] sys_ioctl+0x2e/0x4b > 08a37f28: [<0805a7ae>] handle_syscall+0x86/0xa0 > 08a37f74: [<08068d00>] handle_trap+0xd8/0xe1 > 08a37f90: [<080690f3>] userspace+0x138/0x180 > 08a37fdc: [<0805a4d1>] fork_handler+0x74/0x7c > 08a37ffc: [<a55a5a5a>] 0xa55a5a5a > > > Program received signal SIGSEGV, Segmentation fault. > 0xb7e58761 in abort () from /lib/tls/i686/cmov/libc.so.6 > (gdb) > > > > Program received signal SIGSEGV, Segmentation fault. > 0xb7e58761 in abort () from /lib/tls/i686/cmov/libc.so.6 > (gdb) bt > #0 0xb7e58761 in abort () from /lib/tls/i686/cmov/libc.so.6 > #1 0x080676df in os_dump_core () at arch/um/os-Linux/util.c:109 > #2 0x0805a05a in panic_exit (self=0x825d674, unused1=0, unused2=0x8277ee0) > at arch/um/kernel/um_arch.c:477 > #3 0x0807b088 in notifier_call_chain (nl=0x8277ec0, val=0, v=0x8277ee0, > nr_to_call=-2, nr_calls=0x0) at kernel/sys.c:163 > #4 0x0807b123 in __atomic_notifier_call_chain (nh=0x8277ec0, val=0, > v=0x8277ee0, nr_to_call=-1, nr_calls=0x0) at kernel/sys.c:256 > #5 0x0807b13a in atomic_notifier_call_chain (nh=0x8277ec0, val=0, > v=0x8277ee0) > at kernel/sys.c:266 > #6 0x0806fff6 in panic (fmt=0x8217b25 "BUG!") at kernel/panic.c:99 > #7 0x081bb8d2 in fib6_del_route (fn=0x0, rtp=0x8abd568, info=0x0) > at net/ipv6/ip6_fib.c:1151 > #8 0x081bb9c6 in fib6_del (rt=0x867ed00, info=0x0) at > net/ipv6/ip6_fib.c:1193 > #9 0x081bbba8 in fib6_clean_node (w=0x8a37c20) at net/ipv6/ip6_fib.c:1322 > #10 0x081bba8a in fib6_walk_continue (w=0x8a37c20) at net/ipv6/ip6_fib.c:1264 > #11 0x081bbb57 in fib6_walk (w=0x8a37c20) at net/ipv6/ip6_fib.c:1306 > #12 0x081bbc23 in fib6_clean_tree (root=0x8abd440, > func=0x81bbc88 <fib6_prune_clone>, prune=1, arg=0x867edf8) > at net/ipv6/ip6_fib.c:1360 > #13 0x081bbc83 in fib6_prune_clones (fn=0x8abd440, rt=0x867edf8) > at net/ipv6/ip6_fib.c:1394 > #14 0x081bb9de in fib6_del (rt=0x867edf8, info=0x0) at > net/ipv6/ip6_fib.c:1184 > #15 0x081bbba8 in fib6_clean_node (w=0x8a37cc0) at net/ipv6/ip6_fib.c:1322 > #16 0x081bba8a in fib6_walk_continue (w=0x8a37cc0) at net/ipv6/ip6_fib.c:1264 > #17 0x081bbb57 in fib6_walk (w=0x8a37cc0) at net/ipv6/ip6_fib.c:1306 > #18 0x081bbc23 in fib6_clean_tree (root=0x8272dac, > func=0x81b9ce2 <fib6_ifdown>, prune=0, arg=0xb994160) > at net/ipv6/ip6_fib.c:1360 > #19 0x081bbc51 in fib6_clean_all (func=0x81b9ce2 <fib6_ifdown>, prune=0, > arg=0xb994160) at net/ipv6/ip6_fib.c:1372 > #20 0x081b9d15 in rt6_ifdown (dev=0xb994160) at net/ipv6/route.c:1944 > #21 0x081b56e3 in addrconf_ifdown (dev=0xb994160, how=0) > at net/ipv6/addrconf.c:2400 > #22 0x081b562d in addrconf_notify (this=0x82721c4, event=2, data=0xb994160) > at net/ipv6/addrconf.c:2358 > #23 0x0807b088 in notifier_call_chain (nl=0x8283e94, val=2, v=0xb994160, > nr_to_call=-10, nr_calls=0x0) at kernel/sys.c:163 > #24 0x0807b257 in __raw_notifier_call_chain (nh=0x8283e94, val=2, > v=0xb994160, > nr_to_call=-1, nr_calls=0x0) at kernel/sys.c:451 > #25 0x0807b26e in raw_notifier_call_chain (nh=0x8283e94, val=2, v=0xb994160) > at kernel/sys.c:459 > #26 0x08153c18 in dev_close (dev=0xb994160) at net/core/dev.c:1015 > #27 0x0815619e in unregister_netdevice (dev=0xb994160) at net/core/dev.c:3451 > #28 0x081d75d7 in ip6_tnl_ioctl (dev=0xb994160, ifr=0x8a37e6c, cmd=35314) > at net/ipv6/ip6_tunnel.c:1266 > #29 0x0815578c in dev_ifsioc (ifr=0x8a37e6c, cmd=35314) at > net/core/dev.c:2816 > #30 0x08155a71 in dev_ioctl (cmd=35314, arg=0xbf6d0428) at > net/core/dev.c:2995 > #31 0x0814b435 in sock_ioctl (file=0x832a348, cmd=35314, arg=3211592744) > at net/socket.c:909 > #32 0x080b0801 in do_ioctl (filp=0x16, cmd=35314, arg=3211592744) > ---Type <return> to continue, or q <return> to quit--- > > at fs/ioctl.c:30 > #33 0x080b0ba8 in vfs_ioctl (filp=0x832a348, fd=6, cmd=6, arg=3211592744) > at fs/ioctl.c:159 > #34 0x080b0bf3 in sys_ioctl (fd=6, cmd=35314, arg=3211592744) at > fs/ioctl.c:179 > #35 0x0805a7ae in handle_syscall (r=0x867a894) > at arch/um/kernel/skas/syscall.c:38 > #36 0x08068d00 in handle_trap (pid=10640, regs=0x867a894, > local_using_sysemu=2) > at arch/um/os-Linux/skas/process.c:173 > #37 0x080690f3 in userspace (regs=0x867a894) > at arch/um/os-Linux/skas/process.c:330 > #38 0x0805a4d1 in fork_handler () at arch/um/kernel/skas/process.c:96 > #39 0xa55a5a5a in ?? () > (gdb) > > > > Steps to reproduce: > > I did another test: Modification of file: ip6_fib.c static void fib6_del_route(struct fib6_node *fn, struct rt6_info **rtp, struct nl_info *info) { . . . printk("0 ATOMIC %d\n", atomic_read(&rt->rt6i_ref)); if (atomic_read(&rt->rt6i_ref) != 1) { printk("1 ATOMIC %d\n", atomic_read(&rt->rt6i_ref)); /* This route is used as dummy address holder in some split * nodes. It is not leaked, but it still holds other resources, * which must be released in time. So, scan ascendant nodes * and replace dummy references to this route with references * to still alive ones. */ while (fn) { if (!(fn->fn_flags&RTN_RTINFO) && fn->leaf == rt) { fn->leaf = fib6_find_prefix(fn); atomic_inc(&fn->leaf->rt6i_ref); rt6_release(rt); } fn = fn->parent; } /* No more references are possible at this point. */ if (atomic_read(&rt->rt6i_ref) != 1) printk("2 ATOMIC %d", atomic_read(&rt->rt6i_ref)); . . . Result in the console: . . . 0 ATOMIC 1 0 ATOMIC 1 0 ATOMIC 1 0 ATOMIC 1 Slab corruption: ip6_dst_cache start=08506160, len=224 Redzone: 0x9f911029d74e35b/0x9f911029d74e35b. Last user: [<08157c46>](dst_destroy+0x79/0xad) 0a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6c 6b 6b 6b Prev obj: start=08506068, len=224 Redzone: 0xd84156c5635688c0/0xd84156c5635688c0. Last user: [<08157b05>](dst_alloc+0x26/0x62) 000: 00 00 00 00 00 00 00 00 00 00 00 00 40 b4 26 08 010: 00 00 ff ff 01 00 00 00 00 00 00 00 00 00 00 00 Next obj: start=08506258, len=224 Redzone: 0x9f911029d74e35b/0x9f911029d74e35b. Last user: [<08157c46>](dst_destroy+0x79/0xad) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 0 ATOMIC 1 0 ATOMIC 1 0 ATOMIC 1 0 ATOMIC 1802201963 1 ATOMIC 1802201963 2 ATOMIC 1802201963 Program received signal SIGSEGV, Segmentation fault. rt6_fill_node (skb=0x8b1e7e8, rt=0x8506160, dst=0x0, src=0x0, iif=0, type=25, pid=0, seq=0, prefix=0, flags=0) at net/ipv6/route.c:2145 2145 table = rt->rt6i_table->tb6_id; (gdb) quit we also have encountered this problem, which happens randomly. and it existed on 2.6.18. we have some different log. But we believe they are the same cause. kernel BUG in fib6_del_route at /sw/release/dev/autorel/platform/linux/net/ipv6/ip6_fib.c:865! Oops: Exception in kernel mode, sig: 5 [#1] PREEMPT NIP: C01C59E0 LR: C01C588C CTR: C01C64EC REGS: cc5299f0 TRAP: 0700 Tainted: P (2.6.18-ctc8247) MSR: 00029032 <EE,ME,IR,DR> CR: 44008824 XER: 00000000 TASK = cfbc47b0[700] 'nsm' THREAD: cc528000 GPR00: 00000002 CC529AA0 CFBC47B0 CFAC9380 C0222614 C022252C 0000040B 00000008 GPR08: C0252910 00000004 00000102 C0252910 24000822 102B6330 0FFFF000 00000000 GPR16: 00000001 FFFFFFFF 00000000 7F8207D0 C0250000 C84C7E30 C8626080 00000000 GPR24: C0290000 C0250000 CC528000 C0252910 C02A8C90 CDFEEDA0 CDFEEE44 00000000 NIP [C01C59E0] fib6_del+0x214/0x610 LR [C01C588C] fib6_del+0xc0/0x610 Call Trace: [CC529AA0] [C01C588C] fib6_del+0xc0/0x610 (unreliable) [CC529AE0] [C01C32E8] ip6_route_del+0x12c/0x1d4 [CC529B10] [C01C42C0] inet6_rtm_delroute+0x50/0x6c [CC529B90] [C0166320] rtnetlink_rcv_msg+0x194/0x250 [CC529BC0] [C016C924] netlink_run_queue+0xd8/0x17c [CC529BF0] [C0166418] rtnetlink_rcv+0x3c/0x68 [CC529C10] [C016BD20] netlink_data_ready+0x70/0xcc [CC529C20] [C016ADAC] netlink_sendskb+0x34/0x88 [CC529C40] [C016BC34] netlink_sendmsg+0x270/0x2ec [CC529CB0] [C014D504] sock_sendmsg+0xac/0xf4 [CC529DB0] [C014F36C] sys_sendmsg+0x1d0/0x26c [CC529F00] [C014F7F4] sys_socketcall+0x1d8/0x1dc [CC529F40] [C00042A0] ret_from_syscall+0x0/0x38 Instruction dump: 70090004 40820014 801f0010 7fe3fb78 7f80e800 419e0074 83ff0000 2f9f0000 409effdc 801d00a4 2f800001 419e0008 <0fe00000> 7ec5b378 7ea6ab78 38600019 Kernel panic - not syncing: Aiee, killing interrupt handler! <0>Rebooting in 1 seconds.. Do you use vanilla kernel? Are there any trivial way to reproduce this? The bug is caused by race condition between deleting ipv6 addr (proc ctx) and dad timer (softirq ctx). The following is executing sequence: 1. (process context) add ipv6 address => new ifp => new ifp->rt => start dad timer ifp->rt is not inserted into fib6 tree, it will be inserted into fib6 tree by addrconf_dad_completed() 2. (process context) delete ipv6 address => dst_free(ifp->rt) => ifp->rt->u.dst is queued on dst_garbage_list addrconf.c:__ipv6_ifa_notify():3564: dst_free(&ifp->rt->u.dst) 3. (softirq context) dad timer expired => addrconf_dad_completed() insert ifp->rt into fib6 tree 4. (softirq context) dst_gc timer expired => dst_run_gc() free ifp->rt->u.dst 5. (process context) shutdown interface => fib6_clean_tree() => fib6_walk() => access already freed rt6_info solution to fix the bug: Delete dad timer before deleting ipv6 addr instead of deleting dad timer after deleting ipv6 addr.(addrconf.c:ipv6_del_addr()) Hello, I tried the detetion of dad timer before deleting the addr, it is not this bug. I have put lots of printks to see more and here is my way out: I had sometimes the following call tree: ip6_route_add ... fib6_add ... if (fn->leaf == NULL) { fn->leaf = rt; atomic_inc(&rt->rt6i_ref); } ... err = fib6_add_rt2node(fn, rt, info); ... (Here err was not null) ... if (err) { ... dst_free(&rt->u.dst); ... } And it seems that this dst_free is not good at this point, it decrements rt6i_ref but the address is still used. I do not understand exactly what happens, but the following patch hides my problem, but certainly does not solve it. diff -Naur linux-2.6.22.5/net/ipv6/ip6_fib.c clownix_linux-2.6.22.5/net/ipv6/ip6_fib.c --- linux-2.6.22.5/net/ipv6/ip6_fib.c 2007-08-23 01:23:54.000000000 +0200 +++ clownix_linux-2.6.22.5/net/ipv6/ip6_fib.c 2007-08-29 13:10:35.000000000 +0200 @@ -696,6 +696,7 @@ { struct fib6_node *fn, *pn = NULL; int err = -ENOMEM; + int bug_8895_clownix_provisional_workaround = 0; fn = fib6_add_1(root, &rt->rt6i_dst.addr, sizeof(struct in6_addr), rt->rt6i_dst.plen, offsetof(struct rt6_info, rt6i_dst)); @@ -760,6 +761,7 @@ } if (fn->leaf == NULL) { + bug_8895_clownix_provisional_workaround = 1; fn->leaf = rt; atomic_inc(&rt->rt6i_ref); } @@ -793,7 +795,8 @@ atomic_inc(&pn->leaf->rt6i_ref); } #endif - dst_free(&rt->u.dst); + if (!bug_8895_clownix_provisional_workaround) + dst_free(&rt->u.dst); } return err; My kernel version is 2.6.18 and no CONFIG_IPV6_SUBTREES enabled. Can you explain/find why fib6_add_rt2node return error(-EEXIST)? Your Bug is not the same as mine, I had a kernel crash every 5 Ctrl-C of the user software approximatelly, and with my patch (which does not correct in depth the problem), I can make a Ctrl-C every 10 secondes all day. The EEXIST error can be caused by mistakes in the user software or anything else, I don't know. But I went through the following error: for (iter = fn->leaf; iter; iter=iter->u.dst.rt6_next) { /* * Search for duplicates */ if (iter->rt6i_metric == rt->rt6i_metric) { /* * Same priority level */ if (iter->rt6i_dev == rt->rt6i_dev && iter->rt6i_idev == rt->rt6i_idev && ipv6_addr_equal(&iter->rt6i_gateway, &rt->rt6i_gateway)) { if (!(iter->rt6i_flags&RTF_EXPIRES)) THIS IS WHERE I RETURNED ----------> return -EEXIST; iter->rt6i_expires = rt->rt6i_expires; if (!(rt->rt6i_flags&RTF_EXPIRES)) { iter->rt6i_flags &= ~RTF_EXPIRES; iter->rt6i_expires = 0; } return -EEXIST; } } fib6_add ... if (fn->leaf == NULL) { fn->leaf = rt; <--**-- rt is assigned to fn->leaf atomic_inc(&rt->rt6i_ref); } ... err = fib6_add_rt2node(fn, rt, info); <-**- return -EEXIST ... (Here err was not null) ... if (err) { ... dst_free(&rt->u.dst); <--**-- Actually rt is still in tree (fn->leaf = rt /* see above */) ... } Yes, it is also what I think, but I have also tried to put fn->leaf to null and that did not work, because there are lots of other things to do to delete rt from the tree. So the kernel experts will have to find a solution to clean fn->leaf in case of an error in fib6_add_rt2node. What happens after: in my case, a call to ip_route_output (triggrered by a message output) increments rt6i_ref again and the leaf lives its normal life, but the crash occurs long after that, the rt6i_ref is one too low, so the address is freed when there is still one use of it and then the 0x6b6b6b appear. I still have not seen any bad things caused by my simple patch, so everything is fine for me. Thank you for the other bug, I think I may have seen it too, but I am not sure. Created attachment 12639 [details]
patch of the printk and dump of traces
Why is this bug not corrected, it is old and completely clear: file ip6_fib.c, line 796 in the vanilla kernel 2.6.23.11 the dst_free can cause kernel crash, as qmiao wrote: fib6_add ... if (fn->leaf == NULL) { fn->leaf = rt; <--**-- rt is assigned to fn->leaf atomic_inc(&rt->rt6i_ref); } ... err = fib6_add_rt2node(fn, rt, info); <-**- return -EEXIST ... (Here err was not null) ... if (err) { ... dst_free(&rt->u.dst); <--**-- Actually rt is still in tree (fn->leaf = rt /* see above */) ... } It looks like the code in question is still there. I will forward this to netdev. (pinging DaveM) Date: Fri Apr 18 01:46:19 2008 -0700 [IPV6]: Fix dangling references on error in fib6_add(). Fixes bugzilla #8895 Hi Alan, It would be great favour if you can point me to the fix patch (for this issue). No idea - we fixed it a decade ago |