Bug 70681

Summary: broadcast gre causes oops
Product: Networking Reporter: Andreas Steinmetz (ast)
Component: IPV4Assignee: Stephen Hemminger (stephen)
Status: NEW ---    
Severity: normal CC: alex.zeffertt, lucien.xin
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.13.2 Subsystem:
Regression: No Bisected commit-id:

Description Andreas Steinmetz 2014-02-16 22:17:34 UTC
I was trying to use broaddast (ahem, multicast) gre. This repeatably results in an Oops and a kernel panic:

htpc2 ~ # ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 4074
        inet 10.1.9.61  netmask 255.255.255.0  broadcast 10.1.9.255
        inet6 2001:a60:10b3:c00:201:c0ff:fe13:db43  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::201:c0ff:fe13:db43  prefixlen 64  scopeid 0x20<link>
        inet6 fdf2:e35b:1a0e:2c28:201:c0ff:fe13:db43  prefixlen 64  scopeid 0x0<global>
        inet6 fdf2:e35b:1a0e:2c28::61  prefixlen 64  scopeid 0x0<global>
        ether 00:01:c0:13:db:43  txqueuelen 1000  (Ethernet)
        RX packets 1943144  bytes 403477948 (384.7 MiB)
        RX errors 0  dropped 65686  overruns 0  frame 0
        TX packets 2135358  bytes 367947113 (350.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 20  memory 0xe0700000-e0720000  

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 0  (Local Loopback)
        RX packets 264059  bytes 16600869 (15.8 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 264059  bytes 16600869 (15.8 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

htpc2 ~ # route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.1.9.1        0.0.0.0         UG    2      0        0 eth0
10.1.9.0        0.0.0.0         255.255.255.0   U     0      0        0 eth0
127.0.0.0       0.0.0.0         255.0.0.0       U     0      0        0 lo
127.0.0.0       127.0.0.1       255.0.0.0       UG    0      0        0 lo
224.0.0.0       0.0.0.0         240.0.0.0       U     10     0        0 eth0
htpc2 ~ # ip tunnel add test mode gre local 10.1.9.61 remote 224.66.66.66 ttl 16
htpc2 ~ # ip addr add 10.0.0.1/24 dev test
htpc2 ~ # ip link set test up



This results instantly in the following Oops (from /dev/pstore):

Oops#1 Part1
<4>R13: ffffffff816569c0 R14: ffff88043e250008 R15: ffffffff8166fc40
<4>FS:  0000000000000000(0000) GS:ffff88043e240000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>CR2: 00000000000000a2 CR3: 0000000410f41000 CR4: 00000000001407a0
<4>Stack:
<4> ffff880417fb2840 ffff88042c88a1c0 ffff88043e243dcc ffff8800a1644800
<4> ffffffff816569c0 ffffffffa016b33b 0000000000000012 ffff88042c88a1c0
<4> 0000000000000000 ffffffff816569c0 ffff88042d674000 ffffffffa015c4ac
<4>Call Trace:
<4> <IRQ> 
<4> [<ffffffffa016b33b>] ? ipgre_rcv+0xb4/0xc5 [ip_gre]
<4> [<ffffffffa015c4ac>] ? gre_cisco_rcv+0x3b/0x89 [gre]
<4> [<ffffffffa015c10b>] ? gre_rcv+0x66/0x8e [gre]
<4> [<ffffffff81342222>] ? ip_local_deliver_finish+0x92/0xfc
<4> [<ffffffff81313374>] ? __netif_receive_skb_core+0x612/0x6a5
<4> [<ffffffff813134e8>] ? process_backlog+0x8a/0x140
<4> [<ffffffff81313825>] ? net_rx_action+0xa5/0x1e4
<4> [<ffffffff81040f2e>] ? __do_softirq+0xf1/0x26d
<4> [<ffffffff8104128a>] ? irq_exit+0x35/0x7a
<4> [<ffffffff81020895>] ? smp_apic_timer_interrupt+0x3b/0x46
<4> [<ffffffff813ea60a>] ? apic_timer_interrupt+0x6a/0x70
<4> <EOI> 
<4> [<ffffffff812ee4dd>] ? cpuidle_enter_state+0x43/0xa6
<4> [<ffffffff812ee4d6>] ? cpuidle_enter_state+0x3c/0xa6
<4> [<ffffffff812ee649>] ? cpuidle_idle_call+0x109/0x1e3
<4> [<ffffffff81009b4d>] ? arch_cpu_idle+0x7/0x1a
<4> [<ffffffff8106f2ec>] ? cpu_startup_entry+0x133/0x206
<4>Code: 89 f3 50 44 0f b7 a6 ae 00 00 00 4c 03 a6 c0 00 00 00 41 8b 44 24 10 25 f0 00 00 00 3d e0 00 00 00 75 2c 48 8b 46 58 48 83 e0 fe <80> b8 a2 00 00 00 00 0f 84 53 03 00 00 48 8b 47 18 48 ff 80 48 
<1>RIP  [<ffffffff8137954f>] ip_tunnel_rcv+0x35/0x3d7
<4> RSP <ffff88043e243d68>
<4>CR2: 00000000000000a2
<4>---[ end trace 6a1568a07dacad07 ]---
Oops#1 Part2
<6>gre: GRE over IPv4 demultiplexor driver
<6>ip_gre: GRE over IPv4 tunneling driver
<1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000a2
<1>IP: [<ffffffff8137954f>] ip_tunnel_rcv+0x35/0x3d7
<4>PGD 410f42067 PUD 410f43067 PMD 0 
<4>Oops: 0000 [#1] PREEMPT SMP 
<4>Modules linked in: ip_gre gre bnep autofs4 nls_iso8859_15 nls_cp850 vfat fat configfs uinput snd_aloop snd_seq_midi snd_seq_midi_event snd_seq snd_rawmidi snd_seq_device fuse tun hid_topseed hid_generic iwldvm led_class mac80211 usbhid cdc_acm snd_hda_codec_hdmi coretemp snd_hda_codec_realtek pcspkr iwlwifi cfg80211 btusb i915 8250_pci snd_hda_intel i2c_algo_bit intel_agp snd_hda_codec i2c_i801 intel_gtt r8169 snd_pcm drm_kms_helper snd_page_alloc rtc_cmos mii iTCO_wdt drm snd_timer snd 8250 agpgart soundcore serial_core bluetooth uhci_hcd ehci_pci ehci_hcd xhci_hcd usb_storage
<4>CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.2-gentoo-htpc2 #1
<4>Hardware name: CompuLab Intense-PC/Intense-PC, BIOS CR_2.2.0.400 X64 12/12/2013
<4>task: ffff88043c0fd9a0 ti: ffff88043c0fe000 task.ti: ffff88043c0fe000
<4>RIP: 0010:[<ffffffff8137954f>]  [<ffffffff8137954f>] ip_tunnel_rcv+0x35/0x3d7
<4>RSP: 0018:ffff88043e243d68  EFLAGS: 00010246
<4>RAX: 0000000000000000 RBX: ffff88042c88a1c0 RCX: 0000000000000001
<4>RDX: ffff88043e243dcc RSI: ffff88042c88a1c0 RDI: ffff880417fb2840
<4>RBP: ffff880417fb2840 R08: 0000000000000000 R09: 00000000424242e0
<4>R10: ffff8800a1644a08 R11: ffff8800a1ca9680 R12: ffff880410eae054


The kernel then panics but I don't seem to get the panic reliably written to /dev/pstore so I can't add it here.
Comment 1 Xin Long 2014-03-12 04:26:33 UTC
From b4ddc591e46a884e77092788ec25c36e42ac3304 Mon Sep 17 00:00:00 2001
From: Xin Long <lucien.xin@gmail.com>
Date: Mon, 3 Mar 2014 20:04:33 +0800
Subject: [PATCH] ip_tunnel:multicast process cause panic due to
 skb->_skb_refdst NULL pointer

when ip_tunnel process multicast packets, it may check if the packet is looped
back packet though 'rt_is_output_route(skb_rtable(skb))' in ip_tunnel_rcv(),
but before that , skb->_skb_refdst has been dropped in iptunnel_pull_header(),
so which leads to a panic.

fix the bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/ipv4/ip_tunnel_core.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 6156f4e..88b08aa 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -108,7 +108,6 @@ int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, __be16 inner_proto)
 	nf_reset(skb);
 	secpath_reset(skb);
 	skb_clear_hash_if_not_l4(skb);
-	skb_dst_drop(skb);
 	skb->vlan_tci = 0;
 	skb_set_queue_mapping(skb, 0);
 	skb->pkt_type = PACKET_HOST;
-- 
1.8.3.1
Comment 2 Alex Zeffertt 2014-03-14 12:40:08 UTC
I've been seeing exactly the same kernel oops on my Ubuntu 13.10 system.  I can reproduce the crash by creating multiple LXC containers (each of which has a bridge of gretap interfaces) and then forcibly destroying the containers.

I tried applying the above patch (to linux-source-3.11.0 version 3.11.0-18.32) but now I get a crash when the containers (and therefore the gretap interfaces) are being created.

Apologies if I am supposed to be using a different kernel!

Here is the new oops:

[   15.448092] BUG: unable to handle kernel paging request at fffffffc
[   15.448958] IP: [<c15c190d>] ipv6_rcv+0x13d/0x500
[   15.449524] *pdpt = 0000000001a1a001 *pde = 0000000001a21067 *pte = 0000000000000000 
[   15.450455] Oops: 0000 [#1] SMP 
[   15.450906] Modules linked in: ebt_mark_m ebtable_filter ip_gre gre ip_tunnel dummy macvlan overlayfs xt_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp ip6table_filter ip6_tables iptable_filter ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables bridge stp llc nfsd auth_rpcgss nfs_acl nfs lockd dm_multipath sunrpc scsi_dh psmouse microcode fscache virtio_balloon serio_raw lp parport ext2 floppy
[   15.453411] CPU: 1 PID: 1 Comm: init Not tainted 3.11.10.4 #1
[   15.453411] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[   15.453411] task: f6058000 ti: f60f0000 task.ti: f6036000
[   15.453411] EIP: 0060:[<c15c190d>] EFLAGS: 00010286 CPU: 1
[   15.453411] EIP is at ipv6_rcv+0x13d/0x500
[   15.453411] EAX: fffffffc EBX: eb4ed3c0 ECX: 00000000 EDX: eb4ed3f0
[   15.453411] ESI: eb486200 EDI: 00000018 EBP: f60f1ef8 ESP: f60f1ecc
[   15.453411]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   15.453411] CR0: 80050033 CR2: fffffffc CR3: 36253000 CR4: 000006f0
[   15.453411] Stack:
[   15.453411]  eb4ed3c0 f60f1ef0 c156f5d2 00000000 00000024 ec7b5800 00000001 f3a04380
[   15.453411]  c1937340 eb4ed3c0 c1935e74 f60f1f30 c1544e33 ec7b5800 80000000 00000076
[   15.453411]  00000000 c1937340 ec7b5800 c1935e88 eb4ed3c0 c1935e88 eb4ed3c0 eb664410
[   15.453411] Call Trace:
[   15.453411]  [<c156f5d2>] ? ip_rcv_finish+0x62/0x320
[   15.453411]  [<c1544e33>] __netif_receive_skb_core+0x4a3/0x630
[   15.453411]  [<c1544fd6>] __netif_receive_skb+0x16/0x60
[   15.453411]  [<c154503f>] netif_receive_skb+0x1f/0x80
[   15.453411]  [<c1545817>] napi_gro_receive+0x67/0x90
[   15.453411]  [<f8681aff>] gro_cell_poll+0x5f/0xa0 [ip_tunnel]
[   15.453411]  [<c15452a2>] net_rx_action+0xa2/0x180
[   15.453411]  [<c1057531>] __do_softirq+0xc1/0x1d0
[   15.453411]  [<c1057470>] ? remote_softirq_receive+0xb0/0xb0
[   15.453411]  <IRQ> 
[   15.453411]  [<c10577a5>] ? irq_exit+0x95/0xa0
[   15.453411]  [<c1617758>] ? smp_apic_timer_interrupt+0x38/0x50
[   15.453411]  [<c16100dc>] ? apic_timer_interrupt+0x34/0x3c
[   15.453411] Code: f2 01 c2 f6 c1 02 74 09 31 ff 83 c2 02 66 89 7a fe 83 e1 01 74 03 c6 02 00 8b 43 48 83 e0 fe 0f 84 66 01 00 00 8b 80 c4 00 00 00 <8b> 00 8b 80 80 00 00 00 8b 53 4c 89 43 18 89 d0 2b 43 50 83 f8[   15.477172] device ext1 entered promiscuous mode

[   15.453411] EIP: [<c15c190d>] ipv6_rcv+0x13d/0x500 SS:ESP 0068:f60f1ecc
[   15.453411] CR2: 00000000fffffffc
[   15.453411] ---[ end trace c7339aadbfd8dab1 ]---
[   15.453411] Kernel panic - not syncing: Fatal exception in interrupt
Comment 3 Alex Zeffertt 2014-03-14 14:53:11 UTC
I've decided that my bug is actually different and so I've opened a new ticket (https://bugzilla.kernel.org/show_bug.cgi?id=72081).  However, it's still the case that the patch above caused my system to crash.

Regards,
Comment 4 Xin Long 2014-04-02 01:44:56 UTC
(In reply to Alex Zeffertt from comment #2)
> 
> [   15.453411] EIP: [<c15c190d>] ipv6_rcv+0x13d/0x500 SS:ESP 0068:f60f1ecc
> [   15.453411] CR2: 00000000fffffffc
> [   15.453411] ---[ end trace c7339aadbfd8dab1 ]---
> [   15.453411] Kernel panic - not syncing: Fatal exception in interrupt

hi, Alex, that patch actually cause this panic. a new patch may fix it perfectly.


Commit 10ddceb22ba (ip_tunnel:multicast process cause panic due
to skb->_skb_refdst NULL pointer) removed dst-drop call from
ip-tunnel-recv.

Following commit reintroduce dst-drop and fix the original bug by
checking loopback packet before releasing dst.
Original bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681

CC: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
---
 net/ipv4/gre_demux.c      |    8 ++++++++
 net/ipv4/ip_tunnel.c      |    3 ---
 net/ipv4/ip_tunnel_core.c |    1 +
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/gre_demux.c b/net/ipv4/gre_demux.c
index 1863422f..250be74 100644
--- a/net/ipv4/gre_demux.c
+++ b/net/ipv4/gre_demux.c
@@ -182,6 +182,14 @@ static int gre_cisco_rcv(struct sk_buff *skb)
        int i;
        bool csum_err = false;

+#ifdef CONFIG_NET_IPGRE_BROADCAST
+       if (ipv4_is_multicast(ip_hdr(skb)->daddr)) {
+               /* Looped back packet, drop it! */
+               if (rt_is_output_route(skb_rtable(skb)))
+                       goto drop;
+       }
+#endif
+
        if (parse_gre_header(skb, &tpi, &csum_err) < 0)
                goto drop;

diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 78a89e6..a82a22d 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -416,9 +416,6 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb,

 #ifdef CONFIG_NET_IPGRE_BROADCAST
        if (ipv4_is_multicast(iph->daddr)) {
-               /* Looped back packet, drop it! */
-               if (rt_is_output_route(skb_rtable(skb)))
-                       goto drop;
                tunnel->dev->stats.multicast++;
                skb->pkt_type = PACKET_BROADCAST;
        }
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 6f847dd..8d69626 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -108,6 +108,7 @@ int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, __be16 inner_proto)
        nf_reset(skb);
        secpath_reset(skb);
        skb_clear_hash_if_not_l4(skb);
+       skb_dst_drop(skb);
        skb->vlan_tci = 0;
        skb_set_queue_mapping(skb, 0);
        skb->pkt_type = PACKET_HOST;