Bug 99081 - Bridge multicast snooping breaks ICMPv6
Summary: Bridge multicast snooping breaks ICMPv6
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-28 10:34 UTC by Steinar H. Gunderson
Modified: 2023-10-26 08:39 UTC (History)
6 users (show)

See Also:
Kernel Version: 4.0.4
Subsystem:
Regression: No
Bisected commit-id:


Attachments
cap on both pc and router. (28.82 KB, application/vnd.rar)
2023-10-26 05:33 UTC, jade
Details

Description Steinar H. Gunderson 2015-05-28 10:34:55 UTC
Hi,

I've seen this reported many times around the net, with no definite solution; since I got hit by it again and now used the latest kernel, I thought the best place would be the kernel Bugzilla.

I have a bridge br0 with eth0 on it, and then some tap devices from KVM guests. In this situation, by default, I don't have IPv6. I can't ping anything because ND fails. I've seen this over multiple kernel versions for years, so I'm fairly certain this is not something about my specific setup.

The common workaround is:

 echo 0 > /sys/devices/virtual/net/br0/bridge/multicast_snooping 

which immediately makes IPv6 work well again.

IPv6 unicast is unaffected; only multicast (ie., ND) seems to have the problem.
Comment 1 Eric Dumazet 2015-05-28 11:16:43 UTC
I noticed the following bug, not sure if it is relevant...

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index a3abe6ed111e..22fd0419b314 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1822,7 +1822,7 @@ static void br_multicast_query_expired(struct net_bridge *br,
        if (query->startup_sent < br->multicast_startup_query_count)
                query->startup_sent++;
 
-       RCU_INIT_POINTER(querier, NULL);
+       RCU_INIT_POINTER(querier->port, NULL);
        br_multicast_send_query(br, NULL, query);
        spin_unlock(&br->multicast_lock);
 }
Comment 2 Linus Lüssing 2015-05-28 17:32:11 UTC
Hm, no, that one can't be relevant as the querier port information is only used by the export which is supposed to be used by batman-adv later (patches for that are still pending review and inclusion on the batman-adv mailing list). The selected querier port rcu-pointer isn't used by the forwarding decisions of the bridge. (Nevertheless thanks for finding it! Would have taken me quite some time to spot from a more complex setup with batman-adv+bridge)

The recent patch from Thadeu Lima de Souza Cascardo seems promising though ("bridge: fix parsing of MLDv2 reports"). Steinar, could it be that you are one of the (few?) people using MLDv2 queriers? Also, behind which port is your querier, eth0, br0 or one of the tap interfaces? Does multicast traffic on the port which has the querier work fine?
Comment 3 Steinar H. Gunderson 2015-05-28 20:01:05 UTC
Uhm, I'm fairly IPv6 versed, but I don't think I know what an MLDv2 querier is. Care to explain?

The regular network is behind eth0, and I can't reach stuff on that from my machine, IIRC even before a VM is started (so eth0 is essentially alone on br0).
Comment 4 Linus Lüssing 2015-05-28 20:49:27 UTC
Sure. The bridge learns about which host is interested in which multicast traffic via IGMP (IPv4) or MLD (IPv6). Multicast listeners issue so called IGMP/MLD reports on the link to inform switches and multicast routers about what they want to receive. However they don't do this unconditionally: They only do this if they are asked to report. For this some host needs to regularly issue so called IGMP/MLD queries (usually such a querying host is a multicast router or sometimes a bridge/snooping switch itself).

The Linux bridge also only kicks into gear once it sees that a querier is present. Otherwise there wouldn't be any reports and the bridge wouldn't be able to learn about any multicast listeners.

Could you maybe figure out with tcpdump behind which port the selected querier resides? E.g. do a "tcpdump -i eth0 icmp6", check the IPv6 source address and figure out from which bridge port this source is coming from. And please post the output line from tcpdump about the query message here so we can figure out whether it's an MLDv1 or MLDv2 querier. (for the latter and the MLDv2 reports that creates there is a known issue which just got fixed and queued for stable)
Comment 5 Steinar H. Gunderson 2015-05-28 21:07:34 UTC
We have a 3560-X as router, including IPv6 multicast routing. It's speaking MLDv2, and we have MLD snooping in place in most of the network. Unfortunately, there's way too much IPv6 going on on this subnet for a simple icmp6 tcpdump to be very selective, and I can't see any MLD packets from a simple eyeballing. It's news to me that ND would be treated differently depending on whether we've seen a multicast router or not, but I'll believe you there. :-)

As for which bridge port it would be behind, it has to be eth0 if so; none of the KVM hosts (the only other things on the bridge) speak much IPv6 at all, let alone route it.
Comment 6 Salah Coronya 2015-07-07 05:55:50 UTC
I haved the same problem (with the same setup, bridged kvm eth0) with the same symptom - turning off multicast_snooping fixes the problem. Kernel is 4.0.5. My setup is a lot simpler though: Just 1 computer and a Linksys router.

However, I manually applied "bridge: fix parsing of MLDv2 reports" (	47cc84ce0c2fe75c99ea5963c4b5704dd78ead54) and that seems to have fixed it.
Comment 7 Steinar H. Gunderson 2015-08-03 11:29:05 UTC
I'm still seeing this in 4.2-rc4. Isn't that supposed to have the patch?
Comment 8 Linus Lüssing 2015-08-23 13:12:03 UTC
Yes, that patch is there since 4.2-rc1 and 4.1-rc5.

4.1-rc1 introduced a regression which is supposed to be fixed in master or with the upcoming 4.2-rc8: "net: fix wrong skb_get() usage / crash in IGMP/MLD parsing code" (a516993f0ac1694673412eb2d16a091eafa77d2a). Not quite sure whether missing this patch would create erratic behaviour other than crashing.

Stenair, if the issue is still present for you for current master or any (non-rc) 4.1 kernel, I'd be very interested in having a look at some tcpdumps of ICMPv6 traffic and the according "/sbin/bridge mdb show" output.
Comment 9 Steinar H. Gunderson 2015-08-23 15:16:00 UTC
I'm down on 4.1.4 now (4.2-rc* had other blocking bugs); I've enabled multicast snooping again, and will let you know if I see problems.
Comment 10 Steinar H. Gunderson 2016-01-20 10:50:49 UTC
I still see this in 4.3.0 and 4.4.0. However, it's intermittent; I just experienced it and could get a bridge dump, but once I started the tcpdump, it started working (possibly due to promisc mode?). Anyway, the bridge dump is:

pannekake:~# /sbin/bridge mdb show
dev br0 port eth2 grp ff02::2 temp
dev br0 port eth2 grp ff02::1:ff00:60 temp
dev br0 port eth2 grp ff02::1:ff00:53 temp
dev br0 port eth2 grp ff02::1:ff00:61 temp
dev br0 port eth2 grp ff02::1:ff00:56 temp
dev br0 port eth2 grp ff02::202 temp
dev br0 port eth2 grp ff02::1:ff00:68 temp
dev br0 port eth2 grp ff02::1:ff00:63 temp
dev br0 port eth2 grp ff02::1:ff00:69 temp
dev br0 port eth2 grp ff02::1:ffac:97d6 temp

This was from before starting the tcpdump.
Comment 11 Steinar H. Gunderson 2016-01-21 01:02:43 UTC
Yes indeed, starting tcpdump fixes the problem. I had a hanging SSH to an IPv6 host on the same subnet, and the second I started tcpdump, the connection went through:

01:56:12.880344 IP6 2001:67c:29f4::29 > ff02::1:ff00:50: ICMP6, neighbor solicitation, who has 2001:67c:29f4::50, length 32
01:56:12.880379 IP6 2001:67c:29f4::50 > 2001:67c:29f4::29: ICMP6, neighbor advertisement, tgt is 2001:67c:29f4::50, length 32

::50 here is my machine, ::29 was the one I was SSH-ing to.

I did some testing on ::29, doing “ip -6 neigh flush dev eth0” and trying to ping ::50. No response. “ifconfig br0 promisc” on ::50, and it works. “ifconfig br0 -promisc” and flush the table again, it doesn't. Turn off multicast snooping, and it immediately works. So now I can reproduce this 100% by simple means.

I note that ff02::1:ff00:50 is not in the list of “bridge mdb show”. Might it be that it doesn't understand it should listen on that group, and thus does not get the neighbor soliciation?
Comment 12 Steinar H. Gunderson 2016-06-03 21:01:06 UTC
Still there in 4.7-rc1:

pannekake:~# /sbin/bridge mdb show
dev br0 port eth2 grp ff02::1:ff00:63 temp
dev br0 port eth2 grp ff02::1:ff00:0 temp
dev br0 port eth2 grp ff02::1:ff2f:1158 temp
dev br0 port eth2 grp ff02::fb temp
dev br0 port eth2 grp ff02::1:ff00:56 temp
dev br0 port eth2 grp ff02::1:ff00:1007 temp
dev br0 port eth2 grp ff02::2 temp
dev br0 port eth2 grp ff02::202 temp
dev br0 port eth2 grp ff02::1:ff4f:316 temp
dev br0 port eth2 grp ff02::1:ff00:68 temp
Comment 13 Linus Lüssing 2016-06-16 21:15:03 UTC
Thanks for still checking for this issue! It's a pitty the issue is still there for you even with a very recent kernel...

-----

Regarding the bridge mdb output, hm, I'm getting a similar output, entries for the bridge device itself seem to be missing in "bridge mdb show" for me too...

I added some printk's in br_multicast_ipv6_rcv() and I am receiving reports from br0 (that is "port == NULL" in there) and no errors are thrown by any functions called with br_multicast_ipv6_rcv(). Seems it is being parsed and added fine in there at least in my VMs.

I am ping6'ing the solicited-node multicast address of br0 from a second VM. With no querier, those multicast echo requests are flooded on all other ports and br0 itself. Then I enabled the built-in multicast_querier of br0 and echo replies continued to arrive from br0, while the echo requests stoped being flooded on other bridge ports. So just as expected.

It seems, that maybe the bridge tool or netlink interface is somehow broken or at least misleading? I'm having the same issue with a 4.0.2 kernel.

-----

Regarding your packet loss, it'd still be very helpful to get a tcpdump of all ICMPv6 traffic for 10-15 minutes from affected interfaces, while using the "--no-promiscuous-mode" option. Or at least a capture of all MLD messages and according, broken neighbor solicitation/advertisement exchanges. We need to check first whether MLD works fine.

Very specific captures should be doable relatively easily with wireshark or tshark and their powerful filter rules (for instance "icmpv6.type == 130" or 131/132/143 for MLD). A full ICMPv6 capture with tcpdump multiple interfaces, even if large, would be fine for me too. I can find my way around with wireshark.
Comment 14 jade 2023-10-26 05:28:17 UTC
I have same issue. I have a OpenWRT router wich Kernel is 5.4.225, there are a bridage br0 with wan(eth1) and lan(lan2/lan3/lan4). 

The PC connect to router's lan3, and the router connect to PON Modem by wan(eth1).

when set multicast_snooping to 1, the PC send RS and DHCP6 packet, but can't get according resonses. However, when set multicast_snooping 0, PC can get both RA and DHCP6 resonse, and get IP6 address.

Refer to the attachement, I get the packet by wireshark on PC, and the packet captured on router's lan and wan interface.
Comment 15 jade 2023-10-26 05:33:21 UTC
Created attachment 305296 [details]
cap on both pc and router.

Note You need to log in before you can comment on or make changes to this bug.