Bug 14586

Summary: bridge on bonding interface: DHCP replies don't get through
Product: Networking Reporter: Harald Dunkel (harri)
Component: OtherAssignee: Arnaldo Carvalho de Melo (acme)
Status: CLOSED INVALID    
Severity: normal CC: harri
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31.5 Subsystem:
Regression: No Bisected commit-id:
Attachments: output of 'brctl show', 'ip addr list', and 'cat /proc/net/bonding/bond*'

Description Harald Dunkel 2009-11-11 14:27:58 UTC
I would like to run a bridge for kvm on a bonding interface (4 * 1Gbit, Intel e1000e). Problem: The DHCPDISCOVER packets sent by the guest show up on my dhcp server as expected, but the DHCPOFFER sent as a reply doesn't reach the guest behind the bridge.

Using tcpdump on host and guest I can see the DHCPOFFER on the bond0 and br0 interface, but it never shows up on vnet0 or on the guest's eth0.

If I drop the bonding interface and use the host's eth2 for the bridge instead, then there is no such problem.

Kernel on host and guest is 2.6.31.5. Attached you can find more information about my setup. 

I had sent this information to the linux kvm mailing list before, but consensus was that this is a bridging problem. See

	http://www.spinics.net/lists/kvm/msg25153.html

There was no reply on the linux bridge mailing list, see 

	https://lists.linux-foundation.org/pipermail/bridge/2009-November/006749.html
Comment 1 Harald Dunkel 2009-11-11 14:31:01 UTC
Created attachment 23748 [details]
output of 'brctl show', 'ip addr list', and 'cat /proc/net/bonding/bond*'
Comment 2 Andrew Morton 2009-11-12 22:39:17 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Wed, 11 Nov 2009 14:28:00 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=14586
> 
>            Summary: bridge on bonding interface: DHCP replies don't get
>                     through
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 2.6.31.5
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: acme@ghostprotocols.net
>         ReportedBy: harald.dunkel@t-online.de
>         Regression: No
> 
> 
> I would like to run a bridge for kvm on a bonding interface (4 * 1Gbit, Intel
> e1000e). Problem: The DHCPDISCOVER packets sent by the guest show up on my
> dhcp
> server as expected, but the DHCPOFFER sent as a reply doesn't reach the guest
> behind the bridge.
> 
> Using tcpdump on host and guest I can see the DHCPOFFER on the bond0 and br0
> interface, but it never shows up on vnet0 or on the guest's eth0.
> 
> If I drop the bonding interface and use the host's eth2 for the bridge
> instead,
> then there is no such problem.
> 
> Kernel on host and guest is 2.6.31.5. Attached you can find more information
> about my setup. 
> 
> I had sent this information to the linux kvm mailing list before, but
> consensus
> was that this is a bridging problem. See
> 
>     http://www.spinics.net/lists/kvm/msg25153.html
> 
> There was no reply on the linux bridge mailing list, see 
> 
>    
> https://lists.linux-foundation.org/pipermail/bridge/2009-November/006749.html
Comment 3 Stephen Hemminger 2009-11-12 22:56:10 UTC
> 
> I would like to run a bridge for kvm on a bonding interface (4 * 1Gbit, Intel
> e1000e). Problem: The DHCPDISCOVER packets sent by the guest show up on my
> dhcp
> server as expected, but the DHCPOFFER sent as a reply doesn't reach the guest
> behind the bridge.
> 
> Using tcpdump on host and guest I can see the DHCPOFFER on the bond0 and br0
> interface, but it never shows up on vnet0 or on the guest's eth0.
> 
> If I drop the bonding interface and use the host's eth2 for the bridge
> instead,
> then there is no such problem.
> 
> Kernel on host and guest is 2.6.31.5. Attached you can find more information
> about my setup. 
> 

What is the configuration?
# brctl showstp virbr0
# brctl showmacs virbr0

Is dhclient being run on the bridge interface?
# cat /proc/net/ptype

# cat /proc/net/bonding/bond0


How is bond and bridge configured? Are bonding bridges (wrong)
or bridging bonded interfaces?

Are all links up?

Since this is the initial packet it will have to be flood forwarded by
the bridge, is there any iptables/netfilter rule that might be blocking
packets?
Comment 4 Andy Gospodarek 2009-11-13 03:23:33 UTC
On Thu, Nov 12, 2009 at 02:55:29PM -0800, Stephen Hemminger wrote:
> > 
> > I would like to run a bridge for kvm on a bonding interface (4 * 1Gbit,
> Intel
> > e1000e). Problem: The DHCPDISCOVER packets sent by the guest show up on my
> dhcp
> > server as expected, but the DHCPOFFER sent as a reply doesn't reach the
> guest
> > behind the bridge.
> > 
> > Using tcpdump on host and guest I can see the DHCPOFFER on the bond0 and
> br0
> > interface, but it never shows up on vnet0 or on the guest's eth0.
> > 
> > If I drop the bonding interface and use the host's eth2 for the bridge
> instead,
> > then there is no such problem.
> > 
> > Kernel on host and guest is 2.6.31.5. Attached you can find more
> information
> > about my setup. 
> > 
> 
> What is the configuration?
> # brctl showstp virbr0
> # brctl showmacs virbr0
> 

I'm quite sure this output will show us why this isn't working and it
won't be the first time I've seen this.

What happens with mode 0 is that the DHCPDISCOVER goes out and since the
MAC is unlearned by the switch connected to the host, the frame will
come back on all other members of the bond other than the one that sent
it.
 
Since the bond interface is receiving the frame, the bridge will relearn
the source address of the guest on the bonding interface and any
subsequent frames received on the bond interface that have the
destination MAC of the guest will be dropped by the bridge.  This is as
expected since a bridge should drop frames when the destination MAC of
the incoming frame matches an entry in the forwarding database that
indicates those frames are destined for the receiving port.

I would suggest switching to mode 5 (balance-tlb) if your switch cannot
handle bonding or mode 2 or 4 (balance-xor or 802.3ad, respectively) if
it can.  Both of those modes avoid this problem since mode 5 will drop
additional broadcast frames and modes 2 and 4 will not send broadcast
frames back to any of the bond member's interfaces.

There is a Red Hat kbase article that talks about this problem as well:

http://kbase.redhat.com/faq/docs/DOC-16051

When it was originally written mode 6 was thought to be a workaround
as well, but it has been recently proved to still be a problem and the
article needs to be updated.
Comment 5 Harald Dunkel 2009-11-17 11:04:06 UTC
Many thanx for your detailed explanation. Using balance-xor the DHCP reply comes through as expected. (802.3ad did not work at all, but I would guess this was a local misconfiguration).

I would say this issue is resolved.


Many thanx to all

Harri