Bug 25062
Summary: | Bonding packet deduplication doesn't work properly anymore | ||
---|---|---|---|
Product: | Networking | Reporter: | Kevin Lapagna (kevin.lapagna) |
Component: | Other | Assignee: | Arnaldo Carvalho de Melo (acme) |
Status: | RESOLVED OBSOLETE | ||
Severity: | high | CC: | alan, andreas.jud, kirr, peter.zuercher, tadavis |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | > 2.6.33 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Kevin Lapagna
2010-12-17 11:45:10 UTC
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Fri, 17 Dec 2010 11:45:18 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=25062 > > Summary: Bonding packet deduplication doesn't work properly > anymore > Product: Networking > Version: 2.5 > Kernel Version: > 2.6.33 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: Other > AssignedTo: acme@ghostprotocols.net > ReportedBy: kevin.lapagna@bigtag.ch > Regression: No > > > Here's the setup: > > switch: ordinary cisco switch > eth0: NIC with kernel module tg3 > eth1: NIC with kernel module e1000e > bond0: bond with slaves eth0,eth1 in mode 1 (or 5) > bond0.100: vlan device created with vconfig > bridge100: bridge created with brctl > tap1: tap device created with tunctl > vguest: qemu-kvm vguest whit emulated e1000 NIC > > > |________________|-- eth0 \ > |________________| > | switch | -- bond0 -- bond0.100 -- bridge100 -- tap1 -- | vguest | > |________|-- eth1 / |________| > > When the vguest emits an ethernet broadcast (DHCP-request), it's forwarded > all > the way up to the switch, through eth0. The switch forwards the broadcast - > also to eth1. The packet travels then all the way back to bridge100. So the > last status known for bridge100, regarding the mac address of the vgeust is, > that it is behind bond0.110 (instead of tap1). If a DHCP-server responds to > the > request, the packet travels to bridge100, which has now a faulty > MAC-address-table and the packet will be rejected and never reaches tap1 and > therefor not the vguest. > > I witnessed this wrong behavior in kernel 2.6.37-rc5 (debian package), > 2.6.36.2 > and 2.6.35.9 (self compiled - vanilla). The setup has worked with kernels <= > 2.6.33.7. I've never tried 2.6.34. > > I assume the setup above is a common way for the separation of virtual guests > on a network level. So this could become a major issue for a lot of people > when > upgrading their kernels. > Andrew Morton <akpm@linux-foundation.org> wrote: >On Fri, 17 Dec 2010 11:45:18 GMT >bugzilla-daemon@bugzilla.kernel.org wrote: > >> https://bugzilla.kernel.org/show_bug.cgi?id=25062 >> >> Summary: Bonding packet deduplication doesn't work properly >> anymore >> Product: Networking >> Version: 2.5 >> Kernel Version: > 2.6.33 >> Platform: All >> OS/Version: Linux >> Tree: Mainline >> Status: NEW >> Severity: high >> Priority: P1 >> Component: Other >> AssignedTo: acme@ghostprotocols.net >> ReportedBy: kevin.lapagna@bigtag.ch >> Regression: No >> >> >> Here's the setup: >> >> switch: ordinary cisco switch >> eth0: NIC with kernel module tg3 >> eth1: NIC with kernel module e1000e >> bond0: bond with slaves eth0,eth1 in mode 1 (or 5) >> bond0.100: vlan device created with vconfig >> bridge100: bridge created with brctl >> tap1: tap device created with tunctl >> vguest: qemu-kvm vguest whit emulated e1000 NIC >> >> >> |________________|-- eth0 \ >> |________________| >> | switch | -- bond0 -- bond0.100 -- bridge100 -- tap1 -- | vguest | >> |________|-- eth1 / |________| >> >> When the vguest emits an ethernet broadcast (DHCP-request), it's forwarded >> all >> the way up to the switch, through eth0. The switch forwards the broadcast - >> also to eth1. The packet travels then all the way back to bridge100. So the >> last status known for bridge100, regarding the mac address of the vgeust is, >> that it is behind bond0.110 (instead of tap1). If a DHCP-server responds to >> the >> request, the packet travels to bridge100, which has now a faulty >> MAC-address-table and the packet will be rejected and never reaches tap1 and >> therefor not the vguest. >> >> I witnessed this wrong behavior in kernel 2.6.37-rc5 (debian package), >> 2.6.36.2 >> and 2.6.35.9 (self compiled - vanilla). The setup has worked with kernels >> <= >> 2.6.33.7. I've never tried 2.6.34. >> >> I assume the setup above is a common way for the separation of virtual >> guests >> on a network level. So this could become a major issue for a lot of people >> when >> upgrading their kernels. Just a note that I have reproduced what I believe is the same problem (I didn't use tap, and assigned an IP to the bridge). I used arping to generate ethernet broadcasts. I see the problem on 2.6.36.2, but not on today's net-next-2.6. I'll see if I can dig up the root cause tomorrow. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com |