Bug 9719 - when a system is configured as a bridge, and at the same time configured to have multipath weighted route, with one leg goes thru NAT and another without NAT, the nat path will intermittently get packets leaking out using internal IP without being SNAT...
Summary: when a system is configured as a bridge, and at the same time configured to h...
Status: CLOSED INVALID
Alias: None
Product: Networking
Classification: Unclassified
Component: Netfilter/Iptables (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: networking_netfilter-iptables@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-09 11:55 UTC by Ming-Ching Tiew
Modified: 2008-02-02 05:28 UTC (History)
0 users

See Also:
Kernel Version: failing on 2.6.22.15 and also 2.6.23
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Ming-Ching Tiew 2008-01-09 11:55:50 UTC
Latest working kernel version: 2.6.23
Earliest failing kernel version: 2.6.22.15
Distribution: iptables 1.4.0 was used with kernel 2.6.23 and iptables 1.3.8 with 2.6.22.15
Hardware Environment: 3 interfaces, 2 interfaces bridged to form br0, and another connects to internet using pppoe.
Software Environment: bridge, multipath routing
Problem Description: when a system is configured as a bridge with IP assigned to br0 interface, and at the same time it is configured to have multipath weighted default route, and one of the default route is NAT-ed and another of the default route is not NAT-ed, then it is NAT-ed interface will occasionally get packets leaking out to it with packets with private IPs.

Steps to reproduce: 
1) setup the bridge interface and assign an IP to it 
2) setup an default gateway on side B of the bridge ( without NAT ) and default route the bridge to this gateway. 
3) Setup a client on side A of the bridge and default route to the bridge br0 interface.
4) Start ping'ing an internet site, for example www.google.com from the client.
   Run the ping continuously, for example :-
         while true
         do
            ping -c 1 www.google.com
            sleep 1
         done
5) after successfully and consistently getting a ping response from the www.google.com, on the bridge system start up another uplink to the internet, but this uplink is SNAT-ed 

       ( eg iptables -t nat -A POSTROUTING -o eth2 -j MASQUERADE ) 

6) verify and make sure that the second uplink is working.
7) change the default route on the bridge to multipath weighted route with equal weight on both the uplinks.
8) sniff the NAT-ed inteface for packets coming in from the LAN client. Occasionallly packets with private IP leaks to the NAT-ed interface.
Comment 1 Anonymous Emailer 2008-01-09 15:28:43 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Wed,  9 Jan 2008 11:55:50 -0800 (PST)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9719
> 
>            Summary: when a system is configured as a bridge, and at the same
>                     time configured to have multipath weighted route, with
>                     one leg goes thru NAT and another without NAT, the nat
>                     path will intermittently get packets leaking out using
>                     internal IP without being SNAT-ted
>            Product: Networking
>            Version: 2.5
>      KernelVersion: 2.6.22.15 and 2.6.23
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Netfilter/Iptables
>         AssignedTo: networking_netfilter-iptables@kernel-bugs.osdl.org
>         ReportedBy: mingching.tiew@redtone.com
> 
> 
> Latest working kernel version: 2.6.23
> Earliest failing kernel version: 2.6.22.15

This doesn't make sense.  What we're trying to ask here (and we've been
unable to find a pair of questions which 100% of reporters can successfully
answer) is whether this is a regression, and in which kernel release did we
regress?

In other words: did we break it, and if so, when did we break it?

> Distribution: iptables 1.4.0 was used with kernel 2.6.23 and iptables 1.3.8
> with 2.6.22.15
> Hardware Environment: 3 interfaces, 2 interfaces bridged to form br0, and
> another connects to internet using pppoe.
> Software Environment: bridge, multipath routing
> Problem Description: when a system is configured as a bridge with IP assigned
> to br0 interface, and at the same time it is configured to have multipath
> weighted default route, and one of the default route is NAT-ed and another of
> the default route is not NAT-ed, then it is NAT-ed interface will
> occasionally
> get packets leaking out to it with packets with private IPs.
> 
> Steps to reproduce: 
> 1) setup the bridge interface and assign an IP to it 
> 2) setup an default gateway on side B of the bridge ( without NAT ) and
> default
> route the bridge to this gateway. 
> 3) Setup a client on side A of the bridge and default route to the bridge br0
> interface.
> 4) Start ping'ing an internet site, for example www.google.com from the
> client.
>    Run the ping continuously, for example :-
>          while true
>          do
>             ping -c 1 www.google.com
>             sleep 1
>          done
> 5) after successfully and consistently getting a ping response from the
> www.google.com, on the bridge system start up another uplink to the internet,
> but this uplink is SNAT-ed 
> 
>        ( eg iptables -t nat -A POSTROUTING -o eth2 -j MASQUERADE ) 
> 
> 6) verify and make sure that the second uplink is working.
> 7) change the default route on the bridge to multipath weighted route with
> equal weight on both the uplinks.
> 8) sniff the NAT-ed inteface for packets coming in from the LAN client.
> Occasionallly packets with private IP leaks to the NAT-ed interface.
> 
Comment 2 Ming-Ching Tiew 2008-01-09 16:01:19 UTC
bugme-daemon@bugzilla.kernel.org wrote:
> This doesn't make sense. What we're trying to ask here (and we've been
> unable to find a pair of questions which 100% of reporters can successfully
> answer) is whether this is a regression, and in which kernel release did we
> regress?
>
> In other words: did we break it, and if so, when did we break it?
>
>   

Sorry for the confusion. I realized that mistake immediately after I 
posted it on the web interface. However, the web interface does not seem 
to allow me to correct that.

What I meant was that it failed on both the kernel versions I tested. I 
am afraid it  is a problem which exists all a long. Perhaps it has been 
broken quite sometime already. I need to go back to try some older 
kernel version and see if I could repeat the problem.

Regards.
Comment 3 Ming-Ching Tiew 2008-01-09 16:07:53 UTC
Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Wed,  9 Jan 2008 11:55:50 -0800 (PST)
> bugme-daemon@bugzilla.kernel.org wrote:
>
>   
>> http://bugzilla.kernel.org/show_bug.cgi?id=9719
>>
>>            Summary: when a system is configured as a bridge, and at the same
>>                     time configured to have multipath weighted route, with
>>                     one leg goes thru NAT and another without NAT, the nat
>>                     path will intermittently get packets leaking out using
>>                     internal IP without being SNAT-ted
>>            Product: Networking
>>            Version: 2.5
>>      KernelVersion: 2.6.22.15 and 2.6.23
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: Netfilter/Iptables
>>         AssignedTo: networking_netfilter-iptables@kernel-bugs.osdl.org
>>         ReportedBy: mingching.tiew@redtone.com
>>
>>
>> Latest working kernel version: 2.6.23
>> Earliest failing kernel version: 2.6.22.15
>>     
>
> This doesn't make sense.  What we're trying to ask here (and we've been
> unable to find a pair of questions which 100% of reporters can successfully
> answer) is whether this is a regression, and in which kernel release did we
> regress?
>
> In other words: did we break it, and if so, when did we break it?
>   

Sorry for the confusion and for such a lousy first time bug reporter.

I realized that mistake immediately after I posted it on the web 
interface. However, the web interface does not seem to allow me to 
correct that.

What I meant was that it failed on both the kernel versions I tested. I 
am afraid it  is a problem which exists all a long. Perhaps it has been 
broken quite sometime already. I need to go back to try some older 
kernel version and see if I could repeat the problem.

Regards.
Comment 4 Ming-Ching Tiew 2008-01-09 19:32:38 UTC
Ming-Ching Tiew wrote:
>
> What I meant was that it failed on both the kernel versions I tested. 
> I am afraid it  is a problem which exists all a long. Perhaps it has 
> been broken quite sometime already. I need to go back to try some 
> older kernel version and see if I could repeat the problem.

OK based on the I repeat the problem, so far I could not find such 
misbehaviour on kernel 2.6.18. I will do more tests to make it more 
conclusive.
Comment 5 Ming-Ching Tiew 2008-01-09 20:26:07 UTC
Ming-Ching Tiew wrote:
> Ming-Ching Tiew wrote:
>>
>> What I meant was that it failed on both the kernel versions I tested. 
>> I am afraid it  is a problem which exists all a long. Perhaps it has 
>> been broken quite sometime already. I need to go back to try some 
>> older kernel version and see if I could repeat the problem.
>
> OK based on the I repeat the problem, so far I could not find such 
> misbehaviour on kernel 2.6.18. I will do more tests to make it more 
> conclusive.
>

Sorry for jumping the gun. Kernel 2.6.18 has the same problem too.
I think from now on, I will refrain from early posting until conclusive 
results.
Comment 6 Patrick McHardy 2008-01-10 07:45:59 UTC
Andrew Morton wrote:
>> Distribution: iptables 1.4.0 was used with kernel 2.6.23 and iptables 1.3.8
>> with 2.6.22.15
>> Hardware Environment: 3 interfaces, 2 interfaces bridged to form br0, and
>> another connects to internet using pppoe.
>> Software Environment: bridge, multipath routing
>> Problem Description: when a system is configured as a bridge with IP
>> assigned
>> to br0 interface, and at the same time it is configured to have multipath
>> weighted default route, and one of the default route is NAT-ed and another
>> of
>> the default route is not NAT-ed, then it is NAT-ed interface will
>> occasionally
>> get packets leaking out to it with packets with private IPs.


That is most likely because the route changes over time (when the cache
is flushed) and the NAT mappings for the connection have been set up on
a different interface. The way to properly do this is to add routing
rules based on fwmark and use CONNMARK to bind a connection to one of
the interfaces after the initial multipath routing decision.
Comment 7 Patrick McHardy 2008-02-02 03:56:32 UTC
Expected and well known behaviour. Proper way to configure this is:

ip route add default nexthop <route 1> realm 1 nexthop <route 2> realm 2

ip route add default <route 1> table 100
ip rule add fwmark 0x1 table 100

ip route add defautl <route 2> table 200
ip rule add fwmark 0x2 table 200

iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark
iptables -t mangle -A OUTPUT -j CONNMARK --restore-mark
iptables -t mangle -A POSTROUTING -m connmark --mark 0x0 -m realm --realm 1 -j CONNMARK --set-mark 0x1
iptables -t mangle -A POSTROUTING -m connmark --mark 0x0 -m realm --realm 2 -j CONNMARK --set-mark 0x2

which makes sure connections stay on the same route that was initially chosen.

Please close.
Comment 8 Ming-Ching Tiew 2009-03-25 04:36:08 UTC
The original summary for this bug was longer than 255 characters, and so it was truncated when Bugzilla was upgraded. The original summary was:

when a system is configured as a bridge, and at the same time configured to have multipath weighted route, with one leg goes thru NAT and another without NAT, the nat path will intermittently get packets leaking out using internal IP without being SNAT-ted

Note You need to log in before you can comment on or make changes to this bug.