Bug 39132
Summary: | Starting with 3.0.0-rc6, masquerading seems to be broken. | ||
---|---|---|---|
Product: | Networking | Reporter: | David Hill (hilld) |
Component: | Netfilter/Iptables | Assignee: | networking_netfilter-iptables (networking_netfilter-iptables) |
Status: | CLOSED CODE_FIX | ||
Severity: | high | CC: | FJ.Whittle, florian, hilld, kaber, maciej.rutecki, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.0.0-rc6 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 36912 | ||
Attachments: |
.config file for 3.0-rc6
Temp firewall used to validate that there was not a custom firewall rule in the way. tcpdump log of failed https data sessions |
Description
David Hill
2011-07-10 19:45:13 UTC
The server is a proxy and iptables rules seems to be affected by 3.0.0-rc6 as proxying ssl sites stopped working with that kernel release. --log-level DEBUG is no longer recognise by iptables ... Ok, this is weird but getting clearer as my investigation is progressing. The server is a proxy server servicing some computers within the local area network. If I'm using a PC, I'm not experiencing any issues with anything. If I'm using a MAC and the built-in proxy engine in Mac OS X, SSL doesn't work at all. Safari and Chrome are thus not functionnal. If I'm using a MAC with Firefox 5.0, I'm not experiencing any SSL issues. I'm still trying to figure out where it's breaking and why. Any ideas on this? It doesn't make any sense but this is what I've found so far. If I reboot my server with 2.6.39.3, everything works fine for everybody regardless of the platform used. Ok... I've some more informations on this issue. On the PC it's working because all protocols were using the proxy ... which wasn't the case with Chrome/Safari ... I've managed to have Chrome/Safari to work but if it tries to go direct, it doesn't work at all. It kind of "times out" ... This was working with kernel 2.6.39.3 so it's not a firewall issue. Furthermore , I've reduced my firewall rules to the strict minimum and it does the same thing. Masquerading SSL is definitely broken for all platforms... Finaly, even non SSL request seems to be broken... The server isn't forwarding the request to the client that initiated the http connection. Created attachment 65222 [details]
.config file for 3.0-rc6
I've mentionned a lot of time 2.6.39.3 ... but it does work with 3.0-rc5 ... the problem seems to have started with rc6 only and since I've seen some bug fixes in netfilter, I'm confident that some recent changes are creating this issue. Created attachment 65232 [details]
Temp firewall used to validate that there was not a custom firewall rule in the way.
While looking at the commits between rc5 and rc6, I saw a netfilter_route bug fix... Could it be that commit that broke everything? Here is a traceroute (didn't think about the traceroute before falling on a bug report on the debian bugtracker) and notice the same behavior where when initating a traceroute from the client to a given server, that given server appears as the first HOP even though it is not... papineau:~ eth1$ traceroute www.google.ca traceroute: Warning: www.google.ca has multiple addresses; using 74.125.93.105 traceroute to www.l.google.com (74.125.93.105), 64 hops max, 52 byte packets 1 qw-in-f105.1e100.net (74.125.93.105) 1.389 ms 0.737 ms 0.738 ms 2 10.71.240.1 (10.71.240.1) 7.557 ms 10.583 ms 9.352 ms 3 10.170.171.17 (10.170.171.17) 10.125 ms 11.803 ms 8.215 ms 4 10.170.162.114 (10.170.162.114) 8.807 ms 16.941 ms 8.658 ms 5 216.113.123.113 (216.113.123.113) 8.913 ms 25.996 ms 9.820 ms 6 216.113.122.58 (216.113.122.58) 30.744 ms 28.963 ms 27.825 ms 7 72.14.214.126 (72.14.214.126) 95.914 ms 27.461 ms 26.347 ms 8 216.239.48.108 (216.239.48.108) 27.329 ms 26.826 ms 26.238 ms 9 209.85.248.75 (209.85.248.75) 34.111 ms 37.592 ms 34.852 ms 10 209.85.254.237 (209.85.254.237) 33.582 ms 209.85.254.233 (209.85.254.233) 33.699 ms 209.85.254.235 (209.85.254.235) 33.069 ms 11 216.239.47.34 (216.239.47.34) 35.223 ms 216.239.46.78 (216.239.46.78) 41.498 ms 216.239.47.34 (216.239.47.34) 44.584 ms 12 qw-in-f105.1e100.net (74.125.93.105) 31.651 ms 33.862 ms 33.643 ms That seems likely since we didn't have any other changes in that area. Could you please provide a dump captured with tcpdump -w <file> -s0 of the broken case and retry with that patch reverted? This would be the faulty commit commit 0e90ed0e8b9b1c25040442f1d20c799751b1e727 Merge: 5fc3054 16adf5d Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu Jun 30 10:44:52 2011 -0700 I reverted that commit, recompiled the kernel and rebooted. Voila! Everything is working fine! I'm running 3.0.0-rc7 and NATting is still alive! On 12.07.2011 05:49, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=39132 > > > > > > --- Comment #11 from David Hill <hilld@binarystorm.net> 2011-07-12 03:49:40 > --- > This would be the faulty commit > > commit 0e90ed0e8b9b1c25040442f1d20c799751b1e727 > Merge: 5fc3054 16adf5d > Author: Linus Torvalds <torvalds@linux-foundation.org> > Date: Thu Jun 30 10:44:52 2011 -0700 > > I reverted that commit, recompiled the kernel and rebooted. So you reverted the entire merge or just the routing fix? Yes... I don't know how to unmerge a specific patch in that merge. I see there 24 commits in that branch ... is there an easy way to unmerge only the netfilter patch? commit 0e90ed0e8b9b1c25040442f1d20c799751b1e727 Merge: 5fc3054 16adf5d Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu Jun 30 10:44:52 2011 -0700 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits) usbnet: Remove over-broad module alias from zaurus. MAINTAINERS: drop Michael from bfin_mac driver net/can: activate bit-timing calculation and netlink based drivers by default rionet: fix NULL pointer dereference in rionet_remove net+crypto: Use vmalloc for zlib inflate buffers. netfilter: Fix ip_route_me_harder triggering ip_rt_bug ipv4: Fix IPsec slowpath fragmentation problem ipv4: Fix packet size calculation in __ip_append_data cxgb3: skb_record_rx_queue now records the queue index relative to the net_device. bridge: Only flood unregistered groups to routers qlge: Add maintainer. MAINTAINERS: mark socketcan-core lists as subscribers-only MAINTAINERS: Remove Sven Eckelmann from BATMAN ADVANCED r8169: fix wrong register use. net/usb/kalmia: signedness bug in kalmia_bind() net/usb: kalmia: Various fixes for better support of non-x86 architectures. rtl8192cu: Fix missing firmware load udp/recvmsg: Clear MSG_TRUNC flag when starting over for a new packet ipv6/udp: Use the correct variable to determine non-blocking condition netconsole: fix build when CONFIG_NETCONSOLE_DYNAMIC is turned on ... git revert ed6e4ef836d425bc35e33bf20fcec95e68203afa should do the trick. net/ipv4/netfilter/nf_nat_standalone.c: In function 'nf_nat_standalone_init': net/ipv4/netfilter/nf_nat_standalone.c:287:80: warning: the comparison will always evaluate as 'true' for the address of 'nat_decode_session' will never be NULL [-Waddress] net/ipv4/netfilter/nf_nat_core.c: In function 'nf_nat_protocol_unregister': net/ipv4/netfilter/nf_nat_core.c:528:92: warning: the comparison will always evaluate as 'true' for the address of 'nf_nat_unknown_protocol' will never be NULL [-Waddress] net/ipv4/netfilter/nf_nat_core.c: In function 'nf_nat_init': net/ipv4/netfilter/nf_nat_core.c:739:93: warning: the comparison will always evaluate as 'true' for the address of 'nf_nat_unknown_protocol' will never be NULL [-Waddress] net/ipv4/netfilter/nf_nat_core.c:740:84: warning: the comparison will always evaluate as 'true' for the address of 'nf_nat_protocol_tcp' will never be NULL [-Waddress] net/ipv4/netfilter/nf_nat_core.c:741:84: warning: the comparison will always evaluate as 'true' for the address of 'nf_nat_protocol_udp' will never be NULL [-Waddress] net/ipv4/netfilter/nf_nat_core.c:742:86: warning: the comparison will always evaluate as 'true' for the address of 'nf_nat_protocol_icmp' will never be NULL [-Waddress] net/ipv4/netfilter/nf_nat_core.c:751:78: warning: the comparison will always evaluate as 'true' for the address of 'nf_nat_seq_adjust' will never be NULL [-Waddress] net/ipv4/netfilter/nf_nat_core.c:753:94: warning: the comparison will always evaluate as 'true' for the address of 'nfnetlink_parse_nat_setup' will never be NULL [-Waddress] net/ipv4/netfilter/nf_nat_core.c:756:78: warning: the comparison will always evaluate as 'true' for the address of 'nf_nat_get_offset' will never be NULL [-Waddress] net/ipv4/netfilter/nf_nat_ftp.c: In function 'nf_nat_ftp_init': net/ipv4/netfilter/nf_nat_ftp.c:123:64: warning: the comparison will always evaluate as 'true' for the address of 'nf_nat_ftp' will never be NULL [-Waddress] net/ipv4/netfilter/nf_nat_irc.c: In function 'nf_nat_irc_init': net/ipv4/netfilter/nf_nat_irc.c:85:52: warning: the comparison will always evaluate as 'true' for the address of 'help' will never be NULL [-Waddress] net/netfilter/nf_conntrack_core.c: In function 'nf_conntrack_init': net/netfilter/nf_conntrack_core.c:1579:83: warning: the comparison will always evaluate as 'true' for the address of 'nf_conntrack_attach' will never be NULL [-Waddress] net/netfilter/nf_conntrack_core.c:1580:79: warning: the comparison will always evaluate as 'true' for the address of 'destroy_conntrack' will never be NULL [-Waddress] net/netfilter/nf_conntrack_proto.c: In function 'nf_conntrack_l3proto_unregister': net/netfilter/nf_conntrack_proto.c:210:102: warning: the comparison will always evaluate as 'true' for the address of 'nf_conntrack_l3proto_generic' will never be NULL [-Waddress] net/netfilter/nf_conntrack_proto.c: In function 'nf_conntrack_l4proto_unregister': net/netfilter/nf_conntrack_proto.c:345:102: warning: the comparison will always evaluate as 'true' for the address of 'nf_conntrack_l4proto_generic' will never be NULL [-Waddress] net/netfilter/nf_conntrack_proto.c: In function 'nf_conntrack_proto_init': net/netfilter/nf_conntrack_proto.c:370:103: warning: the comparison will always evaluate as 'true' for the address of 'nf_conntrack_l3proto_generic' will never be NULL [-Waddress] Reverting that commit solves the problem. But a traceroute still appears weird: papineau:~ eth1$ traceroute google.ca traceroute: Warning: google.ca has multiple addresses; using 74.125.91.147 traceroute to google.ca (74.125.91.147), 64 hops max, 52 byte packets 1 qy-in-f147.1e100.net (74.125.91.147) 1.867 ms 1.192 ms 1.085 ms 2 10.71.240.1 (10.71.240.1) 15.449 ms 7.803 ms 6.383 ms 3 10.170.171.17 (10.170.171.17) 11.362 ms 11.794 ms 10.842 ms 4 10.170.162.114 (10.170.162.114) 15.019 ms 9.564 ms 8.221 ms 5 216.113.123.125 (216.113.123.125) 35.551 ms 9.471 ms 9.960 ms 6 216.113.123.190 (216.113.123.190) 22.374 ms 22.067 ms 23.155 ms 7 72.14.214.126 (72.14.214.126) 24.040 ms 41.641 ms 23.489 ms 8 216.239.48.108 (216.239.48.108) 31.167 ms 24.720 ms 24.154 ms 9 209.85.248.75 (209.85.248.75) 30.767 ms 38.030 ms 76.998 ms 10 209.85.254.239 (209.85.254.239) 30.425 ms 209.85.254.233 (209.85.254.233) 31.642 ms 28.777 ms 11 209.85.240.53 (209.85.240.53) 35.177 ms 209.85.240.21 (209.85.240.21) 39.539 ms 209.85.240.53 (209.85.240.53) 38.604 ms 12 qy-in-f147.1e100.net (74.125.91.147) 32.317 ms 31.584 ms 31.203 ms Similar story here, bug is still present in v3.0 On Thu, 4 Aug 2011 21:38:48 +0300 (EEST) Julian Anastasov <ja@ssi.bg> wrote: > > Hello, > > On Thu, 4 Aug 2011, Florian Mickler wrote: > > > Can someone take a look at this regression? > > > > Begin forwarded message: > > > > Date: Thu, 28 Jul 2011 04:51:12 GMT > > From: bugzilla-daemon@bugzilla.kernel.org > > To: florian@mickler.org > > Subject: [Bug 39132] Starting with 3.0.0-rc6, masquerading seems to be > > broken. > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=39132 > > So, problem points again to > "Fix ip_route_me_harder triggering ip_rt_bug" ? May be > David C. Hill or Florian can provide some information, eg. is > tproxy used, what NAT rules are used, any rules in OUTPUT > hooks (NAT/mangle) and which packets are dropped. > > Regards > > -- > Julian Anastasov <ja@ssi.bg> That would have to come from David C. Hill, since I'm not expiriencing this bug. Regards, Flo p.s.: I added the bugzilla daemon to the cc in the hope that this mail will land as a comment in there. I'm not the only one experiencing this bug... so you're telling me that my firewall rules (that didn't change since kernel v2.6.0 are causing this bug? If I revert the commit specified above, it fixes the problem... so why say the bug is somewhere else when I can revert it and get back to normal? I'm attaching my firewall rules ... In fact... the firewall rules are already attached along with the .config file used to compile the kernel. I can only say that reverting the patch above solves almost all issues. Sorry for the comment above, I cannot delete it and it's off topic since I've misread what you typed ... hehe Sorry again. Reply-To: ja@ssi.bg Hello, On Fri, 5 Aug 2011, David Hill wrote: > I'm not using TPROXY and I've used a blank firewall with only masquerading > and reproduced the issue. > Nothing is in NAT/mangle nor OUTPUT but the rules mentionned in the attached > files to this bug. > > Francis Whittle (Comment #18) has the same issue. I compiled 3.0 kernel, added one -j MASQUERADE and tried TCP connection - it works. I'm not sure ip_route_me_harder is called for masqueraded traffic, usually it is called from LOCAL_OUT handlers or to send TCP RST (-j REJECT) via LOCAL_OUT, not for forwarded traffic. Can you show lines of tcpdump output with addresses and ports, so that I can understand what kind of traffic is dropped, is it initial forwarded packet or its response, is it problem with some ICMP packets, I assume there is no problem with locally generated traffic. Can you show output from: # grep . /proc/sys/net/ipv4/conf/*/rp_filter # grep . /proc/sys/net/ipv4/conf/*/send_redirects If it works with -rc5 it should not be rp_filter, for NAT, problem can be with ICMP redirects or something else. Can you tell us if the internal and external devices are same or may be many. Regards -- Julian Anastasov <ja@ssi.bg> Created attachment 68282 [details]
tcpdump log of failed https data sessions
Applies when using -j SNAT --to (external ipv4 address) as well.
Some specific sequences are never forwarded for outgoing NAT traffic. As you can see, routed traffic (in this case IPv6 because I only have the one public v4 address) works fine.
$ grep . /proc/sys/net/ipv4/conf/*/rp_filter /proc/sys/net/ipv4/conf/*/send_redirects /proc/sys/net/ipv4/conf/all/rp_filter:0 /proc/sys/net/ipv4/conf/apex-v6/rp_filter:0 /proc/sys/net/ipv4/conf/default/rp_filter:0 /proc/sys/net/ipv4/conf/eth1/rp_filter:0 /proc/sys/net/ipv4/conf/lo/rp_filter:0 /proc/sys/net/ipv4/conf/mon.wlan0/rp_filter:0 /proc/sys/net/ipv4/conf/pan1/rp_filter:0 /proc/sys/net/ipv4/conf/ppp0/rp_filter:0 /proc/sys/net/ipv4/conf/sit0/rp_filter:0 /proc/sys/net/ipv4/conf/wlan0/rp_filter:0 /proc/sys/net/ipv4/conf/all/send_redirects:1 /proc/sys/net/ipv4/conf/apex-v6/send_redirects:1 /proc/sys/net/ipv4/conf/default/send_redirects:1 /proc/sys/net/ipv4/conf/eth1/send_redirects:1 /proc/sys/net/ipv4/conf/lo/send_redirects:1 /proc/sys/net/ipv4/conf/mon.wlan0/send_redirects:1 /proc/sys/net/ipv4/conf/pan1/send_redirects:1 /proc/sys/net/ipv4/conf/ppp0/send_redirects:1 /proc/sys/net/ipv4/conf/sit0/send_redirects:1 /proc/sys/net/ipv4/conf/wlan0/send_redirects:1 Apologies for the extraneous data in tcpdump log. It didn't look so bad before I uploaded it. The lines you're looking for are packets that come from frank.noxious that are not forwarded as coming from local-external-ipv4 This appears to mostly happen to fragmented packets. In particular I noticed this for seq 352:1792, there are others. I'm not using the same type of(In reply to comment #23) > Reply-To: ja@ssi.bg > > Hello, > > On Fri, 5 Aug 2011, David Hill wrote: > > > I'm not using TPROXY and I've used a blank firewall with only > masquerading > > and reproduced the issue. > > Nothing is in NAT/mangle nor OUTPUT but the rules mentionned in the > attached > > files to this bug. > > > > Francis Whittle (Comment #18) has the same issue. > > I compiled 3.0 kernel, added one -j MASQUERADE and > tried TCP connection - it works. I'm not sure ip_route_me_harder > is called for masqueraded traffic, usually it is called > from LOCAL_OUT handlers or to send TCP RST (-j REJECT) via > LOCAL_OUT, not for forwarded traffic. > > Can you show lines of tcpdump output with addresses and > ports, so that I can understand what kind of traffic is > dropped, is it initial forwarded packet or its response, > is it problem with some ICMP packets, I assume there is > no problem with locally generated traffic. > > Can you show output from: > > # grep . /proc/sys/net/ipv4/conf/*/rp_filter > # grep . /proc/sys/net/ipv4/conf/*/send_redirects > > If it works with -rc5 it should not be rp_filter, > for NAT, problem can be with ICMP redirects or something else. > Can you tell us if the internal and external devices are > same or may be many. > > Regards > > -- > Julian Anastasov <ja@ssi.bg> My internal and external devices are two devices that are not the same brand... 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08) 00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) The Intel adapter is plugged in the cable modem and the Realtek is plugged in the switch. Reply-To: ja@ssi.bg Hello, On Fri, 5 Aug 2011, David Hill wrote: > Hello Julian, > > I'm not using TPROXY and I've used a blank firewall with only masquerading > and reproduced the issue. > Nothing is in NAT/mangle nor OUTPUT but the rules mentionned in the attached > files to this bug. > > Francis Whittle (Comment #18) has the same issue. > > > Hello, > > > > On Thu, 4 Aug 2011, Florian Mickler wrote: > > > > > Can someone take a look at this regression? > > > > > > Begin forwarded message: > > > > > > Date: Thu, 28 Jul 2011 04:51:12 GMT > > > From: bugzilla-daemon@bugzilla.kernel.org > > > To: florian@mickler.org > > > Subject: [Bug 39132] Starting with 3.0.0-rc6, masquerading seems to be > > > broken. > > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=39132 > > > > So, problem points again to > > "Fix ip_route_me_harder triggering ip_rt_bug" ? May be > > David C. Hill or Florian can provide some information, eg. is > > tproxy used, what NAT rules are used, any rules in OUTPUT > > hooks (NAT/mangle) and which packets are dropped. May be it is a sequence of two problems. I now checked the tcpdump log from Francis Whittle. The "seq 352:1792" packet at 18:44:29.235154 that is not SNAT-ed is long, can it be some PMTU event that triggers ICMP response to the internal host? Because I see changes in MSS. May be rc5 triggers ICMP FRAG NEEDED while rc6 does not. It can happen because: 1. ICMP uses non-local iph->saddr when XFRM is compiled, reverse lookup fails with ENOENT but fl4->saddr is already damaged with the original daddr (non-local). Fix is here: http://marc.info/?t=131118984300003&r=1&w=2 2. The patched ip_route_me_harder between 3.0-rc5 and 3.0-rc6 expects that sockets always provide local address. This is wrong for some cases such as TCP (uses different SOCK_RAW socket for some packets and can cause problem for tproxy), RAW (can use spoofed sources) and now the ICMP code that incorrectly provides non-local address. Fix is here: http://marc.info/?t=131274411600001&r=1&w=2 I hope (any of) these two fixes should solve the masquerading problems. If that is not true, tcpdump from rc5 would be helpful for comparison. Regards -- Julian Anastasov <ja@ssi.bg> Works for me in kernel mainline; prior to the commit applying http://marc.info/?t=131274411600001 (797fd3913a...); At a guess http://marc.info/?t=131118984300003&r=1&w=2 did the trick, though I haven't tested further. Merged in v3.1-rc1: commit 415b3334a21aa67806c52d1acf4e72e14f7f402f Author: David S. Miller <davem@davemloft.net> Date: Fri Jul 22 06:22:10 2011 -0700 icmp: Fix regression in nexthop resolution during replies. Mergen in v3.1-rc2: commit 797fd3913abf2f7036003ab8d3d019cbea41affd Author: Julian Anastasov <ja@ssi.bg> Date: Sun Aug 7 09:11:00 2011 +0000 netfilter: TCP and raw fix for ip_route_me_harder |