Most recent kernel where this bug did *NOT* occur: 2.6.20.7 Distribution: Debian unstable Hardware Environment: EM64T (Pentium D) running amd64 kernel Software Environment: Debian unstable Problem Description: I have the hercules s/390 emulator running on an EM64T host, both running Debian unstable. I use a tun interface, a second IP address on eth0 and iptables/nat so the emulator has it's own address on my local network. With 2.6.21.1 on the host, networking between the emulator and the host system is fine (I can ssh from the host into the emulator without problems), but communication from the emulator with other boxes is broken. Other boxes also don't see the emulator if I ping its external address. If I ping another box on my LAN from the emulator while running wireshark on the host, I can see that: - the echo request gets sent OK - the other box replies OK - the host receives the echo reply - but the tun interface never gets it. If I boot the host with 2.6.20 everything works fine again. Here is how the setup looks: |---------------- host system --------------------| |-- emulator --| eth0 tun ctc0 LAN <---> 10.19.66.21 LAN <---> 10.19.66.92 <---> 10.19.92.2 <---> 10.19.92.1 nat P2P The only active iptables rules are: iptables -t nat -A PREROUTING -d 10.19.66.92 \ -j DNAT --to-destination 10.19.92.1 iptables -t nat -A POSTROUTING -s 10.19.92.1 \ -j SNAT --to-source 10.19.66.92
Andrew Morton wrote: > On Mon, 21 May 2007 13:05:36 -0700 > bugme-daemon@bugzilla.kernel.org wrote: > >>Problem Description: >>I have the hercules s/390 emulator running on an EM64T host, both running >>Debian unstable. I use a tun interface, a second IP address on eth0 and >>iptables/nat so the emulator has it's own address on my local network. >> >>With 2.6.21.1 on the host, networking between the emulator and the host system >>is fine (I can ssh from the host into the emulator without problems), but >>communication from the emulator with other boxes is broken. Other boxes also >>don't see the emulator if I ping its external address. >> >>If I ping another box on my LAN from the emulator while running wireshark on >>the host, I can see that: >>- the echo request gets sent OK >>- the other box replies OK >>- the host receives the echo reply >>- but the tun interface never gets it. >> >>If I boot the host with 2.6.20 everything works fine again. Please post the output of lsmod and cat /proc/net/ip_conntrack after sending a ping.
Created attachment 11565 [details] lsmod on the host
Created attachment 11566 [details] Output of cat /proc/net/ip_conntrack on the host
Both files are after trying to ping 10.19.66.1 from the emulator. 10.19.66.1 and 10.19.66.2 are the DNS servers for my local LAN.
icmp 1 12 src=10.19.66.11 dst=10.19.66.255 type=8 code=0 id=58132 packets=1 bytes=28 [UNREPLIED] src=10.19.66.255 dst=10.19.66.11 type=0 code=0 id=58132 packets=0 bytes=0 mark=0 secmark=0 use=1 icmp 1 12 src=10.19.66.11 dst=10.19.66.0 type=8 code=0 id=58132 packets=1 bytes=28 [UNREPLIED] src=10.19.66.0 dst=10.19.66.11 type=0 code=0 id=58132 packets=0 bytes=0 mark=0 secmark=0 use=1 The first one shows the broadcast address as destination, the second one the network address. Are these really the addresses you pinged?
Created attachment 11567 [details] Wireshark capture file No, as I said in my later comment (#4), I pinged 10.19.66.1. 10.19.66.11 is my laptop. Has nothing to do with the ping. I've tried again with another host (one that has no special roles in the LAN) and left a ping running for a long time inside the emulator. Nothing at all shows up in ip_conntrack related to that ping during all that time. Generally while the ping is running ip_conntrack only shows only the established ssh session between the emulator and the host and nothing else: tcp 6 431563 ESTABLISHED src=10.19.92.2 dst=10.19.92.1 sport=34678 dport=22 packets=71 bytes=7036 src=10.19.92.1 dst=10.19.92.2 sport=22 dport=34678 packets=54 bytes=7072 [ASSURED] mark=0 secmark=0 use=1 So it looks like my timing was bad for the one I sent earlier :-/ Attached a wireshark capture file for the ping to that other host (10.19.66.19) while listening on all interfaces.
Created attachment 11568 [details] Wireshark capture file for 2.6.20 I have just done the same ping with 2.6.20, and with that there is also _no_ listing in ip_conntrack for a connection with 10.19.66.19, even though it *does* work. The attached wireshark capture file clearly shows the packages being forwarded to 10.19.92.1 and the subsequent ssh traffic updating the display for the output of the ping command.
The connection tracking entry for pings is destroyed as soon as a reply is seen, so at least the reply seems to be propely associated with the connection. Can you please add logging rules to check how far the packet makes it, something like this: for i in PREROUTING INPUT FORWARD OUTPUT POSTROUTING; do iptables -t mangle -I $i -p icmp -j LOG --log-prefix "$i " done Thanks.
Here's what I get after a 'ping -c 1 10.19.66.19': PREROUTING IN=tun0 OUT= MAC= SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1872 SEQ=1 FORWARD IN=tun0 OUT=eth0 SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1872 SEQ=1 POSTROUTING IN= OUT=eth0 SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1872 SEQ=1 PREROUTING IN=eth0 OUT= MAC=00:16:76:04:ff:09:00:10:83:cf:15:a5:08:00 SRC=10.19.66.19 DST=10.19.66.92 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=19817 PROTO=ICMP TYPE=0 CODE=0 ID=1872 SEQ=1
bugme-daemon@bugzilla.kernel.org wrote: > ------- Additional Comments From elendil@planet.nl 2007-05-23 03:55 ------- > Here's what I get after a 'ping -c 1 10.19.66.19': > > PREROUTING IN=tun0 OUT= MAC= SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 > PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1872 SEQ=1 > FORWARD IN=tun0 OUT=eth0 SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 > PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1872 SEQ=1 > POSTROUTING IN= OUT=eth0 SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 > PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1872 SEQ=1 > PREROUTING IN=eth0 OUT= MAC=00:16:76:04:ff:09:00:10:83:cf:15:a5:08:00 > SRC=10.19.66.19 DST=10.19.66.92 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=19817 > PROTO=ICMP TYPE=0 CODE=0 ID=1872 SEQ=1 This looks all OK, but I still can't figure out whats wrong. Could you install the conntrack tool (included in debian unstable, apt-get install conntrack) and post the ctnetlink events caused by the ping (conntrack -E)? Your kernel needs both CONFIG_NF_CONNTRACK_EVENTS and CONFIG_NF_CT_NETLINK for this. Thanks.
Isn't there a final FORWARD missing? Here's the output of conntrack -E after letting ping running a bit: [NEW] icmp 1 30 src=10.19.92.1 dst=10.19.66.19 type=8 code=0 id=2041 [UNREPLIED] src=10.19.66.19 dst=10.19.66.92 type=0 code=0 id=2041 [NEW] icmp 1 30 src=10.19.92.1 dst=10.19.66.19 type=8 code=0 id=2041 [UNREPLIED] src=10.19.66.19 dst=10.19.66.92 type=0 code=0 id=2041 [UPDATE] icmp 1 28 src=10.19.92.1 dst=10.19.66.19 type=8 code=0 id=2041 src=10.19.66.19 dst=10.19.66.92 type=0 code=0 id=2041 [DESTROY] icmp 1 src=10.19.92.1 dst=10.19.66.19 type=8 code=0 id=2041 packets=1 bytes=84 src=10.19.66.19 dst=10.19.66.92 type=0 code=0 id=2041 packets=0 bytes=0 [UPDATE] icmp 1 30 src=10.19.92.1 dst=10.19.66.19 type=8 code=0 id=2041 src=10.19.66.19 dst=10.19.66.92 type=0 code=0 id=2041 [DESTROY] icmp 1 src=10.19.92.1 dst=10.19.66.19 type=8 code=0 id=2041 packets=1 bytes=84 src=10.19.66.19 dst=10.19.66.92 type=0 code=0 id=2041 packets=0 bytes=0 [NEW] icmp 1 30 src=10.19.92.1 dst=10.19.66.19 type=8 code=0 id=2041 [UNREPLIED] src=10.19.66.19 dst=10.19.66.92 type=0 code=0 id=2041 [UPDATE] icmp 1 30 src=10.19.92.1 dst=10.19.66.19 type=8 code=0 id=2041 src=10.19.66.19 dst=10.19.66.92 type=0 code=0 id=2041 [DESTROY] icmp 1 src=10.19.92.1 dst=10.19.66.19 type=8 code=0 id=2041 packets=1 bytes=84 src=10.19.66.19 dst=10.19.66.92 type=0 code=0 id=2041 packets=0 bytes=0 BTW, thanks for your efforts on this and for your quick responses.
For comparison, here is what I get with 2.6.20 with the iptables logging. Note the additional FORWARD and POSTROUTING lines. PREROUTING IN=tun0 OUT= MAC= SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1642 SEQ=1 IN=tun0 OUT= MAC= SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1642 SEQ=1 IN=tun0 OUT= MAC= SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1642 SEQ=1 FORWARD IN=tun0 OUT=eth0 SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1642 SEQ=1 POSTROUTING IN= OUT=eth0 SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1642 SEQ=1 IN= OUT=eth0 SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1642 SEQ=1 PREROUTING IN=eth0 OUT= MAC=00:16:76:04:ff:09:00:10:83:cf:15:a5:08:00 SRC=10.19.66.19 DST=10.19.66.92 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=46515 PROTO=ICMP TYPE=0 CODE=0 ID=1642 SEQ=1 FORWARD IN=eth0 OUT=tun0 SRC=10.19.66.19 DST=10.19.92.1 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=46515 PROTO=ICMP TYPE=0 CODE=0 ID=1642 SEQ=1 POSTROUTING IN= OUT=tun0 SRC=10.19.66.19 DST=10.19.92.1 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=46515 PROTO=ICMP TYPE=0 CODE=0 ID=1642 SEQ=1 I also checked the output of conntrack -E with 2.6.20, but that looked identical.
Oops, I also had some '-t nat -j LOG' entries. here is a clean output for 2.6.20 without those: PREROUTING IN=tun0 OUT= MAC= SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1642 SEQ=1 FORWARD IN=tun0 OUT=eth0 SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1642 SEQ=1 POSTROUTING IN= OUT=eth0 SRC=10.19.92.1 DST=10.19.66.19 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1642 SEQ=1 PREROUTING IN=eth0 OUT= MAC=00:16:76:04:ff:09:00:10:83:cf:15:a5:08:00 SRC=10.19.66.19 DST=10.19.66.92 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=46515 PROTO=ICMP TYPE=0 CODE=0 ID=1642 SEQ=1 FORWARD IN=eth0 OUT=tun0 SRC=10.19.66.19 DST=10.19.92.1 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=46515 PROTO=ICMP TYPE=0 CODE=0 ID=1642 SEQ=1 POSTROUTING IN= OUT=tun0 SRC=10.19.66.19 DST=10.19.92.1 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=46515 PROTO=ICMP TYPE=0 CODE=0 ID=1642 SEQ=1
As it looked to me we got stuck on this issue, I've done a git bisect and traced the regression to this commit: commit 8030f54499925d073a88c09f30d5d844fb1b3190 Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu Feb 22 01:53:47 2007 +0900 [IPV4] devinet: Register inetdev earlier. I have verified that this commit is the culprit by building a kernel from 2.6.21 with only that commit reverted. With that kernel installed on the host I could again ping normally from within the emulator. Up to you people to figure out *how* this seemingly innocent patch causes the regression :-) Note that the commit is related to 45ba9dd2007da23da5ac21179451c3c9fee30a96, which does the same for IPv6.
bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8519 > ------- Additional Comments From elendil@planet.nl 2007-05-27 11:41 ------- > As it looked to me we got stuck on this issue, I've done a git bisect and > traced the regression to this commit: > commit 8030f54499925d073a88c09f30d5d844fb1b3190 > Author: Herbert Xu <herbert@gondor.apana.org.au> > Date: Thu Feb 22 01:53:47 2007 +0900 > > [IPV4] devinet: Register inetdev earlier. > > I have verified that this commit is the culprit by building a kernel from > 2.6.21 with only that commit reverted. With that kernel installed on the host > I could again ping normally from within the emulator. > > Up to you people to figure out *how* this seemingly innocent patch causes the > regression :-) > Note that the commit is related to 45ba9dd2007da23da5ac21179451c3c9fee30a96, > which does the same for IPv6. Thanks a lot and sorry for the delay! This should make it a lot easier to figure out whats wrong. I'll look into it ..
I can not see any side-effects of this change that could be responsible. Herbert, would you mind having a look?
Hi: I suggest that you have a look at the rp_filter setting on eth0. If it's enabled then try disabling it.
Oh and if that doesn't help then please take a capture of all the content of /proc/sys/net/ipv4/eth0 with and without the patch to see if they're different. Thanks.
> I suggest that you have a look at the rp_filter setting on eth0. $ cat /proc/sys/net/ipv4/conf/eth0/rp_filter 0 The second suggestion shows the problem: --- ipv4.good 2007-05-28 10:43:32.000000000 +0200 +++ ipv4.bad 2007-05-28 10:40:44.000000000 +0200 /proc/sys/net/ipv4/conf/eth0/forwarding: -1 +0 /proc/sys/net/ipv4/conf/eth0/rp_filter: -1 +0 Now, how does this patch cause the "forwarding" setting to not be set? My /etc/sysctl.conf contains: <snip> # Uncomment the next line to enable Spoof protection (reverse-path filter) net.ipv4.conf.default.rp_filter=1 # Uncomment the next line to enable TCP/IP SYN cookies net.ipv4.tcp_syncookies=1 # Uncomment the next line to enable packet forwarding for IPv4 net.ipv4.conf.default.forwarding=1 </snip> This file is read during /etc/rcS.d/S30procps.sh. (/me feels kind of stupid for not checking these values earlier, but as the configuration had always worked fine...)
Changing the value in default only affects interfaces which are registered afterwards. Previously they affected interfaces which are brought up afterwards. I'll talk to others to see if we could come up with a way to minimise this sort of pain.
> Changing the value in default only affects interfaces which are registered > afterwards. Previously they affected interfaces which are brought up > afterwards. To me his seems like a significant change in behavior that is going to affect and probably break quite a few systems. I don't know about other distributions, but at least in Debian this has been the standard way of setting such values for all interfaces for a long time. I wonder if the change was intentional or an unexpected result of this commit? At least the changelog offers no indication that this was intended. I hope you will find a way to resolve this. If the new behavior remains, this will require at least careful documentation.
It should be fixed in the current kernel.