IS WORKING with 2.6.34, problem still present in 2.6.35-rc2 container network is down because of iptables, this ONLY if container is i386 (x86_64 arch container are working no trouble (strange?)). host is 2.6.35-rc1 (fc13) container devices are veth (with a bridge at host level). All container in i386 arch boot correctly, but network is unable to route packet (ping is not responding). Problem is related to iptables activation, adding a SINGLE RULE (within container iptables) make the trouble. here the iptables example *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :std - [0:0] #----------------------------------------------------------- #standard rules #-A FORWARD -j std #troubled Sequence ! -A INPUT -j std #no packet reach if using that sequence -A std -j ACCEPT # #correct No trouble -A INPUT -j ACCEPT #packet are accepted. #----------------------------------------------------------- COMMIT I didn't try using a host with an i386 arch 2.6.35-rc1 kernel, only host x86_64 with a mix of different distribution (fc10 -> fc12) and arch as containers.
Bug still within 2.6.35-rc3 exact same workbench used with 2.6.34 is working
Please try adding a "-j TRACE" rule to see where the packets disappear.
Here are the data (just ONE ping) Jun 15 22:02:11 Sorel kernel: [ 1886.459895] TRACE: mangle:PREROUTING:policy:1 IN=eth0 OUT= MAC=00:26:b9:67:a7:e1:00:19:d1:a8:be:cd:08:00 SRC=X.Y.Z.T DST=192.168.31.34 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=40727 SEQ=1 Jun 15 22:02:11 Sorel kernel: [ 1886.459923] TRACE: nat:PREROUTING:policy:1 IN=eth0 OUT= MAC=00:26:b9:67:a7:e1:00:19:d1:a8:be:cd:08:00 SRC=X.Y.Z.T DST=192.168.31.34 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=40727 SEQ=1 Jun 15 22:02:11 Sorel kernel: [ 1886.459953] TRACE: mangle:FORWARD:policy:1 IN=eth0 OUT=br0 SRC=X.Y.Z.T DST=192.168.31.34 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=40727 SEQ=1 Jun 15 22:02:11 Sorel kernel: [ 1886.459968] TRACE: filter:FORWARD:rule:1 IN=eth0 OUT=br0 SRC=X.Y.Z.T DST=192.168.31.34 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=40727 SEQ=1 Jun 15 22:02:11 Sorel kernel: [ 1886.459987] TRACE: filter:std:rule:7 IN=eth0 OUT=br0 SRC=X.Y.Z.T DST=192.168.31.34 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=40727 SEQ=1 Jun 15 22:02:11 Sorel kernel: [ 1886.460004] TRACE: mangle:POSTROUTING:policy:1 IN= OUT=br0 SRC=X.Y.Z.T DST=192.168.31.34 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=40727 SEQ=1 Jun 15 22:02:11 Sorel kernel: [ 1886.460020] TRACE: nat:POSTROUTING:policy:1 IN= OUT=br0 SRC=X.Y.Z.T DST=192.168.31.34 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=40727 SEQ=1 Keep in mind, we are working in container mode. command line iptables -t raw -p icmp -s X.Y.Z.T/32 -A PREROUTING -j TRACE was applied on the container side (within /etc/rc.d/rc.local, X.Y.Z.T is a public IP server) but seem to apply to host side too... Here is data for a x86-64 container on the same host (ping working) Jun 15 22:10:23 Sorel kernel: [ 2378.271703] TRACE: raw:PREROUTING:policy:2 IN=eth0 OUT= MAC=00:26:b9:67:a7:e1:00:19:d1:a8:be:cd:08:00 SRC=X.Y.Z.T DST=192.168.31.35 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=16152 SEQ=1 Jun 15 22:10:23 Sorel kernel: [ 2378.271738] TRACE: mangle:PREROUTING:policy:1 IN=eth0 OUT= MAC=00:26:b9:67:a7:e1:00:19:d1:a8:be:cd:08:00 SRC=X.Y.Z.T DST=192.168.31.35 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=16152 SEQ=1 Jun 15 22:10:23 Sorel kernel: [ 2378.271766] TRACE: nat:PREROUTING:policy:1 IN=eth0 OUT= MAC=00:26:b9:67:a7:e1:00:19:d1:a8:be:cd:08:00 SRC=X.Y.Z.T DST=192.168.31.35 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=16152 SEQ=1 Jun 15 22:10:23 Sorel kernel: [ 2378.271796] TRACE: mangle:FORWARD:policy:1 IN=eth0 OUT=br0 SRC=X.Y.Z.T DST=192.168.31.35 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=16152 SEQ=1 Jun 15 22:10:23 Sorel kernel: [ 2378.271811] TRACE: filter:FORWARD:rule:1 IN=eth0 OUT=br0 SRC=X.Y.Z.T DST=192.168.31.35 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=16152 SEQ=1 Jun 15 22:10:23 Sorel kernel: [ 2378.271829] TRACE: filter:std:rule:7 IN=eth0 OUT=br0 SRC=X.Y.Z.T DST=192.168.31.35 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=16152 SEQ=1 Jun 15 22:10:23 Sorel kernel: [ 2378.271846] TRACE: mangle:POSTROUTING:policy:1 IN= OUT=br0 SRC=X.Y.Z.T DST=192.168.31.35 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=16152 SEQ=1 Jun 15 22:10:23 Sorel kernel: [ 2378.271861] TRACE: nat:POSTROUTING:policy:1 IN= OUT=br0 SRC=X.Y.Z.T DST=192.168.31.35 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=16152 SEQ=1 I see no noticeable difference... beyond out packet on br0 we are in the dark.
It might be related to bridge netfilter changes. Do you have CONFIG_BRIDGE_NETFILTER enabled? If so, please try: echo 0 >/proc/sys/net/bridge/bridge-nf-call-iptables if that doesn't help, try disabling the config option. Do you see the packet in tcpdump in br0?
Yes, CONFIG_NETLABEL=y CONFIG_NETWORK_SECMARK=y CONFIG_NETFILTER=y CONFIG_NETFILTER_DEBUG=y CONFIG_NETFILTER_ADVANCED=y CONFIG_BRIDGE_NETFILTER=y Now, cat /proc/sys/net/bridge/bridge-nf-call-iptables 0 Same x86-64 OK, i386 not
TCPDUMP br0 show (host side) to i386 22:36:25.729891 IP X.Y.Z.T > 192.168.31.34: ICMP echo request, id 60696, seq 1, length 64 to x86-64 22:37:15.964035 IP X.Y.Z.T > 192.168.31.35: ICMP echo request, id 61720, seq 1, length 64 22:37:15.964077 IP 192.168.31.35 > X.Y.Z.T: ICMP echo reply, id 61720, seq 1, length 64
It might be corrupting the MAC header. What does tcpdump show using '-e'? Please also post the output of tcpdump on the underlying ethernet device.
NO packet received on the container side. tcpdump -e on the host side. 22:52:19.100993 Out ea:13:7d:b6:4c:71 ethertype IPv4 (0x0800), length 100: X.Y.Z.T > 192.168.31.34: ICMP echo request, id 24345, seq 1, length 64 22:52:19.100998 Out ea:13:7d:b6:4c:71 ethertype IPv4 (0x0800), length 100: X.Y.Z.T > 192.168.31.34: ICMP echo request, id 24345, seq 1, length 64 22:52:35.005870 In 00:19:d1:a8:be:cd ethertype IPv4 (0x0800), length 100: X.Y.Z.T > 192.168.31.34: ICMP echo request, id 24857, seq 1, length 64 22:52:35.005891 Out ea:13:7d:b6:4c:71 ethertype IPv4 (0x0800), length 100: X.Y.Z.T > 192.168.31.34: ICMP echo request, id 24857, seq 1, length 64 22:52:35.005897 Out ea:13:7d:b6:4c:71 ethertype IPv4 (0x0800), length 100: X.Y.Z.T > 192.168.31.34: ICMP echo request, id 24857, seq 1, length 64 Do not no if this is meaningful. first ping after the container boot I have two traces output. second ping 3 traces output. ping used was ping -c1 192.168.31.34
New kernel with config CONFIG_NETLABEL=y CONFIG_NETWORK_SECMARK=y CONFIG_NETFILTER=y CONFIG_NETFILTER_DEBUG=y # CONFIG_NETFILTER_ADVANCED is not set kernel=2.6.35-rc3-NONET-trace-1+ Same trouble.
I'm only seeing a single MAC address in the traces. Please attach a binary dump (-s0 -w dump) from both -i eth0 and -i br0 on the host side.
OK now we've overlapped :) So its not bridge netfilter related. I'm currently out of ideas, let me think about it.
command used was /usr/sbin/tcpdump -n -i any -e icmp and host X.Y.Z.T so all interface are "dumped" about icmp packet coming from X.Y.Z.T
Handled-By : Patrick McHardy <kaber@trash.net>
Problem still present with 2.6.35-rc4 I have redone the test. exact same container used with both 2.6.34 and 2.6.35-rc4 Please Note: (hopefully this is meaningful) there is a bug within 2.6.34 about container and /sys on 2.6.34 /sys is NOT a namespace, such cd /sys/class/net show: lrwxrwxrwx 1 root root 0 Jul 10 16:00 br0 -> ../../devices/virtual/net/br0 lrwxrwxrwx 1 root root 0 Jul 10 16:00 eth0 -> ../../devices/pci0000:00/0000:00:1c.5/0000:04:00.0/net/eth0 lrwxrwxrwx 1 root root 0 Jul 10 16:00 lo -> ../../devices/virtual/net/lo lrwxrwxrwx 1 root root 0 Jul 10 16:00 sit0 -> ../../devices/virtual/net/sit0 lrwxrwxrwx 1 root root 0 Jul 10 16:00 To_2595 -> ../../devices/virtual/net/To_2595 lrwxrwxrwx 1 root root 0 Jul 10 16:00 To_2662 -> ../../devices/virtual/net/To_2662 lrwxrwxrwx 1 root root 0 Jul 10 16:00 To_2749 -> ../../devices/virtual/net/To_2749 This means within a container you cans see host network devices This bug is fixed starting with 2.6.35 now, /sys/class/net dislay lrwxrwxrwx 1 root root 0 Jul 10 16:05 eth0 -> ../../devices/virtual/net/eth0 lrwxrwxrwx 1 root root 0 Jul 10 16:05 lo -> ../../devices/virtual/net/lo lrwxrwxrwx 1 root root 0 Jul 10 16:06 sit0 -> ../../devices/virtual/net/sit0 this is perfect... I wonder if the iptables problem could be related to /sys improvement, if such why only while in 386 mode? I have done some test with a container using an RH8.0 template and the
Problem STILL PRESENT with 2.6.35-rc4 I have redone the test. exact same containers used with both 2.6.34 and 2.6.35-rc4 Please Note: (hopefully this is meaningful) there is a bug within 2.6.34 about container and /sys on 2.6.34 /sys is NOT a namespace, such cd /sys/class/net show: lrwxrwxrwx 1 root root 0 Jul 10 16:00 br0 -> ../../devices/virtual/net/br0 lrwxrwxrwx 1 root root 0 Jul 10 16:00 eth0 -> ../../devices/pci0000:00/0000:00:1c.5/0000:04:00.0/net/eth0 lrwxrwxrwx 1 root root 0 Jul 10 16:00 lo -> ../../devices/virtual/net/lo lrwxrwxrwx 1 root root 0 Jul 10 16:00 sit0 -> ../../devices/virtual/net/sit0 lrwxrwxrwx 1 root root 0 Jul 10 16:00 To_2595 -> ../../devices/virtual/net/To_2595 lrwxrwxrwx 1 root root 0 Jul 10 16:00 To_2662 -> ../../devices/virtual/net/To_2662 lrwxrwxrwx 1 root root 0 Jul 10 16:00 To_2749 -> ../../devices/virtual/net/To_2749 This means within a container you can see "host network devices" This bug is fixed starting with 2.6.35 now, /sys/class/net dislay only container own network devices. lrwxrwxrwx 1 root root 0 Jul 10 16:05 eth0 -> ../../devices/virtual/net/eth0 lrwxrwxrwx 1 root root 0 Jul 10 16:05 lo -> ../../devices/virtual/net/lo lrwxrwxrwx 1 root root 0 Jul 10 16:06 sit0 -> ../../devices/virtual/net/sit0 this is perfect... I wonder if the iptables problem could be related to /sys improvement, if such why only while in 386 mode? I have done some test with a container using an (old) RH8.0 template and the problem is the same.
Sorry for the delay. I'll try to reproduce locally.
Is this issue still existent in current mainline kernels?
I believe the fix is in 2.6.36-rc3: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=cca77b7c81876d819a5806f408b3c29b5b61a815 It's also in the queue for 2.6.35-stable.
confirm bug is fixed in 2.6.36-rc6 Thanks
Thank you for the confirmation!