ens3 mtu is 9000 bytes. Internal and guest network mtu 65535 bytes. Sending packet to guest network: #> ip netns exec test ping -s 8700 fd00::6 PING fd00::6(fd00::6) 8700 data bytes 8708 bytes from fd00::6: icmp_seq=1 ttl=64 time=0.674 ms 8708 bytes from fd00::6: icmp_seq=2 ttl=64 time=0.648 ms ^C --- fd00::6 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1008ms rtt min/avg/max/mdev = 0.648/0.661/0.674/0.013 ms [root@arch user (0)] #> ip netns exec test ping -s 32768 fd00::6 PING fd00::6(fd00::6) 32768 data bytes ^C --- fd00::6 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1022ms [root@arch user (1)] #> Send big backet from ipb6 is failure from vxlan.
Created attachment 286095 [details] My network interfaces
Isn't this expected behavior? or should vxlan fragment the packets going over the tunnel? Since vxlan (65k) -> UDP (65+header) and no ICMP need frag... Anyway, adding myself since I'm interested in the outcome.
Yes, vxlan is not sending ICMP need fragmentation from IPv6. According to specification, fragmented ipv6 packet should send ICMP need frag packet. VXLAN is not sending the paслet and due to arises fragmetation problem.
That's not what I'm saying... I'm saying that the vxlan tunnel is doing the right thing, your 32K packets doesn't go trough. My reasoning: for vxlan, your packet is within MTU range, it's accepted and tunneled. Next step is that it will send a UDP packet (size 32k + header) to the interface. But MTU is a fickle beast, either your packet will be dropped in a bridge or any other thing in between and since it's a UDP packet -- it will be dropped silently. If it was a TCP packet, it could have been segmented but the host might not expect that and still drop it as "broken". I assume that this is a openstack setup with openvswitch hybrid networking?
My network is not used openvswitch and openstock. My network based from netns and linux bridge.
Sorry, I'd still say it works as intended - this would be expected behavior. Looked briefly at the vxlan code and it only seems to propagate "do not fragment" flags -- no fragmentation seems to be done. I'd say that the expectation is that the vxlan mtu has to be interface_mtu - vxlan_overhead, ie there would be no way to tunnel a network with a larger MTU