Bug 206089

Summary: veth, virtio tx checksum wrong, forwarding drops packets
Product: Networking Reporter: Harry Coin (hcoin)
Component: IPV4Assignee: Stephen Hemminger (stephen)
Status: NEW ---    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.3.0-24-generic #26-Ubuntu SMP Subsystem:
Regression: No Bisected commit-id:

Description Harry Coin 2020-01-05 21:34:41 UTC
Both using ipv4 and ipv6 when iface below is a:
virtio net driver in a kvm 
and
on baremetal when one end of a veth pair when the other end is in a filtering bridge:

ethtool -K iface tx-checksum-ip-generic on
echo -n "rsschecksum test" | nc -w 3 -4 -u -b -s 192.168.172.3 192.168.172.63 52722
tcpdump -e -p -c 3 -n -vv -i iface ...

Bad TX Checksum generated by interface cephnoc0iface
15:04:21.350281 52:54:bc:75:61:1b > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 58: (tos 0x0, ttl 64, id 39761, offset 0, flags [DF], proto UDP (17), length 44)
192.168.172.3.48926 > 192.168.172.63.52722: [bad udp cksum 0xd9bd -> 0x1f01!] UDP, length 16

But when

ethtool -K iface tx-checksum-ip-generic off
echo -n "rsschecksum test" | nc -w 3 -4 -u -b -s 192.168.172.3 192.168.172.63 52722
tcpdump -e -p -c 3 -n -vv -i iface ...
tcpdump: listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
14:06:50.791664 52:54:00:0b:3d:7c > 52:54:a2:98:33:f0, ethertype IPv4 (0x0800), length 61: (tos 0x0, ttl 64, id 9609, offset 0, flags [DF], proto UDP (17), length 47)
10.12.112.180.34476 > 10.12.112.65.52722: [udp sum ok] UDP, length 19
14:06:50.791734 52:54:00:0b:3d:7c > 52:54:e6:39:16:e8, ethertype IPv4 (0x0800), length 61: (tos 0x0, ttl 64, id 20912, offset 0, flags [DF], proto UDP (17), length 47)
10.12.112.180.34476 > 10.12.112.66.52722: [udp sum ok] UDP, length 19
14:06:50.791784 52:54:00:0b:3d:7c > 52:54:da:59:f5:e0, ethertype IPv4 (0x0800), length 61: (tos 0x0, ttl 64, id 1977, offset 0, flags [DF], proto UDP (17), length 47)
10.12.112.180.34476 > 10.12.112.67.52722: [udp sum ok] UDP, length 19

The problem was detected when packets got dropped passing through all-linux routers and filtering vlan bridges.  A vanilla VM instance running chrony on one VM couldn't access an ntpsec time server on a different subnet running on bare metal on a physically adjacent server using ipv6.   The above examples are ip4, but the results are the same with ip6.  The above examples are udp, the results are the same with tcp.