Bug 198521

Summary: VRF: VRF device does not egress all broadcast(255.255.255.255) destined packet
Product: Networking Reporter: Sukumar (sukumarg1973)
Component: IPV4Assignee: David Ahern (dsahern)
Status: RESOLVED CODE_FIX    
Severity: blocking    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Linux version 4.9.71 Subsystem:
Regression: No Bisected commit-id:

Description Sukumar 2018-01-19 12:59:23 UTC
CONFIGURATION AND PACKET FLOW:
==============================

1) Created VRF device(VRF_258) and enslaved network device(v2_F4252) to this VRF.

/exos/bin # ip link show vrf_258
13: vrf_258: <NOARP,MASTER,UP,LOWER_UP> mtu 65536 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:04:96:9a:b4:f7 brd ff:ff:ff:ff:ff:ff


/exos/bin # ip link show v2_F4252
150: v2_F4252: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vrf_258 state UNKNOWN mode DEFAULT group default qlen 1
    link/ether 00:04:96:9a:b4:f7 brd ff:ff:ff:ff:ff:ff

/exos/bin # ifconfig -a v2_F4252
v2_F4252  Link encap:Ethernet  HWaddr 00:04:96:9A:B4:F7  
          inet addr:20.20.20.10  Bcast:20.20.20.255  Mask:255.255.255.0
          inet6 addr: 2001::1/64 Scope:Global
          inet6 addr: fe80::204:96ff:fe9a:b4f7/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:44 errors:0 dropped:0 overruns:0 frame:0
          TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:12628 (12.3 KiB)  TX bytes:1184 (1.1 KiB)

/exos/bin # ifconfig -a vrf_258
vrf_258   Link encap:Ethernet  HWaddr 00:04:96:9A:B4:F7  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP RUNNING NOARP MASTER  MTU:65536  Metric:1
          RX packets:96 errors:0 dropped:0 overruns:0 frame:0
          TX packets:48 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:28368 (27.7 KiB)  TX bytes:14592 (14.2 KiB)


/exos/bin # ip route show  table 258
default via 20.20.20.1 dev v2_F4252 proto gated metric 10 
unreachable default metric 8192 
broadcast 20.20.20.0 dev v2_F4252 proto kernel scope link src 20.20.20.10 
20.20.20.0/24 dev v2_F4252 proto kernel scope link src 20.20.20.10 
local 20.20.20.10 dev v2_F4252 proto kernel scope host src 20.20.20.10 
broadcast 20.20.20.255 dev v2_F4252 proto kernel scope link src 20.20.20.10 
local 90.90.90.10 dev v9_F4254 proto kernel scope host src 90.90.90.10 
broadcast 127.0.0.0 dev vrf_258 proto kernel scope link src 127.0.0.1 
127.0.0.0/8 dev vrf_258 proto kernel scope link src 127.0.0.1 
local 127.0.0.1 dev vrf_258 proto kernel scope host src 127.0.0.1 
broadcast 127.255.255.255 dev vrf_258 proto kernel scope link src 127.0.0.1 


2) Opened UDP socket SO_BINDTODEVICE to VRF_258 device, enabled SO_BROADCAST setsockoption.
Transmitting UDP packet with SrcIP = 20.20.20.10 and DstIP=255.255.255.255 on v2_F4252 mentioned in pktinfo cmsg header

3) udp_sendmsg() receives the packet then packet given to VRF processing. 
vrf_ip_out() function divert only mulicast packet but broadcast has not been diverted so VRF device started processing
the broadcast packet destined to 255.255.255.255.

4) vrf_ip_out() function  gets vrf->rth dst entry and invokes vrf_output().

5) finally packet enters vrf_process_v4_outbound() function. Here route lookup is performed
    ip_route_output_flow() for this flow on VRF_258.
    Lookup returned 
          routes rt->rt_gateway = 0, 
          rt_type = 3(BROADCAST), 
          rt->rt_flags= 90000000(BROADCAST and LOCAL), 
          rt->dst.dev = VRF_258
     
    Instead of packet egressing, below check ( rt->dst.dev == vrf_dev) forcing the packet to Rx path so packet got
    looped back and not egressing.
    if (rt->dst.dev == net->loopback_dev || rt->dst.dev == vrf_dev ) {
    }


Workaround:
===========

 1) is 255.255.255.255 routeable address ? if not, then packet should not be given to VRF processing
 2) This packet also to be diverted similar to broadcast packet. following patch solved the issue
  
   static struct sk_buff *vrf_ip_out(struct net_device *vrf_dev, struct
	sock *sk, struct sk_buff *skb) {

	/* don’t divert multicast */
	if (ipv4_is_multicast(ip_hdr(skb)->daddr))
	return skb;

        /* MY PATCH BEGIN */
	/* don’t divert broadcast */
	if (ipv4_is_lbcast(ip_hdr(skb)->daddr))
         return skb;
        /* MY PATCH END */
Comment 1 David Ahern 2018-01-25 01:17:41 UTC
Thanks for the bug report.

The suggested change allows a packet to go out, but it is not the complete solution -- responses do not make it back to the sending socket. Getting that to happen is a bigger change. I'll get a patch out in the next few days.
Comment 2 David Ahern 2018-01-29 22:47:13 UTC
The reported problem is resolved by commit 1e19c4d689dc ("net: vrf: Add support for sends to local broadcast address"). That patch will get backported to 4.14. With that change you can bind a socket to the enslaved device using SO_BINDTODEVICE, send to the local broadcast address and receive responses.

A follow on change, commit 9515a2e082f9 ("net/ipv4: Allow send to local broadcast from a socket bound to a VRF") allows a socket to be bound to a VRF device and use IP_UNICAST_IF to set the egress interface. This patch will be in 4.16 and up.