VRF: ICMPV6 Echo Reply failed to egress if ingress pkt Src is IPV6 Global and Dest is IPV6 Link Local. CONFIGURATION AND PACKET FLOW: ============================== 1) Created VRF device(Vrf_258) and enslaved network device(v1_F4246) to this VRF. Assigned Global address 2001::1 and fe80::1 to VLAN device. /exos/bin # ifconfig -a v1_F4246 v1_F4246 Link encap:Ethernet HWaddr 00:04:96:98:C9:18 inet6 addr: 2001::1/64 Scope:Global inet6 addr: fe80::1/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:52 errors:0 dropped:0 overruns:0 frame:0 TX packets:98 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3024 (2.9 KiB) TX bytes:8116 (7.9 KiB) /exos/bin # ifconfig -a vrf_258 vrf_258 Link encap:Ethernet HWaddr 00:04:96:98:C9:18 inet addr:127.0.0.1 Mask:255.0.0.0 UP RUNNING NOARP MASTER MTU:65536 Metric:1 RX packets:27 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1512 (1.4 KiB) TX bytes:0 (0.0 B) /exos/bin # ip link show vrf_258 14: vrf_258: <NOARP,MASTER,UP,LOWER_UP> mtu 65536 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 00:04:96:98:c9:18 brd ff:ff:ff:ff:ff:ff /exos/bin # ip link show v1_F4246 54: v1_F4246: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vrf_258 state UNKNOWN mode DEFAULT group default qlen 1000 link/ether 00:04:96:98:c9:18 brd ff:ff:ff:ff:ff:ff 2) Received Ping request to local intf IP 2001::1 from neigbour DUT with following SRC/Dest. Rx ICMPV6 Echo Req Pkt's SRC IP = fe800000:0:0:2, DEST IP = 20010000:0:0:1 3) Incoming packet SRC is IPv6 Global address so skb->dev changed to vrf_258 by ip receive function. 4) Since it is local delivered packet icmpv6_echo_reply() is invoked. This function tries to send reply with SRC IP = 20010000:0:0:1 DEST IP = fe800000:0:0:2 flowi6_oif = vrf_dev258 5) ip6_dst_lookup() is failed for this packet and got dropped. fl6.flowi6_oif = icmp6_iif(skb); flowi6_oif is having vrf device index when ip6_dst_lookup() is invoked. Pkt flows to ip6_dst_lookup() -> ip6_dst_lookup_tail()->ip6_route_output_flags()->l3mdev_link_scope_lookup()->vrf_link_scope_lookup() There is a check in vrf_link_scope_lookup() which is dropping the reply packet. In this case flowi6_oif and dev->ifindex are same. .. if (fl6->flowi6_oif == dev->ifindex) { dst = &net->ipv6.ip6_null_entry->dst; dst_hold(dst); return dst; } .. TEMP FIX: ========= Return Ingress vlan device's ifIndex instead of skb->dev's. static int icmp6_iif(const struct sk_buff *skb) { int iif = IP6CB(skb)->iif; // skb->dev->ifindex; .. .. }
Hi David, Thanks for your suggested fix. Ping response with Src Global and Dest Link-Local is a valid scenario. In many circumstances we check ping to Global address from the interface which have only link-local configured. I tried your changes with small correction. I am trying to send ping response packet out with Src Global and Dest Link local so the check should be /* Link local, Multicast should not have vrf device as egress */ if (fl6->flowi6_oif == dev->ifindex && rt6_need_strict(&fl6->saddr)) { .. return dst; } After adding the corrected check the packet proceed further. Now rtlookup is failing with oif as vrf device (Src=20010000:0:0:1, Dest=fe800000:0:0:2, oif=vrf_258). rt = vrf_ip6_route_lookup(net, dev, fl6, fl6->flowi6_oif, flags); vrf_ip6_route_lookup() calls ip6_pol_route() which returned error. Routing table ========== /exos/bin # ip -6 route show table 258 anycast 2001:: dev v1_F4246 proto kernel metric 0 pref medium local 2001::1 dev v1_F4246 proto kernel metric 0 pref medium 2001::/64 dev v1_F4246 proto kernel metric 256 pref medium anycast fe80:: dev v1_F4246 proto kernel metric 0 pref medium local fe80::1 dev v1_F4246 proto kernel metric 0 pref medium fe80::/64 dev v1_F4246 proto kernel metric 256 pref medium ff00::/8 dev v1_F4246 metric 256 pref medium To make route look pass set FLOWI_FLAG_SKIP_NH_OIF to fl6->flowi6_flags in vrf_link_scope_lookup() routine for the global source address. Proposed code changes: =================== changes in-line as "Fix" static struct dst_entry *vrf_link_scope_lookup(const struct net_device *dev, struct flowi6 *fl6) { .. .. /* VRF device does not have a link-local address and * sending packets to link-local or mcast addresses over * a VRF device does not make sense */ if (fl6->flowi6_oif == dev->ifindex && rt6_need_strict(&fl6->saddr)) { => Fix1<= .. .. } if (!ipv6_addr_any(&fl6->saddr)) flags |= RT6_LOOKUP_F_HAS_SADDR; => Fix2 start here<= /* Skip NH vrf oif for route lookup */ if (fl6->flowi6_oif == dev->ifindex) { fl6->flowi6_flags |= FLOWI_FLAG_SKIP_NH_OIF; } => Fix2 end here<= .. .. } On Fri, May 4, 2018 at 9:02 AM, David Ahern <dsahern@gmail.com> wrote: > On 4/30/18 6:58 AM, Sukumar Gopalakrishnan wrote: >> VRF: ICMPV6 Echo Reply failed to egress if ingress pkt Src is IPV6 >> Global and Dest is IPV6 Link Local. > > ... > >> if (fl6->flowi6_oif == dev->ifindex) { > > try adding ' && !rt6_need_strict(saddr)' to the above. > > If it works, add a comment above the line noting this case (link local > dest and global source) submit a patch. > > >> dst = &net->ipv6.ip6_null_entry->dst; >> dst_hold(dst); >> return dst; >> } >> .. >> >> >> TEMP FIX: >> ========= >> >> Return Ingress vlan device's ifIndex instead of skb->dev's. >> >> static int icmp6_iif(const struct sk_buff *skb) >> { >> int iif = IP6CB(skb)->iif; // skb->dev->ifindex; >> >> .. >> .. >> } >> >
This is fixed in the following commit in net tree: commit e1ae5c2ea478 Author: Stephen Suryaputra <ssuryaextr@gmail.com> Date: Mon Jun 10 10:32:50 2019 -0400 vrf: Increment Icmp6InMsgs on the original netdev Get the ingress interface and increment ICMP counters based on that instead of skb->dev when the the dev is a VRF device. This is a follow up on the following message: https://www.spinics.net/lists/netdev/msg560268.html v2: Avoid changing skb->dev since it has unintended effect for local delivery (David Ahern). Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(In reply to Stephen Suryaputra from comment #2) > This is fixed in the following commit in net tree: > > commit e1ae5c2ea478 > Author: Stephen Suryaputra <ssuryaextr@gmail.com> > Date: Mon Jun 10 10:32:50 2019 -0400 > > vrf: Increment Icmp6InMsgs on the original netdev > > Get the ingress interface and increment ICMP counters based on that > instead of skb->dev when the the dev is a VRF device. > > This is a follow up on the following message: > https://www.spinics.net/lists/netdev/msg560268.html > > v2: Avoid changing skb->dev since it has unintended effect for local > delivery (David Ahern). > Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> > Reviewed-by: David Ahern <dsahern@gmail.com> > Signed-off-by: David S. Miller <davem@davemloft.net> My apology. This commit alone doesn't fix the bug.