Bug 210569

Summary: ping over geneve would fail
Product: Networking Reporter: Jianlin Shi (jishi)
Component: OtherAssignee: Stephen Hemminger (stephen)
Status: NEW ---    
Severity: high CC: xiyou.wangcong
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.10.0-rc6 Subsystem:
Regression: No Bisected commit-id:

Description Jianlin Shi 2020-12-09 02:04:16 UTC
ping over geneve would fail on 5.10.0-rc6 after commit 
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
Commit: 55fd59b003f6 - Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

reproducer:

ip netns add client                                                                                   
ip netns add server                                                                                   
ip link add veth0_c type veth peer name veth0_s                                                       
ip link set veth0_c netns client                                                                      
ip link set veth0_s netns server                                                                      
ip netns exec client ip link set lo up                                                                
ip netns exec client ip link set veth0_c up                                                           
ip netns exec server ip link set lo up                                                                
ip netns exec server ip link set veth0_s up                                                           
ip netns exec client ip addr add 2000::1/64 dev veth0_c                                               
ip netns exec client ip addr add 10.10.0.1/24 dev veth0_c                                             
ip netns exec server ip addr add 2000::2/64 dev veth0_s                                               
ip netns exec server ip addr add 10.10.0.2/24 dev veth0_s                                             
ip netns exec client ping 10.10.0.2 -c 2                                                              
ip netns exec client ping6 2000::2 -c 2                                                               
ip netns exec client ip link add geneve1 type geneve vni 1234 remote 10.10.0.2 ttl 64                 
ip netns exec server ip link add geneve1 type geneve vni 1234 remote 10.10.0.1 ttl 64                 
ip netns exec client ip link set geneve1 up                                                           
ip netns exec client ip addr add 1.1.1.1/24 dev geneve1                                               
ip netns exec server ip link set geneve1 up                                                           
ip netns exec server ip addr add 1.1.1.2/24 dev geneve1
ip netns exec client ping 1.1.1.2 -c 3
Comment 1 Cong Wang 2020-12-10 06:22:05 UTC
Is this a regression of 5.10 merge window?
Comment 2 Jianlin Shi 2020-12-10 06:25:15 UTC
(In reply to Cong Wang from comment #1)
> Is this a regression of 5.10 merge window?

it seems so, the issue occurred since commit 55fd59b003f6
Comment 3 Cong Wang 2020-12-10 06:27:06 UTC
(In reply to Jianlin Shi from comment #2)
> (In reply to Cong Wang from comment #1)
> > Is this a regression of 5.10 merge window?
> 
> it seems so, the issue occurred since commit 55fd59b003f6

If you could do a bisect, it would save us a lot of time narrowing down this bug.
Comment 4 Cong Wang 2020-12-10 06:46:16 UTC
The last change is this:

commit 4179b00c04d18ea7013f68d578d80f3c9d13150a
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Dec 1 01:05:07 2020 -0800

    geneve: pull IP header before ECN decapsulation
    

I do not see anything wrong with this commit. Does reverting it make this bug disappear?
Comment 5 Cong Wang 2020-12-10 06:49:38 UTC
Well... Looks like the logic is reversed...

I feel confident we should reverse it back:

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 627c33333866..b24e79e3e697 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -263,11 +263,11 @@ static void geneve_rx(struct geneve_dev *geneve, struct geneve_sock *gs,
 
        switch (skb_protocol(skb, true)) {
        case htons(ETH_P_IP):
-               if (pskb_may_pull(skb, sizeof(struct iphdr)))
+               if (!pskb_may_pull(skb, sizeof(struct iphdr)))
                        goto rx_error;
                break;
        case htons(ETH_P_IPV6):
-               if (pskb_may_pull(skb, sizeof(struct ipv6hdr)))
+               if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
                        goto rx_error;
                break;
        default:
Comment 6 Jianlin Shi 2020-12-11 00:46:24 UTC
confirmed that the issue occurred since commit 4179b00c04d18ea7013f68d578d80f3c9d13150a on git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git
Comment 7 Cong Wang 2020-12-12 00:25:40 UTC
(In reply to Jianlin Shi from comment #6)
> confirmed that the issue occurred since commit
> 4179b00c04d18ea7013f68d578d80f3c9d13150a on
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git

It has been reverted:

commit c02bd115b1d25931159f89c7d9bf47a30f5d4b41
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Wed Dec 9 14:39:56 2020 -0800

    Revert "geneve: pull IP header before ECN decapsulation"