Bug 206223 - Kernel >= 5.4.11 breaks macvlan IPv6 (non link-local)
Summary: Kernel >= 5.4.11 breaks macvlan IPv6 (non link-local)
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-16 10:48 UTC by Tarek
Modified: 2020-01-23 19:43 UTC (History)
5 users (show)

See Also:
Kernel Version: 5.4.11
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Tarek 2020-01-16 10:48:03 UTC
Starting with kernel 5.4.11, macvlan links bridged to my Ethernet port that are created (and then brought up) are having issues communicating over IPv6. No communication to global IPv6 addresses on the macvlan interface is possible anymore. IPv4 still works however. This is on an Arch Linux system. 

Before kernel 5.4.11, a macvlan instance automatically gets assigned a global IPv6 address after it is brought up (router has a Stateless IPv6 configuration) as expected. 

# ip link add mvlan link enp7s0 type macvlan mode bridge
# ip link set mvlan up
# ip addr show mvlan
6: mvlan@enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 16:65:77:04:2e:23 brd ff:ff:ff:ff:ff:ff
    inet6 fd00:1::1465:77ff:fe04:2e23/64 scope global dynamic mngtmpaddr
       valid_lft 14699sec preferred_lft 14399sec
    inet6 2001:xxxx:xxxx:xxxx:1465:77ff:fe04:2e23/64 scope global dynamic mngtmpaddr
       valid_lft 14699sec preferred_lft 14399sec
    inet6 fe80::1465:77ff:fe04:2e23/64 scope link
       valid_lft forever preferred_lft forever

Pinging global IPv6 addresses and link-local IPv6 both work using the mvlan interface before 5.4.11. 

Starting with kernel 5.4.11, only a link-local IPv6 address is assigned and no global IPv6 address is assigned when the device is brought up. 

# ip link set mvlan up
# ip addr show mvlan
6: mvlan@enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 16:65:77:04:2e:23 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::1465:77ff:fe04:2e23/64 scope link
       valid_lft forever preferred_lft forever

Attempting to add a global IPv6 address manually to the macvlan interface succeeds without errors, but then trying to ping any global IPv6 address fails with "Network is unreachable." Pings to link-local IPv6 addresses still work however. Hence, IPv6 communication to the outside world is effectively broken on the macvlan interface. I could also not find any errors in dmesg related to this problem.

IPv4 still works however. If I assign an IPv4 manually to the macvlan interface or use dhclient to lease an IPv4 from the router, I can ping and communicate over IPv4 on the macvlan interface. So the issue affects IPv6 only. 

I've also tried this on 5.4.12 and the latest 5.5-rc6 kernel and the problem is still there. 5.4.10 is the last kernel where macvlan IPv6 worked for me.

I did a diff on 5.4.10 vs. 5.4.11 and saw there was a change in drivers/net/macvlan.c. Specifically, the macvlan_broadcast() function was changed to use skb_eth_hdr() instead of eth_hdr(). I reverted that change back to using eth_hdr() and my macvlan instances are able to communicate over IPv6 again on kernel 5.4.11 and up. A global IPv6 address is auto-assigned on kernel 5.4.11+ when the device is brought up like on 5.4.10 after I reverted this change, and pinging the outside world works as expected. Reverting this change also fixes the problem on 5.4.12 and 5.5-rc6 too.

I'm not a network programmer by any means, so I don't know why that change broke IPv6 and if reverting it will bring unintended consequences. But reverting that change is what fixed it for me and I haven't seen any other issues so far.
Comment 1 Scott Ellis 2020-01-19 00:03:31 UTC
I have a different behavior, but I believe it's the same root cause.

My symptom is that ARP responses aren't sent from macvlan interfaces (e.g., docker containers using macvlan).  The culprit is https://patchwork.ozlabs.org/patch/1218459/ , introduced in 5.4.11.

Reverting that change restores the original (desired) behavior.
Comment 2 Dave 2020-01-19 17:21:34 UTC
Just confirming that I have the same behavior as described by Scott Ellis, and it does affect IPv4 as well.  About half of my macvlan docker containers were unreachable outside of the host machine by IPv4.
Comment 3 Z. Liu 2020-01-22 17:30:11 UTC
I have the same problem as described by Scott Ellis, after update kernel to 4.19.97 with same patch applied.
Comment 4 Andy Wang 2020-01-23 03:24:51 UTC
I'm seeing the same problem with my libvirt VMs using macvlan.
Comment 5 Tarek 2020-01-23 19:43:31 UTC
FYI, this has been fixed as of today in the latest kernel 5.4.14 with commit: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c17e025049a639b78bb87a15494116b90f2de94f

Note You need to log in before you can comment on or make changes to this bug.