Bug 194749 - kernel bonding does not work in a network nameservice in versions above 3.10.0-229.20.1
Summary: kernel bonding does not work in a network nameservice in versions above 3.10....
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: x86-64 Linux
: P1 blocking
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-01 21:08 UTC by Dan Geist
Modified: 2017-03-02 22:23 UTC (History)
0 users

See Also:
Kernel Version: > 3.10.0-229.20.1
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Dan Geist 2017-03-01 21:08:01 UTC
bond interface is being used in active/standby mode with two physical NICs inside a network nameservice to provide switchpath redundancy.

netns is instantiated post-boot with the following:

ip netns add vntp
ip link set p4p1 netns vntp
ip link set p4p2 netns vntp
ip link set bond0 netns vntp
ip netns exec vntp ip link set lo up
ip netns exec vntp ip link set p4p1 up
ip netns exec vntp ip link set p4p2 up
ip netns exec vntp ip link set bond0 up
ip netns exec vntp ifenslave bond0 p4p1 p4p2

This works as one would expect in kernel versions up to 3.10.0-229.20.1 (CentOS 7 packages). At the next patchlevel and all subsequent versions released by the packager, the following appears in system logs and the bond0 instantiation fails:

Mar  1 19:33:42 fed1ntpi01 rc.local: Cannot find device "bond0"
Mar  1 19:33:42 fed1ntpi01 rc.local: Master 'bond0': Error: handshake with driver failed. Aborting
Mar  1 19:33:42 fed1ntpi01 rc.local: Cannot find device "bond0"
Mar  1 19:33:42 fed1ntpi01 rc.local: Cannot find device "bond0"
Comment 1 Dan Geist 2017-03-02 22:23:57 UTC
After advisement from the kernel network developer team, I discovered that this behavior is intentional and prevents non-hardware interface types (bond, bridge, etc.) from being moved between netns's.

After disabling the default bond0 init configuration from the CentOS startup framework and implementing the following, the setup is now working with the newest kernel in the RHEL7/CentOS7 releases:

$ cat /etc/rc.local
#!/bin/bash
modprobe bonding
modprobe bonding mode=1 resend_igmp=1 updelay=0 use_carrier=1 miimon=100 downdelay=0 xmit_hash_policy=0 primary_reselect=0 fail_over_mac=0 arp_validate=0 lacp_rate=0 arp_interval=0 ad_select=0
ip netns add vntp
ip link set p4p1 netns vntp
ip link set p4p2 netns vntp
ip netns exec vntp ip link add dev bond0 type bond mode active-backup
ip netns exec vntp ip link set lo up
ip netns exec vntp ip link set p4p1 up
ip netns exec vntp ip link set p4p2 up
ip netns exec vntp ip link set bond0 up
ip netns exec vntp ifenslave bond0 p4p1 p4p2

I believe this bug report can be closed and marked as "user misunderstood feature".

Note You need to log in before you can comment on or make changes to this bug.