Bug 89161

Summary: Regression in bonding driver with devices that have no MAC address
Product: Networking Reporter: Toby Corkindale (tjc)
Component: OtherAssignee: Stephen Hemminger (stephen)
Status: NEW ---    
Severity: normal CC: alan, dingtianhong, szg00000, tjc
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.2.3 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Patch that resolves issue for me

Description Toby Corkindale 2014-12-02 10:14:56 UTC
In kernel 3.13 and earlier, a user could use the "bonding" network module to enslave devices without MAC addresses, such as ppp devices, in certain modes (such as balance-rr).

At some point between 3.13 and 3.17, this ability was removed -- even though apparently a commit was made to specifically ALLOW this to happen:
https://github.com/torvalds/linux/commit/f54424412b6b2f64cae4d7c39d981ca14ce0052c


On kernel 3.13, this was printed to syslog upon enslaving ppp0:
bonding: bond0: Warning: The first slave device specified does not support setting the MAC address. Setting fail_over_mac to active.
bonding: bond0: enslaving ppp0 as an active interface with an up link.

On kernel 3.17.2 and 3.17.4 (and probably others) instead this error comes up:
bond0: Adding slave ppp0
bond0: The slave device specified does not support setting the MAC address

And the slave is not added.


In case it was relevant, I resorted to manually creating the bond0 with appropriate options (including fail_over_mac) preset, and yet the problem persists.

To replicate the problem, setup at least one ppp connection, and then follow these instructions:

    modprobe bonding
    echo '+bond0' > /sys/class/net/bonding_masters
    echo 'active' > /sys/class/net/bond0/bonding/fail_over_mac
    ifconfig bond0 down
    echo 'balance-rr' > /sys/class/net/bond0/bonding/mode
    ifconfig bond0 202.xx.xx.xx netmask 255.255.255.255 mtu 1492 up
    echo '500' > /sys/class/net/bond0/bonding/miimon

    ifconfig ppp0 down
    echo '+ppp0' > /sys/class/net/bond0/bonding/slaves


Under earlier kernel versions, that would work -- but current stable kernels fail.
Comment 1 Alan 2014-12-08 15:31:28 UTC
This is best also reported to netdev@vger.kernel.org by email
Comment 2 Toby Corkindale 2015-09-23 04:07:34 UTC
Hi, just checking in to see if anyone's looked at this.. have seen there's been a bunch of commits to the bonding driver since July 15 but the bit of relevant code still looks the same.
Comment 3 dingtianhong 2015-09-24 03:47:39 UTC
I don't think it is a bug for kernel for 3.17, the fail_over_mac should only valid for bond AB mode, you could not change the mode when the slave is not support setting mac, the right solution is return err and set the right mode and then try again.
Comment 4 Toby Corkindale 2015-09-24 03:49:09 UTC
Ah, but the error appears regardless of bond mode -- in my example above, I'm using balance-rr.

The error is returned regardless of me using "none", "active" or "follow" for fail_over_mac.
Comment 5 Toby Corkindale 2015-11-04 04:15:58 UTC
Just thought I'd drop in and mention that the regression/bug/issue is still present on kernel 4.2.3.

I do note that the code path to add slaves clearly looks like it should let the slave in with a warning only, but actually attempting this on a machine results in a failure and no slaves listed.
Comment 6 Toby Corkindale 2015-11-04 07:47:21 UTC
Created attachment 192041 [details]
Patch that resolves issue for me

I suspect this simple patch isn't of sufficient quality to use directly, but.. it works for me.