Bug 92081
Summary: | skb->len=0 and getting "EOF on netlink" with "ip monitor all" (of iproute) when adding a vlan with "bridge vlan add" | ||
---|---|---|---|
Product: | Networking | Reporter: | Rami Rosen (ramirose) |
Component: | Other | Assignee: | Stephen Hemminger (stephen) |
Status: | CLOSED PATCH_ALREADY_AVAILABLE | ||
Severity: | high | CC: | ramirose, roopa, szg00000 |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.17.6-300 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Rami Rosen
2015-01-26 18:15:01 UTC
The reason for the zero length message in this case is that the user is sending the setlink request to the bridge with self flag set. And since the getlink on the bridge device only returns bytes when its a bridge port, there are no bytes in the skb. There are two fixes needed: - one is the skb->len check - and fix bridge driver ndo_bridge_getlink to return correct vlan changes for the bridge device pasting here more detailed comments from rami rosen on netdev list: For the sake of those who are interested in more implementation details and in the code walkthrough under such scenario, what happens when "bridge vlan add vid 1 dev br0 self" , you should follow this path: Look at rtnl_bridge_setlink() method, it is invoked in this case. http://lxr.free-electrons.com/source/net/core/rtnetlink.c#L2782 If the SELF flag is set it calls dev->netdev_ops->ndo_bridge_setlink() See: http://lxr.free-electrons.com/source/net/core/rtnetlink.c#L2840 and then it calls rtnl_bridge_notify() See: http://lxr.free-electrons.com/source/net/core/rtnetlink.c#L2850 Now, rtnl_bridge_notify() calls dev->netdev_ops->ndo_bridge_getlink() when the self flag is set. See: http://lxr.free-electrons.com/source/net/core/rtnetlink.c#L2767 Now, when running the "bridge vlan add" on a bridge device like we do (and **not on a bridge port**) then the dev variable is an instance of a software bridge. So this calls the ndo_bridge_getlink() callback of the software bridge, which is br_getlink(): See: http://lxr.free-electrons.com/source/net/bridge/br_netlink.c#L205 Now, br_getlink() first checks if the device is a bridge port: struct net_bridge_port *port = br_port_get_rtnl(dev); And it returns 0 if not. So as a result, the skb->len is 0 and an empty notification is sent. And when the rtneltnlink socket, which is opened by "ip monitor all" and listens to netlink messages, receives an empty notification it terminates with the "EOF" message (as mentioned in the bugzilla link). It seems to me that this BUG should be closed. The following patch from Roopa Parbhu fixed it: http://www.spinics.net/lists/netdev/msg314256.html This BUG id is mentioned in the commit message as the reason for submitting it. This patch was integrated already in 3.19. I tested the same scenario on 4.4 and the problem mentioned in the BUG description did not occur. So unless I will get an objection within 24 hours, I intend to close it. Regards, Rami Rosen Intel Corporation As said yesterday, I am changing the status to "RESOLVED", as the following patch from Roopa Parbhu fixed it: http://www.spinics.net/lists/netdev/msg314256.html Rami Rosen Intel Corporation Closing the BUG Rami Rosen Intel Corporation |