Bug 200755
Summary: | UDP packets can arrive on wrong sockets when using connected sockets and SO_REUSEPORT | ||
---|---|---|---|
Product: | Networking | Reporter: | Andrew Cann (shum) |
Component: | Other | Assignee: | Stephen Hemminger (stephen) |
Status: | NEW --- | ||
Severity: | normal | CC: | mark.keaton |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.14.51 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Bug example python code
Bug example C code |
Description
Andrew Cann
2018-08-07 06:22:48 UTC
I have recently been hit by this issue after upgrading some of the Linux systems that I use for testing. A program that has worked for many years now has this UDP port reuse issue when run using a 4.15.0 kernel. The last kernel that I know of without this issue is 4.4.0. The program uses multiple UDP sockets, all bound to the same local UDP port number, to communicate with multiple remote endpoints simultaneously. To do this, the program creates UDP sockets that have the SO_REUSEPORT option enabled using setsockopt(), are bound to the same local address/port using bind(), and are connected to the different remote endpoints using connect(). The program must keep each flow of datagrams to/from the remote endpoints separate in order to function correctly. It is using the SO_REUSEPORT option purely to allow the single UDP port number to be reused by all of the UDP sockets, not to distribute the incoming datagrams between the UDP sockets (i.e., load balancing). When run on a system with a 4.15.0 kernel, I am observing that all of the incoming datagrams are delivered to only one of the UDP sockets, regardless of the incoming datagram's source address and port. All of the other UDP sockets receive nothing. The program needs to have incoming datagrams delivered to the proper UDP socket that is connected to the datagram's source address/port. The man page socket(7) states that the SO_REUSEPORT option on UDP sockets can provide distribution of incoming datagrams over all of the sockets sharing the port number (assuming that the sockets are not connected, I would think). However, the program is not demonstrating that behavior either, as only one socket is receiving all of the incoming datagrams. I have attached a small C program that demonstrates this issue. Just compile and run the program. Regarding the notion that this might just be due to a bug in the connect(2) man page, I believe that the man page is correct as written. It has long been the behavior for a UDP socket that is connected to only send packets to and receive packets from the specified remote address and UDP port number. The use of SO_REUSEPORT on multiple UDP sockets in order to be able to simply reuse a single local UDP port number should not interfere with this connected datagram socket behavior. Looking at the kernel source code for 4.15.0 and comparing it to that for 4.4.0, the bug appears to be in the udp.c file in the udp4_lib_lookup2() and __udp4_lib_lookup() functions. These functions iterate over all of the UDP socket structures and call compute_score() on each socket in order to determine where an incoming datagram should be directed. There is new code that calls into reuseport_select_sock() for the first socket that has a positive score and has the port reuse option set, and returns in the middle of the loop if the call returns a non-NULL value. This new code does not do any testing to see of the socket is a connected socket, and can cause the loop to not test all of the UDP sockets for the highest score. Finally, I have not been successful in finding a simple workaround for this issue until it is fixed. I have even added BPF filters to the UDP sockets in an attempt to filter the incoming datagrams based on source address and port, but this didn't work either as the filter does not have access to the IP header. I am faced with having to rearchitect the program in order to utilize a single UDP socket with packet multiplexing/demultiplexing done in user space, which should not be necessary. Created attachment 283509 [details]
Bug example C code
|