Bug 120441

Summary: af_packet no longer uses symmetric hashing
Product: Networking Reporter: Derek Ditch (derek.ditch)
Component: OtherAssignee: Stephen Hemminger (stephen)
Status: NEW ---    
Severity: normal CC: bastienphilbert, brad, eric
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: >4.4 Subsystem:
Regression: No Bisected commit-id:

Description Derek Ditch 2016-06-16 04:08:02 UTC
I've encountered a bug when using AF_Packet for network capture using Bro (http://bro.org). After additional investigation, I found a bug tracking this in Bro [1], and also in Suricata [2]. The problem is that these tools (among others, such as netsniff-ng) in some cases need to leverage a symmetric hash to ensure both sides of a network connection (at least IPv4 and IPv6) are processed by the same thread of a given fanout group. In the case of Bro and Suricata, this allows a single thread to analyze both sides of the traffic for potentially malicious traffic. In netsniff-ng, this allows the processes to ensure that the captured PCAP by each process has both sides of the conversation.

Besides the evidence in bro logs that I was looking at, a more direct test is with the current master of netsniff-ng [3] that allows hash-based fanout:

For example, w/ 4 workers:

sudo nohup /usr/local/sbin/netsniff-ng -T 0xa1b2c3d4 --fanout-group 13 --fanout-type hash --mmap --ring-size 256MiB --bind-cpu 0 --silent --in eth1 --out /data/pcap/ --prefix "eth1-0." --interval 60sec  & 
sudo nohup /usr/local/sbin/netsniff-ng -T 0xa1b2c3d4 --fanout-group 13 --fanout-type hash --mmap --ring-size 256MiB --bind-cpu 1 --silent --in eth1 --out /data/pcap/ --prefix "eth1-1." --interval 60sec  & 
sudo nohup /usr/local/sbin/netsniff-ng -T 0xa1b2c3d4 --fanout-group 13 --fanout-type hash --mmap --ring-size 256MiB --bind-cpu 2 --silent --in eth1 --out /data/pcap/ --prefix "eth1-2." --interval 60sec  & 
sudo nohup /usr/local/sbin/netsniff-ng -T 0xa1b2c3d4 --fanout-group 13 --fanout-type hash --mmap --ring-size 256MiB --bind-cpu 3 --silent --in eth1 --out /data/pcap/ --prefix "eth1-3." --interval 60sec  & 


Looking at this PCAP, for TCP connections you one should see all the packets for a given 4-tuple into a given file, but this is not the case.


[1] https://bro-tracker.atlassian.net/browse/BIT-1575
[2] https://redmine.openinfosecfoundation.org/issues/1777
[3] https://github.com/netsniff-ng/netsniff-ng/
Comment 1 Eric Leblond 2016-06-16 11:59:58 UTC
It seems the problem has been introduced by https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=42aecaa9bb2bd57eb8d61b4565cee5d3640863fb where the symetrical jhash_3words function has been replaced by jhash2.
Comment 2 Eric Leblond 2016-06-16 12:35:54 UTC
This patch has been introduced in 4.2.
Comment 3 [account disabled by administrator] 2016-06-21 02:14:15 UTC
Try a newer kernel seems someone missed setting up both key values and checking they were set. It is probably fixed now after looking at the latest kernel mainline sources.