Bug 4809 - AF_PACKET sockets ignore the SO_TIMESTAMP sockopt
Summary: AF_PACKET sockets ignore the SO_TIMESTAMP sockopt
Status: REJECTED INVALID
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Patrick McHardy
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-06-27 23:58 UTC by James Gregory
Modified: 2009-03-23 10:27 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.26
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Patch to disable timestamps when requested (384 bytes, patch)
2005-07-20 08:19 UTC, Patrick McHardy
Details | Diff
Patch to disable timestamps if requested for net-2.6.24 (1.83 KB, patch)
2007-09-13 03:41 UTC, Stephen Hemminger
Details | Diff

Description James Gregory 2005-06-27 23:58:47 UTC
Distribution: debian
Hardware Environment: Pentium IV
Problem Description:

Disabling the SO_TIMESTAMP sockopt on AF_PACKET sockets has no effect; the
packets continue to have timestamps added to them, which means calling
do_gettimeofday for every packet. Oprofile has shown this to be the dominant
cost in using AF_PACKET sockets.

Steps to reproduce:

* Open an AF_PACKET socket.
* Disable the SO_TIMESTAMP sockopt
* Observe the timestamps as you read packets from it.

The follwing patch was against 2.6.8 (the same code is in 2.6.12) and fixes this
for the tpacket_rcv case:

diff -urN kernel-source-2.6.8/net/packet/af_packet.c
kernel-source-2.6.8-new/net/packet/af_packet.c
--- kernel-source-2.6.8/net/packet/af_packet.c  2004-08-14 15:37:25.000000000 +1000
+++ kernel-source-2.6.8-new/net/packet/af_packet.c      2005-06-22
09:37:16.704601489 +1000
@@ -625,7 +625,7 @@
        h->tp_snaplen = snaplen;
        h->tp_mac = macoff;
        h->tp_net = netoff;
-       if (skb->stamp.tv_sec == 0) { 
+       if (sk->sk_rcvtstamp && skb->stamp.tv_sec == 0) { 
                do_gettimeofday(&skb->stamp);
                sock_enable_timestamp(sk);
        }

I'm not sufficiently familiar with the codepath that packet_rcv uses, so I'm not
sure if the same bug exists there, or how to fix it (there's no direct
do_gettimeofday call in that version).
Comment 1 Patrick McHardy 2005-07-20 08:19:35 UTC
Created attachment 5350 [details]
Patch to disable timestamps when requested

Does this patch fix the problem? Please keep in mind that if _any_ socket on
the
system requests timestamps, or you use ip_queue or anything else that requests
timestamps, do_gettimeofday will still be called for all packets.
Comment 2 James Gregory 2005-07-20 18:35:16 UTC
I haven't tested your patch, but my strong suspicion is that it will not fix the
problem. The issue was that the AF_PACKET handler was ignoring the sk_rcvtstamp
value in the sk structure; not that the value itself was being set incorrectly.

In a brief inspection of the packet_recvmsg function in af_packet.c, an
unguarded call to sock_recv_timestamp will inspect the sk->sk_rcvtstamp and then
do a do_gettimeofday if that value is true.

The equivalent code in tpacket_rcv works differently however; rather than
calling sock_recv_timestamp it will stamp the packet itself, without inspecing
the sk->sk_rcvtstamp value. I suppose the real fix is to make tpacket_rcv call
sock_recv_timestamp.

I appreciate that there are many use cases for packet timestamps, however my
profiling showed that the cost of fetching those timestamps is huge (on my test
system, around 80% of the cpu time was spent in get_offset_pmtmr when running
memory-mapped tcpdump to capture packets off the wire. My patch plus disabling
SO_TIMESTAMP on the pcap socket eliminated that time expenditure). If making a
sockopt operate as documented will have the effect of allowing some use cases
that don't need timestamps to avoid incurring the cost of fetching them, I see
that as a win.

What would be really cool is a way to set the accuracy of the timestamps; rarely
do you need the accuracy that gettimeofday provides in networking code (and I
think calling it actually takes longer than the microsecond accuracy it
allegedly provides), but that's probably a separate discussion.

The other thing to note here is that your patch suggests that the code to handle
that sockopt has changed recently. On a 2.6.8 tree that I've got lying around,
the sock_setsockopt function in net/core/sock.c uses the following code to
handle SO_TIMESTAMP:

        case SO_TIMESTAMP:
            sk->sk_rcvtstamp = valbool;
            if (valbool)
                sock_enable_timestamp(sk);
            break;

Which looks perfectly sane to me. Is it possible that a regression has been
introduced here?
Comment 3 Patrick McHardy 2005-07-21 00:54:10 UTC
You're right, there seem to be two seperate bugs. I'll have a closer look
tomorrow.
Comment 4 Adrian Bunk 2006-02-03 15:26:27 UTC
Patrick, what is the status of these issues?
Comment 5 Adrian Bunk 2006-11-09 04:56:51 UTC
Patrick?
Comment 6 Stephen Hemminger 2007-09-06 08:23:20 UTC
A fix is heading to 2.6.24 to reduce the overhead by only stamping packets
after the socket filter has run. Since the timestamp is part of the ABI,
we really can't turn it off easily.
Comment 7 James Gregory 2007-09-09 19:36:27 UTC
Hi Stephen,

Which ABI is it a part of? Having an empty timestamp value surely doesn't affect the ABI? Surely it's equally bad to ignore the will of the socket's creator.

Will the new behaviour you have suggested fix the case I initially pointed out as an example of why this is bad? Will tcpdump operate with more reasonable CPU usage?
Comment 8 Stephen Hemminger 2007-09-13 03:41:42 UTC
Created attachment 12815 [details]
Patch to disable timestamps if requested for net-2.6.24
Comment 9 Alan 2009-03-23 10:27:02 UTC
Closing out old stale bugs

Note You need to log in before you can comment on or make changes to this bug.