Bug 7938 - all UDP packets fromlocalhost have wrong check sum
Summary: all UDP packets fromlocalhost have wrong check sum
Status: CLOSED INVALID
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Jeff Garzik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-02-05 05:52 UTC by Toralf Förster
Modified: 2007-07-20 13:13 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.19
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
config (24.45 KB, text/plain)
2007-02-07 08:18 UTC, Toralf Förster
Details

Description Toralf Förster 2007-02-05 05:52:29 UTC
Most recent kernel where this bug did *NOT* occur: 2.6.18
Distribution:
Hardware Environment:
ThinkPad T41
# lspci | grep Ether
02:01.0 Ethernet controller: Intel Corporation 82540EP Gigabit Ethernet
Controller (Mobile) (rev 03)
02:02.0 Ethernet controller: Atheros Communications, Inc. AR5212 802.11abg NIC
(rev 01)

Software Environment:
Gentoo Linux (stable), e1000 driver for the Gigebit network card

Problem Description:

all UDP packets created by my localhost have a wrong check sum
whereas the received UDP packets are ok

Steps to reproduce:
initial bug report : http://bugs.gentoo.org/show_bug.cgi?id=164694
Comment 1 Toralf Förster 2007-02-05 07:33:48 UTC
I realized this initially while playing with wireshark.
I can reproduce it with some kernels between v2.6.18 and v.2.6.19 at the command
line with tcpdump:

$>tcpdump udp -A -vv -i eth0 -l -q -p | grep UDP | grep 'bad udp cksum'

Unfortunately the behaviour can only be reproduced within a LAN (100 MBit
Ethernet) and not at home (4 MBit DSL) - and it's not possible to reproduce the
issue with a user mode linux :-(

My first attempt to bisect the first bad commit was ... unsuccesful. I started
with bisect v2.6.18 ... v2.6.19 but the resulted commit gave me a kernel which
panic'ed instead to boot. I made some attemps like "$>git reset --hard HEAD~30"
but I didn't got a bootable kernel.

The next attempt was : "$> git bisect start drivers/net/" but after some steps I
made a mistake (typed "$> git bisect good" instead of "$> git bisect bad".
I edited the .git/BISECT_LOG file (cutted the last 2 lines) and made a "$> git
bisect replay .git/BISECT_LOG") but the result was that the bisected tree was
reseted and the file .git/BISECT_LOG was lost :-(.
Comment 2 Auke Kok 2007-02-05 12:09:19 UTC
maybe I'm completely off here...

what if the hardware sets the checksum for every packet? that means that tcpdump
can never see the correct checksum before it goes out. AFAIK all e1000 hw sets
the checksum in hardware, so this is kind of expected.

if at all this is a bug, it is that tcpdump doesn't know about hw csum offload
capabilities of the NIC it's tracing.

This "issue" should show up for any NIC that does HW csum offload on transmit.

Comment 3 Toralf Förster 2007-02-07 08:18:58 UTC
Created attachment 10333 [details]
config

I tracked down the problem now to the module iptable_nat. I reproduced the
issue with kernel 2.6.18 (kernel config attached).

After booting into that kernel the command

$>tcpdump udp -A -vv -i eth0 -l -q -p | grep UDP | grep 'bad udp cksum'

gives a log of output, if ntpd was started and some DNS queries are taken. If i
made then a

$>modprobe iptable_nat

I got no subsequent bad UDP packets.
Comment 4 Auke Kok 2007-02-07 08:39:34 UTC
Yes, of course that makes the problem go away. DUH

netfilter is too stupid ;) to know that the outgoing interface will do the
checksum offloading for us, and therefore always calculates it when translating
addresses, because it *has* to - it just modified the package!

You just found a case where we can optimize iptable_nat to _NOT_ recalculate the
csum for us and let the hardware do it.
Comment 5 Toralf Förster 2007-02-09 00:34:48 UTC
Ok, with kernel 2.6.20 iptables_nat doesn't seem to calculate checksum -  now 
all sniffed UDP packets have a wrong check sum - as expected, yes ? 

If that's the case I'll close this bug.

(BTW, the TCP check sum are not affected, isn't it ?)
Comment 6 Auke Kok 2007-02-09 07:34:49 UTC
The TCP checksum should also be missing, but tcpdump might notice that and not
complain about it.

In any case: if your receiving end sees the right checksums then everything is
working the way it should.

Please close this issue :)

Note You need to log in before you can comment on or make changes to this bug.