Bug 8043 - curious communication breakage with e1000 and NBT
Summary: curious communication breakage with e1000 and NBT
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Jesse Brandeburg
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-02-20 04:20 UTC by Wolf Wiegand
Modified: 2008-10-30 23:38 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.18.7
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
tcpdump trace (7.41 KB, text/plain)
2007-02-20 04:22 UTC, Wolf Wiegand
Details
patch to fix tcp zero csum (2.88 KB, patch)
2008-10-06 13:09 UTC, Jesse Brandeburg
Details | Diff

Description Wolf Wiegand 2007-02-20 04:20:57 UTC
Most recent kernel where this bug did *NOT* occur: 2.4.32
Distribution: Univention Corporate Server
Hardware Environment: i386, PCI I2O RAID controller 
Software Environment: Linux 2.6.18.7
Problem Description:

I am installing windows clients using the unattended project. Short description
of this: A PC boots a FreeDOS-Image using PXE. From within FreeDOS, all Windows
installation files (about 500MB) are copied to the local disk for later
installation. I have encountered a very strange problem using the e1000 network
driver. At some point (the exact point differs, but this can always be
reproduced), communication between the linux server and the dos client totally
breaks down. I will attach a tcpdump trace. In the trace you can see that
packages are being sent out to the client, but they either don't arrive at the
client, or the reply packets are lost. This leads to errors on the DOS client
that cannot be recovered from.

This error only occurs using a 2.6-Kernel and the e1000 driver. Switching to a
2.4.32-kernel or using a different network card (we successfully tested this
with a Realtek 8139) make the problem go away. This can only be reproduced with
some clients, where all machines of the same class either show or don't show
this behaviour. This has been reproduced at a customer site with two more linux
machines on different networks. All this indicates that this problem is not
caused by faulty hardware and is related to the e1000 driver.

This can be reproduced with distribution kernels 2.6.18.1 and 2.6.14.7 and
vanilla 2.6.18.7. I cannot test this with more recent kernels, as the hardware
available for tests does not boot with 2.6.19/2.6.20 kernels (issue related to
the i2o raid controller).

What I've tried so far to work around this issue:

- add the parameter debug=16, RxDescriptors=4096, and Flowcontrol=1 to e1000 module
- ethtool -K eth0 tso off
- ethtool -K eth0 sg off
- echo 0 > /proc/sys/net/ipv4/tcp_window_scaling

Nothing of this helped.

Which further information is needed to track this down?

Steps to reproduce:

I can describe the entire setup needed for reproduction (including unattended
windows setup) if this is desired.
Comment 1 Wolf Wiegand 2007-02-20 04:22:36 UTC
Created attachment 10470 [details]
tcpdump trace
Comment 2 Auke Kok 2007-02-26 08:48:31 UTC
Might be related to SNAP:


14:00:37.469645 IP (tos 0x0, ttl  30, id 29939, offset 0, flags [none], proto:
TCP (6), length: 95) winxp.olb.test.45080 > master.olb.test.netbios-ssn: P
53998:54053(55) ack 6120349 win 1450
>>> NBT Session Packet
NBT Session Message
Flags=0x0
Length=51 (0x33)
WARNING: Short packet. Try increasing the snap length by 13
Comment 3 Jesse Brandeburg 2007-02-26 16:42:17 UTC
will you please try the latest standalone driver from http://e1000.sf.net, as it
is much newer than the driver in the kernels you are using.  It should work on
any kernel that is not 2.6.20.

I'm curious whether this has something to do with the CRC stripping changes.

If you get us a more detailed description of how to reproduce I can have our lab
try it.
Comment 4 Wolf Wiegand 2007-02-27 07:01:28 UTC
Hi, thanks for your response. I've reran the tests with the latest driver, which
made no difference:

# modinfo e1000 | grep version
version:        7.3.20
srcversion:     C7395247572355AB0396F9B

I've ran some more tests with tcpdump and ethereal. The following packets are
being sent when the problem occurs (ie, the file being transferred at that
moment cannot be read later on):

10.200.7.230 == linux server, 10.200.7.231 == FreeDOS client:

336.189987  10.200.7.230  10.200.7.231  NBSS  [TCP Window Full] NBSS
Continuation Message
336.282886  10.200.7.231  10.200.7.230  TCP   39952 > netbios-ssn [ACK]
Seq=1076691 Ack=195971747 Win=1450 Len=0
336.283383  10.200.7.230  10.200.7.231  NBSS  [TCP Window Full] NBSS
Continuation Message
336.506917  10.200.7.230  10.200.7.231  NBSS  [TCP Retransmission] NBSS
Continuation Message
336.946934  10.200.7.230  10.200.7.231  NBSS  [TCP Retransmission] NBSS
Continuation Message
337.826998  10.200.7.230  10.200.7.231  NBSS  [TCP Retransmission] NBSS
Continuation Message
339.587090  10.200.7.230  10.200.7.231  NBSS  [TCP Retransmission] NBSS
Continuation Message
343.107283  10.200.7.230  10.200.7.231  NBSS  [TCP Retransmission] NBSS
Continuation Message
350.147649  10.200.7.230  10.200.7.231  NBSS  [TCP Retransmission] NBSS
Continuation Message
364.028368  10.200.7.230  10.200.7.231  NBSS  [TCP Retransmission] NBSS
Continuation Message
380.519862  10.200.7.231  10.200.7.230  SMB   Read Raw Request, FID: 0x1a65
380.558958  10.200.7.230  10.200.7.231  TCP   [TCP Window Full] netbios-ssn >
39952 [ACK] Seq=195973197 Ack=1076746 Win=5840 Len=0
385.559216  10.200.7.230  10.200.7.231  ARP   Who has 10.200.7.231?  Tell
10.200.7.230
385.559343  10.200.7.231  10.200.7.230  ARP   10.200.7.231 is at 00:30:05:19:8e:04
392.179830  10.200.7.230  10.200.7.231  NBSS  [TCP Retransmission] NBSS
Continuation Message
392.180049  10.200.7.231  10.200.7.230  TCP   [TCP ZeroWindow] 39952 >
netbios-ssn [ACK] Seq=1076746 Ack=195973197 Win=0 Len=0
392.180091  10.200.7.231  10.200.7.230  TCP   [TCP Window Update] 39952 >
netbios-ssn [ACK] Seq=1076746 Ack=195973197 Win=1450 Len=0
392.180517  10.200.7.230  10.200.7.231  NBSS  NBSS Continuation Message
392.180593  10.200.7.230  10.200.7.231  NBSS  [TCP Window Full] NBSS
Continuation Message
392.181931  10.200.7.231  10.200.7.230  TCP   39952 > netbios-ssn [ACK]
Seq=1076746 Ack=195974647 Win=1450 Len=0
392.182366  10.200.7.230  10.200.7.231  NBSS  NBSS Continuation Message
392.182415  10.200.7.230  10.200.7.231  NBSS  [TCP Window Full] NBSS
Continuation Message

The [TCP Window full] don't seem to be a problem, these also occur when using
the realtek card.

Steps to reproduce this:

- Get http://bitz150.bitz.briteline.de/undis3c.img and dd it onto a floppy disc
(pxe boot is also possible), boot from this disc. This contains FreeDOS.
- Configure a DHCP server to give out an ip address to the client
- During boot, probably some errors will occur as the configured share will not
be present. Override the given values with the name of a samba server and a
share on it. 
- When you end up on the command prompt, try to copy a large folder off the
network share onto the local hard drive. The hard drive has to be pre-formatted,
as the disc contains no tools for this.

Unfortunately, the problem only occurs with some clients. At the moment, we can
reproduce this on a client where lspci shows the following:

0000:00:00.0 Host bridge: Intel Corp. 82810E DC-133 GMCH [Graphics Memory
Controller Hub] (rev 03)
0000:00:01.0 VGA compatible controller: Intel Corp. 82810E DC-133 CGC [Chipset
Graphics Controller] (rev 03)
0000:00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev 05)
0000:00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev 05)
0000:00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 05)
0000:00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #1) (rev 05)
0000:00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 05)
0000:00:1f.4 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #2) (rev 05)
0000:00:1f.5 Multimedia audio controller: Intel Corp. 82801BA/BAM AC'97 Audio
(rev 05)
0000:01:08.0 Ethernet controller: Intel Corp. 82801BA/BAM/CA/CAM Ethernet
Controller (rev 03)

Regarding the SNAP message: "WARNING: Short packet. Try increasing the snap
length by 13" - This seems to be a tcpdump display issue only, using '-s 180' or
so, this does not happen anymore.
Comment 5 Auke Kok 2007-02-27 08:20:54 UTC
turn off tcp window scaling, you might have a broken router or peer?
Comment 6 Wolf Wiegand 2007-02-28 00:16:02 UTC
We already tried that (was pretty much the first thing we did), and this did not
help. I just rechecked this, establishing a direct connection between server and
client using a crosslink calbe, the same problem still shows up.

Concerning the client network driver, the driver on the DOS disk we are using is
a generic 3com (3Com Universal NDIS driver v1.00) driver which is supposed to
work with any network card. We've replaced the driver with a e100b-driver
(running strings on it reveals "Intel(R) PRO/100 Network Connection Driver v4.57
112304"), which did not help.
Comment 7 Auke Kok 2007-03-03 22:25:41 UTC
This may be a lengthy process, but it might be interesting to try to regress
through our drivers from 7.3.20 back to 7.0.33 or even back to the last version
that's roughly the same as in 2.4.32 (5.4.11) and seeing if the 2.6 kernel works
correctly with any of these drivers.
Comment 8 Olaf Kirch 2007-03-06 00:23:41 UTC
It may help to attach a raw tcpdump (tcpdump -s 512 -w /tmp/somefile) - there
are some peculiarities in the dump.

The pattern looks similar in both cases. However, the dump in attachment #5 [details]
is too dumbed down to actually look at sequence numbers etc.

 -	client max window seems to be set at 1450, and Linux server transmits
	using a segment size of half that window - 725.

 -	master retransmits the same packet all over again, and client ignores
	it.

 -	client sends another readraw request

 -	server retransmits old segment plus additional data, and now the
	client groks it.

Looking at the dump from attachment #1 [details]:

master -> client: 6064241:6064966(725) ack 53650 win 5840
	This seems to be an old packet.
	The next packet we see comes 0.2 seconds later, but
	notice the huge difference in sequence numbers - the
	send sequence differs by about 1.6MB!

master -> client: 6120349:6121074(725) ack 53998 win 5840
	this is repeated several times

client -> master: 53998:54053(55) ack 6120349 win 1450
	this is the read request
	The ACK shows that the client hasn't processed any
	of the reply packets sent above.

master -> client: 6121799:6121799(0) ack 54053 win 5840
	empty ACK of read request

master -> client: 6120349:6121074(725) ack 54053 win 5840
	that same old packet again

client -> master: 54053:54053(0) ack 6121799 win 0
	whoops - now it ACKs that old segment, but it
	seems the following segment was already sent
	*and* received.

	Note that the client advertises a zero window here.
	There's a rather funky TCP stack at work here...

client -> master: 54053:54053(0) ack 6121799 win 1450
	Client reopens TCP window. Apparently it needed
	a little break to process these two segments
	in the queue.

master -> client: 6121799:6122524(725) ack 54053 win 5840
	now we go on sending more data

In summary, the TCP exchange here is highly unusual, but it should
continue after this. The fact that it doesn't would mean (to me)
that the client's TCP stack is terminally confused.

Here's a funny theory: for some strange reason, the e1000 driver retransmits
an old packet which should have been purged from the TX ring long ago.
Client's TCP stack says "omigosh" and things go downhill from here.

Having a raw tcpdump (with some packets before and after the hang) would
help to check that theory
Comment 9 Jesse Brandeburg 2007-04-03 15:23:29 UTC
Wolf, did you have any luck getting a tcpdump -s 512 -w /tmp/dumpfile while
having this problem?
Comment 10 Jesse Brandeburg 2007-05-17 22:04:37 UTC
ping.
Comment 11 Wolf Wiegand 2007-05-22 07:15:23 UTC
Sorry for the delay. I've uploaded the raw tcpdump to
http://bitz150.bitz.briteline.de/tcpdump.out.s512.filtered 
Comment 12 Wolf Wiegand 2007-07-24 07:17:40 UTC
(In reply to comment #7)
> This may be a lengthy process, but it might be interesting to try to regress
> through our drivers from 7.3.20 back to 7.0.33 or even back to the last
> version
> that's roughly the same as in 2.4.32 (5.4.11) and seeing if the 2.6 kernel
> works
> correctly with any of these drivers.

This problem also occurs with versions 7.0.33 and 5.7.6. Version 5.4.11 wouldn't compile on kernel 2.6.14.7, and I was not able to make the necessary change in the source code to compile it.
Comment 13 Jesse Brandeburg 2008-10-06 12:48:11 UTC
I believe we actually fixed this bug in e1000.  I'm not sure if a kernel patch was pushed to do the same.

The problem is that the e1000 hardware was inserting or misinterpreting an incorrect checksum for packets with 0x0000 checksum.

I'll see if I can dig up the patch, as I assume this is still occurring on current kernels.
Comment 14 Jesse Brandeburg 2008-10-06 13:09:10 UTC
Created attachment 18184 [details]
patch to fix tcp zero csum

this patch was only compile tested but has undergone extensive testing in our out of tree drivers.
Comment 15 Jesse Brandeburg 2008-10-06 13:30:26 UTC
not sure which hardware you had (didn't look, sorry) but the same patch is likely needed for e1000 as well.

Jeff Kirsher will probably post both to netdev soon for inclusion in 2.6.28 hopefully.

Note You need to log in before you can comment on or make changes to this bug.