Bug 5644 - NFS v3 TCP 3-way handshake incorrect, iptables blocks access
Summary: NFS v3 TCP 3-way handshake incorrect, iptables blocks access
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Networking
Classification: Unclassified
Component: Netfilter/Iptables (show other bugs)
Hardware: i386 Linux
: P2 blocking
Assignee: Jozsef Kadlecsik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-11-23 07:57 UTC by jl-icase
Modified: 2005-11-29 02:54 UTC (History)
0 users

See Also:
Kernel Version: 2.6.14
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Keep netfilter from dropping ACK probes for half-open connections (5.37 KB, patch)
2005-11-23 10:04 UTC, Trond Myklebust
Details | Diff
Keep netfilter from ingnoring all RST if the previous packet was an ACK (1.39 KB, patch)
2005-11-23 10:05 UTC, Trond Myklebust
Details | Diff

Description jl-icase 2005-11-23 07:57:40 UTC
Most recent kernel where this bug did not occur:
Distribution: Can't remember, possibly FC2.
Hardware Environment:
Software Environment:
Problem Description:

Steps to reproduce:
1. Boot NFS v3 TCP client running iptables & mount NFS filesystem
2. Do a normal NFS client reboot & try mounting the same filesystem again
3. Experience intermittent failure to read superblock

The cause of this problem is NFS server's improper response to SYN packet sent
by the client.  This occurs *after* successful client authorization, when the
client tries to open the connection (i.e. sends SYN to the server's nfs port) to
read the superblock.  The server (sometimes) responds with a pure ACK without
the SYN bit set.  This is blocked by iptables -- thus, mount fails with a "could
not read superblock" message.

Here is an excerpt from ethereal log:

      3 0.021733    client           SERVER           TCP      800 > nfs [SYN]
Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=24095 TSER=0 WS=2
      4 0.021846    SERVER           client           TCP      nfs > 800 [ACK]
Seq=9138391 Ack=3580883479 Win=16022 Len=0 TSV=244936050 TSER=1149400
      5 0.021864    client           SERVER           ICMP     Destination
unreachable (Host administratively prohibited)

The above problem occurs with a very simple default+ssh iptables configuration.
 Disabling iptables on the client makes the problem go away.  Even with iptables
active, there is no problem when nfsd responds with a proper [SYN,ACK] instead
of just pure ACK (this happens intermittently after the client reboot).

Please fix nfsd so that it reliably responds to SYN packets with proper
[SYN,ACK] packets instead of just ACK packets.  Apparently, nfsd state doesn't
get properly reset on client reboots.  Other people have reported autofs
failures which may be related (e.g. on remounts).
Comment 1 Trond Myklebust 2005-11-23 10:02:49 UTC
Olaf Kirch responds with:

We've seen this previously, and submitted a fix to netfilter which
supposedly went into mainline at some point. It seems to be gone
from 2.6.14 though.

The problem is with conntrack, and filtering on RELATED (I assume
your netfilter config does that)

What happens is that the client reboots, opens a new TCP connection
with the same port as last time (say 800), sends SYN. Server still has
an active TCB for this, and thus replies with an ACK containing
its current sequence numbers. Now the client is supposed to RST the
connection.

Unfortunately, conntrack does not expect a lone ACK in this state
and ignores it. So the client will retransmit the SYN until timeout.
Then it picks a new port, and succeeds (maybe).
Comment 2 Trond Myklebust 2005-11-23 10:04:35 UTC
Created attachment 6666 [details]
Keep netfilter from dropping ACK probes for half-open connections

From: Jozsef Kadlecsik
Patch-mainline: submitted by Joszef Kadlecsik
References: SUSE46818

Mounting NFS file systems after a (warm) reboot could take a long time if
firewalling and connection tracking was enabled.

The reason is that the NFS clients tends to use the same ports (800 and
counting down). Now on reboot, the server would still have a TCB for an
existing TCP connection client:800 -> server:2049. The client sends a
SYN from port 800 to server:2049, which elicits an ACK from the server.
The firewall on the client drops the ACK because (from its point of
view) the connection is still in half-open state, and it expects to see
a SYNACK.

The client will eventually time out after several minutes.

The following patch corrects this, by accepting ACKs on half open connections
as well.

Acked-By: Olaf Kirch <okir@suse.de>
Comment 3 Trond Myklebust 2005-11-23 10:05:38 UTC
Created attachment 6667 [details]
Keep netfilter from ingnoring all RST if the previous packet was an ACK

From: Martin Josefsson
Patch-mainline: 2.6.11-rc1
References: SUSE50484

This is incremental fix to netfilter-tcp-rst-ack-fix (#46818) 

The change was that an RST is ignored if the previous packet was an ACK.
This is happens all the time. I know it was intended as a fix for the
SYN - ACK probe - RST sequence but it breaks normal usage. The problem
is that connections that end with RST never get their state changed and
are left in ESTABLISHED state with a large timeout.
The patch below adds a check for
!test_bit(IPS_ASSURED_BIT, &conntrack->status) so your change will only
be active for unassured connections.

Acked-By: Karsten Keil <kkeil@suse.de>
Comment 4 Trond Myklebust 2005-11-23 15:00:52 UTC
Neil points to the bugzilla report at

   https://bugzilla.novell.com/show_bug.cgi?id=104379

Our conclusion appears to be that this is a netfilter issue. Reassigning the bug
to the networking folks...
Comment 5 jl-icase 2005-11-23 21:30:25 UTC
If netfilter fixes help, fine.  However, to me the current situation seems
fragile from NFS perspective alone.  Normal client reboot ought to (?) close the
TCP connection to the NFS server -- this could also fix the problem.  Is there a
technical reason why normal client shutdown doesn't cause the server to close
the connection?
Comment 6 Jozsef Kadlecsik 2005-11-29 02:54:23 UTC
The updated netfilter patch which fixes the problem has been sent to 
the netdev and netfilter-devel lists for rewiev and kernel inclusion.
 
> Normal client reboot ought to (?) close the TCP connection to the NFS server 
> -- this could also fix the problem.

Yes, exactly, however netfilter blocked the packets required to tear down the
half-open connection. NFS or even the TCP stack cannot "workaround" that, so
it was completely a netfilter issue.

Note You need to log in before you can comment on or make changes to this bug.