Bug 6557 - NFS client (10x) performance regression -> all later kernels
Summary: NFS client (10x) performance regression -> all later kernels
Alias: None
Product: File System
Classification: Unclassified
Component: NFS (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Trond Myklebust
Depends on:
Reported: 2006-05-15 06:11 UTC by Jakob
Modified: 2006-08-28 05:24 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.17-rc4
Tree: Mainline
Regression: ---

Program to reproduce the regression (1.20 KB, text/plain)
2006-05-15 06:13 UTC, Jakob
Patch for 2.6.16 (6.85 KB, patch)
2006-05-15 06:15 UTC, Jakob
Details | Diff
Original patch from Trond, works with 2.6.17-rc6 (8.16 KB, patch)
2006-06-06 04:55 UTC, Klaus S. Madsen
Details | Diff

Description Jakob 2006-05-15 06:11:01 UTC
Most recent kernel where this bug did not occur:

 Debian GNU/Linux 3.1 (Sarge)

Hardware Environment:
 NFSv3 server on dual opteron
 NFSv3 clients on dual PIII or Opteron or even VMWare

Software Environment:
 Attached nfsbench program which exposes the problem, other than that, 
standard Debian installations. The problem can even be triggered with an old 
Red Hat inside VMWare.

Problem Description:
 Certain I/O patterns will run horribly slow due to a cache bug in the NFS 
client code in kernels later than

 Especially GNU ld (the linker) will expose this problem - link jobs on large 
files run 10-50 times slower on kernels newer than than they did on 
older kernels.

 The attached nfsbench program exposes the problem too, by sort-of exercising 
the same I/O pattern as GNU ld.

Steps to reproduce:
1) You need an NFS server and an NFS client.
2) Compile the test program: gcc -o nfsbench nfsbench.c
3) On the NFS client, execute the test program 100 times:
   for i in `seq 1 100`; do ./nfsbench; done

On a good kernel, all runs will complete in roughly a second each. On a bad 
kernel, *most* (but not necessarily all) runs will take roughly one minute to 

The attached patch seems to fix the problem.
Comment 1 Jakob 2006-05-15 06:13:21 UTC
Created attachment 8115 [details]
Program to reproduce the regression

Just compile and run.
Run the test program 100 times to make sure that you catch bad kernels - a bad
kernel *may* execute one run of the program in about a second, but *most* runs
will take about a minute. Good kernels should finish a run of the program in
about one second.
Comment 2 Jakob 2006-05-15 06:15:27 UTC
Created attachment 8116 [details]
Patch for 2.6.16

Patch from Trond, adapted by ksm@evalesco.com for 2.6.16
Comment 3 Klaus S. Madsen 2006-06-06 04:15:58 UTC
I can still reproduce this problem on Linux 2.6.17-rc6
Comment 4 Klaus S. Madsen 2006-06-06 04:55:46 UTC
Created attachment 8266 [details]
Original patch from Trond, works with 2.6.17-rc6
Comment 5 Trond Myklebust 2006-06-06 08:21:43 UTC
Already queued up for 2.6.18. Will not fix for 2.6.17 since it is not a critical
Comment 6 Dick Snippe 2006-08-22 15:21:55 UTC
Today I did some testing on NFS related problems in kernels > 2.6.14
I'd like to point out that the patch against 2.6.17(.9) works in our environment.
However, there's a sligt twist that might interest you:
It appears that the bad cache behaviour in the NFS client _only_ happens when
the NFS server is very busy / slow.
When testing (with an unpatched 2.6.17 kernel) against an idle server, all is
well. One run of the nfsbench program takes about one second and generates about
1000 rpc calls.
Testing against a busy linux server, with filesystems exported as "async", also
all is well. 1 second, 1000 rpc's
but, testing against a busy, slow server (fileystems exported as sync), all is
not well. 10 seconds, 10000 rpc's
My conclusion is that caching fails when the NFS server is slow in responding.
This could lead to a downward spiral: NFS server slow -> caching fails -> more
rpc's -> NFS server even slower -> ... -> crunch!
I'm pretty sure I've seen this happening on our production systems running
2.6.15 with mysql instances on NFS shares (although I didn't realize it at the
time). Our response then was to go back to 2.6.14. We're still running 2.6.14 on
those systems.
Comment 7 Trond Myklebust 2006-08-22 16:39:55 UTC
UDP mount? If so, then that is quite expected. Don't use UDP in situations where
you have frequent congestion issues.

TCP mounts can also see poor performance if you have too few nfsd threads, since
the Linux NFS server limits the number number of allowed TCP connections. The
limit is proportional to the number of threads.
Comment 8 Trond Myklebust 2006-08-28 05:24:18 UTC
The patch was merged into 2.6.18-rc1, so I'm closing the bug.

Note You need to log in before you can comment on or make changes to this bug.