Bug 204939

Summary: 15 MB default readahead is extremely large
Product: File System Reporter: Alkis Georgopoulos (alkisg)
Component: NFSAssignee: Trond Myklebust (trondmy)
Status: NEW ---    
Severity: high CC: lists
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.2 Subsystem:
Regression: No Bisected commit-id:

Description Alkis Georgopoulos 2019-09-21 10:56:11 UTC
The current defaults are causing 15 MB readaheads for NFS, which are extremely large and cause excessive traffic and lags. To see the issue:

Mount an NFS dir using the defaults:
# mount -t nfs localhost:/srv/ltsp /mnt

See the defaults (i.e. rsize=1M):
# grep /mnt /proc/mounts
localhost:/srv/ltsp /mnt nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=127.0.0.1,local_lock=none,addr=127.0.0.1 0 0

Find the DEV number:
# cat /proc/fs/nfsfs/volumes
NV SERVER   PORT DEV          FSID                              FSC
v4 7f000001  801 0:67         3f3a98a25a94496f:bd99243e3e0529bd no 

See the resulting read_ahead_kb kernel variable:
# cat /sys/devices/virtual/bdi/0:67/read_ahead_kb
15360

I.e. the kernel will try to read ahead up to 15 MB for each opened file, resulting in excessive traffic and lags.
For example, netbooting an ext4-over-nfs client with the default results in 1160 MB traffic,
while with `echo 4 > /sys/devices/virtual/bdi/0:67/read_ahead_kb` (on the client) it results in 221 MB traffic, and that way it boots a lot faster and without lags.

Non-NFS devices default to read_ahead_kb of 128K:
cat /sys/devices/virtual/bdi/*/read_ahead_kb
128

Initially, NFS_MAX_READAHEAD=15 was used along with rsize=512 to *lower* that read_ahead_kb to 7 (KB) for NFS, as 128 KB is too big for NFS.
Now with rsize=1M, it's causing read aheads of 15 MB which are clearly too large.

Thank you,
Alkis Georgopoulos