Bug 201939 - There is no 'no space' message when using xfs with nfs.
Summary: There is no 'no space' message when using xfs with nfs.
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: File System
Classification: Unclassified
Component: XFS (show other bugs)
Hardware: All Linux
: P1 high
Assignee: FileSystem/XFS Default Virtual Assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-09 12:52 UTC by gbkwon
Modified: 2018-12-11 02:32 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.4.113
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description gbkwon 2018-12-09 12:52:29 UTC
hi all


I use xfs filesystem over nfs


I am using the xfs file system via NFS on two Linux servers.

Even though xfs is full, you will not get a 'no space' message through nfs.

The server's load average goes up and the kworker consumes the CPU.

The server's nfs service is not responding.

When ext3 is full in the same environment, I get a 'no space' message.

Which part of nfs and xfs should be checked?


Below is the environment I am using.


kernel : nfs server 3.4.113 vanilla, nfs client 2.6.32-573.el6.x86_64

nfs : nfs v3 with tcp


==================================================================
nfs server
================================================================== 
/dev/mapper/lv2  500G  500G   20K 100% /lv2

/lv2    0.0.0.0/0.0.0.0(rw,async,wdelay,nohide,nocrossmnt,insecure,no_root_squash,no_all_squash,no_subtree_check,insecure_locks,no_acl,fsid=1543743056,anonuid=65534,anongid=65534)


================================================================== 
nfs client
================================================================== 
10.0.0.20:/lv2   500G  500G   32K 100% /mnt/2 

nfsstat  -m
/mnt/2 from 10.0.0.20:/lv2
 Flags: rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=udp,timeo=11,retrans=3,sec=sys,mountaddr=10.0.0.20,mountvers=3,mountport=2047,mountproto=udp,local_lock=none,addr=10.0.0.20
Comment 1 gbkwon 2018-12-09 12:55:40 UTC
Please see the log below.


top - 21:54:10 up 58 min,  0 users,  load average: 8.50, 8.37, 8.83
Tasks: 347 total,   2 running, 345 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  4.2%sy,  0.0%ni, 95.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12368 root      20   0     0    0    0 S   12  0.0   4:32.44 kworker/2:2
11092 root      20   0     0    0    0 S   10  0.0   4:42.57 kworker/3:1
11166 root      20   0     0    0    0 R    9  0.0   2:09.47 kworker/5:1
12473 root      20   0     0    0    0 S    8  0.0   2:09.36 kworker/4:2
 2944 root      20   0     0    0    0 D    6  0.0   2:09.10 nfsd
 2953 root      20   0     0    0    0 D    6  0.0   2:24.65 nfsd
 3024 root      20   0     0    0    0 D    6  0.0   2:23.12 nfsd
Comment 2 Christian Kujau 2018-12-10 23:19:46 UTC
> kernel : nfs server 3.4.113 vanilla, nfs client 2.6.32-573.el6.x86_64

Both kernel versions have been EOL for a long time (or is the server maybe on 4.4.113 instead?), please either try to reproduce with recent kernel version or contact the vendor about this issue.

That being said, there have been some problems with NFS & ENOSPC in the (very distant) past:

> NFS behaviour when filesystem is 100% full
> https://technicalprose.blogspot.com/2013/11/nfs-behaviour-when-filesystem-is-100.html

> NFS corruption on ENOSPC
> https://www.redhat.com/archives/linux-lvm/2010-December/msg00030.html
Comment 3 gbkwon 2018-12-11 02:05:59 UTC
Thanks for your reply.


The same problem does not occur in 3.18 kernels.

Unfortunately, I have must to use 3.4 kernel.

If I add 'noac' to the nfs mount option, there is no such problem.

However, the 'noac' option causes a performance penalty of more than 50%.

I want to fix the problem without adding the 'noac' option.

If I add the 'actimeo=0' option to the nfs mount option, the problem happens, but after about 5 minutes the problem disappears.
Comment 4 Eric Sandeen 2018-12-11 02:32:17 UTC
(In reply to gbkwon from comment #3)
> Thanks for your reply.
> 
> The same problem does not occur in 3.18 kernels.
> 
> Unfortunately, I have must to use 3.4 kernel.

So, it seems you have reported a bug in a 6 year old kernel which has already been fixed - 4 years ago.  Which means that it's not a bug now.

This is an upstream bug tracker, not a user support forum.  You may wish to ask your distribution for further support of the old kernel, or possibly try a user mailing list.  You could also spot-check kernels in between, or do a full bisect, to look for the commit(s) which fixed the old bug.

But because this issue has apparently already been resolved in upstream kernels, I'm closing this bug.

Thanks,
-Eric

Note You need to log in before you can comment on or make changes to this bug.