Bug 104911

Summary: caching issue
Product: File System Reporter: Leandro Awa (lawa)
Component: VFSAssignee: fs_vfs
Status: NEW ---    
Severity: normal CC: szg00000, trondmy
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.1.6 Tree: Mainline
Regression: No
Attachments: log file from 'git bisect' testing
[PATCH] namei: results of d_is_negative() should be checked after dentry revalidation

Description Leandro Awa 2015-09-23 22:39:23 UTC
We appear to be running into 'caching issues' with NFS in this version of the kernel.   We had recently switched to this version in order to get around a problem we had encountered with NFS in version 3.18.9.  

After switching to version 4.1.6, our parallelized and distributed workflows now  fail consistently with errors of the form:

T34: ./regex.c:39:22: error: config.h: No such file or directory

The latter error message is from a relatively simpler test case (compared to our regular worflow), from a parallelized and distribured build of  binutils 2.25.1 using lsmake (a proprietary make utility from IBM/LSF).   The test case runs in parallel on 2 hosts.  In all of the failures, "config.h" is almost always created on host A, with the failures happening on host B.

We have already tried mounting the filesystem we were using for the test case with progressively lower values of aregmin/acregmax/acdirmin/acdirmax, and even with lookupcache set to none.  None of these helped.

We have ran  simpler tests using the nfstest_cache utility from http://wiki.linux-nfs.org/wiki/index.php/NFStest.  The results we got appear to suggest that NFS caching is behaving normally.

May I know if you may be able to help shed some light on this issue? 
Thank you.

Leandro Awa
Comment 1 Leandro Awa 2015-10-07 16:18:21 UTC
Created attachment 189641 [details]
log file from 'git bisect' testing
Comment 2 Leandro Awa 2015-10-07 16:20:14 UTC
Fyi.  From our 'git bisect' testing, the following commit appears to be
the possible cause of the behavior we've been seeing:

commit 766c4cbfacd8634d7580bac6a1b8456e63de3e84
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Thu May 7 19:24:57 2015 -0400

    namei: d_is_negative() should be checked before ->d_seq validation

    Fetching ->d_inode, verifying ->d_seq and finding d_is_negative() to
    be true does *not* mean that inode we'd fetched had been NULL - that
    holds only while ->d_seq is still unchanged.

    Shift d_is_negative() checks into lookup_fast() prior to ->d_seq

    Reported-by: Steven Rostedt <rostedt@goodmis.org>
    Tested-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Leandro Awa
Comment 3 Trond Myklebust 2015-10-07 18:47:39 UTC
Yes, that looks bad. The negative lookup code in lookup_fast() is circumventing the revalidation of the dentry.

Reassigning to the VFS maintainer.
Comment 4 Trond Myklebust 2015-10-08 12:58:42 UTC
Created attachment 189751 [details]
[PATCH] namei: results of d_is_negative() should be checked after dentry revalidation

Please could you check whether or not the attached patch helps?
Comment 5 Leandro Awa 2015-10-09 00:05:06 UTC
Fyi.  The patch definitely helped.  I just completed a test run and it passed.
Thank you.

Leandro Awa