Bug 219278

Summary: >=linux-6.6.0: Resource temporarily unavailable when reading file attributes the first time
Product: File System Reporter: Jan (linux)
Component: NFSDAssignee: Filesystem/NFSD virtual assignee (filesystem_nfsd)
Status: NEW ---    
Severity: normal CC: linmaxi, linux
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Jan 2024-09-14 10:48:54 UTC
Description
-----------
When a file is written via nfs, calling lsattr on it on the server side fails with:
Resource temporarily unavailable While reading flags from <filename>

This only happens the first time.
The error first appeared in kernel version 6.0.0.


I bisected the issue to this commit

commit 1d3dd1d56ce8322fb5b2a143ec9ff38c703bfeda
Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Thu Jun 29 18:52:40 2023 -0700

    NFSD: Enable write delegation support
    
    This patch grants write delegations for OPEN with NFS4_SHARE_ACCESS_WRITE
    if there is no conflict with other OPENs.
    
    Write delegation conflicts with another OPEN, REMOVE, RENAME and SETATTR
    are handled the same as read delegation using notify_change,
    try_break_deleg.
    
    The NFSv4.0 protocol does not enable a server to determine that a
    conflicting GETATTR originated from the client holding the
    delegation versus coming from some other client. With NFSv4.1 and
    later, the SEQUENCE operation that begins each COMPOUND contains a
    client ID, so delegation recall can be safely squelched in this case.
    
    With NFSv4.0, however, the server must recall or send a CB_GETATTR
    (per RFC 7530 Section 16.7.5) even when the GETATTR originates from
    the client holding that delegation.
    
    An NFSv4.0 client can trigger a pathological situation if it always
    sends a DELEGRETURN preceded by a conflicting GETATTR in the same
    COMPOUND. COMPOUND execution will always stop at the GETATTR and the
    DELEGRETURN will never get executed. The server eventually revokes
    the delegation, which can result in loss of open or lock state.
    
    Tracepoint added to track whether read or write delegation is granted.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

 fs/nfsd/nfs4state.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------
 fs/nfsd/trace.h     |  1 +
 2 files changed, 78 insertions(+), 20 deletions(-)


Steps to reproduce
------------------
# mkdir -p test/{shared,local}
# cd /mnt/test/local
# python test.py; lsattr -l testfile; lsattr -l testfile;

/etc/exports
----------------------------------------------------------------------
/mnt/test/local 127.0.0.1(rw,no_subtree_check,crossmnt,no_root_squash)
----------------------------------------------------------------------

/etc/fstab
---------------------------------------------------------------------
localhost:/mnt/test/local   /mnt/test/shared  nfs   rw,noatime   0 0
---------------------------------------------------------------------

/mnt/test/local/test.py
--------------------------------------------------------------------------
import os
fd = os.open("/mnt/test/shared/testfile", os.O_CREAT | os.O_EXCL | os.O_RDWR)
os.close(fd)
--------------------------------------------------------------------------


Actual result
-------------
lsattr: Resource temporarily unavailable While reading flags from testfile
testfile    Extents


Expected result
---------------
testfile    Extents
testfile    Extents


Background
----------
I am using certbot to receive letsencrypt certificates. The certbot service runs on a different host than the webserver that is used. Certbot uses nfs to write the challenge files to the webserver. The code is very similar to the one in the provided test.py. When letsencrypt tries to verify, the operation fails because the webserver receives 'Resource not available' when trying to read the file.
Comment 1 Max 2024-09-15 11:44:23 UTC
I have succesfully reproduced the bug, I will try to fix it.
Comment 2 Jan 2024-10-15 08:09:30 UTC
A workaroud is to force version 4.0 in fstab like this

/etc/fstab
---------------------------------------------------------------------
localhost:/mnt/test/local   /mnt/test/shared  nfs   rw,noatime,vers=4.0   0 0
---------------------------------------------------------------------