When calling fanotify_mark on an NFS root squash mounted fs, fanotify_mark fails because it doesn't have permission to access the mount point. The workaround for this is trivial, since it involves marking the file system in a separate process with privileges dropped to match those of the mounted fs, returned by stat. However, once you receive an fanotify event from this mark, the file descriptor returned is not readable/seekable, since NFS verifies the permissions of the accessing process and causes the system call to return EIO errors. Again, a trivial work around would be to fork and drop privileges to the appropriate user and perform the IO operations in a separate process. However, in a multi-threaded environment, this becomes dangerous and creates memory allocation deadlock scenarios, unless the forked process performs an exec. Constantly forking and exec'ing in order to perform seeks and reads on the file descriptor would create extra processing overheads, resulting in performance degradation. Alternatively, it requires managing IO processing daemons that are started up when an event is triggered, using sockets to communicate with the parent process. This adds a lot of unnecessary complexity to handling IO. To reproduce this issue as root, create a directory and a file: mkdir /tmp/nfs_export date > /tmp/nfs_export/example chown -R nobody:nogroup /tmp/nfs_export Add this to /etc/exports: /tmp/nfs_export *(rw,async,root_squash) Reload NFS kernel server: /etc/init.d/nfs-kernel-server reload Create a mount point and mount the file system: mkdir /tmp/nfs_mount mount localhost:/tmp/nfs_export /tmp/nfs_mount Attempt to access file: cat /tmp/nfs_mount/example Should get EACCES error. Now just write or use a simple fanotify program to mark the /tmp/nfs_mount mount point and handle the event for an access attempt on /tmp/nfs_mount/example. You will not be able to read from the file descriptor provided by the fanotify event, it will result in an EIO error.
This is applicable to NFS protocol version 3
For NFSv4, the issue is somewhat worse. You can drop privileges to mark the mount point with fanotify_mark, but when poll signifies that an fanotify event is available on the fanotify file descriptor, reading from the fanotify file descriptor fails with EACCES, for events relating to files that are not accessible by a squashed root user. With NFSv3, you can at least read the events from the fanotify file descriptor, you just can't read from the event's associated file descriptor.
Assigning this to the fanotify maintainers to decide if there is a bug here. As far as the NFS filesystem is concerned, it is performing as per design, and there is no bug.
I'm very confused. You are chowning /tmp/nfs_export to nobody:nogroup but NFS is going to squash to nfsnobody So your client doesn't have read permission on the directory you are trying to add a mark. I just set up something similar, but chowned the server backing filesystem to the right uid and everything worked....
I forgot to mention changing the mode to 0700 on the directory and file, or 0600 on the file to be strictly correct. When you say the right uid, what uid are you referring to? nobody:nogroup I will setup a standalone script that I can reproduce the issue with here and send attach it...
I'm saying that root_squash is going to map to nfsnobody. nfsnobody != nobody So if the file/dir is 600/700 it is not accessible to root on the client, and this is expected... I'm confused....
Sorry. Let me elaborate a little. If you have an NFS export with root squash, owned by a user that exists on the client, but not root and that local user has read access, then there is potential for a malicious remote file to be copied to the local system. Fanotify provides hooks to allow a given process to control this with the FAN_OPEN_PERM mode of operation. So /tmp/nfs_export/example is owned by 'foo' with a mode of 0600 and /tmp/nfs_export is also owned by 'foo' with a mode of 0700. When I call fanotify_mark on the /tmp/nfs_mount, in the on-access monitoring process, running as root, fanotify_mark returns -1 because NFS squashes access and prevents root access as you quite rightly pointed out. I work around this by executing a subprocess that inherits the fanotify_fd and performs the fanotify_mark with dropped privileges as 'foo'. Something root has always been able to do with root_squash, which is why I don't know why it squashes you to nfsnobody as opposed to the owner of the file (providing they exist locally). Anyway, that's another issue, I am side tracking... So now I have a mark I marked on a root owned fanotify_fd, using a subprocess I ran as 'foo'. If I now read the /tmp/nfs_mount/example file as user 'foo': sudo -u foo cat /tmp/nfs_mount/example ...I get an event in the fanotify process running as root, that has a file descriptor copied from kernel space to user space, for the file being accessed by foo. However, the root running fanotify process gets an EIO error reading from this file descriptor, when determining if it is "safe". Meanwhile, user 'foo' is blocked while the process decides whether to fail open or not. This is again, because NFS squashes root access to nfsnobody. Again, it is possible to work around this by forking yet another process to perform seeks and reads on the event file descriptor, running as 'foo'; 'foo' being determined by stat'ing the /proc/`event.pid`/fd/`event.fd` file and looking at the value of st_uid. My issue is that forking is very cumbersome for on-access, granted it will increase the load average of the system for file operations like a recursive grep. It is also not thread safe, so if you consider the fanotify process having 5 or more threads for parsing fanotify events and responding based on some evaluation, forking becomes very problematic and you can't just drop privileges in a given thread without first blocking the other threads. This is also the case with GVFS mounted file systems, but I am more concerned that for NFS v4, the read operations on fanotify_fd fail with EACCES under these scenarios, so you can't even obtain the event struct. I would expect the kernel to be able to allow root to obtain file owner privileges when constructing the fanotify event and creating the accessed file descriptor. Without this, fanotify can't secure GVFS or NFS file system access. I am part way through my test script, do you still want me to attach it? Thanks Craig
I guess I'm going to need to see the test script to figure out what you are really doing. Seems like this is trivially fixed by using anon_uid=nobody in your exports instead of squashing root to have 0 access...
Created attachment 106913 [details] Test demonstrating drop privileges required for fanotify_mark() on root squash fs. This exhibits the fanotify_mark() issue, but for some yet unknown reason, it doesn't exhibit the read failure on the event file descriptor. I am sifting through the production code to see if I can identify how it differs from my stand-alone example. The only obvious one is that my stand-alone example is single threaded, so could it be lock/mutex related? The argument to the script is the owner of the NFS export directory and files, so you can specify some other user than the default of 'mail' that I have used.
The script can also be used to demonstrate the NFS v4 issue. Just change the vers=3 mount option to vers=4 and you get the following output: fanotify_init(): Initialising... fanotify_init(): Initialised on fd 3 fanotify_mark(): Marking (null)... Error: 13: Permission denied Okay, so let's do that again with dropped privileges... (This wouldn't work in a multi-threaded environment without forking and executing another process, so let's do it like that) Getting stat info: /tmp/fanotify_test.sh.lxwvvNN1/nfs_mount Waiting for fanotify_mark() child process... child: Setting real & effective uid: 8 Closing fanotify fd: 3 Child exited with exit status: 0 fanotify_mark(): Marked /tmp/fanotify_test.sh.lxwvvNN1/nfs_mount loop: Waiting for events... loop: Waiting for events... loop: Waiting for events... loop: Error: Failed to read from fanotify fd: 20: Not a directory loop: Waiting for events... Received SIGTERM, shutting down... loop: Shutdown gracefully Closing fanotify fd: 3
Also, noticed a printf ordering bug in the form of Marking (null). Feel free to fix that one :-)
I have been able to reproduce the issue on NFS 3 with my stand-alone script. It seems that on Kernel version 3.2.0-23, it works. But on the 3.5.0-34 kernel I have, it doesn't work. I will paste the output separately...
Kernel version: 3.5.0-34-generic Compiling /tmp/fanotify_test.sh.E55gudiy/fanotify_test.c... Setting up directories... Testing NFS export owned by mail... drwxrwxrwx 4 root root 4096 Jul 18 10:37 /tmp/fanotify_test.sh.E55gudiy drwx------ 2 mail root 4096 Jul 18 10:37 /tmp/fanotify_test.sh.E55gudiy/nfs_export -rwx------ 1 mail root 29 Jul 18 10:37 /tmp/fanotify_test.sh.E55gudiy/nfs_export/example -rw-r--r-- 1 root root 6521 Jul 18 10:37 /tmp/fanotify_test.sh.E55gudiy/fanotify_test.c drwxrwxrwx 2 root root 4096 Jul 18 10:37 /tmp/fanotify_test.sh.E55gudiy/nfs_mount -rwxr-xr-x 1 root root 13744 Jul 18 10:37 /tmp/fanotify_test.sh.E55gudiy/fanotify_test -rw-r--r-- 1 root root 389 Jul 18 10:37 /tmp/fanotify_test.sh.E55gudiy/exports.bak Configuring NFS server... * Re-exporting directories for NFS kernel daemon... [ OK ] Mounting the NFS export to /tmp/fanotify_test.sh.E55gudiy/nfs_mount... Starting fanotify handler as root... Waiting for process to settle... Attempting to access /tmp/fanotify_test.sh.E55gudiy/nfs_mount/example as nobody... Thu Jul 18 10:37:56 BST 2013 === Output from fanotify handler process === fanotify_init(): Initialising... fanotify_init(): Initialised on fd 3 fanotify_mark(): Marking (null)... Error: 13: Permission denied Okay, so let's do that again with dropped privileges... (This wouldn't work in a multi-threaded environment without forking and executing another process, so let's do it like that) Getting stat info: /tmp/fanotify_test.sh.E55gudiy/nfs_mount Waiting for fanotify_mark() child process... child: Setting real & effective uid: 8 Closing fanotify fd: 3 Child exited with exit status: 0 fanotify_mark(): Marked /tmp/fanotify_test.sh.E55gudiy/nfs_mount loop: Waiting for events... loop: Waiting for events... loop: Waiting for events... loop: Read 24 bytes from fanotify fd loop: Event file descriptor: 4 loop: Error: Readlink failed: 2: No such file or directory loop: Allowing access to file: /proc/6595/fd/4 loop: Reading from event file descriptor... loop: Error: Read failed: 5: Input/output error loop: Waiting for events... Received SIGTERM, shutting down... loop: Shutdown gracefully Closing fanotify fd: 3 === End of output === Restoring exports from /tmp/fanotify_test.sh.E55gudiy... * Re-exporting directories for NFS kernel daemon... Note the EIO error trying to read from the event file descriptor.
Kernel version: 3.2.0-23-generic Compiling /tmp/fanotify_test.sh.MdFC1yq7/fanotify_test.c... Setting up directories... Testing NFS export owned by mail... drwxrwxrwx 4 root root 4096 Jul 18 10:41 /tmp/fanotify_test.sh.MdFC1yq7 -rw-r--r-- 1 root root 389 Jul 18 10:41 /tmp/fanotify_test.sh.MdFC1yq7/exports.bak drwxrwxrwx 2 root root 4096 Jul 18 10:41 /tmp/fanotify_test.sh.MdFC1yq7/nfs_mount -rw-r--r-- 1 root root 6521 Jul 18 10:41 /tmp/fanotify_test.sh.MdFC1yq7/fanotify_test.c -rwxr-xr-x 1 root root 12212 Jul 18 10:41 /tmp/fanotify_test.sh.MdFC1yq7/fanotify_test drwx------ 2 mail root 4096 Jul 18 10:41 /tmp/fanotify_test.sh.MdFC1yq7/nfs_export -rwx------ 1 mail root 29 Jul 18 10:41 /tmp/fanotify_test.sh.MdFC1yq7/nfs_export/example Configuring NFS server... * Re-exporting directories for NFS kernel daemon... [ OK ] Mounting the NFS export to /tmp/fanotify_test.sh.MdFC1yq7/nfs_mount... Starting fanotify handler as root... Waiting for process to settle... Attempting to access /tmp/fanotify_test.sh.MdFC1yq7/nfs_mount/example as nobody... Thu Jul 18 10:41:16 BST 2013 === Output from fanotify handler process === fanotify_init(): Initialising... fanotify_init(): Initialised on fd 3 fanotify_mark(): Marking (null)... Error: 13: Permission denied Okay, so let's do that again with dropped privileges... (This wouldn't work in a multi-threaded environment without forking and executing another process, so let's do it like that) Getting stat info: /tmp/fanotify_test.sh.MdFC1yq7/nfs_mount Waiting for fanotify_mark() child process... child: Setting real & effective uid: 8 Closing fanotify fd: 3 Child exited with exit status: 0 fanotify_mark(): Marked /tmp/fanotify_test.sh.MdFC1yq7/nfs_mount loop: Waiting for events... loop: Waiting for events... loop: Waiting for events... loop: Read 24 bytes from fanotify fd loop: Event file descriptor: 4 loop: Error: Readlink failed: 2: No such file or directory loop: Allowing access to file: /proc/2940/fd/4 loop: Reading from event file descriptor... loop: Read 29 bytes from file: /proc/2940/fd/4 loop: Waiting for events... Received SIGTERM, shutting down... loop: Shutdown gracefully Closing fanotify fd: 3 === End of output === Restoring exports from /tmp/fanotify_test.sh.MdFC1yq7... * Re-exporting directories for NFS kernel daemon... Note on this version, it works fine and there is no EIO error.
Kernel version: 3.8.0-26-generic Compiling /tmp/fanotify_test.sh.eqXYHrki/fanotify_test.c... Setting up directories... Testing NFS export owned by mail... drwxrwxrwx 4 root root 4096 Jul 18 11:14 /tmp/fanotify_test.sh.eqXYHrki -rw-r--r-- 1 root root 389 Jul 18 11:14 /tmp/fanotify_test.sh.eqXYHrki/exports.bak drwxrwxrwx 2 root root 4096 Jul 18 11:14 /tmp/fanotify_test.sh.eqXYHrki/nfs_mount -rw-r--r-- 1 root root 6521 Jul 18 11:14 /tmp/fanotify_test.sh.eqXYHrki/fanotify_test.c -rwxr-xr-x 1 root root 12212 Jul 18 11:14 /tmp/fanotify_test.sh.eqXYHrki/fanotify_test drwx------ 2 mail root 4096 Jul 18 11:14 /tmp/fanotify_test.sh.eqXYHrki/nfs_export -rwx------ 1 mail root 29 Jul 18 11:14 /tmp/fanotify_test.sh.eqXYHrki/nfs_export/example Configuring NFS server... * Re-exporting directories for NFS kernel daemon... [ OK ] Mounting the NFS export to /tmp/fanotify_test.sh.eqXYHrki/nfs_mount... Starting fanotify handler as root... Waiting for process to settle... Attempting to access /tmp/fanotify_test.sh.eqXYHrki/nfs_mount/example as nobody... Thu Jul 18 11:14:15 BST 2013 === Output from fanotify handler process === fanotify_init(): Initialising... fanotify_init(): Initialised on fd 3 fanotify_mark(): Marking /tmp/fanotify_test.sh.eqXYHrki/nfs_mount... Error: 13: Permission denied Okay, so let's do that again with dropped privileges... (This wouldn't work in a multi-threaded environment without forking and executing another process, so let's do it like that) Getting stat info: /tmp/fanotify_test.sh.eqXYHrki/nfs_mount Waiting for fanotify_mark() child process... child: Setting real & effective uid: 8 Closing fanotify fd: 3 Child exited with exit status: 0 fanotify_mark(): Marked /tmp/fanotify_test.sh.eqXYHrki/nfs_mount loop: Waiting for events... loop: Waiting for events... loop: Waiting for events... loop: Read 24 bytes from fanotify fd loop: Event file descriptor: 4 loop: Error: Readlink failed: 2: No such file or directory loop: Allowing access to file: /proc/1886/fd/4 loop: Reading from event file descriptor... loop: Error: Read failed: 5: Input/output error loop: Waiting for events... Received SIGTERM, shutting down... loop: Shutdown gracefully Closing fanotify fd: 3 === End of output === Restoring exports from /tmp/fanotify_test.sh.eqXYHrki... * Re-exporting directories for NFS kernel daemon...
I have noticed on 32bit platforms and a 3.2 kernel, the EIO errors occur intermittently. Below are two identical runs, several seconds apart. The first run worked fine and was able to read from the file descriptor. The second run failed and was not able to read from the file descriptor (EIO error). The good run: Platform: Linux somehost.somedomain 3.2.0-48-generic #74-Ubuntu SMP Thu Jun 6 19:45:16 UTC 2013 i686 i686 i386 GNU/Linux Compiling /tmp/fanotify_test.sh.rTXTGfRG/fanotify_test.c... Setting up directories... Testing NFS export owned by mail... drwxrwxrwx 4 root root 4096 Jul 23 12:17 /tmp/fanotify_test.sh.rTXTGfRG -rw-r--r-- 1 root root 389 Jul 23 12:17 /tmp/fanotify_test.sh.rTXTGfRG/exports.bak drwxrwxrwx 2 root root 4096 Jul 23 12:17 /tmp/fanotify_test.sh.rTXTGfRG/nfs_mount -rw-r--r-- 1 root root 6521 Jul 23 12:17 /tmp/fanotify_test.sh.rTXTGfRG/fanotify_test.c -rwxr-xr-x 1 root root 12212 Jul 23 12:17 /tmp/fanotify_test.sh.rTXTGfRG/fanotify_test drwx------ 2 mail root 4096 Jul 23 12:17 /tmp/fanotify_test.sh.rTXTGfRG/nfs_export -rwx------ 1 mail root 29 Jul 23 12:17 /tmp/fanotify_test.sh.rTXTGfRG/nfs_export/example Configuring NFS server... * Re-exporting directories for NFS kernel daemon... [ OK ] Mounting the NFS export to /tmp/fanotify_test.sh.rTXTGfRG/nfs_mount... Starting fanotify handler as root... Waiting for process to settle... Attempting to access /tmp/fanotify_test.sh.rTXTGfRG/nfs_mount/example as nobody... Tue Jul 23 12:17:37 BST 2013 === Output from fanotify handler process === fanotify_init(): Initialising... fanotify_init(): Initialised on fd 3 fanotify_mark(): Marking /tmp/fanotify_test.sh.rTXTGfRG/nfs_mount... Error: 13: Permission denied Okay, so let's do that again with dropped privileges... (This wouldn't work in a multi-threaded environment without forking and executing another process, so let's do it like that) Getting stat info: /tmp/fanotify_test.sh.rTXTGfRG/nfs_mount Waiting for fanotify_mark() child process... child: Setting real & effective uid: 8 Closing fanotify fd: 3 Child exited with exit status: 0 fanotify_mark(): Marked /tmp/fanotify_test.sh.rTXTGfRG/nfs_mount loop: Waiting for events... loop: Waiting for events... loop: Waiting for events... loop: Read 24 bytes from fanotify fd loop: Event file descriptor: 4 loop: Error: Readlink failed: 2: No such file or directory loop: Allowing access to file: /proc/2068/fd/4 loop: Reading from event file descriptor... loop: Read 29 bytes from file: /proc/2068/fd/4 loop: Waiting for events... Received SIGTERM, shutting down... loop: Shutdown gracefully Closing fanotify fd: 3 === End of output === Restoring exports from /tmp/fanotify_test.sh.rTXTGfRG... * Re-exporting directories for NFS kernel daemon... The failed run: Platform: Linux somehost.somedomain 3.2.0-48-generic #74-Ubuntu SMP Thu Jun 6 19:45:16 UTC 2013 i686 i686 i386 GNU/Linux Compiling /tmp/fanotify_test.sh.H8b9ap4o/fanotify_test.c... Setting up directories... Testing NFS export owned by mail... drwxrwxrwx 4 root root 4096 Jul 23 12:17 /tmp/fanotify_test.sh.H8b9ap4o -rw-r--r-- 1 root root 389 Jul 23 12:17 /tmp/fanotify_test.sh.H8b9ap4o/exports.bak drwxrwxrwx 2 root root 4096 Jul 23 12:17 /tmp/fanotify_test.sh.H8b9ap4o/nfs_mount -rw-r--r-- 1 root root 6521 Jul 23 12:17 /tmp/fanotify_test.sh.H8b9ap4o/fanotify_test.c -rwxr-xr-x 1 root root 12212 Jul 23 12:17 /tmp/fanotify_test.sh.H8b9ap4o/fanotify_test drwx------ 2 mail root 4096 Jul 23 12:17 /tmp/fanotify_test.sh.H8b9ap4o/nfs_export -rwx------ 1 mail root 29 Jul 23 12:17 /tmp/fanotify_test.sh.H8b9ap4o/nfs_export/example Configuring NFS server... * Re-exporting directories for NFS kernel daemon... [ OK ] Mounting the NFS export to /tmp/fanotify_test.sh.H8b9ap4o/nfs_mount... Starting fanotify handler as root... Waiting for process to settle... Attempting to access /tmp/fanotify_test.sh.H8b9ap4o/nfs_mount/example as nobody... Tue Jul 23 12:17:40 BST 2013 === Output from fanotify handler process === fanotify_init(): Initialising... fanotify_init(): Initialised on fd 3 fanotify_mark(): Marking /tmp/fanotify_test.sh.H8b9ap4o/nfs_mount... Error: 13: Permission denied Okay, so let's do that again with dropped privileges... (This wouldn't work in a multi-threaded environment without forking and executing another process, so let's do it like that) Getting stat info: /tmp/fanotify_test.sh.H8b9ap4o/nfs_mount Waiting for fanotify_mark() child process... child: Setting real & effective uid: 8 Closing fanotify fd: 3 Child exited with exit status: 0 fanotify_mark(): Marked /tmp/fanotify_test.sh.H8b9ap4o/nfs_mount loop: Waiting for events... loop: Waiting for events... loop: Waiting for events... loop: Read 24 bytes from fanotify fd loop: Event file descriptor: 4 loop: Error: Readlink failed: 2: No such file or directory loop: Allowing access to file: /proc/2148/fd/4 loop: Reading from event file descriptor... loop: Error: Read failed: 5: Input/output error loop: Waiting for events... Received SIGTERM, shutting down... loop: Shutdown gracefully Closing fanotify fd: 3 === End of output === Restoring exports from /tmp/fanotify_test.sh.H8b9ap4o... * Re-exporting directories for NFS kernel daemon...
I ran the aformentioned tests with tcpdump running to capture the NFS packets. As far as NFS is concerned, all three requests were served and we can see 29 bytes being transferred in all three dumps. But the first two failed with EIO errors, while the last one was able to read 29 bytes from the event file descriptor. Does this suggest that NFS is not preventing access?
Created attachment 106995 [details] Revised test script. I have made some changes to the test script.
I have eliminated the intermittency of the EIO error on NFSv3 and isolated it to whether a response has been processed or not. In my sample program, I didn't want to unnecessarily block IO, so ensured I sent a response first (ALLOW). In doing so, if this response is handled in time, reading from the event file descriptor does not cause an EIO error. However, I revised this decision and moved the sending of the response to after the attempts to read from the event file descriptor. If I don't send a response, I consistently get EIO errors when reading from the event file descriptor (some sort of kernel file locking issue maybe?). I also moved the sending of the response back to before the read and stuck a second sleep after the write() call to attempt synchronisation. With the sleep in after sending the response, we get no EIO errors at all, but it does present the problem that it has sent a response before it has identified if what type to send.
I enabled NFS debugging output: rpcdebug -m nfs -s all rpcdebug -m nfsd -s all It seems that the originating process, after being unblocked by the ALLOW response, performs a flush and release on the inode. This seems to allow the root fanotify process to read from the event file descriptor. I guess this is where my input will stop, since I don't know enough about what is going on in kernel space. I've had a glaze over the NFS codebase, but all I can see is an area of code that descends from the fh_verify() call, that ensures the current credentials NFSD_MAY_LOCK or ( NFSD_MAY_READ and NFSD_MAY_USER_OVERRIDE ). The ball is in your park now; I hope my feedback has been useful.
>Seems like this is trivially fixed by using anon_uid=nobody in your exports >>instead of squashing root to have 0 access... Tried setting this option in the exports, this still does not allow the root fanotify process to read from the event file descriptor. Any further update on this would be useful.
We have bisected the NFSv4 issue (unable to receive fanotify events - just get "Not a directory" errors). The bug originated at: commit 1788ea6e3b2a58cf4fb00206e362d9caff8d86a7 Author: Jeff Layton <jlayton@redhat.com> Date: Fri Nov 4 13:31:21 2011 -0400 nfs: when attempting to open a directory, fall back on normal lookup (try #5) commit d953126 changed how nfs_atomic_lookup handles an -EISDIR return from an OPEN call. Prior to that patch, that caused the client to fall back to doing a normal lookup. When that patch went in, the code began returning that error to userspace. The d_revalidate codepath however never had the corresponding change, so it was still possible to end up with a NULL ctx->state pointer after that. That patch caused a regression. When we attempt to open a directory that does not have a cached dentry, that open now errors out with EISDIR. If you attempt the same open with a cached dentry, it will succeed. Fix this by reverting the change in nfs_atomic_lookup and allowing attempts to open directories to fall back to a normal lookup Also, add a NFSv4-specific f_ops->open routine that just returns -ENOTDIR. This should never be called if things are working properly, but if it ever is, then the dprintk may help in debugging. To facilitate this, a new file_operations field is also added to the nfs_rpc_ops struct. Cc: stable@kernel.org Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> This fits with the errno of ENOTDIR that we are getting from the fanotify event read.
Link to the commit in cgit: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1788ea6e3b2a58cf4fb00206e362d9caff8d86a7
To clarify the 3.{1,2} kernels where the regression was introduced give ENOTDIR. The current tip of Linus' tree gives 13 - Permission denied when reading the event.
So is fanotify trying to call dentry_open() on behalf of the filesystem in fs/notify/fanotify/fanotify_user.c:create_fd()? That would be a bug...
How should fanotify reopen the file, ideally skipping authentication checks?
Why does a notification system need to do this in the first place? Open by dentry is race-prone in NFS: there is no guarantee that the file won't have been replaced on the server. There is no way to skip authentication checks in NFS. Every RPC call that is sent is authenticated, and the server will check whether or not that user has permission to perform that particular operation at this time.
fanotify passes the fd to the user-space process to do content-based access control. See: http://lwn.net/Articles/339253/ How about passing the credentials from a different process? create_fd is called within the context of the fanotify process, but needs to use the credentials of the original process.
If you really must run this on a NFS client (rather than on the server), then I suggest creating a proper callback into the filesystem. Trying to hack it by directly calling VFS filesystem helper functions such as dentry_open() isn't going to be supported.