Bug 217850 - overlayfs: cannot rename symlink if lower filesystem is FUSE/NFS
Summary: overlayfs: cannot rename symlink if lower filesystem is FUSE/NFS
Status: NEW
Alias: None
Product: Linux
Classification: Unclassified
Component: Kernel (show other bugs)
Hardware: All Linux
: P3 high
Assignee: Virtual assignee for kernel bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-31 17:17 UTC by Ruiwen Zhao
Modified: 2023-09-01 09:52 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Ruiwen Zhao 2023-08-31 17:17:20 UTC
Hi, 

We recently found a regression on linux kernel: rename(2) on a symlink through an overlayfs fails with ENXIO, when the lowerdir is FUSE.

*What happened*

When running `mv` command on a symlink file through overlayfs, and the overlayfs's lowdir on FUSE or NFS, the command fails with "No such device or address". This issue happens on kernel 5.15 and 6.1, but not on 5.10. 

*How to reproduce*
Environment: Debian bookworm (kernel 6.1.0)

1. To prepare the FUSE fs, create a file and a symlink under the VM's root dir:

```
ruiwen@instance-1:/tmp$ ls / -l | grep foo
-rw-r--r--   1 root root     0 Aug 30 23:10 foo
lrwxrwxrwx   1 root root     3 Aug 30 23:12 foolink -> foo
```
and then run libfuse's passthrough (https://github.com/libfuse/libfuse/blob/master/example/passthrough.c), which mounts a FUSE filesystem by mirroring the root dir:

```
ruiwen@instance-1:~/fuse-3.16.1/build/example$ ./passthrough -o allow_other /tmp/fusemount
ruiwen@instance-1:~/fuse-3.16.1/build/example$ ls /tmp/fusemount/ -l | grep foo
-rw-r--r--   1 root root     0 Aug 30 23:10 foo
lrwxrwxrwx   1 root root     3 Aug 30 23:12 foolink -> foo
```

2. Create an overlayfs mount, with lower dir being the the mount point of FUSE filesystem.
```
ruiwen@instance-1:/tmp$ mkdir -p fusemount upper work merged
ruiwen@instance-1:/tmp$ sudo mount -t overlay overlay -o lowerdir=fusemount,upperdir=upper,workdir=work merged
ruiwen@instance-1:/tmp$ ls -l merged/ | grep foo
-rw-r--r--   1 root root     0 Aug 30 23:10 foo
lrwxrwxrwx   1 root root     3 Aug 30 23:12 foolink -> foo
```

3. Try to move the symlink and see the failure:
```
ruiwen@instance-1:/tmp$ mv merged/foolink merged/foolink2
mv: cannot move 'merged/foolink' to 'merged/foolink2': No such device or address
```


*Some observations*

1. Same bug has been reported at Debian Bug, where overlayfs is used with NFS: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1049885. This makes me think that the bug is more on overlayfs, but not on FUSE or NFS.

2. This issue can be reproduced on kernel 5.15, 6.10, but CANNOT be reproduced on kernel 5.10. There is a noticeable change on 5.15 that is related to overlayfs: (https://github.com/torvalds/linux/commit/72db82115d2bdfbfba8b15a92d91872cfe1b40c6), which introduces copyup fileattr.

3. When reproducing this bug, we found that the error ENXIO was actually from getting lower fileattr. In dmesg we see: "failed to retrieve lower fileattr (/link, err=-6)". So it seems that overlayfs for some reason fails to get the file attributes of the source file from the underlying filesystem.
Comment 1 Bagas Sanjaya 2023-09-01 09:52:39 UTC
(In reply to Ruiwen Zhao from comment #0)
> Hi, 
> 
> We recently found a regression on linux kernel: rename(2) on a symlink
> through an overlayfs fails with ENXIO, when the lowerdir is FUSE.
> 
> *What happened*
> 
> When running `mv` command on a symlink file through overlayfs, and the
> overlayfs's lowdir on FUSE or NFS, the command fails with "No such device or
> address". This issue happens on kernel 5.15 and 6.1, but not on 5.10. 
> 
> *How to reproduce*
> Environment: Debian bookworm (kernel 6.1.0)
> 
> 1. To prepare the FUSE fs, create a file and a symlink under the VM's root
> dir:
> 
> ```
> ruiwen@instance-1:/tmp$ ls / -l | grep foo
> -rw-r--r--   1 root root     0 Aug 30 23:10 foo
> lrwxrwxrwx   1 root root     3 Aug 30 23:12 foolink -> foo
> ```
> and then run libfuse's passthrough
> (https://github.com/libfuse/libfuse/blob/master/example/passthrough.c),
> which mounts a FUSE filesystem by mirroring the root dir:
> 
> ```
> ruiwen@instance-1:~/fuse-3.16.1/build/example$ ./passthrough -o allow_other
> /tmp/fusemount
> ruiwen@instance-1:~/fuse-3.16.1/build/example$ ls /tmp/fusemount/ -l | grep
> foo
> -rw-r--r--   1 root root     0 Aug 30 23:10 foo
> lrwxrwxrwx   1 root root     3 Aug 30 23:12 foolink -> foo
> ```
> 
> 2. Create an overlayfs mount, with lower dir being the the mount point of
> FUSE filesystem.
> ```
> ruiwen@instance-1:/tmp$ mkdir -p fusemount upper work merged
> ruiwen@instance-1:/tmp$ sudo mount -t overlay overlay -o
> lowerdir=fusemount,upperdir=upper,workdir=work merged
> ruiwen@instance-1:/tmp$ ls -l merged/ | grep foo
> -rw-r--r--   1 root root     0 Aug 30 23:10 foo
> lrwxrwxrwx   1 root root     3 Aug 30 23:12 foolink -> foo
> ```
> 
> 3. Try to move the symlink and see the failure:
> ```
> ruiwen@instance-1:/tmp$ mv merged/foolink merged/foolink2
> mv: cannot move 'merged/foolink' to 'merged/foolink2': No such device or
> address
> ```
> 
> 
> *Some observations*
> 
> 1. Same bug has been reported at Debian Bug, where overlayfs is used with
> NFS: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1049885. This makes
> me think that the bug is more on overlayfs, but not on FUSE or NFS.
> 
> 2. This issue can be reproduced on kernel 5.15, 6.10, but CANNOT be
> reproduced on kernel 5.10. There is a noticeable change on 5.15 that is
> related to overlayfs:
> (https://github.com/torvalds/linux/commit/
> 72db82115d2bdfbfba8b15a92d91872cfe1b40c6), which introduces copyup fileattr.
> 
> 3. When reproducing this bug, we found that the error ENXIO was actually
> from getting lower fileattr. In dmesg we see: "failed to retrieve lower
> fileattr (/link, err=-6)". So it seems that overlayfs for some reason fails
> to get the file attributes of the source file from the underlying filesystem.

First, verify that the issue is also occurs on latest mainline, self-compiled
(currently v6.5). Then, if still persists, try bisecting (see Documentation/admin-guide/bug-bisect.rst for instructions).

Note You need to log in before you can comment on or make changes to this bug.