Bug 204705

Summary: Race condition between vfs_rename and vfs_link
Product: File System Reporter: Xavier Roche (xavier.roche+bugzilla.kernel.org)
Component: OtherAssignee: fs_other
Status: NEW ---    
Severity: normal CC: xavier.roche+bugzilla.kernel.org, xavier.roche
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.3.0-rc6 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Script demonstrating the race condition
Naive fix demonstrating the origin of the issue

Description Xavier Roche 2019-08-27 07:08:57 UTC
Created attachment 284621 [details]
Script demonstrating the race condition

1. Problem description

There seem to be a race condition vfs_rename and vfs_link, when those operations are done in parallel:

1. Moving a file to an existing target file to replace it (eg. mv file target)
2. Creating a link from the target file (eg. ln target link)

The attached rename_link_race.sh script demonstrated the issue:
$ ./rename_link_race.sh
ln: failed to create hard link 'link' => 'target': No such file or directory

The link call randomly yields ENOENT.

2. Concurrent operation guarantee

My understanding is that on client side, as the target file always exist and is never erased, but just replaced, the link should never fail.

The POSIX standard does not seem to directly address this case, but my understanding is that rename() atomicity should guarantee correct behavior w.r.t rename()/link() concurrent operations:

See IEEE Std 1003.1,

(1) https://pubs.opengroup.org/onlinepubs/009695399/functions/rename.html

"If the link named by the new argument exists, it shall be removed and old renamed to new. In this case, a link named new shall remain visible to other processes throughout the renaming operation and refer either to the file referred to by new or old before the operation began."

"This rename() function is equivalent for regular files to that defined by the ISO C standard. Its inclusion here expands that definition to include actions on directories and specifies behavior when the new parameter names a file that already exists. That specification requires that the action of the function be atomic."

(2) https://pubs.opengroup.org/onlinepubs/009695399/functions/link.html
[ENOENT]
    "A component of either path prefix does not exist; the file named by path1 does not exist; or path1 or path2 points to an empty string."

The only erroneous [ENOENT] case is when the file does not "exist"; but the renaming should guarantee that the file always "exists" ("a link named new shall remain visible to other processes throughout the renaming operation and refer either to the file referred to by new or old before the operation began")

3. Regression origin

This issue seems to be an old regression introduced in 2.6.38 by a fix (aae8a97d3ec30) in fs/namei.c preventing to hardlink a deleted file (with i_nlink == 0)

A naive "fix" is to remove the "if (inode->i_nlink == 0 && !(inode->i_state & I_LINKABLE))" case, but this is probably not something we want to do.

Note that this bug report summarize the linux-fsdevel report (https://marc.info/?t=156663157200002&r=1&w=2)
Comment 1 Xavier Roche 2019-08-27 07:09:36 UTC
Created attachment 284623 [details]
Naive fix demonstrating the origin of the issue