Created attachment 284621 [details] Script demonstrating the race condition 1. Problem description There seem to be a race condition vfs_rename and vfs_link, when those operations are done in parallel: 1. Moving a file to an existing target file to replace it (eg. mv file target) 2. Creating a link from the target file (eg. ln target link) The attached rename_link_race.sh script demonstrated the issue: $ ./rename_link_race.sh ln: failed to create hard link 'link' => 'target': No such file or directory The link call randomly yields ENOENT. 2. Concurrent operation guarantee My understanding is that on client side, as the target file always exist and is never erased, but just replaced, the link should never fail. The POSIX standard does not seem to directly address this case, but my understanding is that rename() atomicity should guarantee correct behavior w.r.t rename()/link() concurrent operations: See IEEE Std 1003.1, (1) https://pubs.opengroup.org/onlinepubs/009695399/functions/rename.html "If the link named by the new argument exists, it shall be removed and old renamed to new. In this case, a link named new shall remain visible to other processes throughout the renaming operation and refer either to the file referred to by new or old before the operation began." "This rename() function is equivalent for regular files to that defined by the ISO C standard. Its inclusion here expands that definition to include actions on directories and specifies behavior when the new parameter names a file that already exists. That specification requires that the action of the function be atomic." (2) https://pubs.opengroup.org/onlinepubs/009695399/functions/link.html [ENOENT] "A component of either path prefix does not exist; the file named by path1 does not exist; or path1 or path2 points to an empty string." The only erroneous [ENOENT] case is when the file does not "exist"; but the renaming should guarantee that the file always "exists" ("a link named new shall remain visible to other processes throughout the renaming operation and refer either to the file referred to by new or old before the operation began") 3. Regression origin This issue seems to be an old regression introduced in 2.6.38 by a fix (aae8a97d3ec30) in fs/namei.c preventing to hardlink a deleted file (with i_nlink == 0) A naive "fix" is to remove the "if (inode->i_nlink == 0 && !(inode->i_state & I_LINKABLE))" case, but this is probably not something we want to do. Note that this bug report summarize the linux-fsdevel report (https://marc.info/?t=156663157200002&r=1&w=2)
Created attachment 284623 [details] Naive fix demonstrating the origin of the issue
See new thread also: https://lore.kernel.org/lkml/CAJfpegvVKWHhhXwOi9jDUOJi2BnYSDxZQrp1_RRrpVjjZ3Rs2w@mail.gmail.com/T/