Bug 203271 - CIFS: open file with O_CREAT return "No such file or directory" after unlink for SMB2.0+ - xfstests generic/531
Summary: CIFS: open file with O_CREAT return "No such file or directory" after unlink ...
Status: RESOLVED WILL_NOT_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: CIFS (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: fs_cifs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-11 07:29 UTC by Xiaoli Feng
Modified: 2019-08-16 14:06 UTC (History)
3 users (show)

See Also:
Kernel Version: 5.1.0-rc3+
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Xiaoli Feng 2019-04-11 07:29:25 UTC
Setup samba server and mount it in local with vers=3.11. Use the reproducer to test if open file succesffully after unlink. But failed for SMB2.0+. It returns "No such file or directory" when try to open the same file with O_CREAT.

reproducer.c:
#define _GNU_SOURCE            
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/file.h>
#include <errno.h>
void test()
{
    int fd;
    int ret;

    fd = open("./testfile",O_RDWR|O_CREAT,0644);
    printf("errno=%d\n",errno);
    if (fd < 0)
    {
        perror("open failed");
        exit(1);
    }
    ret = unlink("./testfile");
    if (ret < 0)
    {
        perror("unlink failed");
        exit(1);
    }
}
int main(int argc, char **argv)
{
    int index=50;
    while(index)
    {
        index--;
        printf("loop %d\n",index);
        test();
    }
    return 0;
}
Comment 2 Steve French 2019-06-24 03:26:51 UTC
Silly rename strategy might work in some cases (e.g. http://nfs.sourceforge.net/#faq_d2)
Comment 3 Ronnie Sahlberg 2019-06-24 03:32:21 UTC
(In reply to Steve French from comment #2)
> Silly rename strategy might work in some cases (e.g.
> http://nfs.sourceforge.net/#faq_d2)

Not sure. Silly rename only works as long as you can guarantee that all clients do it, then also all clients need to know about silly-rename and know that the
names need to be filtered from readdir() (and maybe nlink numbers to be fudged for the parent.)

But you would still have the situation where open(O_CREAT) could fail because a different client, or a native app on the windows server have set the delete-on-close flag. I think it is just a situation where NTFS is different to Posix and it is difficult to come up with a good errno value.
Comment 4 Steve French 2019-06-24 03:34:46 UTC
NFS does it - it might be worth a try.   For the readdir example, we shouldn't list files that are delete on close as they are not supposed to be in the namespace - maybe we could construct a test to open a file, unlink it, do a readdir and make sure it is not displayed (we should skip files with delete on close attribute .. right?)
Comment 5 Ronnie Sahlberg 2019-06-24 03:36:47 UTC
Well, we can try.
Comment 6 Ronnie Sahlberg 2019-08-05 22:06:52 UTC
We can't do silly-rename like NFS does since you can't rename an open file in SMB.
We talked some about what if we could do something with posix extensions and add a posix-unlink() or similar but that wouldn't work either since it would change the semantics for windows clients.

This is just a situation where we can't get perfect posix semantics in SMB2.


One thing we should fix though is that if the initial rename fails we try to set-delete-on-close on the destination and try the rename again.
We should only do this delete-and-retry IFF the protocol is SMB1 since it will never work on SMB2. And in SMB2 it will lead to "rename failed but the traget file was deleted" which is not good.
Comment 7 Johannes ter Haak 2019-08-16 14:06:35 UTC
So is this a permanent unfixable problem? This means whoever wants to use cifs to mount home directories in a mixed environment with pam_mount is bound to SMB1 .

This variation of above code using `rename()` instead of `unlink()` fails with "Permission denied" when run on SMB3 (SMB1 is fine):

```
#define _GNU_SOURCE
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

int main( int argc, char **argv ) {
  int i, fd, ret;
  char file[20];

  for( i=0; i < 50; i++ ){
    printf( "loop %d\n", i );
    sprintf( file, "./testfile-%d", rand() );

    fd = open( file, O_RDWR|O_CREAT, 0644 );
    printf( "errno=%d\n", errno );
    if( fd < 0 ){
        perror( "open failed" );
        exit(1);
    }

    ret = rename( file, "./testfile" );
    if ( ret < 0 ){
        perror( "rename failed" );
        exit(1);
    }
  }
  return 0;
}
`` `

Unfortunately this is a common atomic file replacement strategy excessively used by dconf. Every time a value is updated in dconf, the database file is replaced, which fails. Dconf also stores state for nautilus and the gnome desktop in general. A broken dconf service renders the gnome desktop (and other desktops too) unusable (keyboard layout settings not working).

Changing a setting:

```
dconf write /org/gnome/rhythmbox/player/volume 1.0
inotifywait -rm .config/dconf
```

Results in:

```
.config/dconf/ OPEN user
.config/dconf/ CREATE user.I5XA2Z
.config/dconf/ OPEN user.I5XA2Z
.config/dconf/ MODIFY user.I5XA2Z
.config/dconf/ CLOSE_WRITE,CLOSE user.I5XA2Z
.config/dconf/ DELETE user
.config/dconf/ MOVED_FROM user.I5XA2Z
.config/dconf/ MOVED_TO user
```

I suspect the following ubuntu bug report is related to this:

https://bugs.launchpad.net/ubuntu/+source/cifs-utils/+bug/1764778

We're on Fedora 29 and Kernel 5.2.7-100.fc29.x86_64
Samba server is 4.8.3 on CentOS 7

Note You need to log in before you can comment on or make changes to this bug.