Bug 203269

Summary: CIFS: block in wait_for_free_credits when open file for SMB1.0 - xfstests generic/531
Product: File System Reporter: Xiaoli Feng (fengxiaoli0714)
Component: CIFSAssignee: fs_cifs (fs_cifs)
Status: RESOLVED CODE_FIX    
Severity: high CC: smfrench
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.1.0-rc3+ Subsystem:
Regression: No Bisected commit-id:
Attachments: kernel patch to cifs.ko to fix cause of hang reported here

Description Xiaoli Feng 2019-04-11 07:27:36 UTC
Setup samba server and mount it in local with vers=1.0. Use the reproducer to test if open file succesffully after unlink. The process blocked in wait_for_free_credits and never return.

[<0>] wait_for_free_credits+0x258/0x460 [cifs]
[<0>] SendReceive+0xcd/0x360 [cifs]
[<0>] CIFSPOSIXCreate+0x1bf/0x440 [cifs]
[<0>] cifs_posix_open+0x1ee/0x300 [cifs]
[<0>] cifs_do_create+0x447/0x710 [cifs]
[<0>] cifs_atomic_open+0x1ab/0x530 [cifs]
[<0>] path_openat+0xd54/0x1670
[<0>] do_filp_open+0x93/0x100
[<0>] do_sys_open+0x186/0x220
[<0>] do_syscall_64+0x55/0x1a0
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] 0xffffffffffffffff

reproducer.c:
#define _GNU_SOURCE            
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/file.h>
#include <errno.h>
void test()
{
    int fd;
    int ret;

    fd = open("./testfile",O_RDWR|O_CREAT,0644);
    printf("errno=%d\n",errno);
    if (fd < 0)
    {
        perror("open failed");
        exit(1);
    }
    ret = unlink("./testfile");
    if (ret < 0)
    {
        perror("unlink failed");
        exit(1);
    }
}
int main(int argc, char **argv)
{
    int index=50;
    while(index)
    {
        index--;
        printf("loop %d\n",index);
        test();
    }
    return 0;
}
Comment 1 Xiaoli Feng 2019-04-15 02:17:49 UTC
It's a regression issue. Testing pass on v4.9-rc8 but is failed on v5.0.
Comment 2 Steve French 2019-04-17 04:29:18 UTC
Looks like a fairly simple bug

Presumably the oplock break issue mentioned above is easy to workaround - "modprobe ifs enable_oplocks=N" but the problem seems to be that we are leaking 'credits' (or perhaps more accurately for SMB1 dialect would be to say we are leaking the active request count) one for every oplock break. Should be a fairly easy fix - the reason it probably wasn't noticed as much is due to discouraging the use of SMB1 due to security being worse in this very old dialect.
Comment 3 Steve French 2019-04-17 13:39:04 UTC
I tested Ronnie's fix with the reproducer and it worked. Let us know if you see any other related problems.
Comment 4 Steve French 2019-04-17 13:40:06 UTC
Created attachment 282369 [details]
kernel patch to cifs.ko to fix cause of hang reported here