|Summary:||CIFS umount fails since 14302ee33 with some servers (exit code 32)|
|Product:||File System||Reporter:||Moritz Duge (duge)|
Description Moritz Duge 2022-03-31 16:47:35 UTC
With upstream kernel 5.16.9 CIFS umount fails when using certain SMB servers. "umount" returns exit code 32 and the "mount" command still lists the mount as being present. See below for the bad commit I've bisected. The bug has been reproduced multiple times with upstream kernel 5.16.9! But additionally I've done much testing with openSUSE kernels. Here's the openSUSE bugreport: https://bugzilla.opensuse.org/show_bug.cgi?id=1194945 Additionally with the same servers there's a problem showing the free space with the "df" command. But I haven't been able to find out if this is really related to the umount problem. = SMB Server = I haven't been able to identify the exact server side settings. But this problem occured with at least this SMB server (with upstream kernel 5.16.9): NetApp (Release 9.7P12) with dfs and CIFS mount options "vers=3.1.1,seal" (quota state unknown) Additionally I've verified the bug with the openSUSE kernel 5.3.18-lp152.72-default and this SMB server: Windows Server 2019 with dfs and quota enabled (no explicit "vers" or "seal" mount options) Additionally the bug appeared with another NetApp SMB server (tested upstream 5.16.9) and two unknown servers (tested only openSUSE-15.2 kernels). Also it looks like the bug may need a setup where the user can only read //server/share/username/ but has no permissions to read //server/share/. = Bad Commit = With the openSUSE kernel I bisected the problem down to this commit (6ae27f2b2) between openSUSE-15.2 kernels 5.3.18-lp152.69.1 and 5.3.18-lp152.72.1. https://github.com/SUSE/kernel/commit/6ae27f2b260e91f16583bbc1ded3147e0f7c5d94 This commit is also present in the upstream kernel (14302ee33). https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=14302ee3301b3a77b331cc14efb95bf7184c73cc And it has been merged between 5.11 and 5.12. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=d0df9aabefda4d0a64730087f939f53f91e29ee6 As said I can't reproduce this with arbitrary SMB servers. And it's always a time consuming procedure for me to do a test with the affected production SMB servers. But if you're really unhappy with the bisect search on the openSUSE kernel, I can repeat the test with the upstream commit 14302ee33 and it's predecessor.
Comment 1 Moritz Duge 2022-05-04 16:50:26 UTC
Update: I talked to Paulo (the Author of the mentioned commits). I'll try to get network traces of that behavior for Paulo. But I may not get the permission for that because of organizational regulations ... :-/ (still trying) Sadly the bug only happens inside networks with some annoying security regulations. And as said I haven't found a way to reproduce it. If anyone, who may have more experience with NetApp or Windows Server, has any idea how to reproduce this with a clean setup, please give me a hint.
Comment 2 Moritz Duge 2022-05-19 15:01:28 UTC
The problem disappeared with one of the latest openSUSE kernel updates. It must have been one of these: 2022-04-06: 5.3.18-150300.59.63 2022-05-05: 5.3.18-150300.59.68 I didn't do a bisect. But the following commit, contained in SUSE kernel 5.3.18-150300.59.68, seems most likely to me. kernel.org https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=5d7e282541fc91b831a5c4477c5d72881c623df9 SUSE kernel: https://github.com/SUSE/kernel/commit/f6c7673fbee1985e8fcfe4936ca6d91852f86b13 SUSE kernel source: https://github.com/SUSE/kernel-source/commit/e7007189db138241fce6440c3bcfa084a0cf7c72