With upstream kernel 5.16.9 CIFS umount fails when using certain SMB servers.
"umount" returns exit code 32 and the "mount" command still lists the mount as being present.
See below for the bad commit I've bisected.
The bug has been reproduced multiple times with upstream kernel 5.16.9!
But additionally I've done much testing with openSUSE kernels.
Here's the openSUSE bugreport:
Additionally with the same servers there's a problem showing the free space with the "df" command. But I haven't been able to find out if this is really related to the umount problem.
= SMB Server =
I haven't been able to identify the exact server side settings. But this problem occured with at least this SMB server (with upstream kernel 5.16.9):
NetApp (Release 9.7P12) with dfs and CIFS mount options "vers=3.1.1,seal"
(quota state unknown)
Additionally I've verified the bug with the openSUSE kernel 5.3.18-lp152.72-default and this SMB server:
Windows Server 2019 with dfs and quota enabled
(no explicit "vers" or "seal" mount options)
Additionally the bug appeared with another NetApp SMB server (tested upstream 5.16.9) and two unknown servers (tested only openSUSE-15.2 kernels).
Also it looks like the bug may need a setup where the user can only read //server/share/username/ but has no permissions to read //server/share/.
= Bad Commit =
With the openSUSE kernel I bisected the problem down to this commit (6ae27f2b2) between openSUSE-15.2 kernels 5.3.18-lp152.69.1 and 5.3.18-lp152.72.1.
This commit is also present in the upstream kernel (14302ee33).
And it has been merged between 5.11 and 5.12.
As said I can't reproduce this with arbitrary SMB servers. And it's always a time consuming procedure for me to do a test with the affected production SMB servers. But if you're really unhappy with the bisect search on the openSUSE kernel, I can repeat the test with the upstream commit 14302ee33 and it's predecessor.
I talked to Paulo (the Author of the mentioned commits).
I'll try to get network traces of that behavior for Paulo.
But I may not get the permission for that because of organizational regulations ... :-/
Sadly the bug only happens inside networks with some annoying security regulations. And as said I haven't found a way to reproduce it.
If anyone, who may have more experience with NetApp or Windows Server, has any idea how to reproduce this with a clean setup, please give me a hint.
The problem disappeared with one of the latest openSUSE kernel updates.
It must have been one of these:
I didn't do a bisect. But the following commit, contained in SUSE kernel 5.3.18-150300.59.68, seems most likely to me.
SUSE kernel source: