Bug 15440

Summary: alfresco smb shares won't list any file
Product: File System Reporter: Javier Barroso (javibarroso)
Component: CIFSAssignee: Jeff Layton (jlayton)
Status: CLOSED DOCUMENTED    
Severity: normal CC: jlayton, maciej.rutecki, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.33 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 14885    
Attachments: mounting smb share from alfresco resulting in an empty dir
dmesg after echo 7 > /proc/fs/cifs/cifsFYI
patch -- turn on SMBFLG2_IS_LONG_NAME
strace of ls when alfresco is mounted without options
after turn-on SMBFLG2_IS_LONG_NAME patch applied

Description Javier Barroso 2010-03-04 09:02:09 UTC
Hi,

First: this is the same bug that #15427, I'll try to mark as dup, but assigned to cifs like Andrew told me.

I'm running debian sid, and last thursday I reported an bug [1].

Ben tell me about report it here .. so:

I have an alfresco (document manager) smb share (it is configured to auth
against an ldap which is the master in a PDC from SAMBA).

When I try mount an share from alfresco with kernel 2.6.30 it is fine, and ls
command show me all my files.

When I mount the same share from alfresco with kernel 2.6.32 from sid or 2.6.33
from experimental, the mount success, but ls doesn't show me any file, and I
can't access any file (it seems like they are not there)

I tried with both types:

mount -t cifs -o username=user //alfresco/alfresco ~/alfresco
mount -t smbfs -o username=user //alfresco/alfresco ~/alfresco

And the same result (as indicated previously)

Please tell me if you need more info, my config is debian sid default, and in
the bug report in debian reportbug gets many info that could be useful here
(but I don't want to get all and eat space here).

Thank you very much

[1] http://bugs.debian.org/571459
Comment 1 Javier Barroso 2010-03-04 09:04:07 UTC
*** Bug 15427 has been marked as a duplicate of this bug. ***
Comment 2 Jeff Layton 2010-03-04 13:41:46 UTC
Ok, doesn't surprise me that "cifs" and "smbfs" work similarly here -- I think debian has some sort of goop that turns smbfs mount attempts into cifs mounts.

Does this problem go away if you mount with '-o nomodeset' ?
Comment 3 Jeff Layton 2010-03-04 13:44:02 UTC
Hmm...I also see this in the logs on the debian bug:

[ 5051.884072] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
[ 5051.884082]  CIFS VFS: Send error in SessSetup = -13
[ 5051.884095]  CIFS VFS: cifs_mount failed w/return code = -13

...does a new set of these messages pop up whenever you try to mount, or were those places where you fat-fingered the password?
Comment 4 Javier Barroso 2010-03-04 14:55:57 UTC
Hi Jeff,

These messages were before, I suppose I didn't write well my password.

# dmesg -c
$ mount -t cifs -o username=user,nomodeset //alfresco/alfresco ~/alfresco/
$ dmesg

[26123.836198] CIFS: Unknown mount option nomodeset

I searched in google about nomodeset option in samba, but not luck, only kernel parameters related was found

Can I probe with another parameter?

I will try booting with SystemRescueCd which have a 2.6.32.9 kernel and I will report results here

Thanks
Comment 5 Jeff Layton 2010-03-04 14:57:36 UTC
Sorry...that's what I get for posting before I've fully woken up...

I meant to say, can you try mounting with '-o noserverino' ?
Comment 6 Javier Barroso 2010-03-04 15:15:28 UTC
It works !!

Then is it a bug on debian ? , or an configuration error ?

Thank you very much !
Comment 7 Jeff Layton 2010-03-04 15:26:31 UTC
No, it's a kernel problem so we can work on it here. noserverino should be ok as a workaround for now.

What kind of server is this? The problem is likely an issue with how the server presents uniqueid's.

What would be very helpful would be to get a wire capture of a failed mount attempt. The instructions on how to do that are on this page:

    http://wiki.samba.org/index.php/LinuxCIFS_troubleshooting

Start up the capture, attempt the mount without the '-o noserverino' option and then stop the capture. Then attach the capture file to this bug.
Comment 8 Javier Barroso 2010-03-04 17:23:50 UTC
Hi,

Alfresco [1] have their own smb protocol implementation [2], I don't know if a java standard one.

I'm uploading a session (cifs-alfresco-bug.dump.gz).

1. mount without any option
2. ls on mount point (empty)
3. umount

My wireshark told me:

Stopped processing module RFC1213-MIB due to error(s) to prevent potential crash in libsmi.
Module's conformance level: 1.
See details at: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=560325

but I think this is other issue ;)

Thank you 

[1] http://alfresco.com
[2] http://wiki.alfresco.com/wiki/CIFS_linux
Comment 9 Javier Barroso 2010-03-04 17:25:50 UTC
Created attachment 25356 [details]
mounting smb share from alfresco resulting in an empty dir

Captured with tcpdump:

mount
ls
umount

Thanks
Comment 10 Javier Barroso 2010-03-04 17:47:02 UTC
Created attachment 25358 [details]
dmesg after echo 7 > /proc/fs/cifs/cifsFYI

dmesg -c
echo 7 > /proc/fs/cifs/cifsFYI
mount 
umount
echo 0 > /proc/fs/cifs/cifsFYI
dmesg > cifs-alfresco-bug.dmesg.gz

Thanks
Comment 11 Javier Barroso 2010-03-04 17:48:07 UTC
I hope dmesg maybe useful here, too

Thank you
Comment 12 Jeff Layton 2010-03-04 18:52:56 UTC
Thanks, that helps...

It looks like the server is sending back a very strange error:

SMB	Trans2 Response, FIND_FIRST2, Error: Unknown SRV error (ff)

This actually looks like a server bug, you may want to send this capture to the Alfresco people and have them fix it -- they should be sending back a sane error code.

We probably have a client side bug here as well. I'm not sure that we should be trying to use the FIND_FIRST2 infolevel that we are against this server, based on what it sent in the NegotiateProtocol request. Let me do a bit more research and I'll get back to you.

For now, you should be able to use '-o noserverino' to work around the issue.
Comment 13 Javier Barroso 2010-03-04 19:32:34 UTC
I posted in their forum, I hope they tell me about where should I post the question (Alfresco community edition are archived in their bugtracker)...

http://forums.alfresco.com/en/viewtopic.php?f=47&t=25451

Thanks your effort
Comment 14 Jeff Layton 2010-03-05 18:52:12 UTC
Created attachment 25370 [details]
patch -- turn on SMBFLG2_IS_LONG_NAME

I really sort of doubt this will make any difference, but I noticed in the capture that we're not setting this bit in the header. You might want to try patching the kernel with this patch and then testing against this server again. Maybe it'll behave better?
Comment 15 Rafael J. Wysocki 2010-03-05 19:43:21 UTC
Handled-By : Jeff Layton <jlayton@redhat.com>
Comment 16 Jeff Layton 2010-03-05 20:12:32 UTC
To summarize...

This appears to be a server-side bug. The server is sending back an invalid error code in response to a SMB_FIND_FILE_ID_FULL_DIR_INFO infolevel TRANS2_FIND_FIRST2 request. This causes the client to do the best it can when translating the error and it returns an -EIO.

It looks like the application calls back down with another readdir. At that point, we have an initiated search already so it tries to do a FIND_NEXT2 request. That apparently translates any error into -ENOENT (which seems a little broken, actually, but seems to be a somewhat benign bug).

What might be interesting to see is an strace of the "ls" command after mounting without -o noserverino. I'd like to see what the application is actually doing here -- it seems like it's calling readdir twice so I have to wonder whether it might not be getting back the original -EIO for some reason.

We also have a (seemingly benign) but where we're not setting a flag in the header properly, but it's not likely that that's a real problem here.

Either way though, the proper fix will be to the server, but '-o noserverino' should help serve as a workaround for now.
Comment 17 Javier Barroso 2010-03-06 12:51:36 UTC
Created attachment 25383 [details]
strace of ls when alfresco is mounted without options

I'm attaching strace log from ls as requested

I'll try to recompile 2.6.33 kernel with your patch

Thank you very much
Comment 18 Javier Barroso 2010-03-08 08:07:04 UTC
Created attachment 25404 [details]
after turn-on SMBFLG2_IS_LONG_NAME patch applied

The same result with patch applied.

I'm attaching a tar.gz with 3 files:

- tcpdump trace
- dmesg messages
- ls strace

Thanks
Comment 19 Jeff Layton 2010-03-08 12:08:17 UTC
Ok, not terribly surprising. I think this server just doesn't support that infolevel properly, and doesn't return the correct sort of error.

The interesting thing though is the strace output you sent in comment #17. That shows the first getdents() call returning success, which seems wrong -- I think that probably ought to return an error. So we may have a minor bug here. Fixing that however won't make this server start working correctly -- it should just make it so that an error is returned more quickly.
Comment 20 Javier Barroso 2010-03-21 20:28:36 UTC
Hi,

Why is this bug invalid ?

With 2.6.30 mount works fine but with 2.6.3[2-3] doesn't work (without option mentioned by Jeff). Something in the code changed and caused this behaviour, didn't it ?

Thank you very much
Comment 21 Rafael J. Wysocki 2010-03-21 20:34:41 UTC
This is a server problem.

Unfortunately there's no "this is not a kernel issue" option to give as the reason for closing, so I used "invalid".  Changing to "documented", but the outcome still is we don't think that's a bug in the kernel.
Comment 22 Javier Barroso 2010-03-25 11:52:09 UTC
Hi again,

I discovered another regression:

$ # uname -r
2.6.32-3-686
$ sudo mount -o username=user,noserverino //alfresco/alfresco /home/user/alfresco/
$ touch alfresco/a
touch: cannot touch `alfresco/a': Permission denied
$ su 
# touch /home/user/alfresco/a # works fine

$ # uname -r
2.6.30-1-686
$ sudo mount -o username=user,noserverino //alfresco/alfresco /home/user/alfresco/
$ touch alfresco/a # works fine

Is this another kernel issue, or maybe caused by sudo ?

Should I change status from this ticket ? I guess yes, but I'm not sure .. so I don't touch 

Thank you very much
Comment 23 Jeff Layton 2010-03-25 12:24:38 UTC
(In reply to comment #22)
> Should I change status from this ticket ? I guess yes, but I'm not sure .. so
> I
> don't touch 
> 

No, I think you should open new bug for this. There's no reason to believe that this problem is related to the original problem you reported here.