The Linux client does not like the following response including the FileId opaque value 0x48900000489. This value leads the Linux client missing such entry (does not show from the ls command). + Tcp: Flags=...AP..., SrcPort=Microsoft-DS(445), DstPort=46874, PayloadLen=376, Seq=3277917 - 3278293, Ack=1359064130, Win=16384 + SMBOverTCP: Length = 372 - Smb: R; Transact2, Find First2, ID Full Directory Info Protocol: SMB Command: Transact2 50(0x32) + NTStatus: 0x0, Facility = FACILITY_SYSTEM, Severity = STATUS_SEVERITY_SUCCESS, Code = (0) STATUS_SUCCESS + SMBHeader: Response, TID: 0x003F, PID: 0x6319, UID: 0x003F, MID: 0x7FDD - RTransaction2: WordCount: 10 (0xA) TotalParameterCount: 10 (0xA) TotalDataCount: 304 (0x130) Reserved: 0 (0x0) ParameterCount: 10 (0xA) ParameterOffset: 56 (0x38) ParamDisplacement: 0 (0x0) DataCount: 304 (0x130) DataOffset: 68 (0x44) DataDisplacement: 0 (0x0) SetupCount: 0 (0x0) Reserved2: 0 (0x0) ByteCount: 317 (0x13D) Pad1: Binary Large Object (1 Bytes) + FindFirst2ParameterBlock: Pad2: Binary Large Object (2 Bytes) - IDFullDirInfo: . NextEntryOffset: 88 (0x58) FileIndex: 8 (0x8) CreationTime: 09/10/2010, 07:45:55.000000 UTC LastAccessTime: 09/10/2010, 14:13:53.000000 UTC LastWriteTime: 09/10/2010, 07:48:50.000000 UTC LastChangeTime: 09/10/2010, 07:48:50.000000 UTC - EndOfFile: 0 LargeInteger: 0 (0x0) - AllocationSize: 0 LargeInteger: 0 (0x0) + ExtFileAttributes: 0x0010 FileNameLength: 2 (0x2) EaSize: 0 (0x0) Reserved: 0 (0x0) - FileId: 1161 LargeInteger: 4952097293449 (0x48100000489) + FileName: . EntryPad: Binary Large Object (6 Bytes) - IDFullDirInfo: .. NextEntryOffset: 88 (0x58) FileIndex: 12 (0xC) CreationTime: 09/10/2010, 07:45:47.000000 UTC LastAccessTime: 09/10/2010, 09:29:06.000000 UTC LastWriteTime: 09/10/2010, 07:48:52.000000 UTC LastChangeTime: 09/10/2010, 07:48:52.000000 UTC - EndOfFile: LargeInteger: 0 (0x0) - AllocationSize: 0 LargeInteger: 0 (0x0) + ExtFileAttributes: 0x0010 FileNameLength: 4 (0x4) EaSize: 0 (0x0) Reserved: 0 (0x0) - FileId: 1161 LargeInteger: 4599909975177 (0x42F00000489) + FileName: .. EntryPad: Binary Large Object (4 Bytes) - IDFullDirInfo: ProhibitUnusedCapture.pm NextEntryOffset: 0 (0x0) FileIndex: 500 (0x1F4) CreationTime: 09/10/2010, 07:45:56.000000 UTC LastAccessTime: 09/10/2010, 07:56:54.000000 UTC LastWriteTime: 09/07/2009, 21:37:07.000000 UTC LastChangeTime: 09/10/2010, 07:45:56.000000 UTC - EndOfFile: 13508 LargeInteger: 13508 (0x34C4) - AllocationSize: 16384 LargeInteger: 16384 (0x4000) + ExtFileAttributes: 0x0020 FileNameLength: 48 (0x30) EaSize: 0 (0x0) Reserved: 0 (0x0) - FileId: 1161 LargeInteger: 4986457031817 (0x48900000489) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< + FileName: ProhibitUnusedCapture.pm The same response with a FileId=0x489 workarounds the problem. + Tcp: Flags=...AP..., SrcPort=Microsoft-DS(445), DstPort=46874, PayloadLen=376, Seq=3275589 - 3275965, Ack=1359062598, Win=16384 + SMBOverTCP: Length = 372 - Smb: R; Transact2, Find First2, ID Full Directory Info Protocol: SMB Command: Transact2 50(0x32) + NTStatus: 0x0, Facility = FACILITY_SYSTEM, Severity = STATUS_SEVERITY_SUCCESS, Code = (0) STATUS_SUCCESS + SMBHeader: Response, TID: 0x003F, PID: 0x6317, UID: 0x003F, MID: 0x7FD4 - RTransaction2: WordCount: 10 (0xA) TotalParameterCount: 10 (0xA) TotalDataCount: 304 (0x130) Reserved: 0 (0x0) ParameterCount: 10 (0xA) ParameterOffset: 56 (0x38) ParamDisplacement: 0 (0x0) DataCount: 304 (0x130) DataOffset: 68 (0x44) DataDisplacement: 0 (0x0) SetupCount: 0 (0x0) Reserved2: 0 (0x0) ByteCount: 317 (0x13D) Pad1: Binary Large Object (1 Bytes) + FindFirst2ParameterBlock: Pad2: Binary Large Object (2 Bytes) - IDFullDirInfo: . NextEntryOffset: 88 (0x58) FileIndex: 8 (0x8) CreationTime: 09/10/2010, 07:45:55.000000 UTC LastAccessTime: 09/10/2010, 14:13:40.000000 UTC LastWriteTime: 09/10/2010, 07:48:50.000000 UTC LastChangeTime: 09/10/2010, 07:48:50.000000 UTC - EndOfFile: 0 LargeInteger: 0 (0x0) - AllocationSize: 0 LargeInteger: 0 (0x0) + ExtFileAttributes: 0x0010 FileNameLength: 2 (0x2) EaSize: 0 (0x0) Reserved: 0 (0x0) - FileId: 1161 LargeInteger: 1161 (0x489) + FileName: . EntryPad: Binary Large Object (6 Bytes) - IDFullDirInfo: .. NextEntryOffset: 88 (0x58) FileIndex: 12 (0xC) CreationTime: 09/10/2010, 07:45:47.000000 UTC LastAccessTime: 09/10/2010, 09:29:06.000000 UTC LastWriteTime: 09/10/2010, 07:48:52.000000 UTC LastChangeTime: 09/10/2010, 07:48:52.000000 UTC - EndOfFile: 0 LargeInteger: 0 (0x0) - AllocationSize: 0 LargeInteger: 0 (0x0) + ExtFileAttributes: 0x0010 FileNameLength: 4 (0x4) EaSize: 0 (0x0) Reserved: 0 (0x0) - FileId: 1161 LargeInteger: 1161 (0x489) + FileName: .. EntryPad: Binary Large Object (4 Bytes) - IDFullDirInfo: ProhibitUnusedCapture.pm NextEntryOffset: 0 (0x0) FileIndex: 500 (0x1F4) CreationTime: 09/10/2010, 07:45:56.000000 UTC LastAccessTime: 09/10/2010, 07:56:54.000000 UTC LastWriteTime: 09/07/2009, 21:37:07.000000 UTC LastChangeTime: 09/10/2010, 07:45:56.000000 UTC - EndOfFile: 13508 LargeInteger: 13508 (0x34C4) - AllocationSize: 16384 LargeInteger: 16384 (0x4000) + ExtFileAttributes: 0x0020 FileNameLength: 48 (0x30) EaSize: 0 (0x0) Reserved: 0 (0x0) - FileId: 1161 LargeInteger: 1161 (0x489) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< + FileName: ProhibitUnusedCapture.pm
Thanks for the bug report. Generally we use the FileId as the inode number. The exception is on 32-bit machines where we have to hash that down to a 32 bit number, but that shouldn't prevent it from showing up in a readdir listing. What arch is the client that you're using here? Any chance you could provide a raw capture of this info? I'd like to look at it carefully with wireshark...
Created attachment 31902 [details] capture of the client <-> server Please find in this attachement the 2 cases the first is nominal (frame 137), since the client requests a SMB:QueryPathInfo for the entry. the second case, frame 758 same request (without our trick) Index Number is 0x0000048900000489, ans so the client doesnt request this time the QueryPathinfo
Created attachment 134391 [details] tcpdump file of FIND_FIRST2 response
I'm not shure this issue is solved. I have found that one file is missing in a filelisting from a Win2008 64-bit windows on two occations. We never saw this when connecting to older Win machines. We list and copy 100000+ files from windows to linux daily. The issue seems to be the same as Eric reported. When high and low part of Index Number / FileId (Which seems to be used as inode number) is the same. In my case "Index Number: 0x001a0000001a0000". Eric's number follows the same pattern. This is tested on kernel 3.2.54. Please take a look at attachment 134391 [details] it contains the raw pcap dump. Corresponding to this "ls -la" root@cs1:/home/po# ll /mnt/ total 2348 drwxr-xr-x 1 root root 1855488 Apr 29 19:19 . drwxr-xr-x 22 root root 4096 Jun 18 2013 .. drwxr-xr-x 0 root root 0 Feb 22 01:54 123455 drwxr-xr-x 0 root root 0 Feb 22 01:54 139345 drwxr-xr-x 0 root root 0 Feb 22 01:54 200865 drwxr-xr-x 0 root root 0 Feb 22 01:54 220814 -rwxr-xr-x 0 root root 12078 Apr 29 13:45 227074_SogXml_176_5669052549_2014042913403231.tif -rwxr-xr-x 0 root root 2253 Apr 29 13:45 227074_SogXml_176_5669052549_2014042913403231.xml -rwxr-xr-x 0 root root 12316 Apr 29 13:31 227074_SogXml_176_6230340199_2014042913263995.tif -rwxr-xr-x 0 root root 2264 Apr 29 13:31 227074_SogXml_176_6230340199_2014042913263995.xml -rwxr-xr-x 0 root root 13154 Apr 29 13:39 227074_SogXml_176_6484810269_2014042913343266.tif -rwxr-xr-x 0 root root 2309 Apr 29 13:39 227074_SogXml_176_6484810269_2014042913343266.xml -rwxr-xr-x 0 root root 12480 Apr 29 13:31 227074_SogXml_176_9263533623_2014042913263973.tif -rwxr-xr-x 0 root root 2342 Apr 29 13:31 227074_SogXml_176_9263533623_2014042913263973.xml -rwxr-xr-x 0 root root 11624 Apr 29 13:08 227074_SogXml_176_9288946628_2014042913042497.tif -rwxr-xr-x 0 root root 2238 Apr 29 13:08 227074_SogXml_176_9288946628_2014042913042497.xml -rwxr-xr-x 0 root root 12234 Apr 29 16:28 227074_SogXml_177_6484495426_2014042916243070.tif -rwxr-xr-x 0 root root 2273 Apr 29 16:28 227074_SogXml_177_6484495426_2014042916243070.xml -rwxr-xr-x 0 root root 11702 Apr 29 16:28 227074_SogXml_177_6484790628_2014042916243120.tif -rwxr-xr-x 0 root root 2259 Apr 29 16:28 227074_SogXml_177_6484790628_2014042916243120.xml -rwxr-xr-x 0 root root 12894 Apr 29 16:28 227074_SogXml_177_8535381076_2014042916243099.tif -rwxr-xr-x 0 root root 2403 Apr 29 16:28 227074_SogXml_177_8535381076_2014042916243099.xml -rwxr-xr-x 0 root root 12206 Apr 29 12:02 227074_SogXml_177_9309621150_2014042911583252.tif -rwxr-xr-x 0 root root 2253 Apr 29 12:02 227074_SogXml_177_9309621150_2014042911583252.xml drwxr-xr-x 0 root root 0 Feb 22 01:54 234567 drwxr-xr-x 0 root root 0 Feb 22 01:55 235267 drwxr-xr-x 0 root root 0 Feb 22 01:55 361196 drwxr-xr-x 0 root root 0 Feb 22 01:55 421713 drwxr-xr-x 0 root root 0 Jan 26 2013 422378 drwxr-xr-x 0 root root 0 Feb 22 01:55 431718 drwxr-xr-x 0 root root 0 Feb 22 01:55 490011 drwxr-xr-x 0 root root 0 Feb 22 01:55 543983 -rwxr-xr-x 0 root root 14962 Apr 29 15:59 move.txt -rwxr-xr-x 0 root root 363008 Apr 29 16:08 Thumbs.db stat of file: root@cs1:/home/po# stat /mnt/422147_SogXml_177_4069079814_2014042314353706.xml File: `/mnt/422147_SogXml_177_4069079814_2014042314353706.xml' Size: 2262 Blocks: 8 IO Block: 16384 regular file Device: 15h/21d Inode: 7318349396180992 Links: 1 Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2014-04-23 14:40:38.069000000 +0200 Modify: 2014-04-23 14:40:38.069000000 +0200 Change: 2014-04-29 16:05:44.142000000 +0200 Birth: -
Ok, I see the problem. On a 32-bit arch, this function comes out to be 0 when the top and bottom 32 bit words are identical: static inline ino_t cifs_uniqueid_to_ino_t(u64 fileid) { ino_t ino = (ino_t) fileid; if (sizeof(ino_t) < sizeof(u64)) ino ^= fileid >> (sizeof(u64)-sizeof(ino_t)) * 8; return ino; } When you do that on a 32 bit arch: cifs_uniqueid_to_ino_t(0x001a0000001a0000) == 0 The kernel likely still returns that in the readdir(), but I think ls may then skip printing it out.
Created attachment 134451 [details] patch -- cifs: fix cifs_uniqueid_to_ino_t not to ever return 0 So anyway, I'm assuming your client is a 32-bit arch here? If so, then could you test this patch against this server and see if it helps. Be forewarned that the st_ino values of the dentries on this share will be different with this patch.
Yes you are right this is a 32-bit machine. Your patch is almost working, good! It is missing one line of code, the "return ino;" #if BITS_PER_LONG == 64 static inline ino_t cifs_uniqueid_to_ino_t(u64 fileid) { return (ino_t)fileid; } #else static inline ino_t cifs_uniqueid_to_ino_t(u64 fileid) { ino_t ino; ino = hash_64(fileid, (sizeof(ino_t) * 8) - 1) + 1; /***********************/ return ino; /***********************/ } #endif Is this hash_64 safe? What is the risk of colission? A collision would generate the same kind of error, a missing directoy item. Wouldn't it? A stat on the file returns the same inode number as before so this code doesn't seem to be used other than in a readdir(). When would your statement "Be forewarned that the st_ino values of the dentries on this share will be different with this patch." be a problem? Can this cause problems for others? Thanks!
Doh! Well spotted. I'll fix that before I send it out. Yes, we do need to return the value. hash_64 is likely to be safer and have better distribution than the current hashing routine. This scheme does reduce the range of possible values a bit, from 2^32-1 to 2^31, but that's not likely to matter much for most folks. As always when hashing, there is a risk of collision, but there was one before. The kernel doesn't really care about the st_ino/i_ino value, but some userland apps may. Not much we can reasonably do about that, unfortunately. If you care about the st_ino values being reported to userland, then move to a 64-bit OS. As far as the inode numbers not changing, it may just be an artifact of how the hashing works in this case.
I'm not shure about this "As far as the inode numbers not changing, it may just be an artifact of how the hashing works in this case." If you refere to what I wrote about the inode number reported by stat. stat seems to report a 64-bit value even on 32-bit OS: pcap/Wireshark reports: "Index Number: 0x001a0000001a0000" and stat reports "Inode: 7318349396180992" which is the sama value, it seems not to be hashed at all. This value is the raw "Index Number" both before and after the patch.
The way stat() works on 32 bit arches is somewhat complicated... glibc() always calls the stat64() syscall. The stat program on your machine is almost certainly compiled with -D_FILE_OFFSET_BITS=64, which means that the struct stat passed in has a 64-bit field for st_ino. Thus you'll see the "raw" 64-bit value with that program. Where you likely will see the numbers change is in the inode numbers reported in dirent->d_ino returned by readdir(3).
Thanks for the clarification and great thanks for your cooperation. Best regards Per-Ola
Created attachment 134731 [details] patch -- cifs: fix cifs_uniqueid_to_ino_t not to ever return 0 Revised patch. Fix the 32-bit version of the function to actually return a value.