[Samba] Re: Large numbers of files in a directory - take #2 :-)

Michael Lueck mlueck at lueckdatasystems.com
Thu Feb 3 20:21:41 GMT 2005


Jeremy Allison wrote:

> The secret to this is really in the "case sensitive = True"
> line - it tells smbd never to scan for case-insensitive
> versions of names. So if an application asks for a file
> called "FOO", and it can't be found by a simple stat call,
> then smbd will return file not found immediately without
> scanning the containing directory for a version of a different
> case. The other "xxx case xxx" lines make this work by forcing
> a consistent case on all files created by smbd.

Hang on here... Windows app asks for file "Foo" and under this proposal it will not be found?

If so could this create an issue where Windows app writes "Foo" and is successful yet goes back to read it and is told it is not there?

This code is starting to sound like the M$ FTPD which is case sensitive only for DIR and LS commands, GET and PUT is case insensitive, thus you can get into modes where if "FOO" is the existing file 
on the server, you upload over top of it "Foo" then dir/ls for "Foo" to check that the size matches, you are told the file is not there. MOST ANNOYING to code around M$ FTPD to say the least.

I recently was dealing with this case issue and came up with the following scheme which I will try for a while.

1) Upper case the entire directory/file name
2) CRC32 that string
3) Store that hash along with file data

When a request comes in for a file, again 1) Upper case the entire directory/file name and 2) CRC32 that string, then check against the list of known files and see if there is a match. Step through 
hash collisions as needed. Already scanning my own desktop I hit one collision using this scheme, but one on a full hard drive is not bad! ;-)

Anyway, I came up with the above to avoid developing something like the case guessing code found in Samba. (Still considering how to deal with M$ FTPD when I get to the FTP I/O part of my program.)

-- 
Michael Lueck
Lueck Data Systems

Remove the upper case letters NOSPAM to contact me directly.



More information about the samba mailing list