[Samba] Directory with large number of files (follow-up)

paul.r.schenk at accenture.com paul.r.schenk at accenture.com
Tue Jul 23 11:02:02 GMT 2002


Hello all,

This is a follow-up to my post a few weeks ago about poor performance when
serving files from a directory with a large number of files (in this case
over 600 000 files).

I traced this down to two places in the code:

1) The trans2 routine get_lanman2_dir_entry loops through the entire
directory looking for possible matches. I can see why this is the best idea
for the general case (better than doing all possible variations on the
name). However, in the large number of files case, it takes quite a long
time to loop through the directory (1/2 million files). I patched this
routine to only look for the name exactly as supplied. This breaks backward
compatibility and forces case sensitivity, but in this case I have control
over what files are being asked for (it's an application we maintain).

2) The routine OpenDir in dir.c creates a Dir structure that contains every
directory entry. It even does this if 'dont descend' is set for the
directory (this must be a bug).  I patched OpenDir to return after
retrieving at most 50 entries. Since I don't loop through the list to get
my files (see point 1 above), this is not a loss.

So now I can open any file using Samba on my HP D380/2 in the same amount
of time it takes a Pentium II thing running NT4 to serve the files.

As an aside, 'dont descend' seems only marginally useful. Given my walk
through the code, every directory is scanned in it's entirety at least
twice (by OpenDir) before a decision not to show any files is made. If it
takes over 1 minute to scan the directory, you can say good-bye to your CPU
pretty quickly. This option prevents browsing, but I've seen some requests
for \dirname\* that caused get_lanman2_dir_entry to find the matches for
this (I would have expected 'dont descend' to stop this).

So what do I suggest in general? A 'max compatible number' would be a good
option. It would work something like:

in smb.conf:

max compatible number = 5000

Which would do two things.

1) If OpenDir get's more than the given number of files, it will abort with
only the partial list.
2) If get_lanman2_dir_entry loops through this many files with no match, it
will give up and just try for the name exactly as supplied.

This would let Samba deal with these strange cases of lots of files and
still keep all the old clients happy in most cases.  If anybody else thinks
this might be of use, I can try to pretty up my hacks to implement this.

I've been hacking 2.2.5 in case it matters.

Thanks
Paul


This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise private information.  If you have
received it in error, please notify the sender immediately and delete the
original.  Any other use of the email by you is prohibited.





More information about the samba mailing list