[Samba] Samba Performance question

Belgardt, Wolfgang Wolfgang.Belgardt at hp.com
Fri Dec 6 20:52:00 GMT 2002


Hello Paul,

Thanks for your explanations of samba doings in this case.

1) where is samba build the in-memory list?  On server  or on the client? I think on the server, right?  
1a) All file  are 8.3 named files;  installed via a NT client
.
2) To your informations: I believe  the customer software is searching the files with wildcards. What is the customer doing? He read on a NT client music CDs and build from every Track on this CD a 30 sec MP3 file.  The software is  automatic create a 8 character long directory and write  for all tracks,of the CD,  an mp3 file with an 8.3 name. Only the 3 character exts is different; to recognize the tracks. The write is not a Problem, because the time is not relevant for the customer.  
What is doing with this files? When a customer goes to the CD shop an he will be hearing a music from a CD, he put the CD  to a barcode reader and then the software is searching via the barcode key the mp3 files to play the music. This search is doing, I believe, with wildcards. The time to search is 6 sec  in a directory with 45000 files and with a filename with 257 exts.
When we search in this directory a file with fewer exts (ie 15) the search need 3 sec.  With much fewer files in directory (10000) search is done fewer the 1 sec.



Kind Regards / Grüsse 

Wolfgang 

-----Original Message-----
From: paul.r.schenk at accenture.com [mailto:paul.r.schenk at accenture.com]
Sent: Friday, December 06, 2002 19:33
To: Belgardt, Wolfgang
Cc: samba at samba.org
Subject: Re: [Samba] Samba Performance question


We had the same problem here and I traced it to how Samba pretends to be a
Windows server.

Basically Samba does this:

1) build an in-memory list of a directory's contents, with 8.3 mangled
names
2) When asked for a file, look through the list created in step 1) trying
to find a match.  It tries an exact match and then an 8.3 match.

With large numbers of files in directories (I have one with about 650000
files), 1) creates a huge list and 2) takes forever and pegs the CPU at
100%.

In cases with large numbers of files, Windows wins hands-down, because the
8.3-stupid-stuff is handled by the filesystem.

I solved this by making a modified version of a few routines.
1) I make the routines that create the directory list abort after 100 files
and pretend there are no more files.
2) I modified the file opening routine (trans2_readdir, I think) to attempt
to open the file using the filesystem first, bypassing all
case-insenitive-8.3-mangling code. If that fails, I let it try the
look-up-in-a-list method (except for a hard-coded directory where I return
file-not-found if the direct attempt failed).
3) I set 'dont descend' on the big directories, to help users who
mistakenly try to browse the directory with explorer, although mod 1) would
mean they'd only see 100 files anyway.

Making these changes allows my HP9000-D380/2 to outperform a Windows NT 4
Pentium 2 when dealing with directories of over 600000 files. Stock Samba
compiled from source (or the depot from itrc) served files from this
directory at about the rate of 5 min/file, with the CPU pegged at 100%. NT
can handle this in less than 1 second. Now I have over 400 people opening
files in this directory all day, and the CPU doesn't even work up a sweat.

The mods I made break what I understand SMB to be. The broken-ness would
only affect old clients (Win 3.1) and clients that try to open 'AFILE.DOC'
and expect to get 'afile.doc'. Since I control what the client requests, I
could get around this. YMMV.

Hope this helps. Does anybody know if changes to address this problem are
in Samba 3?

All the best,
Paul



                                                                                                                                 
              "Belgardt, Wolfgang"                                                                                               
              <Wolfgang.Belgardt at hp.co         To:      <samba at samba.org>                                                        
              m>                               cc:                                                                               
              Sent by:                         Subject: [Samba] Samba Performance question                                       
              samba-admin at lists.samba.                                                                                           
              org                                                                                                                
                                                                                                                                 
                                                                                                                                 
              05/12/2002 04:45 PM                                                                                                
                                                                                                                                 
                                                                                                                                 



Dear all,





I have a difficult Problem with samba 2.2.5, I hope everyone can help me.


My customer has samba 2.2.5 running on a HP Alpha Server ES40 Cluster with
Tru64 V5.1. The share on this Server has  3.1 million files in  16000
directories.


Some one this directories have 45000 files on it.


The problem is: if we try a search a file from this  big directory  via an
NT Client the response time is to large for the the customer.


He has run an similar application on a NT File server. NT responded after 1
sec  and samba need 6 sec.


Can someone explain me what I can do to increase the performance, please?






Kind Regards / Mit freundlichen Grüssen


Wolfgang Belgardt
Customer Support Consultant

Hewlett-Packard GmbH
Customer Support
Bonsiepen 5
D-45136 Essen
Phone: ++49 (0) 201 2663 258
Fax:     ++49 (0) 201 2663 200
mobil:   +49 (0171 3357 256)
E-mail:  Wolfgang.Belgardt at hp.com
http://www.hp.com/de
__________________________________________________________________________________
Hewlett-Packard GmbH
Geschäftsführer: Jörg Menno Harms (Vorsitzender), Jürgen Banhardt, Wolfram
Fischer,
 Rainer Kaczmarczyk, Bärbel Schmidt, Fritz Schuller, Regine Stachelhaus
Vorsitzender des Aufsichtsrats: Heribert Schmitz
Sitz der Gesellschaft: Böblingen, Amtsgericht Böblingen HRB 4081









This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise private information.  If you have
received it in error, please notify the sender immediately and delete the
original.  Any other use of the email by you is prohibited.



More information about the samba mailing list