[Samba] Samba 3.0.6 and OpenLDAP performance problem

Tomasz Finke tomasz at finke.pl
Wed Oct 6 11:06:57 GMT 2004


Hello,

I'm running Samba 3.0.6 PDC with OpenLDAP 2.1.25 backend on a Linux
machine with RedHat 3.0 ES installed.  This is a large installation
with separate Samba BDC and 2 file servers.  The BDC server uses a 
replica LDAP server, working as slave for the master LDAP server
installed at PDC.  The number of domain accounts is about 1850 and
at the moment about 500 machines are added to the Samba domain.  The
number of machines increased slowly since April and for the last few
weeks we observed large delays during the domain logons.

The logon process for some Windows machines takes as much as 10-20
minutes (!)  For most of the users these times are of course
unacceptable.

Most of the users start their work and logon to the domain between
7:30-8:30 AM.  Within these hours the load of the PDC server sometimes
exceeds 100-120.  About 90% of the CPU time is utilized by slapd.

The PDC/BDC machines are HP DL-380 server with single Xeon CPU 2.80GHz,
2,5 GB of RAM,  no swap and with Gigabit Ethernet interface.

When I turned on the high debug level for both Samba and OpenLDAP
daemons and the problem is that during the processing of the logon
script Samba orders the LDAP backend to perform multiple searches for
all the domain users and repeats it 3 or 4 times.  This gives about 8-9
_thousand_ of full LDAP directory searches for single logon session!
The small part of slapd debug file follows:


Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=65 SRCH base="dc=XX 
Company,dc=pl" scope=2 
filter="(&(uid=umwadd01)(objectClass=sambaSamAccount))"
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=65 SEARCH RESULT 
tag=101 err=0 nentries=1 text=
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=66 SRCH base="dc=XX 
Company,dc=pl" scope=2 
filter="(&(uid=umwadd02)(objectClass=sambaSamAccount))"
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=66 SEARCH RESULT 
tag=101 err=0 nentries=1 text=
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=67 SRCH base="dc=XX 
Company,dc=pl" scope=2 
filter="(&(uid=umwadd03)(objectClass=sambaSamAccount))"
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=67 SEARCH RESULT 
tag=101 err=0 nentries=1 text=


... and so on, for some reason every user must be found in LDAP several
times.  All these searches are performed during the logon script
processing.  Since many of our users are still using Win98 workstations,
the system "hangs" for them for several minutes with empty screen and
only a logon script window open.

What's more confusing, for some of the domain users only about 60
LDAP searches are performed and they are able to log on to the domain
in a few seconds.  I tried to compare their exported ldif data with
users which experience the delays, but there's nothing exceptional,
only their names, UIDs and SIDs are different.

The problem does not depend on the operating system of the workstation
- we've tested Win98, NT, W2000 and XP systems.  It seems to be rather
user-centric.

I tried to increase OpenLDAP and nscd performance by setting the thread
number up to 256 and increasing the cache size, but this gives only a
small improvement.  The indexes in slapd.conf are defined as
described in the Samba docs:


index           default                         sub

index objectClass                               eq
index uidNumber,gidNumber                       eq
index memberUid                                 eq
index cn,sn,uid,displayName                     pres,sub,eq
index mail,givenname                            eq,subinitial
index nisMapName,nisMapEntry                    eq,pres,sub
index homeDirectory,sambaLogonScript            eq

index           sambaSID                        eq
index           sambaPrimaryGroupSID            eq
index           sambaDomainName                 eq

sizelimit       -1

cachesize       100000
dbcachesize     15000000
threads         256


We have BDC server configured as the second logon server, but for some
reason only small number of workstation chooses this server as logon
server.  Perhaps I should increase the "os level" for the BDC from 33
to 255, as it is configured for the PDC?

The smb.conf of the PDC server follows:

[global]
         workgroup = XXCOMP
         security = user
         server string = XX Company - PDC
         passdb backend = ldapsam:ldap://127.0.0.1
         idmap backend = ldap:ldap://127.0.0.1
         idmap uid = 40000-50000
         idmap gid = 40000-50000
         log level = 1
         log file = /var/log/samba/log.%m
         max log size = 500
         time server = Yes
         socket options = TCP_NODELAY IPTOS_LOWDELAY SO_KEEPALIVE 
SO_RCVBUF=8192 SO_SNDBUF=8192
         logon path =
         logon drive = K:
         logon home = \\fileserv02\homes\%U

         #logon script = %U.bat

         domain logons = Yes
         os level = 255
         local master = Yes
         preferred master = Yes
         domain master = Yes
         wins proxy = Yes
         wins support = Yes
         ldap suffix = dc=XX Company,dc=pl
         ldap group suffix = ou=groups
         ldap user suffix = ou=people
         ldap idmap suffix = ou=idmap,dc=XX Company,dc=pl
         ldap machine suffix = ou=machines
         ldap admin dn = cn=Manager,dc=XX Company,dc=pl
         ldap ssl = no
         ldap passwd sync = Yes
         remote browse sync = 10.255.255.255 130.130.255.255
         printing = cups
         hide unreadable = Yes
         nt acl support = Yes
         admin users = "Domain Admins"
         name resolve order = lmhosts wins hosts bcast
         ldap timeout = 15


I tried to use the idmap feature od Samba, but for some reason after
creating the "ou=idmap,dc=XX Company,dc=pl" container in LDAP, Samba
does not populate it with SID-GID mappings.  Perhaps this is the root
cause of our problem.

The whole Samba domain worked properly without these logon delays for
several months.  When the number of users and workstations was small, no
performance problems occcured.  Now we have serious problem with about
1/3 of the intended number of domain workstations (500 of ~1500).
Unless I find the solution, our management will probably decide to
migrate from Samba to Active Directory...

Thanks in advance,
Tomasz Finke




More information about the samba mailing list