[Samba] Samba 3.0.6 and OpenLDAP performance problem
Tomasz Finke
tomasz at finke.pl
Wed Oct 6 11:06:57 GMT 2004
Hello,
I'm running Samba 3.0.6 PDC with OpenLDAP 2.1.25 backend on a Linux
machine with RedHat 3.0 ES installed. This is a large installation
with separate Samba BDC and 2 file servers. The BDC server uses a
replica LDAP server, working as slave for the master LDAP server
installed at PDC. The number of domain accounts is about 1850 and
at the moment about 500 machines are added to the Samba domain. The
number of machines increased slowly since April and for the last few
weeks we observed large delays during the domain logons.
The logon process for some Windows machines takes as much as 10-20
minutes (!) For most of the users these times are of course
unacceptable.
Most of the users start their work and logon to the domain between
7:30-8:30 AM. Within these hours the load of the PDC server sometimes
exceeds 100-120. About 90% of the CPU time is utilized by slapd.
The PDC/BDC machines are HP DL-380 server with single Xeon CPU 2.80GHz,
2,5 GB of RAM, no swap and with Gigabit Ethernet interface.
When I turned on the high debug level for both Samba and OpenLDAP
daemons and the problem is that during the processing of the logon
script Samba orders the LDAP backend to perform multiple searches for
all the domain users and repeats it 3 or 4 times. This gives about 8-9
_thousand_ of full LDAP directory searches for single logon session!
The small part of slapd debug file follows:
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=65 SRCH base="dc=XX
Company,dc=pl" scope=2
filter="(&(uid=umwadd01)(objectClass=sambaSamAccount))"
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=65 SEARCH RESULT
tag=101 err=0 nentries=1 text=
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=66 SRCH base="dc=XX
Company,dc=pl" scope=2
filter="(&(uid=umwadd02)(objectClass=sambaSamAccount))"
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=66 SEARCH RESULT
tag=101 err=0 nentries=1 text=
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=67 SRCH base="dc=XX
Company,dc=pl" scope=2
filter="(&(uid=umwadd03)(objectClass=sambaSamAccount))"
Sep 27 15:01:09 umwsap11 slapd[16930]: conn=458 op=67 SEARCH RESULT
tag=101 err=0 nentries=1 text=
... and so on, for some reason every user must be found in LDAP several
times. All these searches are performed during the logon script
processing. Since many of our users are still using Win98 workstations,
the system "hangs" for them for several minutes with empty screen and
only a logon script window open.
What's more confusing, for some of the domain users only about 60
LDAP searches are performed and they are able to log on to the domain
in a few seconds. I tried to compare their exported ldif data with
users which experience the delays, but there's nothing exceptional,
only their names, UIDs and SIDs are different.
The problem does not depend on the operating system of the workstation
- we've tested Win98, NT, W2000 and XP systems. It seems to be rather
user-centric.
I tried to increase OpenLDAP and nscd performance by setting the thread
number up to 256 and increasing the cache size, but this gives only a
small improvement. The indexes in slapd.conf are defined as
described in the Samba docs:
index default sub
index objectClass eq
index uidNumber,gidNumber eq
index memberUid eq
index cn,sn,uid,displayName pres,sub,eq
index mail,givenname eq,subinitial
index nisMapName,nisMapEntry eq,pres,sub
index homeDirectory,sambaLogonScript eq
index sambaSID eq
index sambaPrimaryGroupSID eq
index sambaDomainName eq
sizelimit -1
cachesize 100000
dbcachesize 15000000
threads 256
We have BDC server configured as the second logon server, but for some
reason only small number of workstation chooses this server as logon
server. Perhaps I should increase the "os level" for the BDC from 33
to 255, as it is configured for the PDC?
The smb.conf of the PDC server follows:
[global]
workgroup = XXCOMP
security = user
server string = XX Company - PDC
passdb backend = ldapsam:ldap://127.0.0.1
idmap backend = ldap:ldap://127.0.0.1
idmap uid = 40000-50000
idmap gid = 40000-50000
log level = 1
log file = /var/log/samba/log.%m
max log size = 500
time server = Yes
socket options = TCP_NODELAY IPTOS_LOWDELAY SO_KEEPALIVE
SO_RCVBUF=8192 SO_SNDBUF=8192
logon path =
logon drive = K:
logon home = \\fileserv02\homes\%U
#logon script = %U.bat
domain logons = Yes
os level = 255
local master = Yes
preferred master = Yes
domain master = Yes
wins proxy = Yes
wins support = Yes
ldap suffix = dc=XX Company,dc=pl
ldap group suffix = ou=groups
ldap user suffix = ou=people
ldap idmap suffix = ou=idmap,dc=XX Company,dc=pl
ldap machine suffix = ou=machines
ldap admin dn = cn=Manager,dc=XX Company,dc=pl
ldap ssl = no
ldap passwd sync = Yes
remote browse sync = 10.255.255.255 130.130.255.255
printing = cups
hide unreadable = Yes
nt acl support = Yes
admin users = "Domain Admins"
name resolve order = lmhosts wins hosts bcast
ldap timeout = 15
I tried to use the idmap feature od Samba, but for some reason after
creating the "ou=idmap,dc=XX Company,dc=pl" container in LDAP, Samba
does not populate it with SID-GID mappings. Perhaps this is the root
cause of our problem.
The whole Samba domain worked properly without these logon delays for
several months. When the number of users and workstations was small, no
performance problems occcured. Now we have serious problem with about
1/3 of the intended number of domain workstations (500 of ~1500).
Unless I find the solution, our management will probably decide to
migrate from Samba to Active Directory...
Thanks in advance,
Tomasz Finke
More information about the samba
mailing list