[Samba] delay in winbindd user/group lookups with many users/groups in domain

Walter Haidinger listrelay.samba.0411 at banshee.dnsalias.org
Mon Nov 29 10:44:41 GMT 2004


I have the following problem:  winbindd getgrnam/getpwnam/lookupname calls 
do not return for a long time (> 30 seconds) which is obviously quite 
annoying to the users. This usually happens at the "first" login, i.e. 
subsequent connections are not delayed until a (probably cache) timeout. 
I've failed to determine who is responsible for the caching yet.

I'll provide version and configuration details below.

First, here are some examples from log.winbindd:

[2004/11/29 09:02:19, 3] nsswitch/winbindd_sid.c:winbindd_lookupname(96)
  [28927]: lookupname MY_DOMAIN/HOST_FOO1
[2004/11/29 09:02:56, 3] 
[2004/11/29 09:28:17, 3] nsswitch/winbindd_group.c:winbindd_getgrnam(243)
  [21242]: getgrnam MY_DOMAIN/GROUP_BAR23
[2004/11/29 09:28:54, 3] 
[2004/11/29 09:45:15, 3] nsswitch/winbindd_user.c:winbindd_getpwnam(126)
  [21242]: getpwnam MY_DOMAIN/USER_JOE232
[2004/11/29 09:45:51, 3] 

See the delays? Short question: How can I eliminate that?
Searching the mailing-list and the web did turn up any evident solution.

I did mess with 'winbind cache time', turned off winbindd enumeration and 
even increased nscd's cache size and timeouts. Unfortunately the delays 

Also, it seems that not all users suffer from the delays, only some 
despite the fact that _all_ are members of the same domain group (see 
smb.conf example below). I'm sorry, but I would like to have a more 
reproducable behaviour too!

But then, perhaps it isn't even a Samba issue since getpwnam/getgrnam are 
plain system calls... Still any way to speed the lookups up?

All of the above happens with Samba 3.0.9 under SuSE Linux 9.1 (RPMs 
installed from ftp.suse.com:/projects/samba), SuSE Linux kernel 
2.6.5-7.111-default on a P-II/400 with 512MB (doesn't do anything else but 
running Samba). 
It runs as a NT4 domain client. The domain is _large_: almost 20000 users 
and 170000 groups (yes, no error: 170 thousend! Please don't ask me why so 
many groups, I'm _not_ the domain admin...)

  security = domain
  workgroup = MY_DOMAIN
  password server = *
  wins server =
  name resolve order = wins bcast
  idmap uid = 100000-500000
  idmap gid = 100000-500000
  winbind enum users = no
  winbind enum groups = no
  # try a big cache timeout value
  winbind cache time = 86400
  winbind enable local accounts = no
  winbind use default domain = yes
  template shell = /bin/false
  template homedir = /tmp

  preferred master = No
  local master = No
  domain master = No
  os level = 0

     path = /opt/shared_dir
     # It seems that Samba is slow in determining if the connecting
     # user is in the GROUP_BAR23 domain group...
     valid users = +MY_DOMAIN/GROUP_BAR23 MY_DOMAIN/USER_JOE232
     read only = yes
Hope I've provided enough configuration details. If not, please tell me!
One last thing: When searching the mailing-list, I found a getpwnam-cache 
patch: http://lists.samba.org/archive/samba-cvs/2004-November/052998.html
Would this help me? But then, what about a group cache?

Any hints to solve the above problem are _very_ appreciated!
Thanks in advance!
Regards, Walter

PS: No errors in the logfiles but this one in the winbind log:

"winbindd_pam_auth_crap: sam_logon returned ACCESS_DENIED.  Maybe the 
trust account password was changed and we didn't know it.  Killing 
connections to domain MY_DOMAIN"

Yet, 'wbinfo -t' succeeds.

I'm do not know if the error message is related to my problem above and 
what is causing this (yet).

