[Samba] optimizing and scaling ntlm_auth

Louis Munro lmunro at inverse.ca
Thu Sep 11 08:22:17 MDT 2014


On 2014-09-11, at 9:23 , Volker Lendecke <Volker.Lendecke at SerNet.DE> wrote:

> Yes, it does. This winbind child is really idle. If at the same time
> you have multiple ntlm_auth processes asking for attention, we have a
> scheduling problem in the winbind parent. Just to make sure: Was this
> really under load while one child was busy and ntlm_auth processes were
> not being served?


This is what the process tree for winbind looks like: 

#pstree 1460 -p
winbindd(1460)─┬─winbindd(1463)
               ├─winbindd(1465)
               ├─winbindd(2030)
               ├─winbindd(3192)
               ├─winbindd(4426)
               ├─winbindd(4427)
               ├─winbindd(11303)
               ├─winbindd(16894)
               ├─winbindd(25401)
               ├─winbindd(25404)
               ├─winbindd(25408)
               ├─winbindd(25410)
               ├─winbindd(25413)
               ├─winbindd(25416)
               ├─winbindd(25419)
               ├─winbindd(25422)
               ├─winbindd(25428)
               ├─winbindd(25431)
               ├─winbindd(25435)
               ├─winbindd(25438)
               ├─winbindd(25445)
               ├─winbindd(25448)
               ├─winbindd(25452)
               ├─winbindd(25456)
               ├─winbindd(25459)
               ├─winbindd(25461)
               ├─winbindd(25464)
               ├─winbindd(25468)
               ├─winbindd(25471)
               ├─winbindd(25473)
               ├─winbindd(25479)
               ├─winbindd(25484)
               ├─winbindd(25486)
               ├─winbindd(25490)
               ├─winbindd(25492)
               ├─winbindd(25495)
               ├─winbindd(25498)
               ├─winbindd(25501)
               ├─winbindd(28832)
               └─winbindd(28834)



If I trace the first process (strace -p 1463) I see it handling requests. 
The further down the list I go, the less activity there is. 
I.e. the second process (1465) handles only a few requests. 
The third one (2030) even less.  

By the time I make it to the bottom of the list the seem to never handle any requests no matter how long I trace for (admittedly minutes and not hours). 

This may be normal. I don't know enough about the winbind internals to say whether that is how requests should balance or not. 
I do wonder why I have so many processes if winbind is not using them. 

As mentioned I have a pretty constant stream of ntlm_auth coming in. 
I log the time ntlm_auth takes to return and it varies significantly between just a few ms up to a few seconds in the worse case. 
I am trying to find out if the wait is mostly on the server running winbind or the DC. 

Is there a way for me to know how long the winbind process has been waiting for a reply from the DC?
Or how many requests are in the "queue" at any given time? 
I can crank up the logging but I just don't know what to look for. 
 
Thank you for your help.

--
Louis Munro
lmunro at inverse.ca  ::  www.inverse.ca 
+1.514.447.4918 x125  :: +1 (866) 353-6153 x125
Inverse inc. :: Leaders behind SOGo (www.sogo.nu) and PacketFence (www.packetfence.org)


More information about the samba mailing list