[Samba] optimizing and scaling ntlm_auth

Volker Lendecke Volker.Lendecke at SerNet.DE
Thu Sep 11 08:39:37 MDT 2014


On Thu, Sep 11, 2014 at 10:22:17AM -0400, Louis Munro wrote:
> On 2014-09-11, at 9:23 , Volker Lendecke <Volker.Lendecke at SerNet.DE> wrote:
> 
> > Yes, it does. This winbind child is really idle. If at the same time
> > you have multiple ntlm_auth processes asking for attention, we have a
> > scheduling problem in the winbind parent. Just to make sure: Was this
> > really under load while one child was busy and ntlm_auth processes were
> > not being served?
> 
> 
> This is what the process tree for winbind looks like: 
> 
> #pstree 1460 -p
> winbindd(1460)─┬─winbindd(1463)
>                ├─winbindd(1465)
>                ├─winbindd(2030)
>                ├─winbindd(3192)
>                ├─winbindd(4426)
>                ├─winbindd(4427)
>                ├─winbindd(11303)
>                ├─winbindd(16894)
>                ├─winbindd(25401)
>                ├─winbindd(25404)
>                ├─winbindd(25408)
>                ├─winbindd(25410)
>                ├─winbindd(25413)
>                ├─winbindd(25416)
>                ├─winbindd(25419)
>                ├─winbindd(25422)
>                ├─winbindd(25428)
>                ├─winbindd(25431)
>                ├─winbindd(25435)
>                ├─winbindd(25438)
>                ├─winbindd(25445)
>                ├─winbindd(25448)
>                ├─winbindd(25452)
>                ├─winbindd(25456)
>                ├─winbindd(25459)
>                ├─winbindd(25461)
>                ├─winbindd(25464)
>                ├─winbindd(25468)
>                ├─winbindd(25471)
>                ├─winbindd(25473)
>                ├─winbindd(25479)
>                ├─winbindd(25484)
>                ├─winbindd(25486)
>                ├─winbindd(25490)
>                ├─winbindd(25492)
>                ├─winbindd(25495)
>                ├─winbindd(25498)
>                ├─winbindd(25501)
>                ├─winbindd(28832)
>                └─winbindd(28834)
> 
> 
> 
> If I trace the first process (strace -p 1463) I see it handling requests. 
> The further down the list I go, the less activity there is. 
> I.e. the second process (1465) handles only a few requests. 
> The third one (2030) even less.  
> 
> By the time I make it to the bottom of the list the seem to never handle any requests no matter how long I trace for (admittedly minutes and not hours). 
> 
> This may be normal. I don't know enough about the winbind internals to say whether that is how requests should balance or not. 
> I do wonder why I have so many processes if winbind is not using them. 

winbind at one point has been flooded with 30 simultaneous
requests. winbind children never exit again, this was
discussed in another thread. I don't remember if the code to
exit idle children eventually made it in.

> 
> As mentioned I have a pretty constant stream of ntlm_auth coming in. 
> I log the time ntlm_auth takes to return and it varies significantly between just a few ms up to a few seconds in the worse case. 
> I am trying to find out if the wait is mostly on the server running winbind or the DC. 
> 
> Is there a way for me to know how long the winbind process has been waiting for a reply from the DC?
> Or how many requests are in the "queue" at any given time? 
> I can crank up the logging but I just don't know what to look for. 

We don't gather these statistics right now. Should be simple
to do, but right now we don't have that. From a network
trace I think I could spot whether a DC was slow overall or
whether it was us, but without further code instrumentation
I don't have a good idea how to find what you need from
logs. tcpdump can do rolling logs, so it's possible to keep
the last gigabyte of traffic around an rotate the rest away.
It might be worthwhile to run that for a while and later on
analyze the sniff in the moment when ntlm_auth was slow.

Volker

-- 
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de, mailto:kontakt at sernet.de


More information about the samba mailing list