[PATCH][WIP] Make the Samba AD DC multi-process

Andrew Bartlett abartlet at samba.org
Wed Dec 14 03:11:53 UTC 2016


On Wed, 2016-12-14 at 07:22 +1300, Andrew Bartlett wrote:
> On Tue, 2016-12-13 at 17:37 +0100, Stefan Metzmacher wrote:
> > Hi Andrew,
> > 
> > > 
> > > > 
> > > > I don't see my concerns of my previous mails addressed,
> > > > so please don't push it as is.
> > > > 
> > > > I can try to explain them again if required later today.
> > > 
> > > I'm really sorry to have to trouble you for them, because I've
> > > been
> > > working hard to address them.
> > > 
> > >  - I have added an smb.conf option for the lsa over netlogon
> > > behaviour
> > >  - I have added tests for that option
> > >  - I've explained that we can't conditionally check for incorrect
> > > association groups until we know which endpoint we bound to
> > 
> > This is wrong we know the endpoint from the time we do the
> > listen on the socket. Or am I missing something?
> > 
> > call->conn->endpoint should be available before we even enter
> > dcesrv_bind() everything else is a bug that needs to be fixed.
> > So please keep the assoc_group checks in one place.
> 
> Thanks.  I missed where we could rely on that after my previous
> attempt
> to find iface, which wasn't available earlier.  

Done.

> > We also don't need any check in dcesrv_handle_fetch()
> > as there won't be any handles if they're not created.
> 
> The purpose of that is to assert that the pipe's flags are correct.
>  I
> don't want to be debugging situations that happen only in some
> contexts, just because the pipe miss-declared if it uses handles. 
> 
> We can drop that patch if it still makes you unhappy. 
> 
> > We also need to make sure we explicitly set
> > ep->use_single_process = true; if the lsa interface
> > tries to use the netlogon named pipe.
> 
> +	/*
> +	 * By default don't force into a single process, but if any
> +	 * interface on this endpoint on this service uses handles
> +	 * (most do), then we must force into single process mode
> +	 */
> +	if (use_single_process == true) {
> +		ep->use_single_process = true;
>  	}
>  
> 
> I'll expand the comment to explain that this more than covers lsa
> over
> netlogon, it also covers the ncalrpc case (all on one socket), which
> also remains a single process.

Done.

> > And we may need to use DLIST_ADD_END() instead of
> > DLIST_ADD() in some places and also check
> > ep->use_single_process if find_endpoint() found an endpoint.
> 
> Why do we need to change this?  What specifically do you want
> changed.
>  I need, sadly, more detail than 'in some places'. 
> 
> If find_endpoint() finds an endpoint, then we use that, unless we are
> over TCP/IP (previously all bound to a single port) and only if we
> are
> not forced to single process mode. 
> 
> > Before calling dcesrv_add_ep() we may reset e->use_single_process
> > if this_model_ops == model_ops.
> 
> Correct, we reset e->use_single_process to true if any process on the
> endpoint needs single process mode.  This happens most of the time.

I've expanded the comments about this behaviour.

> > Maybe it would be good to have (temporary) DEBUG() statements
> > which print endpoint details and all available interfaces
> > in dcesrv_add_ep(). If we would have such code I'd like to
> > see the output of it in various situations, with all combinations
> > of possible process models and lsaovernetlogon=yes/no.
> > This would it much easier to judge if I can be happy with the
> > logic you've implemented.
> 
> You can see that the lsa over netlogon works because we have explicit
> tests added for that.  If adding lsa to \pipe\netlogon it wasn't
> controlling the process model for netlogon, it wouldn't succeed/fail
> when changing this option.
> 
> We can just drop that lsa over netlogon over NP part for now if it
> really bothers you that much.  The important client (winbindd)
> prefers
> ncacn_ip_tcp anyway. 
> 
> > > 
> > >  - I've changed to specifying flags, not just the use of handles
> > >  - I've explained why the dcerpc interface should and does
> > > declare
> > > about the use of handles, not the ignoring of association groups,
> > > because per your request, only when all interfaces on an endpoint
> > > don't
> > > use handles, do we ignore association groups.  (And the DCE/RPC
> > > server
> > > could track this some other way if it wanted). 
> > >  - I've explained that unused records are removed in the same way
> > > as
> > > the old code, on startup.
> > 
> > That's not true. You removed memcache_delete() without a
> > replacement
> > and challenge_cache_fetch() is used if the client actually
> > switches the connection between netr_ServerReqChallenge()
> > and netr_ServerAuthenticate(), otherwise we'll use
> > dce_call->context->private_data to get the challenge.
> 
> Thanks.  I'll add some tests to cover that, as it has some security
> properties in the case where we don't enforce schannel.  We have to
> remove it even if we didn't fetch it, so removing it on the
> challenge_cace_fetch() isn't enough.  Otherwise someone could re-use
> an
> observed challenge (they probably couldn't do very much with the
> result, but it wouldn't be good).   

I've added important tests for this, which has shown up some very
important bugs. 

> 
> > I also guess some of the new includes are not required
> > anymore, at least "replace.h" comes already via
> > "includes.h" and #include "system/filesys.h" should
> > follow includes.h.
> > 
> > And I think netlogon_cache_entry should be moved to
> > schannel.idl next to netlogon_creds_CredentialState.
> 
> OK. 

Done.

> > > 
> > >  - And, as you have seen from Garming, we have shown this makes a
> > > significant improvement to the use case:  high-load 802.1x (NTLM)
> > > authentication via winbind on domain members. 
> > > 
> > > We have a customer who suffers significant load on the DCs after
> > > a
> > > wifi
> > > outage, because the DC needs to suddenly re-authenticate all the
> > > users.
> > >  That is why we are working on this. 
> > > 
> > > I've re-read your last mail, and I'm honestly at my wits end as
> > > what
> > > more you want, so in the desperate hope that we are simply in
> > > miscommunication, can you please make clear again what specific
> > > changes
> > > you want?  I would really like to move on from this, and lock it
> > > in
> > > for
> > > 4.6.
> > 
> > I also hope we'll have it in 4.6.
> > 
> > Thanks for your patience.
> > metze
> 
> I'll see what I can do today, starting with using the endpoint for
> the
> association group check, and the DB delete.

I hope that what I attach here meets with your approval.  In
particular, I've added a test to cover bug 11291 (netapp challenge
reuse). 

Thanks,

Andrew Bartlett
-------------- next part --------------
A non-text attachment was scrubbed...
Name: multi-process-samba-ad-dc.patch
Type: text/x-patch
Size: 72475 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20161214/67671aaa/multi-process-samba-ad-dc.bin>


More information about the samba-technical mailing list