[PATCH][WIP] Make the Samba AD DC multi-process

Andrew Bartlett abartlet at samba.org
Wed Dec 14 15:00:24 UTC 2016


On Wed, 2016-12-14 at 12:08 +0100, Stefan Metzmacher wrote:
> Am 14.12.2016 um 05:47 schrieb Andrew Bartlett:
> > 
> > On Wed, 2016-12-14 at 16:11 +1300, Andrew Bartlett wrote:
> > > 
> > > On Wed, 2016-12-14 at 07:22 +1300, Andrew Bartlett wrote:
> > > > 
> > > > On Tue, 2016-12-13 at 17:37 +0100, Stefan Metzmacher wrote:
> > > > > 
> > > > > Hi Andrew,
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > I don't see my concerns of my previous mails addressed,
> > > > > > > so please don't push it as is.
> > > > > > > 
> > > > > > > I can try to explain them again if required later today.
> > > > > > 
> > > > > > I'm really sorry to have to trouble you for them, because
> > > > > > I've
> > > > > > been
> > > > > > working hard to address them.
> > > > > > 
> > > > > >  - I have added an smb.conf option for the lsa over
> > > > > > netlogon
> > > > > > behaviour
> > > > > >  - I have added tests for that option
> > > > > >  - I've explained that we can't conditionally check for
> > > > > > incorrect
> > > > > > association groups until we know which endpoint we bound to
> > > > > 
> > > > > This is wrong we know the endpoint from the time we do the
> > > > > listen on the socket. Or am I missing something?
> > > > > 
> > > > > call->conn->endpoint should be available before we even enter
> > > > > dcesrv_bind() everything else is a bug that needs to be
> > > > > fixed.
> > > > > So please keep the assoc_group checks in one place.
> > > > 
> > > > Thanks.  I missed where we could rely on that after my previous
> > > > attempt
> > > > to find iface, which wasn't available earlier.  
> > > 
> > > Done.
> > > 
> > > > 
> > > > > 
> > > > > We also don't need any check in dcesrv_handle_fetch()
> > > > > as there won't be any handles if they're not created.
> > > > 
> > > > The purpose of that is to assert that the pipe's flags are
> > > > correct.
> > > >  I
> > > > don't want to be debugging situations that happen only in some
> > > > contexts, just because the pipe miss-declared if it uses
> > > > handles. 
> > > > 
> > > > We can drop that patch if it still makes you unhappy. 
> > > > 
> > > > > 
> > > > > We also need to make sure we explicitly set
> > > > > ep->use_single_process = true; if the lsa interface
> > > > > tries to use the netlogon named pipe.
> > > > 
> > > > +	/*
> > > > +	 * By default don't force into a single process, but
> > > > if
> > > > any
> > > > +	 * interface on this endpoint on this service uses
> > > > handles
> > > > +	 * (most do), then we must force into single process
> > > > mode
> > > > +	 */
> > > > +	if (use_single_process == true) {
> > > > +		ep->use_single_process = true;
> > > >  	}
> > > >  
> > > > 
> > > > I'll expand the comment to explain that this more than covers
> > > > lsa
> > > > over
> > > > netlogon, it also covers the ncalrpc case (all on one socket),
> > > > which
> > > > also remains a single process.
> > > 
> > > Done.
> > > 
> > > > 
> > > > > 
> > > > > And we may need to use DLIST_ADD_END() instead of
> > > > > DLIST_ADD() in some places and also check
> > > > > ep->use_single_process if find_endpoint() found an endpoint.
> > > > 
> > > > Why do we need to change this?  What specifically do you want
> > > > changed.
> > > >  I need, sadly, more detail than 'in some places'. 
> > > > 
> > > > If find_endpoint() finds an endpoint, then we use that, unless
> > > > we
> > > > are
> > > > over TCP/IP (previously all bound to a single port) and only if
> > > > we
> > > > are
> > > > not forced to single process mode. 
> > > > 
> > > > > 
> > > > > Before calling dcesrv_add_ep() we may reset e-
> > > > > >use_single_process
> > > > > if this_model_ops == model_ops.
> > > > 
> > > > Correct, we reset e->use_single_process to true if any process
> > > > on
> > > > the
> > > > endpoint needs single process mode.  This happens most of the
> > > > time.
> > > 
> > > I've expanded the comments about this behaviour.
> > > 
> > > > 
> > > > > 
> > > > > Maybe it would be good to have (temporary) DEBUG() statements
> > > > > which print endpoint details and all available interfaces
> > > > > in dcesrv_add_ep(). If we would have such code I'd like to
> > > > > see the output of it in various situations, with all
> > > > > combinations
> > > > > of possible process models and lsaovernetlogon=yes/no.
> > > > > This would it much easier to judge if I can be happy with the
> > > > > logic you've implemented.
> > > > 
> > > > You can see that the lsa over netlogon works because we have
> > > > explicit
> > > > tests added for that.  If adding lsa to \pipe\netlogon it
> > > > wasn't
> > > > controlling the process model for netlogon, it wouldn't
> > > > succeed/fail
> > > > when changing this option.
> > > > 
> > > > We can just drop that lsa over netlogon over NP part for now if
> > > > it
> > > > really bothers you that much.  The important client (winbindd)
> > > > prefers
> > > > ncacn_ip_tcp anyway. 
> > > > 
> > > > > 
> > > > > > 
> > > > > > 
> > > > > >  - I've changed to specifying flags, not just the use of
> > > > > > handles
> > > > > >  - I've explained why the dcerpc interface should and does
> > > > > > declare
> > > > > > about the use of handles, not the ignoring of association
> > > > > > groups,
> > > > > > because per your request, only when all interfaces on an
> > > > > > endpoint
> > > > > > don't
> > > > > > use handles, do we ignore association groups.  (And the
> > > > > > DCE/RPC
> > > > > > server
> > > > > > could track this some other way if it wanted). 
> > > > > >  - I've explained that unused records are removed in the
> > > > > > same
> > > > > > way
> > > > > > as
> > > > > > the old code, on startup.
> > > > > 
> > > > > That's not true. You removed memcache_delete() without a
> > > > > replacement
> > > > > and challenge_cache_fetch() is used if the client actually
> > > > > switches the connection between netr_ServerReqChallenge()
> > > > > and netr_ServerAuthenticate(), otherwise we'll use
> > > > > dce_call->context->private_data to get the challenge.
> > > > 
> > > > Thanks.  I'll add some tests to cover that, as it has some
> > > > security
> > > > properties in the case where we don't enforce schannel.  We
> > > > have to
> > > > remove it even if we didn't fetch it, so removing it on the
> > > > challenge_cace_fetch() isn't enough.  Otherwise someone could
> > > > re-
> > > > use
> > > > an
> > > > observed challenge (they probably couldn't do very much with
> > > > the
> > > > result, but it wouldn't be good).   
> > > 
> > > I've added important tests for this, which has shown up some very
> > > important bugs. 
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > I also guess some of the new includes are not required
> > > > > anymore, at least "replace.h" comes already via
> > > > > "includes.h" and #include "system/filesys.h" should
> > > > > follow includes.h.
> > > > > 
> > > > > And I think netlogon_cache_entry should be moved to
> > > > > schannel.idl next to netlogon_creds_CredentialState.
> > > > 
> > > > OK. 
> > > 
> > > Done.
> > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > 
> > > > > >  - And, as you have seen from Garming, we have shown this
> > > > > > makes
> > > > > > a
> > > > > > significant improvement to the use case:  high-load 802.1x
> > > > > > (NTLM)
> > > > > > authentication via winbind on domain members. 
> > > > > > 
> > > > > > We have a customer who suffers significant load on the DCs
> > > > > > after
> > > > > > a
> > > > > > wifi
> > > > > > outage, because the DC needs to suddenly re-authenticate
> > > > > > all
> > > > > > the
> > > > > > users.
> > > > > >  That is why we are working on this. 
> > > > > > 
> > > > > > I've re-read your last mail, and I'm honestly at my wits
> > > > > > end as
> > > > > > what
> > > > > > more you want, so in the desperate hope that we are simply
> > > > > > in
> > > > > > miscommunication, can you please make clear again what
> > > > > > specific
> > > > > > changes
> > > > > > you want?  I would really like to move on from this, and
> > > > > > lock
> > > > > > it
> > > > > > in
> > > > > > for
> > > > > > 4.6.
> > > > > 
> > > > > I also hope we'll have it in 4.6.
> > > > > 
> > > > > Thanks for your patience.
> > > > > metze
> > > > 
> > > > I'll see what I can do today, starting with using the endpoint
> > > > for
> > > > the
> > > > association group check, and the DB delete.
> > > 
> > > I hope that what I attach here meets with your approval.  In
> > > particular, I've added a test to cover bug 11291 (netapp
> > > challenge
> > > reuse). 
> > 
> > I've fixed the extra #include lines and fixed another test to run
> > without "allow nt4 crypto". 
> > 
> > See attached.  
> 
> I've pushed the torture tests alone as I want to make sure we already
> pass them (as we should) with the current code.

Good idea.  I just ran out of work day to re-order, so thanks!

> I'd push the "rpc_server:netlogon Move from memcache to a tdb cache"
> patch with the following additions squashed:
> https://git.samba.org/?p=metze/samba/wip.git;a=commitdiff;h=7b19da551
> 6a63126cb087403057639cca600fc10

Thanks.

> Then I'll push the "lsa over netlogon" patches alone.

Great.

> I still think we should not handle
> DCESRV_INTERFACE_FLAGS_HANDLES_NOT_USED
> in dcesrv_handle_fetch(), if you want to add something
> let dcesrv_handle_new() fail and make sure all callers
> of dcesrv_handle_new() check the return.
> 
> And before calling dcesrv_add_ep() we should
> set e->use_single_process = true if this_model_ops is
> the same as model_ops. If someone uses samba -Msingle
> e->use_single_process should always be true.

OK, that makes a little more sense to me now.  Could you do that for
me?  (Otherwise I'll get to it next week).

Thanks!
-- 
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba




More information about the samba-technical mailing list