[PATCH][WIP] Make the Samba AD DC multi-process

Stefan Metzmacher metze at samba.org
Wed Dec 14 11:08:33 UTC 2016

Am 14.12.2016 um 05:47 schrieb Andrew Bartlett:
> On Wed, 2016-12-14 at 16:11 +1300, Andrew Bartlett wrote:
>> On Wed, 2016-12-14 at 07:22 +1300, Andrew Bartlett wrote:
>>> On Tue, 2016-12-13 at 17:37 +0100, Stefan Metzmacher wrote:
>>>> Hi Andrew,
>>>>>> I don't see my concerns of my previous mails addressed,
>>>>>> so please don't push it as is.
>>>>>> I can try to explain them again if required later today.
>>>>> I'm really sorry to have to trouble you for them, because I've
>>>>> been
>>>>> working hard to address them.
>>>>>  - I have added an smb.conf option for the lsa over netlogon
>>>>> behaviour
>>>>>  - I have added tests for that option
>>>>>  - I've explained that we can't conditionally check for
>>>>> incorrect
>>>>> association groups until we know which endpoint we bound to
>>>> This is wrong we know the endpoint from the time we do the
>>>> listen on the socket. Or am I missing something?
>>>> call->conn->endpoint should be available before we even enter
>>>> dcesrv_bind() everything else is a bug that needs to be fixed.
>>>> So please keep the assoc_group checks in one place.
>>> Thanks.  I missed where we could rely on that after my previous
>>> attempt
>>> to find iface, which wasn't available earlier.  
>> Done.
>>>> We also don't need any check in dcesrv_handle_fetch()
>>>> as there won't be any handles if they're not created.
>>> The purpose of that is to assert that the pipe's flags are correct.
>>>  I
>>> don't want to be debugging situations that happen only in some
>>> contexts, just because the pipe miss-declared if it uses handles. 
>>> We can drop that patch if it still makes you unhappy. 
>>>> We also need to make sure we explicitly set
>>>> ep->use_single_process = true; if the lsa interface
>>>> tries to use the netlogon named pipe.
>>> +	/*
>>> +	 * By default don't force into a single process, but if
>>> any
>>> +	 * interface on this endpoint on this service uses handles
>>> +	 * (most do), then we must force into single process mode
>>> +	 */
>>> +	if (use_single_process == true) {
>>> +		ep->use_single_process = true;
>>>  	}
>>> I'll expand the comment to explain that this more than covers lsa
>>> over
>>> netlogon, it also covers the ncalrpc case (all on one socket),
>>> which
>>> also remains a single process.
>> Done.
>>>> And we may need to use DLIST_ADD_END() instead of
>>>> DLIST_ADD() in some places and also check
>>>> ep->use_single_process if find_endpoint() found an endpoint.
>>> Why do we need to change this?  What specifically do you want
>>> changed.
>>>  I need, sadly, more detail than 'in some places'. 
>>> If find_endpoint() finds an endpoint, then we use that, unless we
>>> are
>>> over TCP/IP (previously all bound to a single port) and only if we
>>> are
>>> not forced to single process mode. 
>>>> Before calling dcesrv_add_ep() we may reset e->use_single_process
>>>> if this_model_ops == model_ops.
>>> Correct, we reset e->use_single_process to true if any process on
>>> the
>>> endpoint needs single process mode.  This happens most of the time.
>> I've expanded the comments about this behaviour.
>>>> Maybe it would be good to have (temporary) DEBUG() statements
>>>> which print endpoint details and all available interfaces
>>>> in dcesrv_add_ep(). If we would have such code I'd like to
>>>> see the output of it in various situations, with all combinations
>>>> of possible process models and lsaovernetlogon=yes/no.
>>>> This would it much easier to judge if I can be happy with the
>>>> logic you've implemented.
>>> You can see that the lsa over netlogon works because we have
>>> explicit
>>> tests added for that.  If adding lsa to \pipe\netlogon it wasn't
>>> controlling the process model for netlogon, it wouldn't
>>> succeed/fail
>>> when changing this option.
>>> We can just drop that lsa over netlogon over NP part for now if it
>>> really bothers you that much.  The important client (winbindd)
>>> prefers
>>> ncacn_ip_tcp anyway. 
>>>>>  - I've changed to specifying flags, not just the use of
>>>>> handles
>>>>>  - I've explained why the dcerpc interface should and does
>>>>> declare
>>>>> about the use of handles, not the ignoring of association
>>>>> groups,
>>>>> because per your request, only when all interfaces on an
>>>>> endpoint
>>>>> don't
>>>>> use handles, do we ignore association groups.  (And the DCE/RPC
>>>>> server
>>>>> could track this some other way if it wanted). 
>>>>>  - I've explained that unused records are removed in the same
>>>>> way
>>>>> as
>>>>> the old code, on startup.
>>>> That's not true. You removed memcache_delete() without a
>>>> replacement
>>>> and challenge_cache_fetch() is used if the client actually
>>>> switches the connection between netr_ServerReqChallenge()
>>>> and netr_ServerAuthenticate(), otherwise we'll use
>>>> dce_call->context->private_data to get the challenge.
>>> Thanks.  I'll add some tests to cover that, as it has some security
>>> properties in the case where we don't enforce schannel.  We have to
>>> remove it even if we didn't fetch it, so removing it on the
>>> challenge_cace_fetch() isn't enough.  Otherwise someone could re-
>>> use
>>> an
>>> observed challenge (they probably couldn't do very much with the
>>> result, but it wouldn't be good).   
>> I've added important tests for this, which has shown up some very
>> important bugs. 
>>>> I also guess some of the new includes are not required
>>>> anymore, at least "replace.h" comes already via
>>>> "includes.h" and #include "system/filesys.h" should
>>>> follow includes.h.
>>>> And I think netlogon_cache_entry should be moved to
>>>> schannel.idl next to netlogon_creds_CredentialState.
>>> OK. 
>> Done.
>>>>>  - And, as you have seen from Garming, we have shown this makes
>>>>> a
>>>>> significant improvement to the use case:  high-load 802.1x
>>>>> (NTLM)
>>>>> authentication via winbind on domain members. 
>>>>> We have a customer who suffers significant load on the DCs
>>>>> after
>>>>> a
>>>>> wifi
>>>>> outage, because the DC needs to suddenly re-authenticate all
>>>>> the
>>>>> users.
>>>>>  That is why we are working on this. 
>>>>> I've re-read your last mail, and I'm honestly at my wits end as
>>>>> what
>>>>> more you want, so in the desperate hope that we are simply in
>>>>> miscommunication, can you please make clear again what specific
>>>>> changes
>>>>> you want?  I would really like to move on from this, and lock
>>>>> it
>>>>> in
>>>>> for
>>>>> 4.6.
>>>> I also hope we'll have it in 4.6.
>>>> Thanks for your patience.
>>>> metze
>>> I'll see what I can do today, starting with using the endpoint for
>>> the
>>> association group check, and the DB delete.
>> I hope that what I attach here meets with your approval.  In
>> particular, I've added a test to cover bug 11291 (netapp challenge
>> reuse). 
> I've fixed the extra #include lines and fixed another test to run
> without "allow nt4 crypto". 
> See attached.  

I've pushed the torture tests alone as I want to make sure we already
pass them (as we should) with the current code.

I'd push the "rpc_server:netlogon Move from memcache to a tdb cache"
patch with the following additions squashed:

Then I'll push the "lsa over netlogon" patches alone.

I still think we should not handle DCESRV_INTERFACE_FLAGS_HANDLES_NOT_USED
in dcesrv_handle_fetch(), if you want to add something
let dcesrv_handle_new() fail and make sure all callers
of dcesrv_handle_new() check the return.

And before calling dcesrv_add_ep() we should
set e->use_single_process = true if this_model_ops is
the same as model_ops. If someone uses samba -Msingle
e->use_single_process should always be true.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20161214/be8cf8de/signature.sig>

More information about the samba-technical mailing list