Guidelines about fixing bugs

Stefan Metzmacher metze at samba.org
Thu Feb 2 10:06:30 UTC 2017


Hi Matthieu,

> I had a chat offline with Jeremy today (well yesterday) and he seems to
> be ok with the idea of doing ASYNC resolution.
> 
> In source4/libcli/resolve/dns_ex.c we are doing ASYNC DNS (for the AD
> server part) by forking child and running getaddrinfo in it. Is it
> something like that that we want to do for Winbindd (and other tools).

I think we could use pthreadpool_tevent_job_send/recv instead of doing
a fork().

We could also use pthreadpool_tevent_job_send/recv to wrap krb5 and/or
gssapi calls,
together with the send_to_kdc/send_to_realm hooks available in heimdal
and recent MIT versions we could build a fully async gensec_gssapi module
instead of having the semi-async mess in smb_krb5_send_and_recv_func_int().

metze

> On 02/01/2017 02:22 AM, Andreas Schneider wrote:
>> On Wednesday, 1 February 2017 01:18:17 CET Matthieu Patou wrote:
>>> Hello All,
>>>
>>> We have been witnessing some issues related to the way Winbindd do DNS
>>> lookups of DC for various services (ldap, kdc, ...)
>>>
>>> It mainly boils down to this bug:
>>> https://bugzilla.samba.org/show_bug.cgi?id=12533, that is to say DNS
>>> resolution of names from the SRV records are done in a sequential way;
>>> when there is a combination of slow DNS server and huge network of DC
>>> the resolution can take so long (and is made worse by other bugs) that
>>> clients timeout.
>>>
>>> While looking at this issue I've found a couple of in-efficiencies like
>>>
>>> 	, where the refresh_usn
>>> function would be called multiple time in parallel, or
>>>
>>> https://bugzilla.samba.org/show_bug.cgi?id=12548, where the get_dc_name
>>> is called twice almost back to back, given the fact that this function
>>> cause the SRV records for _kerberos to be looked up and the names to be
>>> resolved it's basically doubling the time it takes to do refresh_usn
>>> which is kind of a big deal when DNS is slow and DC is huge.
>>>
>>> I'm not as familiar as Volker, Metze of Guenther with this part of the
>>> code base so I would appreciate if one of you (or a bit more) could
>>> chime in on the high level solution that proposed in those bugs so that
>>> me or pradeep or ravindra are doing the work in the right direction.
>> Uri has been working in this area lately. I've discovered this issue during a 
>> 'net ads join' too. We get the list of DCs (e.g. 200) and then we resolve each 
>> DC name to an ip address. Depending on the network this can take several 
>> minutes. We only need one IP to resolve, we normally only talk to one in the 
>> list. Fixing this means to rewrite the whole logic and make sure everything is 
>> site-aware.
>>
>> Maybe Uri can give some more pointers, he implemented site support.
>>
>>
>> Cheers,
>>
>>
>> 	Andreas
>>
>>
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20170202/e1e3bc5e/signature.sig>


More information about the samba-technical mailing list