Guidelines about fixing bugs

Stefan Metzmacher metze at
Thu Feb 2 10:06:30 UTC 2017

Hi Matthieu,

> I had a chat offline with Jeremy today (well yesterday) and he seems to
> be ok with the idea of doing ASYNC resolution.
> In source4/libcli/resolve/dns_ex.c we are doing ASYNC DNS (for the AD
> server part) by forking child and running getaddrinfo in it. Is it
> something like that that we want to do for Winbindd (and other tools).

I think we could use pthreadpool_tevent_job_send/recv instead of doing
a fork().

We could also use pthreadpool_tevent_job_send/recv to wrap krb5 and/or
gssapi calls,
together with the send_to_kdc/send_to_realm hooks available in heimdal
and recent MIT versions we could build a fully async gensec_gssapi module
instead of having the semi-async mess in smb_krb5_send_and_recv_func_int().


> On 02/01/2017 02:22 AM, Andreas Schneider wrote:
>> On Wednesday, 1 February 2017 01:18:17 CET Matthieu Patou wrote:
>>> Hello All,
>>> We have been witnessing some issues related to the way Winbindd do DNS
>>> lookups of DC for various services (ldap, kdc, ...)
>>> It mainly boils down to this bug:
>>>, that is to say DNS
>>> resolution of names from the SRV records are done in a sequential way;
>>> when there is a combination of slow DNS server and huge network of DC
>>> the resolution can take so long (and is made worse by other bugs) that
>>> clients timeout.
>>> While looking at this issue I've found a couple of in-efficiencies like
>>> 	, where the refresh_usn
>>> function would be called multiple time in parallel, or
>>>, where the get_dc_name
>>> is called twice almost back to back, given the fact that this function
>>> cause the SRV records for _kerberos to be looked up and the names to be
>>> resolved it's basically doubling the time it takes to do refresh_usn
>>> which is kind of a big deal when DNS is slow and DC is huge.
>>> I'm not as familiar as Volker, Metze of Guenther with this part of the
>>> code base so I would appreciate if one of you (or a bit more) could
>>> chime in on the high level solution that proposed in those bugs so that
>>> me or pradeep or ravindra are doing the work in the right direction.
>> Uri has been working in this area lately. I've discovered this issue during a 
>> 'net ads join' too. We get the list of DCs (e.g. 200) and then we resolve each 
>> DC name to an ip address. Depending on the network this can take several 
>> minutes. We only need one IP to resolve, we normally only talk to one in the 
>> list. Fixing this means to rewrite the whole logic and make sure everything is 
>> site-aware.
>> Maybe Uri can give some more pointers, he implemented site support.
>> Cheers,
>> 	Andreas

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <>

More information about the samba-technical mailing list