winbindd main process hangs on samba-dc

Jeremy Allison jra at samba.org
Tue Sep 8 19:13:04 UTC 2020


On Tue, Sep 08, 2020 at 09:05:30PM +0200, Isaac Boukris wrote:
> On Tue, Sep 8, 2020 at 8:59 PM Jeremy Allison <jra at samba.org> wrote:
> >
> > On Tue, Sep 08, 2020 at 08:56:35PM +0200, Isaac Boukris via samba-technical wrote:
> > > Hi,
> > >
> > > This issue was initially reported on ipa-dc, but I'm able to somewhat
> > > reproduce in lab with samba-dc, by dropping returned tcp packet from a
> > > DC from a trusted domain (iptables -A INPUT -p tcp -s 192.168.0.120 -j
> > > DROP).
> > >
> > > As you can see in the attached log, the main winbind process goes into
> > > blocking DC calls such as get_sorted_dc_list(), and depending on the
> > > amount of DCs to try, it may cause clients (such as wbinfo -p, or more
> > > importantly, smbd!) to hang for minutes and to timeout.
> > >
> > > Here for instance, we block for 5 second per DC:
> > > [2020/09/08 20:27:49.595952,  3, pid=66128, effective(0, 0), real(0,
> > > 0)] ../../source3/lib/util_sock.c:447(open_socket_out_send)
> > >   Connecting to 192.168.0.120 at port 445
> > > [2020/09/08 20:27:49.601764,  3, pid=66128, effective(0, 0), real(0,
> > > 0)] ../../source3/lib/util_sock.c:447(open_socket_out_send)
> > >   Connecting to 192.168.0.120 at port 139
> > > [2020/09/08 20:27:54.603044, 10, pid=66128, effective(0, 0), real(0,
> > > 0), class=winbind]
> > > ../../source3/winbindd/winbindd_cm.c:1712(find_new_dc)
> > >   find_new_dc: smbsock_any_connect failed for domain ACOM address
> > > 192.168.0.120. Error was NT_STATUS_IO_TIMEOUT
> > >
> > > On a member machine i couldn't trigger it as it seems the
> > > get_sorted_dc_list is done in the per-domain process (as well as the
> > > call to fork_child_dc_connect()), while here it happens in the main
> > > process.
> > >
> > > Any ideas?
> >
> > What version of Samba is this ?
> >
> > I may have already fixed this in master with
> > the async DNS SRV record -> A/AAAA lookup
> > changes.
> 
> git master, in this test i only block tcp packets btw.

OK, so we should be getting a good list in a reasonable time.
Looking at the smbsock_any_connect() that should be pinging
a new DC every second, and timing out in total after 10
seconds.

Can you add DEBUG to print out the number of DC's you
get back from get_sorted_dc_list(), and the timings
inside find_new_dc() ?




More information about the samba-technical mailing list