winbindd main process hangs on samba-dc

Isaac Boukris iboukris at gmail.com
Tue Sep 8 20:20:08 UTC 2020


On Tue, Sep 8, 2020 at 9:13 PM Jeremy Allison <jra at samba.org> wrote:
>
> On Tue, Sep 08, 2020 at 09:05:30PM +0200, Isaac Boukris wrote:
> > On Tue, Sep 8, 2020 at 8:59 PM Jeremy Allison <jra at samba.org> wrote:
> > >
> > > On Tue, Sep 08, 2020 at 08:56:35PM +0200, Isaac Boukris via samba-technical wrote:
> > > > Hi,
> > > >
> > > > This issue was initially reported on ipa-dc, but I'm able to somewhat
> > > > reproduce in lab with samba-dc, by dropping returned tcp packet from a
> > > > DC from a trusted domain (iptables -A INPUT -p tcp -s 192.168.0.120 -j
> > > > DROP).
> > > >
> > > > As you can see in the attached log, the main winbind process goes into
> > > > blocking DC calls such as get_sorted_dc_list(), and depending on the
> > > > amount of DCs to try, it may cause clients (such as wbinfo -p, or more
> > > > importantly, smbd!) to hang for minutes and to timeout.
> > > >
> > > > Here for instance, we block for 5 second per DC:
> > > > [2020/09/08 20:27:49.595952,  3, pid=66128, effective(0, 0), real(0,
> > > > 0)] ../../source3/lib/util_sock.c:447(open_socket_out_send)
> > > >   Connecting to 192.168.0.120 at port 445
> > > > [2020/09/08 20:27:49.601764,  3, pid=66128, effective(0, 0), real(0,
> > > > 0)] ../../source3/lib/util_sock.c:447(open_socket_out_send)
> > > >   Connecting to 192.168.0.120 at port 139
> > > > [2020/09/08 20:27:54.603044, 10, pid=66128, effective(0, 0), real(0,
> > > > 0), class=winbind]
> > > > ../../source3/winbindd/winbindd_cm.c:1712(find_new_dc)
> > > >   find_new_dc: smbsock_any_connect failed for domain ACOM address
> > > > 192.168.0.120. Error was NT_STATUS_IO_TIMEOUT
> > > >
> > > > On a member machine i couldn't trigger it as it seems the
> > > > get_sorted_dc_list is done in the per-domain process (as well as the
> > > > call to fork_child_dc_connect()), while here it happens in the main
> > > > process.
> > > >
> > > > Any ideas?
> > >
> > > What version of Samba is this ?
> > >
> > > I may have already fixed this in master with
> > > the async DNS SRV record -> A/AAAA lookup
> > > changes.
> >
> > git master, in this test i only block tcp packets btw.
>
> OK, so we should be getting a good list in a reasonable time.
> Looking at the smbsock_any_connect() that should be pinging
> a new DC every second, and timing out in total after 10
> seconds.
>
> Can you add DEBUG to print out the number of DC's you
> get back from get_sorted_dc_list(), and the timings
> inside find_new_dc() ?

Attached updated log file with XXX printout before and after entering
find_new_dc(), they occur about once a minute and hang for 5 seconds
per DC (I only have one DC in this tests, but multiple DCs and
multiple trusts could easily go over the a~20 seconds timeouts of smbd
clients).

[2020/09/08 22:10:49.862339,  1, pid=101776, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd_cm.c:1988(cm_open_connection)
  XXX entering find_new_dc()
...
[2020/09/08 22:10:54.875914,  1, pid=101776, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd_cm.c:1992(cm_open_connection)
  XXX after find_new_dc() setting offline
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log.winbindd
Type: application/octet-stream
Size: 153527 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20200908/ca78023e/log-0001.obj>


More information about the samba-technical mailing list