[PATCH] tfork fix issues with waiter process termination

Andrew Bartlett abartlet at samba.org
Sat Sep 16 18:58:45 UTC 2017

On Sun, 2017-09-17 at 06:25 +1200, Andrew Bartlett via samba-technical
> On Sat, 2017-09-16 at 07:52 -0700, Ralph Böhme via samba-technical
> wrote:
> > On Sat, Sep 16, 2017 at 10:13:37AM +0000, Gary Lockyer wrote:
> > > > Your fix for the status fd is spot on, I'd like to change the event fd fix
> > > > however. Depending on whether tfork_event_fd() has been called, ownership of the
> > > > event fd should be transferred to the caller.
> > > 
> > > That makes more sense as it will reduce the chances of file descriptor
> > > leaks.
> > > > 
> > > > Please take a look at attached patchset. The FIXUP commits are some minor
> > > > fixes. The event fd changes are in the "lib/util: only close the event_fd in
> > > > tfork if the caller didn't call tfork_event_fd()" commit.
> > > 
> > > I'm happy with the revised patch set.
> > 
> > great!
> > 
> > We also need a backport for 4.7, don't we? metze already created a bug for this
> > one, so I'm going to add the bug URL and push later on if I don't hear any
> > objections.
> > 
> > https://bugzilla.samba.org/show_bug.cgi?id=13037
> A BUG URL is always a good thing, but I'm not sure we need the backport
> critically.

Regardless I've filled in a short summary of the issue into the bug,
for clarity. 

> The reason this was never noticed is that in the current use, a sibling
> (actually cousin) worker process still holding the FD from the parent
> is quite rare.  You would get it if you ran 'samba -M single' for smbd
> and winbindd, but thankfully everything is marked close-on-exec, so no
> duplicate is found then, and the monitor process closes all FDs using a
> loop.

The biggest risk would be resolve_name_dns_ex_send(), which does a
fork().  I could imagine a race between the exit from a samba_kcc run
and a slow blocking DNS lookup in -M single.  

By luck I don't think they are likely to happen in the same process by
default (-M standard), and dns with eventually time out, but luck and
lockups are not a good combination.

> The new use case, to handle a forked-but-not-execed samba child in a
> process model re-factor is what caught this case.
> I'll let Gary tell more of the tale of how we found this, but the
> whiteboard sessions were quite epic, tracing who had what read and
> write FDs open!

The biggest risk isn't the code as-is, but something else that is
backported that uses fork(), or tfork().

So, on further consideration, BUG and backport it.  We never want to
debug this on a customer site :-)


Andrew Bartlett
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba

More information about the samba-technical mailing list