TALLOC: Discussion of destructor processing

Tue May 1 06:00:01 MDT 2012

On Tue, 2012-05-01 at 07:31 -0400, Stephen Gallagher wrote: 
> On Tue, 2012-05-01 at 15:00 +1000, Andrew Bartlett wrote:
> > On Mon, 2012-04-30 at 13:38 -0400, Stephen Gallagher wrote:
> > > I was having a discussion with Pavel Březina about talloc destructors
> > > and he made me aware of the underlying algorithm used to process them
> > > during a nested free operation.
> > > 
> > > So, to summarize quickly, the behavior is as follows: (pseudo-code).
> > > 
> > > if (destructor(ctx) == -1) {
> > >      return -1;
> > > }
> > > free_children(ctx); /* This recurses */
> > > free(ctx);
> > > 
> > > 
> > > I see several problems with this approach.
> > > 
> > > 1) The results of the destructors for the children are not considered
> > > when determining whether to report success or failure for the
> > > talloc_free(). This means that if any part of the hierarchy except the
> > > top-most context is freed, there will be a memory leak (with probably
> > > unreachable memory).
> > 
> > This is an interesting issue.  In general, it is a really bad idea for a
> > destructor to fail (this isn't made very clear in the docs however). 
> 
> 
> Perhaps I was putting the cart before the horse with my initial
> comments. I suppose my first question really should be: "Why do we allow
> destructors to fail?". In the current code, there is really no sane way
> to deal with a failed destructor at all. It might make sense then to
> just make the decision to turn destructors into void functions and
> ALWAYS free memory after they've completed.
> 
> The real problem with destructors on nested contexts is that there's no
> way to report back whether any or all of the children failed their
> destructors. Depending on only the top-level context failing to return
> -1 to the caller is misleading at best.
> 
> I think it would be a better design pattern to remove the return value
> of destructors entirely and always assume it completed successfully.
> This will encourage people first of all to build their destructors such
> that it should be impossible to fail them.
> 
> Can someone demonstrate an example for me of a case where handling
> destructor failure is actually relied upon in samba?

My memory is quite bad, but I seem to remember some places where we may
rely on the destructor to return a failure in order to prevent freeing
memory.
This may have been removed after the introduction of talloc_reference()
though.

In general having a destructor fail makes little sense indeed, as there
is no way to recover the situation.

> > 
> > > 2) It strikes me as strange that we're calling the destructor before
> > > we're we're handling the destructors of the children. It's contrary to
> > > (at least my) expectations about the order that destructors should
> > > execute. Freeing of memory and execution of destructors should be
> > > happening from the "bottom-up" of the hierarchy.
> > 
> > If we did it in that order, then a top level destructor could not rely
> > on the contents of the structures that it may be accessing.  This is
> > because typically the talloc tree follows the pointer tree, and many of
> > the elements in a structure with a destructor are allocated (and some of
> > those allocations might be with malloc/etc and a destructor).
> > 
> 
> This is an interesting point. I wonder whether we're encouraging a bad
> practice here, though. In order to minimize the likelihood of failure, a
> destructor really should be truly minimalistic. It should not be
> performing complex operation on child contexts, and I'd argue that
> proper destructor writing should be reliant ONLY on the current context
> being valid and any of its necessary components being outside the
> current memory context being freed.

It would be nice if we could be so rigid, I can think of many cases
where the destructor relies on a structure to be accessible, and
therefore children to be available.

However in a 2 pass strategy where all destructor are called first and
only then all children are freed this would be guaranteed.
Unfortunately this would also always guarantee that talloc_free() will
be slower all the time. However I guess we could optimize it if we could
expose a flag in the parents by walking up the parents that there is
actually some child with destructors set. This would allow us to skip
the double pass when no child has destructors and confine it to branches
where destructors are set. It would shift the penalty of walking the
branch to the time we set the destructor, and would not be too
penalizing because we are walking up parent by parent, which is arguably
a lot less steps than inspecting all the children.

> > > My vision of how this should behave is as follows: We should iterate
> > > through the hierarchy tree, starting from the leaves. Work its way back
> > > up, calling destructors where they exist until either we have called all
> > > of them or any one has failed. If any failed, return -1 to the top-level
> > > free. If all succeeded, THEN iterate the tree a second time to actually
> > > perform the deallocation.
> > 
> > Given that talloc_free() is rarely checked for errors, wouldn't this
> > leave even more memory potentially leaked, as any destructor anywhere
> > could abort the free of an entire tree?
> > 
> 
> Yes, it could. But now we are at least left with a complete memory
> hierarchy that we can investigate, instead of a partially-deallocated
> mess. And obviously if we changed to this approach, we'd need to be more
> consistent about checking talloc_free() for errors. This would
> necessitate a change in design patterns.

No, this would be too big of a shift. All developers are used to
consider a free() an operation that either always work or nothing can be
done if it fails. And indeed it is that.
Even if we were to check the return of talloc_free() what would we do if
it fails ? At most we would put out a warning and would just live with
the leak, because calling it again would be a bit pointless, it would
almost certainly fail again.

So the only possibility is to make it clear destructors should never
fail and perhaps add a logging facility to talloc so that when a
destructor do fail we log about it (perhaps with a hierarchy dump in the
logs) and just keep going.

> > This area is very much worth thinking about - much of what talloc does
> > is just a little bit tricky, and it has taken time to work out if the
> > way it works is really the best way in the long term.  What I mention
> > above is just some counter-points to help you think over the complexity
> > of the issue.
> 
> 
> Oh, I don't disagree at all. That's why I started this thread. I'm sure
> that I've overlooked plenty of historical decisions (I've only been
> working with talloc for about four years, I don't claim to know
> everything about it). That's also why this is a discussion instead of a
> patch as Jeremy requested.

Yeah this is correct, before we can make a patch we need to agree on the
behavior of destructors and talloc_free()

Simo.

-- 
Simo Sorce
Samba Team GPL Compliance Officer <simo at samba.org>
Principal Software Engineer at Red Hat, Inc. <simo at redhat.com>