the sorry saga of the talloc soname 'fix'

Wed Jul 8 08:15:52 GMT 2009

Although the soname debate is important (and how we handle soname
bumping and ABI changes), I want to draw some attention back to the
talloc fix.

I believe these comments will also provide some relief for the immediate
soname problem.

* Andrew Bartlett wrote, On 07/07/09 08:10:
>
> I think Volker is right, that given we can easily (for some definition
> of easily) avoid changing the ABI, then it's an easier way out of this
> mess.  Had you proposed a more radical solution, there would perhaps
> have been less resistance :-)
>
> But I wonder if we really have preserved the ABI.  Given that
> talloc_free() and talloc_steal() now does a different thing, won't it
> now allow a use-after-free?  This would be an ABI change (even if not a
> signature change), and justify the change. 
>
> This is more than just a cleanup of possible memory leaks, isn't it?
>   

Andrew; you make a good point.

This change only means that talloc_free merely has a different conflict
with talloc_reference.

Before this change talloc_reference/talloc_unreference had to be avoided
for fear of runtime talloc errors.

After this change, talloc_free must be avoided for the same fear.

I don't know about others views, but I don't think this is any kind of fix.

talloc_free is not safe to use, so we may as well get rid of it
altogether and thus fully justify the .so version bump.

However I have a suggestion, which if followed will avoid this problem,
and the immediate issues of this "sorry soname saga".

I really really think it is the solution to the current drawn out debate.

Really really.

As no-one has /shown/ any evidence of understanding what I've said
before or of how it helps or even if it makes a difference, I'm offering
a solution here. (The only objection I ever had to it was from tridge
who felt that it was no different from what we had before).

Please make the effort to understand what I suggest here, and then
please make a judgement on whether or not this true talloc tip will
solve the whole problem including the sorry soname saga in this instance.

[From my own point of view I'm going to let this go now, but it concerns
me to see this needless long debate, and a new talloc which I cannot
ship with it's current talloc_free restrictions]

*The true talloc tip...*

Before I explain, some points must be acknowledged:

   1. There exists much code that uses talloc_steal/talloc_free
      ambiguously.
   2. This code needs fixing anyway.
   3. It is being fixed with the aid of tridge's patch

Before I explain, these points must be understood:

   1. talloc_reference, talloc_reparent, talloc_unlink are pure and holy
      and have zero ambiguity
   2. talloc_steal and talloc_free are said to be ambiguous (herein lies
      the entire problem).

These are the features of the solution:

   1. To make talloc_steal and talloc_free unambiguous
   2. To preserve the utility of talloc
   3. To preserve the API signatures and ABI
   4. To avoid a soname bump

To understand the problem these points must be understood:

   1. talloc_steal and talloc_free are like talloc_reparent and
      talloc_unlink except that they have an implicit "parent" argument
   2. the current ambiguity comes from knowing which parent to use as
      the implicit argument
   3. the solution must remove ambiguity of the implicit argument

The solution is to:

    *make the implicit argument of talloc_steal and talloc_free be the
    original allocating parent*

if we do this, then there is absolutely no ambiguity [see point 3 below]

Implications of this solution

   1. The solution can remove ambiguity while preserving the API
      signatures and ABI
      Really really
   2. It does NOT make buggy use of talloc_free suddenly become bug free
      Faulty use of talloc_free and talloc_steal is will still be faulty
   3. Just because talloc_steal and talloc_free are not ambigious
      doesn't mean your program is not ambiguous.
      Use of talloc/talloc_free is no more error-prone than use
      malloc/free - which doesn't say a lot, but it is a benchmark.
      If your code is too complicated for malloc/free then you should
      not be using talloc/talloc_free but talloc/talloc_unlink.
   4. If you don't want the original allocating parent to have any
      special treatment then don't use talloc_free/talloc_steal
      you can't use them now anyway unless you can swear that no-one
      else is taking a reference to your object
   5. Arguably we didn't preserve the ABI as we slightly changed the
      meaning of talloc_steal/talloc_unfree
      We did remove an ambiguity though, one which caused many uses of
      talloc_steal/talloc_free to be broken

And, (here I really do start to repeat myself) we get all of this merely by:

   1. preventing the internal promotion of reference to parent
   2. using a "no_owner" place holder as parent when talloc_free is
      called while references are still held. This no_owner does not
      have the power to prevent destruction when the references all go.
      This no_owner can be stolen from.

The patches for this, including patches to the test suite can be found in:
   http://lists.samba.org/archive/samba-technical/2009-July/065483.html

The specific "no_owner" patch is the third attachment. Much old code in
that patch is commented out rather than removed because I was
anticipating some discussion on the way I implemented it. [I shall
remove the commented out code in my next rebase anyway as it "works for me"]

I don't intend to make these points again, but as said before, the only
person who showed any evidence of having read the solution didn't seem
to think it made any difference.

Sam