the sorry saga of the talloc soname 'fix'

ronnie sahlberg ronniesahlberg at gmail.com
Wed Jul 8 08:27:00 GMT 2009


> Before I explain, some points must be acknowledged:
>
>   1. There exists much code that uses talloc_steal/talloc_free
>      ambiguously.
>   2. This code needs fixing anyway.
...
> These are the features of the solution:
>
>   1. To make talloc_steal and talloc_free unambiguous

There is nothing inherently ambigous with talloc_steal and talloc_free.
I use them a lot in CTDB and there is aboslutely no need to fix
anything regarding these functions or their use there.

These functions are NOT ambiguous by themself. They are only ambiguous
when they are used together with _reference/unreference. Never else.


ronnie sahlberg



On Wed, Jul 8, 2009 at 6:15 PM, Sam Liddicott<sam at liddicott.com> wrote:
> Although the soname debate is important (and how we handle soname
> bumping and ABI changes), I want to draw some attention back to the
> talloc fix.
>
> I believe these comments will also provide some relief for the immediate
> soname problem.
>
> * Andrew Bartlett wrote, On 07/07/09 08:10:
>>
>> I think Volker is right, that given we can easily (for some definition
>> of easily) avoid changing the ABI, then it's an easier way out of this
>> mess.  Had you proposed a more radical solution, there would perhaps
>> have been less resistance :-)
>>
>> But I wonder if we really have preserved the ABI.  Given that
>> talloc_free() and talloc_steal() now does a different thing, won't it
>> now allow a use-after-free?  This would be an ABI change (even if not a
>> signature change), and justify the change.
>>
>> This is more than just a cleanup of possible memory leaks, isn't it?
>>
>
> Andrew; you make a good point.
>
> This change only means that talloc_free merely has a different conflict
> with talloc_reference.
>
> Before this change talloc_reference/talloc_unreference had to be avoided
> for fear of runtime talloc errors.
>
> After this change, talloc_free must be avoided for the same fear.
>
> I don't know about others views, but I don't think this is any kind of fix.
>
> talloc_free is not safe to use, so we may as well get rid of it
> altogether and thus fully justify the .so version bump.
>
>
> However I have a suggestion, which if followed will avoid this problem,
> and the immediate issues of this "sorry soname saga".
>
> I really really think it is the solution to the current drawn out debate.
>
> Really really.
>
> As no-one has /shown/ any evidence of understanding what I've said
> before or of how it helps or even if it makes a difference, I'm offering
> a solution here. (The only objection I ever had to it was from tridge
> who felt that it was no different from what we had before).
>
> Please make the effort to understand what I suggest here, and then
> please make a judgement on whether or not this true talloc tip will
> solve the whole problem including the sorry soname saga in this instance.
>
> [From my own point of view I'm going to let this go now, but it concerns
> me to see this needless long debate, and a new talloc which I cannot
> ship with it's current talloc_free restrictions]
>
> *The true talloc tip...*
>
> Before I explain, some points must be acknowledged:
>
>   1. There exists much code that uses talloc_steal/talloc_free
>      ambiguously.
>   2. This code needs fixing anyway.
>   3. It is being fixed with the aid of tridge's patch
>
> Before I explain, these points must be understood:
>
>   1. talloc_reference, talloc_reparent, talloc_unlink are pure and holy
>      and have zero ambiguity
>   2. talloc_steal and talloc_free are said to be ambiguous (herein lies
>      the entire problem).
>
> These are the features of the solution:
>
>   1. To make talloc_steal and talloc_free unambiguous
>   2. To preserve the utility of talloc
>   3. To preserve the API signatures and ABI
>   4. To avoid a soname bump
>
> To understand the problem these points must be understood:
>
>   1. talloc_steal and talloc_free are like talloc_reparent and
>      talloc_unlink except that they have an implicit "parent" argument
>   2. the current ambiguity comes from knowing which parent to use as
>      the implicit argument
>   3. the solution must remove ambiguity of the implicit argument
>
> The solution is to:
>
>    *make the implicit argument of talloc_steal and talloc_free be the
>    original allocating parent*
>
> if we do this, then there is absolutely no ambiguity [see point 3 below]
>
> Implications of this solution
>
>   1. The solution can remove ambiguity while preserving the API
>      signatures and ABI
>      Really really
>   2. It does NOT make buggy use of talloc_free suddenly become bug free
>      Faulty use of talloc_free and talloc_steal is will still be faulty
>   3. Just because talloc_steal and talloc_free are not ambigious
>      doesn't mean your program is not ambiguous.
>      Use of talloc/talloc_free is no more error-prone than use
>      malloc/free - which doesn't say a lot, but it is a benchmark.
>      If your code is too complicated for malloc/free then you should
>      not be using talloc/talloc_free but talloc/talloc_unlink.
>   4. If you don't want the original allocating parent to have any
>      special treatment then don't use talloc_free/talloc_steal
>      you can't use them now anyway unless you can swear that no-one
>      else is taking a reference to your object
>   5. Arguably we didn't preserve the ABI as we slightly changed the
>      meaning of talloc_steal/talloc_unfree
>      We did remove an ambiguity though, one which caused many uses of
>      talloc_steal/talloc_free to be broken
>
> And, (here I really do start to repeat myself) we get all of this merely by:
>
>   1. preventing the internal promotion of reference to parent
>   2. using a "no_owner" place holder as parent when talloc_free is
>      called while references are still held. This no_owner does not
>      have the power to prevent destruction when the references all go.
>      This no_owner can be stolen from.
>
> The patches for this, including patches to the test suite can be found in:
>   http://lists.samba.org/archive/samba-technical/2009-July/065483.html
>
> The specific "no_owner" patch is the third attachment. Much old code in
> that patch is commented out rather than removed because I was
> anticipating some discussion on the way I implemented it. [I shall
> remove the commented out code in my next rebase anyway as it "works for me"]
>
> I don't intend to make these points again, but as said before, the only
> person who showed any evidence of having read the solution didn't seem
> to think it made any difference.
>
> Sam
>


More information about the samba-technical mailing list