the sorry saga of the talloc soname 'fix'

Sam Liddicott sam at liddicott.com
Thu Jul 9 11:31:01 GMT 2009


* tridge wrote, On 09/07/09 11:02:
> Hi Sam,
>
> My apologies for not having done a proper analysis of your proposal. 
>
> I have just tried to apply your patches to a test tree, but they fail
> to apply, and the fixups do not look trivial. Could you send me a
> clean copy of talloc.[ch] and testsuite.c for me to look at?
>   
Yes. I don't know why they failed for you, they apply cleanly to 22 June
master:
e129384d7c1df664e447186673dd107e190e2894

perhaps you need to revert you patches, but I attach the files you ask for.
> I do have some comments based on your description of the patch though,
> although whether those comments are useful depends on my correct
> interpretation of your text description of the changes. I always
> prefer to work from real code, as descriptions are rarely sufficient,
> especially with such a complex change.
>
> ok, so some wild specualation based on your description:
>
>  1) you propose that talloc_free() be allowed with a reference is
>  present, but that it should always free the 'original' or 'real'
>  parent. That is exactly the opposite behaviour of what talloc has
>  done since talloc_reference() was added, so changing it would be
>  rather a surprise to existing code. 
yes, I sadly consider that this code is already broken through the
ambiguity that is currently generally recognized, and which has the
consequences of leaving dangling references.
> The existing behaviour, before my
>  recent changes, was to remove the most recent reference. I am always
>  reluctant to completely invert existing behaviour. To prevent
>  ambiguous calls is one thing, especially when a clear warning is
>  given, but to do the exact opposite of what has been documented seems
>  to be asking for trouble. I certainly wouldn't call it a ABI (or
>  API) compatible change.
>   
I think this is not a complete inversion of the old behaviour, as I
think free cannot be called more than once (unless there was an
intevening steal). Therefore free isn't "removing the oldest parent
first" as such (which would be a complete inversion) it is instead
removing ONLY the allocating parent. free is a way for the parent to
abandon the child but let the child live off it's references.

I had never imagined that code would call talloc_free intending to free
a reference more recent that the allocating parent. The thought scares
me; however...
>  2) I think, from your description, that talloc_steal would change to
>  become a function that sometimes changes the effective reference
>  count and sometimes doesn't. 
yes, because "no_owner" doesn't count as a reference.
Also, no amount of consecutive calls to free can have a combined effect
of removing more than one reference.
> Let's look at an example:
>
>     int *p1 = talloc(NULL, int);
>     int *p2 = talloc(NULL, int);
>     int *p3 = talloc(NULL, int);
>     int *c1 = talloc(p1, int);
>     int *r1 = talloc_reference(p2, c1);
>
> at this point the effective reference count of c1 is 2. It will take 2
> free calls for it to go away.
>   
it will actually take: (one talloc_unlink(p2, c1)), and (one
talloc_free(c1) or talloc_unlink(p1, c1)).
Keep calling free won't make it go away - otherwise the r1 would become
a dangling reference which was the cause of me noticing all this.
> Now we do this:
>
>     talloc_free(r1);
>
> and if I have understood your patches correctly (a big if, given I
> can't run them!) then we end at this point with p1 having a reference
> to c1, but c1 also having a new type of pseudo-parent.
>   
yes
> At this point c1 has a effective reference count of 1, so a single
> free will get rid of it.
>   
no. A single free will do nothing, the reference will be left until p2
goes away or p2 explicitly unlinks itself; otherwise we get dangling
references and suffer the ambiguity that the "single free" would not
know which reference it was removing.
> Now we do this:
>
>     talloc_steal(p3, r1);
>
> and now, if I have understood your patches correctly, p3 becomes the
> 'real' parent, but the effective reference count of c1 has increased!
> It will now take two free operations for it to go away. 

As above, it will take (one talloc_unlink), and (one talloc_free or
talloc_unlink).

> So in total we
> will need 3 frees to get rid of this pointer if you use a steal, and
> two frees if you don't use a steal.
>   
Yes, but the mismatch isn't as it appears, the context must be from the
perspective of writing code and knowing what to write, not from the
runtime perspective object being referenced - as it is too hard for
humans to weave their minds along the code paths followed by an object.

It follows these rules:

If you talloc you must make arrangements for free (directly or implicitly)
If you steal you must make arrangements for free.

It's actually impossible/undesirable to have to detect if the object you
have (which may only be alive because it is referenced) has been
attemptedly freed. If you steal, you are "adopting" without regard to
whether or not the original parent has given up, and so you have the
obligation to free.

In regular malloc/free code transfer of owership of a pointer needs no
special treatment because the pointer itself does not track the owner.
talloc introduces talloc_steal when such ownership is transferred
because the owner is tracked.

> So that means that talloc_steal() now will, sometimes, change the
> reference count of a pointer. That is not a property it has had
> before, and it is not a desirable property I think. 
>   
I now see your objection. I think it is one of perception because you
refer to tc->parent as the "real parent" and feel that steal and free
operate on the real parent.

I tried to cover this in the talloc "reframing".

Consider that the pointer in tc-> parent has absolutely no
hierachical/tree significance, and that we enforce talloc and
talloc_steal to have to actually add the parent to tc->refs for it to be
a reference.

tc->parent merely becomes a convenience pointer in order to provide the
implicit argument for talloc_free and talloc_steal. Of course this
argument can be good for one use only! (talloc_steal replenishes it with
the new value).

That is how the ambiguity is removed while preserving the ability to use
talloc_free in all code that has the expectation of knowing who the
tc->parent is on account of the passing of ownership being implicit in
the code. This doesn't include code which says implicitly "I know who
all the references are and in what order" because this is too hard to know.

However, if tc->parent must always also be added to tc->refs we don't
actually have to go to the trouble of doing it.
> Have I interpreted the patches correctly, or should I wait until I can
> actually run the code?
>   

I think there was some confusion over repeated calls to free. I hope I
cleared these up.

Sam


More information about the samba-technical mailing list