the sorry saga of the talloc soname 'fix'

Sam Liddicott sam at liddicott.com
Thu Jul 9 13:21:59 GMT 2009


As a spoiler, Simo worked out what I was trying to say!
[cue festival music, etc]

* simo wrote, On 09/07/09 13:55:
> On Thu, 2009-07-09 at 12:31 +0100, Sam Liddicott wrote:
>   
>> I think there was some confusion over repeated calls to free. I hope I
>> cleared these up.
>>     
>
> Sam, I am sorry to say that you aren't always very clear :-)
>   
I've only just worked out what usage model of talloc is being followed
by the rest of you! (More on my enlightenment below)
> ...
>
> On Sam's proposal:
> ------------------
> ...
>
> 1. plain talloc()
> Basically you say that any new allocation is also an implicit reference,
> so when I allocate c1 on p1 then c1 has p1 as parent (which is just an
> implicit *default* reference).
>   
Yes. it could be an explicit reference, but why go to the trouble...
> 2. talloc_reference()
> Here you say that talloc_reference() is just adding a reference to an
> existing memory context. if you have talloc_reference(p2, c1) this means
> c1 has just a new reference. the only way to remove this reference is by
> calling talloc_unlink(p2, c1)
>   
Or automatically when p2 goes away (explicitly or when it's parents go
away etc)
> 2.1 Q. does talloc_free(p2) also automatically free c1 if p2 was the
> last reference for c1 but not the parent ? Or do you need to set an
> explicit destructor on p2 to call talloc_unlink(p2,c1) ?
>   
if p2 was the last reference for c1 (meaning c1's implicit reference in
tc->parent is also gone) then c1 will be freed automatically.

> 3. talloc_steal()
> For normal cases talloc_steal(p3, c1) just changes the parent and at all
> effects c1 changed ownership.
> For cases where it would happen that c1 was alive only because of
> additional references then talloc_steal ADDs a new parent.
>   
I would say talloc_steal always adds a new parent, it just doesn't
remove the old one because it's already been removed. But yes, there is
one more parent than before. The new parent is an implicit reference as
it is held in tc->parent.
> Therefore you still need to talloc_free(c1) (or more frequently just p3)
> and to talloc_unlink(p2, c1)
>   
yes. But you don't need to think about as long as each stealer knows it
has a free call to match, and each talloc knows it has a stealer or free
call to match.
> It's unlikely you are going to mix talloc_steal() and talloc_reference()
> in the same functions anyway, so the case where a talloc_steal() is
> called on a freed object is going to be rare.
>   
yes.
> 4. talloc_free()
> A free only removes the parent, it *never* promotes a reference to be
> the new parent, if there are no more references the object is actually
> really freed. If there are other references the object will not be freed
> until each referencing object is actually freed or an explicit
> talloc_unlink(ref, c1) is called.
>   
yes
> If 2.1 holds true then I see no major problem here, if "all parents" are
> freed the object is freed as well, and as long as one referencing parent
> exists the object exists
>   
yes
> 4.1 What will happen if I call talloc_free() on an object that has no
> parent ? Do we get an error ?
>   
On reflection, I think we should. If not, it would only be a "lucky
reference" that stopped what would have been a double-free.
> Problem:
> How do you identify the parent as freed? NULL can't be used as it is a
> valid parent. We may need to introduce a boolean:
> tc->parent_allocated = true/false
>   
I tried using a flag, but in the end went for a special singleton
no_owner_context owner, where no_owner_context is an internal talloc
context.
>
> Conclusions:
> ------------
>
> Your patch would change fundamentally just one thing:
> if you used to use talloc_free() to directly kill a reference then you
> may get errors/memleaks depending on the implementation.
>   
yes. And code that wasn't getting these errors or leaks might start to
get them just because other code has started taking references or is now
taking references at a different time. Which is very possible in an
asynchronous server.
> This is not going to be a huge problem in most cases and is going to be
> contained in the whereabouts of the original talloc_reference() so that
> it should be easy to *know* who is the parent to actually perform a
> talloc_unlink()
> (It may be nice to have a talloc_unreference() call perhaps to lessen
> confusion for the API users).
>   
It would be nice.
> It still means some code need to be changed, but that is equally true
> with current Tridge's patch, the difference is that changes should be
> localized around uses of talloc_reference() in most sane cases.
>
> The other *good* difference is that only the code using
> talloc_reference() will need to worry about the effects of
> talloc_reference(), 
yes
> code not using it can keep being the usual simple
> combination of talloc(), talloc_steal(), talloc_free() and no ill
> effects will be felt if some foreign code takes a reference on some
> memory object.
which is exactly what I've been doing
>  The only effect will be (as expected) that the referenced
> memory will be kept alive for some longer time.
>   
yes. The other effect might be that the destructor gets called later. If
there were non-object related destruction effects it might be
troublesome if these were called later. I think such would be bad form
anyway.
>
> Do I get this right ?
>   
You do!
I owe you a meal over which you can tell me what I might have done to
explain better...

I've been enlightened into what was going on;

talloc users hoped to do:

ref = talloc_reference(me, thing);

and then later be able to do:

talloc_free(ref);

The pointer ref is seen as having life in it's own right; I had only
seen "ref" as a being the same as "thing" (except segfaultingly NULL if
the reference failed) and therefore needing to be un-referenced rather
than freed.


I confess that I had never imagined this as it seems so unhealthy in
inherently risky.

I guess if ref were some kind of C++ auto-dereferencing pointer it could
be made to work, but now I see the reasons why talloc_free would remove
the latest reference in the hopes that it was the right one.

I see that from the point of view of this usage model it would have been
hard to work out the benefit of what I was suggesting.

I thank all those who made the effort to understand, and Simo for
succeeding, Metze and Volker and many others for letting me confuse them
repeatedly, and Tridge for writing talloc (which is pretty sweet), etc etc

Sam



More information about the samba-technical mailing list