[QUICK] talloc bugs

Sam Liddicott sam at liddicott.com
Mon Jun 29 02:37:03 MDT 2009


* ronnie sahlberg wrote, On 29/06/09 07:39:
>   On Mon, Jun 29, 2009 at 2:25 PM, <tridge at samba.org> wrote:
>   
>> Hi Sam,
>>
>> Thanks for bringing this to our attention again.
>>
>> As I mentioned the last time around, my preference is to keep the
>> current behaviour, except to disallow direct talloc_free() of a
>> pointer that has outstanding references. The idea is that if you free
>> something with a reference either by the reference parent or by a
>> direct parent then the intention of the caller is very clear. 

Yes, this is very clear and would solve this particular problem (but
introduce another).

Ironically, by solving problem through refusing to talloc_free if there
are references, it is therefore preventing the promotion of reference to
parent; and that (as far as I can tell) is the smallest change required
to solve the problem. We don't need to restrict the use of talloc_free.

However depending on the meaning of your clause "disallow direct
talloc_free() of a pointer that has outstanding references" this
constraint on talloc_free could lead instead to memory leaks.

   1. If "disallow" means "have no effect" then we have memory leaks as
      free fails to remove the parent because of a reference that the
      allocating code did not anticipate. When the reference is freed,
      the parent would still remain.
   2. if "disallow" means return an error that has to be dealt with,
      then it is just an exchange of complications.
   3. if "disallow" means "not actually free the allocation, just remove
      the parent" - then this is what I am suggesting.


>> The only
>> ambiguity is where you call talloc_free() on a pointer directly where
>> that pointer has a outstanding reference. In that case the talloc code
>> has no way of knowing which parent you intend to keep, and the caller
>> should instead call a function that provides that information (such as
>> talloc_unlink).
>>     

This is true if we insist that references and the parent are equal
(Ronnie argues against this, as do I).
Also, this behaviour (fall-back to talloc_unlink) would be onerous to
manage at run-time (point 2, above)

Furthermore, this question of which reference to removed could only
truly exists if a reference is promoted to parent, but
your suggested restriction on talloc_free (above) would prevent this
promotion. If a reference is not promoted, then the parent to remove is
obviously the allocating (or stealing) parent.

This is also reflected in Ronnie's model below.

>> What I don't know is whether this should be the default behaviour, or
>> if we should have an inheritable flag to set this behaviour. 

Having flags will just lead to unpredictable run-time breakage, or extra
documentation on all functions that take talloc'd pointers to identify
whether or not the pointer can be referenced or not, and much pleading
as coders beg primary authors to change the flag.

>> That
>> really depends on whether how much we are currently relying on direct
>> free of pointers with references in our current code base. I haven't
>> run that experiment.
>>     

By "direct free of pointers" do you mean the idea that the pointer
should be freed and it's destructor called instantly that free  is
called, regardless of references? Or did you mean talloc_free as oppose
to talloc_unreference ?

If the former, then I can only see that such instant direct freeing of
pointers matters if we are counting on the destructor to be called at
the point of free, something which talloc_reference seems designed to
prevent; but I think there is something to be talked about here (the
point at which the destructor [which can block free] should be called
when there are also references)

If the latter, then ignore the previous paragraph.

>> Do you have time to test this solution?
>>     

Your constraint on talloc_free of referenced objects may or may not
cause problems for samba as it stands, but I hope that we can see that
the likelyhood of memory leaks in general for other talloc-ing programs
makes it unnecessary to make a test; because even if samba passes the
test, the disallowment of talloc_free on referenced memory can result in
leaks when the reference finally goes away.


(* ronnie sahlberg wrote, On 29/06/09 07:39:)
>   
> I am not convinced using references is that useful at all.
>   

not right now :-)

>
> Really, the purpose of a referencing system is to make memory
> deallocation automatic and easy to use.
> A reference system that contains "surprises" or is hard to use
> (perhaps even harder than just doing the tracking manually) is less
> than useful.
>
>
> I think that it is a mistake and making it more complex for humans to
> use to as now really having two different reference systems in place
> but with confusion about what talloc_free() really means.
>   

I think that you are right, the meaning of talloc_free is now unclear.
I have treated it to mean mere removal of the parent, but I now see that
there are other understandings of what may mean.

> I think one would have to decide on create time what reference system
> is used and that this would affect which operation is used to finally
> free the object.
> I think an easier API would be :
> Single-Parented objects
> * normal talloc allocations as today, these allocations always have a
> single parent object and they can be reparented with talloc_"steal"
> and they are freed by talloc_free().
> *You can not add a reference to these single-parented objects.
>
> Multi-Referenced objects
> * a new set of APIs that mirror the current allocators but with a
> _ref() prefix. These create a "referenced" object.
>   Perhaps they should be created similar to
> talloc_reference(<parent>, talloc(NULL, <type>))
>   I.e. the objects are created without a parent, and instead of a
> parent they have an initial reference.
> * On objects that were created using *_ref() you can add additional
> references using talloc_reference().
> * You can not use talloc_free() on these objects, instead you always
> use talloc_unreference().
> Once the last reference is deleted, the entire object itself is
> automatically freed.
>
>
> This would keep the normal hierarical single-parented pure talloc tree
> structure and the refcounted multiparented non-tree useages using
> separate API.
> The API itself would indicate which kind of reference model is used,
> which would make it much easier for code review or when looking at
> someone elses code, since the model would be explicit from the API
> used.
>
> I think it would be a mistake to try to get the two different models
> use the same API or make one model a subset of the other, since this
> will just "confuse".
>   

You have good points here. If we follow this model then I hope that in
samba, the single parent objects would be phased out. I would hope that
all programs would only use the multi-reference objects, because sooner
or later an additional reference will need to be kept.

This deprecation of the single parent objects would be the same as
phasing out talloc_free and talloc_steal.

> In my usecases in CTDB I never use refcounting, though I du use
> talloc_steal() a lot.  The only thing wrong with talloc_steal() is its
> name since it inherently suggests one is doing something
> invalid/dodgy. It should be renamed talloc_reparent() instead

As far as I can tell, those who use ref-counting get nothing but
trouble, and it is due to the promotion of reference to owner.

Tridges solution prevents this promotion by restricting use of
talloc-free but results in run-time talloc_free failures or memory
leaks, instead of the current run-time dangling pointers.

We can prevent promotion of reference to owner by using a "no_owner"
owner in talloc free, and thus not change the usability of talloc free.

This would respect Ronnies separation of api's, and yet keep them both
in a single pointer. Colluding code modules can use
talloc/talloc_steal/talloc_free quite safely as they historically have
done; and non-colluding modules can use
talloc_reference/talloc_unreference on the same pointers.

"Colluding" means where the code authors are in collusion and approve or
anticipate each-others use - after all it would be dangerous to call
talloc_steal if the allocator might call talloc_free behind your back.

Sam


More information about the samba-technical mailing list