[QUICK] talloc bugs

Mon Jun 29 03:59:17 MDT 2009

On Mon, Jun 29, 2009 at 6:37 PM, Sam Liddicott<sam at liddicott.com> wrote:
...
> As far as I can tell, those who use ref-counting get nothing but trouble,
> and it is due to the promotion of reference to owner.
>
> Tridges solution prevents this promotion by restricting use of talloc-free
> but results in run-time talloc_free failures or memory leaks, instead of the
> current run-time dangling pointers.
>
> We can prevent promotion of reference to owner by using a "no_owner" owner
> in talloc free, and thus not change the usability of talloc free.
>
> This would respect Ronnies separation of api's, and yet keep them both in a
> single pointer. Colluding code modules can use
> talloc/talloc_steal/talloc_free quite safely as they historically have done;
> and non-colluding modules can use talloc_reference/talloc_unreference on the
> same pointers.
>
> "Colluding" means where the code authors are in collusion and approve or
> anticipate each-others use - after all it would be dangerous to call
> talloc_steal if the allocator might call talloc_free behind your back.
>
> Sam
>

I think you got most of what I was trying to say but not all, since I
didnt what I really myself exactly what I felt was wrong.

Lets try again.

The current structure of talloc is a tree structure where every object
(except root) has ONE parent, and every object is a child to another
object.
A true tree structure where talloc_*() adds a leaf to the tree,
talloc_steal() moves a subtree within the tree, and talloc_free()
removes a subtree from the tree.

When you start using references, I dont think that you have a tree
structure at all anymore but just a generic graph.
Currently this is very confusingly implemented as an add-on to the
talloc()/talloc_free() semantics where all references are equal,
except  some references that are more equal (and different) and thus
to to really make talloc_free() you need to be aware of the current
sideefeccts in order to use it safely/sanely. A reference model where
you need to be aware of inner details of the reference model is imho
sub-optimal.

When using references, I dont think there is a tree at all but a pure
graph. Pretending that we still have a tree and parents et al I think
is a mistake.
The data models and the differences between a tree and a graph are
vast and they really need different APIs, if not for anything else
then for makiung it explicit for people reading the code "this object
is treated as an object in a graph" vs "this object is an object in a
tree".

In a graph I personally dont think there are any differences between
different references. They are all equal and they are all replaceable.
A consequence of this is that there is no longer a concept of a
"parent", and then and there any and all relation to a tree falls.

In a graph, there would be refcounting but you can never free an
object. The only thing that makes sense in a graph is adding and
removing edges.
Once there are no edges that connects to a certain object, that object
can safely be freed.

Example : A circular linked list. A circular linked list today with
talloc is treated as one meta-object owning the metadata for the list
och which all list elements are child objects to that linked list
metadata object. This can be argued to be a very unnatural hierarchy
for a list.
If the list instead was treated as a graph, this would more naturally
map to the actual structure of a list.
Each node in a list owns a "reference" to the next* pointer in the ring.

I.e. circular dependencies would not be a pathological cornercase but
rather the natural way to structure talloc dependencies for several
classes of objects and relations.

To continue the example, today with talloc since there are no
relations/references between the individual objects in a list, doing
things like removing an object from the list by talloc_free() is
cumbersome since you dont have any relations between the onbject you
delete and the previous object that links to you and references to
you.
You thus need explicit code for handling removal of a node entry in a
talloc linked list, that essentially needs to process the entire list
from the metadata head object all the way to the item removed.
This is due to that talloc is a tree model, not a true reference count model.
Instead if this was managed as a refcounted graph, deleting an item in
the linked list would NOT have to do the linear walk of the list from
"head" to the object to find the object to be deleted, instead it
would do something like this :
   PreviousObject has a pointer that points to Object.
   Since it has this pointer it also owns a reference to Object,
PreviousObject references Object.
   When we now delete/unlink Object, similar to how a
talloc_destructor() works we should now have a callback to
PreviousObject that says
       "Object, which you have a reference to is going away, do what
you need to do to deal with it."
   This callback for PreviousObject would then as arguments get both
PreviousObject itself and also Object as argument and could then
modify
      PreviousObject->next to point to PreviousObject->next->next
instead, thus changing the linking of the list.
      This callback would then also call
talloc_reference(PreviousObject, PreviousObject->next->next) to make
sure that the next item we now link to
      does not go away.

(The example above would require a new type of talloc function   :
talloc_destroy() that would be similar to talloc_free() but would call
the "referenced object is being destroyed" callback before actually
removing the object).

I thus think references should require a separate API that deals with
the fact they operate in a graph and not a tree.
Thus needing similar allocators as the talloc_*() ones but maybe with
a _ref postfix.
They would use talloc_reference/unreference to add remove references
from an object.
Once all references to an object are removed, the object is automatically freed.
Objects can register a "referent_destroyed" callback to be called when
an object which is referenced is freed/destroyed.
A new function talloc_destroy() can be used to delete an object with
existing references. Causing the callbacks above to be called before
the actual removal.

ronnie