talloc quiz, and: dangling talloc references and inconsistencies in the talloc model

Thu Jan 22 10:42:35 GMT 2009

* tridge wrote, On 16/01/09 00:19:
> Hi Sam,
>
> First off, I should say that talloc_reference() is definately the
> trickiest part of talloc, and that I often try to avoid it. I've also
> been quite tempted to change talloc to fail explicit talloc_free()
> calls when a pointer has a reference and to require that
> talloc_unlink() be used in that case. An explicit talloc_free() just
> doesn't give enough information to the talloc library to always do the
> 'right thing'.
>
> That said, I don't agree with the proposed solution (though perhaps
> you can convince me). I deliberately chose the current semantics for a
> good reason.
>
> To keep it concrete, let's look at the example you added to the test
> suite:
>
> 	void *root, *p1, *p2, *ref, *r1;
>
> 	root = talloc_named_const(NULL, 0, "root");
> 	p1 = talloc_named_const(root, 1, "p1");
> 	p2 = talloc_named_const(p1, 1, "p2");
> 	/* Now root owns p1, and p1 owns p2 */
>
> 	r1 = talloc_named_const(root, 1, "r1");
> 	ref = talloc_reference(r1, p2);
>
> This is the setup you are concerned about. You now worry about the
> difference from the point of view of r1 between talloc_free(p1) and
> talloc_free(p2). I'd like to expand that to a 3rd case for you to
> consider, which is talloc_free(r1).
>
> What talloc is really trying to simulate with references is the
> ability for a pointer to have two parents. The only difference between
> the two parents of p2 is that one was established earlier than the
> other. The fact that internally talloc considers one to be an 'owner'
> and the other a 'reference' is supposed to be hidden as far as
> possible from the programmer. Unfortunately it isn't completely
> hidden.
>
> So from that point of view the memory tree looks like this:
>
>                            root
>                             / \
>                            /   \
>                           /     \
>                          /       \
>                         r1       p1
>                           \      / 
>                            \    /
>                             \  /
>                           (p2,ref)
>
> Notice that I've labelled the bottom pointer with two names. The value
> of p2 is guaranteed to be the same value as ref, so they are the same
> pointer. When p2 or ref is passed to a function we have no way to
> distinguish which is being used (as its the same value).
>
> So let's look at the 3 cases and try to work out the intent of the
> programmer in each case.
>
>   1) talloc_free(r1). The intent in this case is very clear. The
>   programmer is destroying the tree starting at r1, which means we
>   should end up with this:
>
>                            root
>                               \
>                                \
>                                 \
>                                  \
>                                  p1
>                                  / 
>                                 /
>                                /
>                           (p2,ref)
>
>   2) talloc_free(p1). The intent in this case is also clear in this
>   case. The programmer is destroying the tree starting at p1, which
>   means we should end up with this:
>
>                            root
>                             /  
>                            /    
>                           /      
>                          /        
>                         r1         
>                           \        
>                            \     
>                             \   
>                           (p2,ref)
>
>   3) talloc_free(p2). This is the tricky one. There is no way to
>   distinguish this from talloc_free(ref), so we have to choose one of
>   the two above approaches. The approach I chose in talloc was that
>   the most recent parent should be removed. This is because with no
>   way to distinguish what the programmer wanted I needed some
>   consistent rule to use, and that is the most logical rule I could
>   think of and it made sense to me in terms of the common nesting used
>   with references. 
I think this is a good assumption, but I think it is less likely to hold
true in a system of asynchronous modules (like vfs layers) when interest
in structures isn't necessarily nested, and where it is nearly
impossible to test whether or not a particular use will be safe at runtime.
> That gives us this:
>
>                            root
>                             / \
>                            /   \
>                           /     \
>                          /       \
>                         r1       p1
>                                  / 
>                                 /
>                                /
>                           (p2,ref)
>
>
> The test_implicit_explicit_free() test checks on something quite
> different, and ignores the "two parent" view of talloc references. I
> think it is looking for consistency in the wrong way, and ignores the
> fact that talloc_free(p2) is the same call as talloc_free(ref). 
I see why you say this. In fact it considers that talloc_free is unsafe
for the reasons you give above.
> It
> also increases the exposure to the programmer of the idea of who is
> the 'owner' of a pointer, thus reducing the illusion talloc tries to
> create of having a true multi-parent tree structure.
>
> btw, the 'quiz' questions you posted don't make any sense to me, as
> they don't explain that the functions save_for_later() and
> sneak_final_look() do. If you give some explicit, runnable, code then
> it would make more sense to me. Just saying "you got everything right"
> doesn't tell me what the code actually does. There could well be bugs
> in talloc, but I'd need test cases to confirm that it really is a bug.
>   

> Cheers, Tridge
>