Musings on talloc and talloc_free

Mon Jan 12 23:12:06 GMT 2009

Thanks for taking to time to point out where I was unclear.

Volker Lendecke wrote:
> On Mon, Jan 12, 2009 at 04:22:27PM -0000, Sam Liddicott wrote:
>   
>> 1. talloc free should never be called more than once per allocation,
>> because it becomes to hard to be sure who the second owner is.
>>     
>
> Wait a second -- how is it possible to have two owners
> assuming nobody calls talloc_reference()? 
It's not - if you make that assumption. See below about samba4 talloc_free
> I must have
> completely misunderstood talloc_steal (or the s3 version of
> it, talloc_move). I always thought it transfers ownership
> and does not duplicate it.
>   
I was referring to cases where there was a reference.
>> 2. that talloc_steal should never happen secretly but be a documented
>> part of whatever function invokes talloc_steal() as part of it's operation.
>> 3. talloc_steal should never be directly called after a talloc_free (OK
>> internal to talloc)
>>     
>
> Again -- doing *anything* to a pointer after having called
> talloc_free on it is just a plain bug.
>   
Unless the thing that does it knows there is a talloc_reference (it 
probably took the reference).
>> 4. When talloc_free is called after talloc_steal it had better be in the
>> context of the new owner - to be safe, why not just use talloc_unlink to
>> prove it?
>>     
>
> I don't get what you mean by "in the context of the new
> owner". You call talloc_free() on a pointer. Where
> is the context??
>
>   
That darn word "context"; sorry. I didn't mean "mem_ctx" but code 
context. If module B steals a pointer from module A, then module B can 
call talloc_free. Module A *could* call talloc_free if it did it with 
permission of the module B author (maybe in a callback or something). 
Thats what I meant by "in the context of the new owner" - in a way that 
the author of the stealing module planned for, officially acting in a 
sanctioned way.
>> 1. talloc_free (when not invoked internally) should raise an error if
>> called more than once on the same allocation. Any code that does this
>>     
>
> Calling talloc_free on a pointer twice just can't happen.
> It's a bug. Period. That's why we have the TALLOC_FREE macro
> in Samba 3. It NULLs out the pointer so that you get the
> nice core dump error message on the second talloc_free call.
>
>   
In samba4, talloc_free takes the top reference and removes it as a 
reference and makes it the new owner.

Thus if talloc_free has been called once on an allocation that also has 
references, the second owner will be one of the references.

If there was only one reference, then after the first call to 
talloc_free there will be no references and just a new owner. If 
talloc_free is called again, then the allocation will be freed.

I'm suggesting that it should not be legitimate to call talloc_free a 
second time on the grounds that it will be too hard to track who the 
second owner will be, and in cases where it seems simple to track it is 
a brittle solution as in future revisions of software other references 
may also be taken which break that expectation.

If samba3 is different in this respect, then it would explain your 
conclusion that talloc_free cannot be mixed with talloc_reference
>> "legitimately" because it "knows" who the second reference is, is both
>>     
>
> You said on irc that talloc_reference is just innocent. From
> that I understood that you don't have to use it to generate
> all the hell you describe here. So I don't really know where
> the second reference you describe comes from, unless I've
> completely misunderstood talloc_steal so far.
>   
Talloc_reference is involved, but as a victim, not a guilty party.

It is safe to mix talloc_reference and talloc_unlink with the two bold 
brothers talloc_free and talloc_steal, and the restriction that comes is 
not on use of talloc_reference but on use of talloc_free and 
talloc_steal. I think this is a more natural view of the problem, 
although blaming talloc_reference is not entirely unreasonable.
> Can you show us some small pieces of sample code that
> demostrate your points?  

They are slightly contrived and demonstrate the error cases rather than 
best use of talloc.

1. bad use of talloc_free after talloc_steal.
if filecache_add did a talloc_reference instead then this would be OK. 
(Of course a temporary mem_ctx might be a better idea if this were not 
so contrived)

struct fileinfo *fi = talloc_zero(mem_ctx, struct fileinfo);
/* filecache_add does a talloc_steal */
filecache_add(ntvfs->fileinfo_cache, fi);
talloc_free(fi);
/* Now the fileinfo cache has a duff entry */
return;

2. bad use of talloc_steal after talloc_free

struct thing* t=talloc_zero(ctx, struct thing);
/* this takes a reference */
stick_in_some_cache(queue, t);
use_thing_til_we_had_enough_right_now(t);
talloc_free(t); /* still in cache */
/* NOTE: Now anything that uses the cache item can talloc_reference all 
it likes but must NOT call anything which does a talloc_steal or the 
cache reference which is now the owner will get replaced by the stealing 
context and when that is free'd the cache will contain a bad entry.
*/

The way to let references work safely is to
1. forbid talloc_free or talloc_steal to be called if talloc_free has 
ever been called;
2. and to be careful not to call talloc_free assuming the wrong owner on 
something that could have been stolen.

The first constraint can be coded into talloc, the second can be done by 
not calling talloc_free, but talloc_unlink.

But what if someone wants to call talloc_steal on something that was 
freed after references were taken? It can be unfair to punish the caller 
just because it did not know the entire history of the allocation.

IDEA: Perhaps talloc_free should not move a reference to become an owner 
as it does now, but set the owner to NULL or to some special "no-owner" 
context, with a flag to say that there actually is no owner. Then 
talloc_steal would be safe to use on something that was free'd and the 
flag could be cleared. This flag would need to be checked when the last 
reference were removed so that the allocation could actually be freed.

The only code at risk would be current code that calls talloc_free more 
than once because it thinks it knows who the new owner is, but that is 
bad code and deserves an error on a twice-called free.

I like the idea and it can be coded pretty much by just an extra flag 
and a spare context. I may try it. It means that owners and references 
are separate things.

Sam