[QUICK] talloc bugs

tridge at samba.org tridge at samba.org
Wed Jul 1 12:06:39 GMT 2009


Hi Sam,

 > I think this debugging is really great and will make talloc debugging
 > much simpler.

yep, it made finding the ambiguous uses of talloc_free and
talloc_steal very easy, though I need to fix up the stderr hack.

 > However I don't think that Samba code was "broken", it made fair and
 > reasonable use of a fair talloc API, as other libtalloc using programs
 > will do.

I think if you look at the actual changes I made, you'll find they
were nearly all real bugs. Is there a particular one you think was
fair and reasonable that should still be allowed?

 > I don't think it is wrong to want to avoid use of talloc_free in a
 > program, but making it a run-time error to call talloc_free if a
 > yet-to-be-written loadable module might decide by race-condition to
 > keep a reference a bit longer, and thus fail talloc_free, is a way
 > of making talloc brittle and fail in hard to foresee or and hard to
 > repeat circumstances.

indeed, although the previous behaviour was also brittle. The previous
behaviour could change behaviour depending on timing. The new
behaviour is to leak memory if that race happens. I think a runtime
leak on error (with a nice big warning telling you where the bug is!)
is better than the previous ambiguous behaviour.

 > I guess parent also means reference there...
 > > If you pass in a reference as the old
 > > reference then the reference is moved to the new parent.
 > >   
 > but that bit doesn't make sense, I guess you meant that if you pass in a
 > reference as the old reference then the new reference also becomes
 > parent (and the old parent becomes a reference?)

no, I meant what I said. If old_parent in talloc_reparent() was a
reference, then it will still be a reference after the
talloc_reparent() call.

With a bit more complex code we could make it become the "real" parent
internally, but the point of these changes is to avoid the ambiguity
that stems from treating references and parents as different. It just
doesn't matter anymore, except for the pretty-printing when you
display the memory tree, and for the one function talloc_parent().

The aim is to make a reference and a parent the same thing as far as
the public API is concerned. Which a pointer is internally should not
matter.

 > I refer you to the re-framed talloc, below; there is no need to fail
 > with an error.

You mentioned in a previous mail that you have been running with this
new model for a while - can you post the patch you are using? I know
you posted a patch a few months back during the previous discussion,
but if I remember correctly that patch was missing some basic
behaviour (I can't remember exactly what). Have you resolved the
problems?

 > The simplest way to fix it is not to have to fix it (samba) at all. We
 > only have to fix it because of this darned who-knows-why insistence on
 > promoting reference to parent while simultaneously admitting through the
 > API that references and parents are not equal and therefore actually
 > admitting that such promotion is unwise but doing it anyway.

Sam, please look at the following bit of code:

	int *p1 = talloc_new(NULL);
	int *p2 = talloc_new(NULL);
	int *c1 = talloc(p1, int);
	int *r1 = talloc_reference(p2, c1);
	talloc_free(p1);

now tell me what you would have it do. What pointers will remain at
the end of that code, and what parent/child relationship?

The current code leaves p2 as the parent of c1. That necessarily
involves 'promoting' the reference to be a parent. I just cannot see
any other sane behaviour. If you can explain why the above example
should behave in any other way than it currently does then I'll start
listening to your proposed changes. Without the above example working
differently, it's all a pointless discussion.

I _want_ the difference between references and parents to not be
visible to users of the talloc API. It is in my view the only natural
way of having references in talloc. The ambiguity we had previously
was the flaw in that, and now it's fixed.

If what you want is an _internal_ change to talloc, so that it
internally becomes a multi-parent data structure and not references
then we could consider that, and if it made the common use cases of
talloc faster then we could do it. I think it will make it slower as
more than one parent is extremely rare as a percentage of all the
talloc pointers we have.

As of now, the fact that internally talloc uses two different ways of
representing parents is hidden from users. Apart from the
talloc_parent() call, I don't know of any way to find out which is the
'real' parent and which is a reference. So the internal representation
is hidden from applications. That means we can change that
representation in the future without breaking apps, if we have reason
to do it.

Cheers, Tridge


More information about the samba-technical mailing list