talloc_stackframe() and talloc_tos()
rusty at rustcorp.com.au
Wed Jun 27 06:12:58 MDT 2012
I've been looking into talloc_pool, talloc_stackframe and
talloc_tos(), because dbwrap() uses talloc_tos and I want to convert S4
to use dbwrap, so I can get ntdb as an option for every database. I've
been hacking for a few days, and am taking tomorrow off, so I though I'd
post what I have:
(1) talloc_pool() (in libtalloc) implements a bump allocator. This only
makes sense if you have a pop-on-pop-off style workload. You create
a pool of a given size, and that's used as a simple allocator. If
it fills, we fall back to malloc.
(2) talloc_stackframe() is a samba-specific wrapper which uses
a talloc_pool if the first call uses talloc_stackframe_pool().
If you free a talloc_stackframe(), it frees every stackframe
allocated afterwards, too.
(3) talloc_tos() returns the current top stackframe (ie. last
(1) talloc_pool is fast 
talloc: 11545659 ops/sec
talloc_pool: 30377035 ops/sec
malloc: 12960209 ops/sec
(2) Yes, even faster than alloc_mmap:
talloc (alloc_mmap): 16943997 ops/sec
(3) For simple functions (eg. ones which do a single allocation) using
talloc_tos() is faster and simpler than (the more general pattern)
of using talloc_stackframe() then allocating off that: one
allocation vs two, eg:
char *fname = talloc_asprintf(talloc_tos(), "%s/%s", dir, path);
int fd = open(fname, O_RDONLY);
TALLOC_CTX *tmp_ctx = talloc_stackframe();
char *fname = talloc_asprintf(tmp_ctx, "%s/%s", dir, path);
int fd = open(fname, O_RDONLY);
(1) Libraries need to set up talloc_stackframe() at every entry point,
otherwise talloc_tos() does a hack where we leak memory.
(2) Only smbd calls talloc_stackframe_pool(), so other code ends up
using non-pool allocations (ie. normal ones).
(3) There's no way to know how full a stackframe got, so tuning is hard
(smbd uses 8192 bytes).
(4) You can talloc_steal() out of a talloc_pool(), leaving the pool
stuck. If you turn on debugging for this, you see many cases where
this is done.
(5) talloc_tos() allocations are often not freed, but left for someone
in the callchain to free. This could lead to more memory being used
than necessary in pathalogical cases.
(6) talloc_stackframe() and talloc_tos() mis-usage is hard to see, since
a failure to free is not noticed at runtime.
What I'm trying to do about it:
- Add a talloc_pool_stats() call to show how successful a pool was.
- Change talloc_stackframe_pool() to auto-size. The caller hands a
static struct (carefully internally threadsafe), and we grow/shrink
based on usage. This should make it easier to use elsewhere.
- Add --check-talloc-stack option to configure (developer mode only) which
does sanity checking, using gcc's -finstrument-functions option to
insert calls on every function entry and exit.
This checks that we always (and only) free a talloc_stackframe() in
the function which allocated it, makes us panic if we ever don't have
a stackframe in talloc_tos().
- Make talloc_steal() out of a pool log a warning. Unfortunately,
samba3 does this quite a bit: if these can't be fixed, the stackframe
size will have to be capped quite low, since this prevents immediate
freeing. I'm still working through them :(
- Make our talloc_log_fns call smb_panic if we're in developer mode.
This makes me fix the problems.
- Pass a talloc context into S3's lp_* functions which return a string.
This is almost always talloc_tos(), but avoids us doing a gratuitous
talloc_strdup() in the few dozen places where we want another parent
- Plus misc fixes and cleanups.
You can follow the fun in my "talloc-stack-wip" branch... I've just
reshuffled it, so various things might be a bit broken at the moment,
but should make interesting reading:
 -O2 on an Ubuntu 32-bit x86 box. On sn-devel, results are similar:
talloc: 18524433 ops/sec
talloc_pool: 43509747 ops/sec
malloc: 21832590 ops/sec
talloc (alloc_mmap): 21503065 ops/sec
More information about the samba-technical