[TDB] Patches for file and memory usage growth issues

Rusty Russell rusty at samba.org
Tue Apr 19 05:34:13 MDT 2011

On Mon, 18 Apr 2011 09:32:35 -0400, simo <idra at samba.org> wrote:
> On Mon, 2011-04-18 at 22:20 +0930, Rusty Russell wrote:
> > FYI, here are the times & file sizes for ./growtdb-bench 100000 1 on my
> > laptop (I cut the test down so it would run in reasonable time):
> > 
> > Baseline:               13m53         472M
> > Intelligent repack:     11m6          451M
> > Limited expand:         7m48          117M
> > Repack in place:        7m44          113M
> Can you check what is the maximum memory footprint while the test runs
> on each of these tests ?

I meant to ask you: how did you measure that?  I had a ps running every
second, but it usually missed the peak (which happens right at the end
of the repack).

> The bug (found elswhere in the end :) that made me initially dive into
> these size issues was a large increase in memory footprint.
> With the old method we were basically doing 2 huge mmaps that would
> cause the process to use up to 4g of virtual memory (3.5g RES, 2.2g
> SHR). I would hope that by not using an auxiliary, in memory database,
> there is a way to substantially reduce that.
> > As a whim, I put TDB2 to the same test:
> > TDB2                    0m9           297M
> Now, this is awesome! Very fast.
> What is it that is making such a big difference in time used ?

Several things.  The good first:

(1) The hash scales as we get bigger, so 100000 records is pretty easy.
    I usually test with 1 to 5 million records.
(2) We only overallocate records once they actually grow, but then we
    overallocate by 50%, meaning fewer reallocs for the index.

The bad:

(3) TDB doesn't repack.  We'll probably need to eventually, since
    fragmentation can be an issue in any allocator.

(4) I use tdb_append in my benchmark, and that's a simple "read, copy
    write" inside tdb1, but tdb2 just writes if it has room.  But LDB
    doesn't use tdb_append anyway, so it's cheating :)

> > Simo, this is what I'm thinking of pushing, once it's tested.  OK?
> Sounds ok repack wise.
> Do you mind if I also push the patch to fix tdbbackup to not use a
> transaction on the copied db ?

> That one also made quite a difference for the performance and final size
> of the tdb generated. And because the final tdb is a backup file it
> means lower times to copy it wherever it needs to be copied so really
> worthwhile imo.

Yes, please do.  I wanted to make sure we'd nailed the growth problems
first before you made them go away :)

I've pushed it to the autobuilder now.


More information about the samba-technical mailing list