[TDB] Patches for file and memory usage growth issues

simo idra at samba.org
Mon Apr 18 07:32:35 MDT 2011


On Mon, 2011-04-18 at 22:20 +0930, Rusty Russell wrote:
> On Mon, 18 Apr 2011 16:53:00 +0930, Rusty Russell <rusty at samba.org> wrote:
> > On Thu, 14 Apr 2011 14:14:32 -0400, simo <idra at samba.org> wrote:
> > > On Wed, 2011-04-13 at 14:46 +0930, Rusty Russell wrote:
> > > Hi Rusty,
> > > unfortunately this still doesn't seem to really help.
> > > 
> > > I've slightly modified my tests so I've rerun 3 tests:
> > > 1. plain
> > > 2. with your first 3 patches
> > > 3. with all 4 patches
> > > 
> > > plain is the baseline and it includes my patches to use better
> > > heuristics as published in my git tree.
> > > 
> > > With your patches I actually see both an increase in time spent and size
> > > of the memory footprint as well as final size of the tdb
> > > unfortunately.
> > >
> > >
> > > The strict repack looks certainly overkill with thess tests, although
> > > the basic repack patches do not hit too hard on time spent although the
> > > final tdb is still 300MiB larger than without the patch.
> > 
> > Yes, let's ignore that overagressive repack as a completely bad idea.
> > 
> > And this clearly reveals YA bug in my repack code:
> 
> Actually, that revealed a bug in tdb_summary().
> 
> Even with my benchmark, your extension fix was such a vast improvement
> that the 1/3 peak memory reduction caused by my much-more-complicated
> tdb_repack code pales in comparison.
> 
> So, I'm going to keep that in my back pocket for now; I *think* I just
> got the last two bugs out, but my record here isn't great :)
> 
> FYI, here are the times & file sizes for ./growtdb-bench 100000 1 on my
> laptop (I cut the test down so it would run in reasonable time):
> 
> Baseline:               13m53         472M
> Intelligent repack:     11m6          451M
> Limited expand:         7m48          117M
> Repack in place:        7m44          113M

Can you check what is the maximum memory footprint while the test runs
on each of these tests ?

The bug (found elswhere in the end :) that made me initially dive into
these size issues was a large increase in memory footprint.
With the old method we were basically doing 2 huge mmaps that would
cause the process to use up to 4g of virtual memory (3.5g RES, 2.2g
SHR). I would hope that by not using an auxiliary, in memory database,
there is a way to substantially reduce that.

> As a whim, I put TDB2 to the same test:
> TDB2                    0m9           297M

Now, this is awesome! Very fast.
What is it that is making such a big difference in time used ?

> So now I go to copy Simo's limited expand code across to tdb2 :)

:)

> Simo, this is what I'm thinking of pushing, once it's tested.  OK?

Sounds ok repack wise.
Do you mind if I also push the patch to fix tdbbackup to not use a
transaction on the copied db ?
That one also made quite a difference for the performance and final size
of the tdb generated. And because the final tdb is a backup file it
means lower times to copy it wherever it needs to be copied so really
worthwhile imo.

> Cheers,
> Rusty.
> 
> The following changes since commit af45636166c7a0cb87630105d18ce489e7391525:
> 
>   Fix bug 8072 - PANIC: create_file_acl_common frees handle two times. (2011-04-09 02:05:15 +0200)
> 
> are available in the git repository at:
>   git://git.samba.org/rusty/samba.git tdb-repack
> 
> Rusty Russell (3):
>       tdb: fix transaction recovery area for converted tdbs.
>       tdb: tdb_repack() only when it's worthwhile.
>       tdb: make sure we skip over recovery area correctly.
> 
> Simo Sorce (1):
>       tdb_expand: limit the expansion with huge records
> 
>  lib/tdb/common/io.c          |   25 +++++++++++++++++++----
>  lib/tdb/common/summary.c     |   13 ++++++++++-
>  lib/tdb/common/transaction.c |   43 ++++++++++++++++++++++++++++++++++-------
>  3 files changed, 66 insertions(+), 15 deletions(-)


-- 
Simo Sorce
Samba Team GPL Compliance Officer <simo at samba.org>
Principal Software Engineer at Red Hat, Inc. <simo at redhat.com>



More information about the samba-technical mailing list