[TDB] Patches for file and memory usage growth issues
simo
idra at samba.org
Mon Apr 18 07:32:35 MDT 2011
On Mon, 2011-04-18 at 22:20 +0930, Rusty Russell wrote:
> On Mon, 18 Apr 2011 16:53:00 +0930, Rusty Russell <rusty at samba.org> wrote:
> > On Thu, 14 Apr 2011 14:14:32 -0400, simo <idra at samba.org> wrote:
> > > On Wed, 2011-04-13 at 14:46 +0930, Rusty Russell wrote:
> > > Hi Rusty,
> > > unfortunately this still doesn't seem to really help.
> > >
> > > I've slightly modified my tests so I've rerun 3 tests:
> > > 1. plain
> > > 2. with your first 3 patches
> > > 3. with all 4 patches
> > >
> > > plain is the baseline and it includes my patches to use better
> > > heuristics as published in my git tree.
> > >
> > > With your patches I actually see both an increase in time spent and size
> > > of the memory footprint as well as final size of the tdb
> > > unfortunately.
> > >
> > >
> > > The strict repack looks certainly overkill with thess tests, although
> > > the basic repack patches do not hit too hard on time spent although the
> > > final tdb is still 300MiB larger than without the patch.
> >
> > Yes, let's ignore that overagressive repack as a completely bad idea.
> >
> > And this clearly reveals YA bug in my repack code:
>
> Actually, that revealed a bug in tdb_summary().
>
> Even with my benchmark, your extension fix was such a vast improvement
> that the 1/3 peak memory reduction caused by my much-more-complicated
> tdb_repack code pales in comparison.
>
> So, I'm going to keep that in my back pocket for now; I *think* I just
> got the last two bugs out, but my record here isn't great :)
>
> FYI, here are the times & file sizes for ./growtdb-bench 100000 1 on my
> laptop (I cut the test down so it would run in reasonable time):
>
> Baseline: 13m53 472M
> Intelligent repack: 11m6 451M
> Limited expand: 7m48 117M
> Repack in place: 7m44 113M
Can you check what is the maximum memory footprint while the test runs
on each of these tests ?
The bug (found elswhere in the end :) that made me initially dive into
these size issues was a large increase in memory footprint.
With the old method we were basically doing 2 huge mmaps that would
cause the process to use up to 4g of virtual memory (3.5g RES, 2.2g
SHR). I would hope that by not using an auxiliary, in memory database,
there is a way to substantially reduce that.
> As a whim, I put TDB2 to the same test:
> TDB2 0m9 297M
Now, this is awesome! Very fast.
What is it that is making such a big difference in time used ?
> So now I go to copy Simo's limited expand code across to tdb2 :)
:)
> Simo, this is what I'm thinking of pushing, once it's tested. OK?
Sounds ok repack wise.
Do you mind if I also push the patch to fix tdbbackup to not use a
transaction on the copied db ?
That one also made quite a difference for the performance and final size
of the tdb generated. And because the final tdb is a backup file it
means lower times to copy it wherever it needs to be copied so really
worthwhile imo.
> Cheers,
> Rusty.
>
> The following changes since commit af45636166c7a0cb87630105d18ce489e7391525:
>
> Fix bug 8072 - PANIC: create_file_acl_common frees handle two times. (2011-04-09 02:05:15 +0200)
>
> are available in the git repository at:
> git://git.samba.org/rusty/samba.git tdb-repack
>
> Rusty Russell (3):
> tdb: fix transaction recovery area for converted tdbs.
> tdb: tdb_repack() only when it's worthwhile.
> tdb: make sure we skip over recovery area correctly.
>
> Simo Sorce (1):
> tdb_expand: limit the expansion with huge records
>
> lib/tdb/common/io.c | 25 +++++++++++++++++++----
> lib/tdb/common/summary.c | 13 ++++++++++-
> lib/tdb/common/transaction.c | 43 ++++++++++++++++++++++++++++++++++-------
> 3 files changed, 66 insertions(+), 15 deletions(-)
--
Simo Sorce
Samba Team GPL Compliance Officer <simo at samba.org>
Principal Software Engineer at Red Hat, Inc. <simo at redhat.com>
More information about the samba-technical
mailing list