[TDB] Patches for file and memory usage growth issues
rusty at samba.org
Mon Apr 11 20:14:19 MDT 2011
On Mon, 11 Apr 2011 19:27:27 -0400, simo <idra at samba.org> wrote:
> On Tue, 2011-04-12 at 00:38 +0930, Rusty Russell wrote:
> > I didn't implement tdb_repack; I hope we won't get as fragmented as
> > TDB1 does (we do full coalescing).
> > Simo, here's my tdb_repack rewrite. It's lightly tested, but I'd be
> > interested to see how much it helps you. It's a bit more complex than
> > I'd hoped...
> Hi Rusty,
> unfortunately my tests doesn't show results to be as good as I hoped
> for. Here is the tree I used:
> Tests take a while to run so I haven't done them all with the latest
> patches, the first 2 tests below w/o your patches also still had my
> original patch that didn't change the 25% allocation overhead, only the
> x2/x100 change for records bigger than 100k.
> With your patch, for some reason memory allocation didn't improve much,
> I still got a peak of 3.9G without compression and 3.2G with compression
> enabled (I have an additional patch in the packages I built for my
> fedora machine that allows me to toggel compression using an environment
> variable so that I don't have to recompile tests and library). This
> compares to previous values of 3.7G and 2.8G with compression, so it
> seem that it actually get worse, but again in the previous runs I didn't
> have the heuristic to change to 10% over-allocation on expand after
> 100MiB so take these results with a grain of salt, I will repeat those
> runs to get a better assesment of the difference with just your patch
> Your patches also seemed to make the tests much slower :/
> Also notice how your patches are causing the tdb to nd up beeing even
> bigger but about 200MiB (and note that in the non-rusty runs I had
> heuristics to always increase the DB by 25%, while with your patches
> applied I changed them to increase the size by only 10% once the TDB
> grows past 100MiB).
Hmm, OK. I obviously broke something:
> Rusty's patch no compression: ~ 30 min:
> $ tdbtool tests_sysdb.rusty.nocomp/sssd.ldb
> tdb> info
> Size of file/data: 1538629632/850286307
> Number of records: 90658
> Smallest/average/largest keys: 12/51/65
> Smallest/average/largest data: 43/9327/1289140
> Smallest/average/largest padding: 4/48387/4294967234
That padding number is very very wrong! It implies that this database
won't pass tdb_check() (unless tdb_summary() has a bug).
I think the first change we need is that one-liner which limits
repacking when databases have grown a great deal.
The expansion heuristic is also an obvious target, we should fix it as
Meanwhile, I will stress tdb_repack and see if I can find the bug. It
may also be that my trimming heuristic there is a bad idea: if it
guesses wrong and trims a growing record, it will increase
fragmentation, leading to slower performance, more expansions and more
I'll hack up a patch to keep stats inside the reserved areas so we can
get more insight into what's happening.
More information about the samba-technical