Proposed API change to tdb.

Rusty Russell rusty at rustcorp.com.au
Wed Dec 22 19:12:42 MST 2010


On Thu, 23 Dec 2010 06:54:30 am Jeremy Allison wrote:
> On Wed, Dec 22, 2010 at 12:17:23PM +1030, Rusty Russell wrote:
> > On Wed, 22 Dec 2010 06:02:42 am simo wrote:
> > > On Tue, 2010-12-21 at 11:25 -0800, Jeremy Allison wrote:
> > > > Hi Rusty & friends,
> > > > 
> > > > I'd like to propose an API addition to tdb.
> > > > 
> > > > Currently, when we expand a tdb file on running out
> > > > of freelist, we have a heuristic that we follow that
> > > > states:
> > > > 
> > > > "always make room for at least 100 more records, and at
> > > > least 25% more space."
> > > > 
> > > > and then rounded up to a multiple of a page size.
> > > > 
> > > > I am working with an OEM that is running Samba on
> > > > a memory contrained box, and they are storing some
> > > > of the tdb's in an in-memory filesystem, to prevent
> > > > disk spin-up and consequent power drain.
> > > > 
> > > > The problem with the above heuristic is it creates
> > > > tdb files that are far too large for their box and
> > > > prevents them storing them on the ramfs.
> > 
> > Nothing wrong with this, but AFAICT you're missing the root cause.
> > 
> > Neither the 25% nor the 100 record heuristics are the problem; it's the
> > fact that we disabled right merging of freed blocks for performance.  This
> > means our tdbs get fragmented.  As a heuristic, we repack the entire db when
> > we expand inside a transaction.
> > 
> > So smaller expansions == more regular repacking.  Perhaps try turning the
> > current #ifdef USE_RIGHT_MERGES into a customisable parameter, and turn
> > that on instead?
> 
> Actually I'm not sure that is the cause. This is happening in the
> winbindd_cache tdb when the box is first joined to a domain and
> winbindd is being requested to enumerate all the users in a large
> domain, which then get added into the cache.
> 
> As there hasn't been time for anything to time out, this means
> that all the activity is due to additions, not freelist fragmentation.

Then your patch can only reduce it by 100 records or 25%.  But you said
they're "far too large", which makes me wonder.

I'm porting my tdb_summary() code across to SAMBA's tdb now; once
that's done you can run "info" in tdbtool and it will tell you more about
the internals of the db (I've been slack...).

BTW, the other tunable that comes to mind is the 25% overallocation we do
by default.

Cheers,
Rusty.


More information about the samba-technical mailing list