[TDB] Patches for file and memory usage growth issues

Mon Apr 11 07:21:33 MDT 2011

On Mon, 2011-04-11 at 14:46 +0200, Stefan (metze) Metzmacher wrote:
> Hi Simo,
> 
> >>  > The first one should be uncontroversial, the original intent makes sense
> >>  > in a tdb database when you use small records, but when a rare huge
> >>  > record is saved in, growing the db by 100 times its size really is
> >>  > overkill. I also reduced the minimum overhead from 25% to 10% as with
> >>  > large TDBs (in the order of 500MB/1GB, growing by 25% is quite a lot).
> >>
> >> The change from 25% to 10% will have the bigger impact of the two
> >> changes I think. The 25% number was always fairly arbitrary, but
> >> changing to 10% means 2.5 times as many tdb_expand calls while a
> >> database grows. Have you measured any impact of this in initially
> >> creating a large database?
> > 
> > The only impact I measured was the size, and it was a quite impressive
> > gain. But I haven't tested if it made a difference in speed.
> 
> maybe we can use something like this:
> 
> uint64_t calc_new_size(uint64_t old_size, uint64_t add_size)
> {
> 	uint64_t needed_size = old_size + add_size;
> 	uint64_t new_size1, new_size2, new_size;
> 
> #define MEGA_BYTE (1024 * 1024)
> 
> 	/* expand by 100MB */
> 	max_size1 = needed_size + (100 * MEGA_BYTE)
> 	/* expand by 25% */
> 	max_size2 = needed_size * 125 / 100;
> 
> 	/* use the minimum */
> 	max_size = MIN(max_size1, max_size2);
> 
> 	/* align to 1MB */
> 	max_size = (max_size + (MEGA_BYTE -1)) / MEGA_BYTE;
> 	
> 	return max_size;
> }

You always need to make sure max_size is never smaller than the record
size you need. that's why I always have size*2 as the very minimum.

> >> I think the real problem is the inefficient index format in ldb.
> > 
> > Oh, that's totally a huge issue, but I am trying to work on 2 fronts
> > here. I need something to cut down on memory usage quickly, in order to
> > solve problems for current users, w/o making incompatible changes. Then
> > I need to account for efficiency. Of course if both can be achieved
> > quickly that's even better.
> > 
> >> We really need to fix that. The compression will makes the file smaller,
> >> but much slower. Then we'll need a compression cache to make it fast
> >> again, and we'll quickly end up with something that is very hard to
> >> maintain. 
> > 
> > A cache would mean keeping huge records in memory, which would cause
> > memory to grow again too much I think, unless we use some LRU and keep
> > the cache size tightly controlled, but that would indeed be expensive.
> > 
> > One thing I would do is to save a key/dn pair with key not bigger than a
> > long integer, and then use this integer in indexes instead of the full
> > DNs. Whether we can make this transparent and efficient I don't know
> > yet, but it would certainly reduce the size of large indexes by more
> > than an order of magnitude.
> 
> Tridge and I discussed something like that the AD plugfest last year.
> 
> If I remember correctly we discussed using the objectGUID as primary fixed
> size key.
> 
> And a maybe even better solution using uint64_t values which can be used
> as direct offset into the tdb mmap area.

ObjectGUID is still much larger than we really need I think.
All we need is really just a sequential number (64 bit is going to be
bigger than we really need, but looks like a good compromise) that keys
a record that holds the DN.
This means each member of an index will waste exactly 8 bytes. Instead
of the much bigger current average (which depends on the basedn size but
is usually above 30 bytes).
Direct offsets in the tdb mmap area are not a good idea though, it would
tie data contents to records layout making it very difficult to repack
or backup tdbs without knowing how to manipulate the data inside the DB
and causing a lot or relocations each time you try to do that.

Although, if we made sure that we always reuse these numbers so that we
tightly pack the table we could easily have a list of references in
memory as a simple indirection table so we can skip the hash search for
running databases. Not sure we really need an optimization like that,
hash lookups tend to be fast enough, but perhaps a 75k members index
(meaning 75k+1 hash lookups) may actually make it worth.

Simo.

-- 
Simo Sorce
Samba Team GPL Compliance Officer <simo at samba.org>
Principal Software Engineer at Red Hat, Inc. <simo at redhat.com>