[PATCH] GUID index for LDB

Andrew Bartlett abartlet at samba.org
Mon Sep 11 10:57:16 UTC 2017


On Fri, 2017-09-08 at 10:36 +0200, Stefan Metzmacher wrote:
> Am 08.09.2017 um 05:56 schrieb Andrew Bartlett:
> > On Thu, 2017-09-07 at 12:03 +1200, Andrew Bartlett via samba-technical
> > wrote:
> > > 
> > > I'll put that in the release commit message and in the top of the
> > > ldb_index.c file.
> > 
> > The attached, updated patch set includes this slab (see the patch for
> > the full text):
> > 
> > The new 'GUID index' format is:
> > -------------------------------
> > 
> > dn: @INDEX:NAME:DNSUPDATEPROXY
> > @IDXVERSION: 3
> > @IDX: <binary GUID>[<binary GUID>[...]]
> > 
> > The binary guid is 16 bytes, as bytes and not expanded as hexidecimal
> > or pretty-printed.  The GUID is chosen from the message to be stored
> > by the @IDXGUID attribute on @INDEXLIST.
> > 
> > If there are multiple values the @IDX value simply becomes longer,
> > in multiples of 16.
> > 
> > The corrosponding entry is stored in a TDB record with key:
> > 
> > GUID=<binary GUID>
> > 
> > This allows a very quick translation between the fixed-length index 
> > values and the TDB key, while seperating entries from other data
> > in the TDB, should they be unlucky enough to start with the bytes of
> > the 'DN=' prefix.  
> > 
> > Additionally, this allows a scope BASE search to directly find the
> > record via a simple match on a GUID= extended DN, controlled via
> > @IDX_DN_GUID on @INDEXLIST
> > 
> > Exception for special @ DNs:
> > 
> > @BASEINFO, @INDEXLIST and all other special DNs are stored as per the
> > original format, as they are never referenced in an index and are used
> > to bootstrap the database.
> > 
> > 
> > Control points for choice of index mode
> > ---------------------------------------
> > 
> > The choice of index and TDB key mode is made based (for example, from
> > Samba) on entries in the @INDEXLIST DN:
> > 
> > dn: @INDEXLIST
> > @IDXGUID: objectGUID
> > @IDX_DN_GUID: GUID
> > 
> > By default, the original DN format is used.
> 
> So we're upgrading the database on first use with the new code?
> 
> My fear with this is that a simple package upgrade will make
> a dc with a large database unusable for quite some time.
> 
> Can you please check the cost of an upgrade for databases with
> 1.) 5000 users, 5000 computers and 5000 groups
> 2.) 20000 users, 20000 computers and 20000 groups
> 3.) with the numbers of the largest known customer size

With 100k users in 1-4 groups each, it takes 3min30sec on my desktop.

Oddly, the undo operation (just wrote a downgrade script in case it is
needed) takes 11mins. 

The patches I've got to only re-index once per change to the index
helps a lot, naturally :-). 

> I guess rewriting the whole database consumes quite some cpu
> and also memory. A server may run out of memory while doing this
> as we need more than twice the size of all sam.ldb* databases together.

We only hold the index in memory, the rest could get paged out, but I
agree that running on a server with 2x DB size in free memory would
avoid thrashing. 

I agree a DB without a big transaction area already and perfectly
packed seems the worst possible case.  I gave up measuring the undo
run, but the upgrade case only took 4mins, probably because the index
records get to re-use their place in the DB.  (The new GUID index is
smaller, indeed this seems to be the primary advantage). 

The (100k user) database does grow from 800MB to 2700MB. 

Process resident size was around 2.3GB (from memory). 

tdbtool output from all the DBs in sam.ldb.d attached. 

Given all this, and the new output being printed, I think it is
reasonable to do this on first use for Samba 4.8, with adequate warning
in the WHATSNEW.txt

Thanks,

Andrew Bartlett
-- 
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba
-------------- next part --------------
CN=CONFIGURATION,DC=SAMBA,DC=ORG.ldb
Size of file/data: 25231360/4274916
Header offset/logical size: 0/25231360
Number of records: 7245
Incompatible hash: no
Active/supported feature flags: 0x00000000/0x00000001
Robust mutexes locking: no
Smallest/average/largest keys: 12/43/141
Smallest/average/largest data: 68/546/25871
Smallest/average/largest padding: 26/152/6483
Number of dead records: 1
Smallest/average/largest dead records: 13504488/13504488/13504488
Number of free records: 1
Smallest/average/largest free records: 6136612/6136612/6136612
Number of hash chains: 10000
Smallest/average/largest hash chains: 0/0/6
Number of uncoalesced records: 0
Smallest/average/largest uncoalesced runs: 0/0/0
Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes: 1/16/4/24/54/1/0
CN=SCHEMA,CN=CONFIGURATION,DC=SAMBA,DC=ORG.ldb
Size of file/data: 30842880/4961727
Header offset/logical size: 0/30842880
Number of records: 10867
Incompatible hash: no
Active/supported feature flags: 0x00000000/0x00000001
Robust mutexes locking: no
Smallest/average/largest keys: 12/37/129
Smallest/average/largest data: 68/418/24886
Smallest/average/largest padding: 26/119/6241
Number of dead records: 1
Smallest/average/largest dead records: 16920552/16920552/16920552
Number of free records: 1
Smallest/average/largest free records: 7363632/7363632/7363632
Number of hash chains: 10000
Smallest/average/largest hash chains: 0/1/9
Number of uncoalesced records: 0
Smallest/average/largest uncoalesced runs: 0/0/0
Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes: 1/15/4/24/55/1/0
DC=DOMAINDNSZONES,DC=SAMBA,DC=ORG.ldb
Size of file/data: 1159168/152620
Header offset/logical size: 0/1159168
Number of records: 218
Incompatible hash: no
Active/supported feature flags: 0x00000000/0x00000001
Robust mutexes locking: no
Smallest/average/largest keys: 12/42/140
Smallest/average/largest data: 45/657/23643
Smallest/average/largest padding: 19/282/5918
Number of dead records: 1
Smallest/average/largest dead records: 741352/741352/741352
Number of free records: 50
Smallest/average/largest free records: 20/3140/31064
Number of hash chains: 10000
Smallest/average/largest hash chains: 0/0/2
Number of uncoalesced records: 2
Smallest/average/largest uncoalesced runs: 1/1/1
Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes: 1/12/5/14/64/1/3
DC=FORESTDNSZONES,DC=SAMBA,DC=ORG.ldb
Size of file/data: 3784704/86526
Header offset/logical size: 0/3784704
Number of records: 117
Incompatible hash: no
Active/supported feature flags: 0x00000000/0x00000001
Robust mutexes locking: no
Smallest/average/largest keys: 12/43/146
Smallest/average/largest data: 45/696/23643
Smallest/average/largest padding: 19/276/5918
Number of dead records: 1
Smallest/average/largest dead records: 757736/757736/757736
Number of free records: 31
Smallest/average/largest free records: 24/92397/2806132
Number of hash chains: 10000
Smallest/average/largest hash chains: 0/0/2
Number of uncoalesced records: 1
Smallest/average/largest uncoalesced runs: 1/1/1
Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes: 0/2/1/76/20/0/1
DC=SAMBA,DC=ORG.ldb
Size of file/data: 2869850112/602742126
Header offset/logical size: 0/2869850112
Number of records: 800884
Incompatible hash: no
Active/supported feature flags: 0x00000000/0x00000001
Robust mutexes locking: no
Smallest/average/largest keys: 12/36/129
Smallest/average/largest data: 72/715/17488876
Smallest/average/largest padding: 26/193/4372231
Number of dead records: 1
Smallest/average/largest dead records: 1289592808/1289592808/1289592808
Number of free records: 1
Smallest/average/largest free records: 802944316/802944316/802944316
Number of hash chains: 10000
Smallest/average/largest hash chains: 15/80/211
Number of uncoalesced records: 0
Smallest/average/largest uncoalesced runs: 0/0/0
Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes: 1/20/5/28/45/1/0
-------------- next part --------------
CN=CONFIGURATION,DC=SAMBA,DC=ORG.ldb
Size of file/data: 7503872/5377860
Header offset/logical size: 0/7503872
Number of records: 7245
Incompatible hash: no
Active/supported feature flags: 0x00000000/0x00000001
Robust mutexes locking: no
Smallest/average/largest keys: 12/46/173
Smallest/average/largest data: 68/695/145107
Smallest/average/largest padding: 26/190/36291
Number of dead records: 0
Smallest/average/largest dead records: 0/0/0
Number of free records: 5
Smallest/average/largest free records: 12/105882/529148
Number of hash chains: 10000
Smallest/average/largest hash chains: 0/0/5
Number of uncoalesced records: 0
Smallest/average/largest uncoalesced runs: 0/0/0
Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes: 4/67/18/7/0/3/1
CN=SCHEMA,CN=CONFIGURATION,DC=SAMBA,DC=ORG.ldb
Size of file/data: 8908800/5785269
Header offset/logical size: 0/8908800
Number of records: 10867
Incompatible hash: no
Active/supported feature flags: 0x00000000/0x00000001
Robust mutexes locking: no
Smallest/average/largest keys: 12/39/173
Smallest/average/largest data: 68/492/107188
Smallest/average/largest padding: 34/138/26815
Number of dead records: 0
Smallest/average/largest dead records: 0/0/0
Number of free records: 7
Smallest/average/largest free records: 12/188648/1320084
Number of hash chains: 10000
Smallest/average/largest hash chains: 0/1/9
Number of uncoalesced records: 0
Smallest/average/largest uncoalesced runs: 0/0/0
Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes: 5/60/17/15/0/3/0
DC=DOMAINDNSZONES,DC=SAMBA,DC=ORG.ldb
Size of file/data: 417792/174502
Header offset/logical size: 0/417792
Number of records: 218
Incompatible hash: no
Active/supported feature flags: 0x00000000/0x00000001
Robust mutexes locking: no
Smallest/average/largest keys: 12/46/173
Smallest/average/largest data: 45/754/23643
Smallest/average/largest padding: 19/204/5918
Number of dead records: 0
Smallest/average/largest dead records: 0/0/0
Number of free records: 2
Smallest/average/largest free records: 20/76582/153144
Number of hash chains: 10000
Smallest/average/largest hash chains: 0/0/2
Number of uncoalesced records: 0
Smallest/average/largest uncoalesced runs: 0/0/0
Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes: 2/39/11/37/0/1/10
DC=FORESTDNSZONES,DC=SAMBA,DC=ORG.ldb
Size of file/data: 3026944/96877
Header offset/logical size: 0/3026944
Number of records: 117
Incompatible hash: no
Active/supported feature flags: 0x00000000/0x00000001
Robust mutexes locking: no
Smallest/average/largest keys: 12/48/173
Smallest/average/largest data: 45/779/23643
Smallest/average/largest padding: 19/212/5918
Number of dead records: 0
Smallest/average/largest dead records: 0/0/0
Number of free records: 2
Smallest/average/largest free records: 80/1431112/2862144
Number of hash chains: 10000
Smallest/average/largest hash chains: 0/0/2
Number of uncoalesced records: 0
Smallest/average/largest uncoalesced runs: 0/0/0
Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes: 0/3/1/95/0/0/1
DC=SAMBA,DC=ORG.ldb
Size of file/data: 892006400/654663098
Header offset/logical size: 0/892006400
Number of records: 800884
Incompatible hash: no
Active/supported feature flags: 0x00000000/0x00000001
Robust mutexes locking: no
Smallest/average/largest keys: 12/38/173
Smallest/average/largest data: 72/778/17488876
Smallest/average/largest padding: 26/209/4372237
Number of dead records: 0
Smallest/average/largest dead records: 0/0/0
Number of free records: 24
Smallest/average/largest free records: 12/2104659/50510380
Number of hash chains: 10000
Smallest/average/largest hash chains: 9/80/232
Number of uncoalesced records: 0
Smallest/average/largest uncoalesced runs: 0/0/0
Percentage keys/data/padding/free/dead/rechdrs&tailers/hashes: 3/70/19/6/0/3/0


More information about the samba-technical mailing list