[PATCH] GUID index for LDB

Andrew Bartlett abartlet at samba.org
Wed Sep 6 15:15:08 UTC 2017


On Wed, 2017-09-06 at 12:44 +0200, Stefan Metzmacher wrote:
> Hi Andrew,
> 
> > > > I'll e-mail Tridge/Douglas/Catalyst for permission on a re-licence to
> > > > LGPLv3 the binary search macros next week.
> > > > 
> > > > I know it is a massive patch set, but some feedback would be helpful.
> > > > 
> > > > http://git.catalyst.net.nz/gitweb?p=samba.git;a=shortlog;h=refs/heads/GUID-index-6
> > > > 
> > 
> > This has been a success, and I have now got some performance numbers.
> > 
> > See the attached graph, with test times normalized to 1.  It shows that
> >  some tasks are better (much better) and that the rest is pretty much
> > un-affected.  (We have found that the noise on these measurements is
> > about 5%). 
> 
> So there's no performance loss of 5% for searches?

There might be, but I don't think so.  Our experience is that even
after several runs the numbers below 5% are not statistically
significant, as the absolute values have too much noise.  Watch how the
'do nothing' line moves around for an idea.  I'll mail you the full
results file and the absolute values graph tomorrow. 

> As we're now doing one more hop from the index (now via the objectGUID)
> to the dn.

We only change the cost for a base DN search really, plus the cost of
checking the base for a subtree search.  

Anything in an index avoids going from a DN -> key with a casefold, as
both the contents of the index and the key in the DB now match exactly
(plus a prefix). 

So an indexed search that was:
 - index -> casefold -> key
is:
 - index -> key

and a base search that was:
 - DN -> casefold -> key
is:
 - DN -> index -> key

We use a number of tricks to ensure we don't waste the expense of the
casefold. 

> Is it expected that only some workloads are faster?

Yes.  It is delete from an index that hurts the most in the current
code (linear scan), the rest of the benefit comes from having a smaller
index overall, reducing the memcpy time in the read and transaction
commit. 

> Do these numbers already include the BINARY_SEARCH patches?

Yes.

> > This series also passes a full make test.  It also showed some flapping
> > tests, so I plan to chase those down and I look forward to a positive
> > review!
> 
> I guess you'll resort the commits so that the version bump happens
> at the end just before the final patch that activates it in Samba?

Yes, that is the plan.

> Can you also run an autobuild without the activation in Samba,
> so that we're sure we don't insert regressions for possible backports?

Yes, I'll check it with and without activation. 

I have another proposal for the changes that need to make it in for 4.7
in a distinct thread.

> I'm planing to have a closer look during the next two weeks.
> As this is a huge patchset that immediately results in a new
> ldb release, we should take our time for this (as it won't make it
> to 4.7.0) without rushing this into master.

To give you some confidence, Garming has had a look at it, and I plan
to deal with the small details he has found in the next couple of days.

Thanks,

Andrew Bartlett

-- 
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba




More information about the samba-technical mailing list