ldb + samdb perfs and ideas for perf improvement

Mon Jan 21 00:27:10 MST 2013

On Sun, 2013-01-20 at 23:16 -0800, Matthieu Patou wrote:
> On 01/08/2013 09:25 PM, Matthieu Patou wrote:
> > On 01/07/2013 11:45 AM, simo wrote:
> >> On Mon, 2013-01-07 at 00:22 -0800, Matthieu Patou wrote:
> >>> Hello all,
> >>>
> >>> As some are already aware I started a task on performance
> >>> improvements on ldb and by side effect on samdb as well (or more
> >>> exactly on ldb_modules used by samdb).
> >>> The work in progress result of this is located at
> >>> http://git.samba.org/mat/?p=mat/samba.git;a=shortlog;h=refs/heads/ldb_perfs 
> >>>
> >>> It pass make tests, there is some stuff to fix in it like checking
> >>> for null pointers retuned by talloc and also limiting the size of
> >>> the talloc_pool but you should get an idea of it. I tried and think
> >>> managed to not break ldb ABI if I did well let me know I'm pretty
> >>> sure I can work around.
> >>>
> >>> I started my worked based on callgrind trace from Thomas Manniger
> >>> and continued by using a couple of python script to reproduce what I
> >>> think are typical searches, that is to say:
> >>>
> >>> 1) doing a search that will use a not really selective index on a
> >>> subtree that is in turn selective, thus causing the index to return
> >>> a lot of entries by the search itself not so much. This kind of
> >>> search didn't happen too often with the default provision but is
> >>> very likely to happen if you start to partition the users partition,
> >>> for instance if you have CN=<dept>,CN=Users,DC=domain,DC=tld
> >>> where dept can be sales, finances, hr, engineering, it ... and do a
> >>> search like '(objectclass=users)',
> >>> base="CN=IT,CN=Users,DC=domain,DC=tld" you will hit a not very
> >>> selective index but after you will not have a lot entries in IT
> >>> (relatively speaking to the number of total users). Based on my
> >>> experience of IT manager this is scheme is a pretty frequent one and
> >>> admins tends to think that doing search in subcontainers help the
> >>> server to do quick search (hint: it was not true).
> >>>
> >>> 2) doing a search that will use a not really selective index on the
> >>> DN of a partition (DC=domain,DC=tld, or
> >>> DC=Configuration,DC=domain,DC=tld, ...) with also a criteria on a
> >>> non indexed attribute (at least from ldb-tdb point of view). This
> >>> kind of search is *very* frequent in real life ie.
> >>> (&(objectclass=user)(mail=mat*)). This kind of search usually
> >>> returns not a lot of objects and not a lot of attributes.
> >>>
> >>> 3) doing a search that returns a lot of objects with one attribute
> >>>
> >>> 4) doing a basedn search returning a couple of attributes, this kind
> >>> of search is present a lot in the samdb and is not too much
> >>> influenced by indexes as (for the moment) samdb is indexed by DNs.
> >>>
> >>> I've been mainly changing:
> >>>
> >>> * doing the scope filtering before unpacking the object, this impact
> >>> searches like 1) and has a huge impact on performance (200%/300%
> >>> speed gain just this alone) as we don't have to unmarshall most of
> >>> the object and just talloc_free them a couple of seconds after.
> >>>
> >>> * indexing (including onelevel indexing) to use DNs stored in the
> >>> case folded form already, the reason for this is that as tdb's key
> >>> are DNs in the case folded we can save quite a lot of time at search
> >>> by not doing the case folding of DNs returned by an index and just
> >>> marking them already casefolded, it requires to store the DN in a
> >>> casefolded form and offset part of the cost to the write operation
> >>> (but usually the database has much more read than writes), this has
> >>> an influence in 1), 2), and to some extent 3 but the cost of search
> >>> callbacks reduce the impact of this change
> >>>
> >>> * filtering attributes not needed for the search criteria or for the
> >>> requested attributes, so that they are not talloced to be removed
> >>> just a couple of microsecond after the complete filtering.
> >>> This impact searches 1) 2) and 4)
> >>>
> >>> * reworking the operational module to avoid a double loop search to
> >>> be done on every single attribute of every single entry returned by
> >>> a search, this has an influence on search 3)
> >>>
> >>> For search 1) I have 600/700% of increase with a index of 1300
> >>> objects but with only 1 object matching the scope with one
> >>> attribute, that's impressive but it's definitely not the most
> >>> frequent query.
> >>> For search 2) I have 500/550% of increase with an index of 1300
> >>> objects but with only 1 object with criteria on indexed attribute
> >>> and a non indexed criteria
> >>> For search 3) I have 100%/130% of increase
> >>> For search 4) I have 100%/130% of increase
> >>>
> >>> I have also achieved a ~20% improvement on make test TESTS=samba4 
> >>> (tests
> >>> that can be impacted ihmo by improvements in ldb/dsdb) with some very
> >>> long tests like samba4.rpc.samr.large-dc.two or
> >>> samba4.ldap.secdesc.python  being improved from 30% to 50%.
> >>>
> >>> Thomas, can you try those patches but not on the database you have in
> >>> production, before doing test performance you have to reindex the
> >>> database so that the new indexing scheme is used (for the moment
> >>> no-autoupgrade): samba-tool dbcheck --reindex
> >> A few q.s about patches
> >>
> >> In ldb_compare_base_one you use this function:
> >> get_num_char_uptoposition() that does not skip escaped chars, so if you
> >> have a \, you might count it as one component when in fact there is
> >> none.
> > That's done on purpose, I don't want to have a complicated (costly) 
> > algorithm, my hypothesis is that those cases are pretty rare so if I'm 
> > asked if CN=foo\,bar,cn=baz has cn=baz as one level base, it should go 
> > to the path were we case fold both the dn and the base and do the 
> > correct comparison.
> >> I guess the point is that if they have *more* text then it already is at
> >> least one level, well that would be true if the 'optimization' weren't
> >> buggy in the first place.
> > The point is that if it's a first level there is one non escaped ',' 
> > if there is more it's either because it's not a first level or because 
> > there is an escaped ','.
> >>
> >> You could have attrname=foo,dc=bar and name=foo,dc=bar as the 2 DNs, and
> >> this function would return the wrong answer. It would tell you they have
> >> the same base when in fact they do not. We should fix that not build on
> >> this error to also return that there is one more component and match a
> >> onelevel that does not exist.
> > Well I'm pretty sure that there is a test for this but even if not 
> > (I'll fix this minor issue if any) we first do:
> > dif = strlen(attrname=foo,dc=bar) - strlen(name=foo,dc=bar)
> > which will return 3, then we compare that when taking the string of 
> > the dn starting at offset dif it matches the string of the base, in 
> > your example it matches,
> > Then we search in the range [0, dif -1] for the presence of ',' if we 
> > found 1 then it's assuming that there is a child. In your example, the 
> > search for a ',' in the range [0, 2] won't find any ','.
> >
> > But indeed there was an issue (already existing) with subtree scope 
> > with the kind of example you gave so I rewrote the code to be more 
> > careful.
> >
> >>
> >> What signatures 'changed' in "ldb: increase version due to signature
> >> changes" ?
> > It's because I have made the functions:
> >
> > ldb_dn_compare_base_one: int (struct ldb_dn *, struct ldb_dn *)
> > ldb_dn_set_casefolded: void (struct ldb_dn *)
> >
> > And made 2 other functions visible from ldb subsystem (that is to say 
> > they are in the private api but are usable from ldb_tdb):
> > ldb_match_message: int (struct ldb_context *, const struct ldb_message 
> > *, const struct ldb_parse_tree *, enum ldb_scope, bool *)
> > ldb_match_scope: int (struct ldb_context *, struct ldb_dn *, struct 
> > ldb_dn *, enum ldb_sc
> >> Also you increase the version again a few patches later, please increase
> >> it once only, if really necessary.
> > So yes it's needed and I started one set of optimisation and then a 
> > second set, when I'll push it I'll merge both so that we just have one 
> > bump, maybe it can be worth to bump from 1.1.x to 1.2.x.
> >> In general I like the direction you are going, but haven't had time to
> >> finely inspect all patches yet.
> > Good, wait for more comments.
> > Matthieu.
> >
> 
> Hello I pushed here:
> http://git.samba.org/mat/?p=mat/samba.git;a=shortlog;h=refs/heads/wait_reviews
> 
> A first set of patches, can I have a review for those who aren't ?
> They are mostly straight forward apart form:
> 
> ldb: Add ldb_dn_compare_base_one for checking if
> dsdb-operational: rework the loop for attribute removal
> dsdb-extended: use a cache of dsdb_attributes for the

Have you seen the comments metze has in his branch?
https://git.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/master4-tmp3 

It is for metze to write these up formally for you, but I just wanted
you get get some sense that this is being looked at, and there is some
feedback you can expect soon.  Metze I know has been busy trying to sort
out the ACLs, with the patches we have both done recently now finally
being in shape to get into the tree.

In terms of those reviews, I tend to agree with metze's terse markers so
far.

Andrew Bartlett
-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org