Fwd: Regression: ldb performance with indexes

Andrew Bartlett abartlet at samba.org
Thu May 2 20:51:31 UTC 2024


On Thu, 2024-05-02 at 14:49 +0200, Andreas Schneider via samba-
technical wrote:
> On Friday, 22 March 2024 17:55:23 GMT+2 Andréas LEROUX via samba-
> technical wrote:
> > Hi Andreas and Andrew,
> >  >>>> > Hi,my colleagues discovered a performance issue in libldb:
> > >>>> > https://bugzilla.samba.org/show_bug.cgi?id=15590
> >  >>>> >  >>>> > > > > As soon as you use indexes, ldbadd will be
> > magnitudes >>  >> slower than >>  >>>> > itwas before.Could some
> > ldb expert please look into it? >>>> >  >>>> > > Your subject says
> > a regression. What version is this a >>>>  >>>> regressionagainst?
> > >>>> Isn't that obvious from the bug report? >>>  >>> Here is the
> > short summary: >>> $ bash repro.sh 20000 indexesAdded 2 records
> > successfullyAdded >>  >> 20000 >>  >>> records successfully >>> On
> > Samba 4.10: 0m01.231sOn Samba 4.19: 1m30.924s (that's 90 times >>>
> > slower) >>>  >>>> > The very nature of a DB index is that it will
> > take time to >>>>  >>>> create,possibly a lot of time, but should
> > make reads faster. >>>> Either the DB index doesn't work at all in
> > Samba 4.10 or there >>  >> is a >>  >>> huge performance problem in
> > Samba 4.19. What is it? >>  >> Thanks, that wasn't written as
> > obviously on the bug, thanks for the >> clarification. >  > I used
> > our CentOS 8 Stream CI image for bisecting. You can't bisect >
> > easily on a modern Linux Distribution, as the included waf would
> > not > have support for newer Python versions like 3.12.
> >  > In case you want to reproduce it, here is my run:I'm Andréas
> > from Tranquil IT dev team. Denis and Yohannès asked me thisweek to
> > take a look at the performance issues on large domains,
> > whichinclude this issue in the current thread along the mdb large
> > transactionissues.
> > The attached patchset goes through all the tdb and ldb make test.
> > * LMDB : increase MDB_IDL_LOGN from 16 to 18 to accomodate large
> > nestedtransactions* tdb : fail-fast when record hash doesn't match
> > expected hash to avoidto read/copy the entire record* ldb :
> > increase DEFAULT_INDEX_CACHE_SIZE from 491 to 8089 to increasethe
> > number of bucket to have smaller bucket to have faster iteration
> > ineach buckets in tdb_find
> > With this patchset we can upgrade large domains (>200k objects)
> > toFL2k16 level in approximatly 1 hour instead of 3 days :-)
> > [root at srvads1-bl1cw ~]# bash repro.sh 20000 indexes Added 2
> > recordssuccessfully Added 20000 records successfully real 0m0.536s
> > user0m0.798s sys 0m0.105s
> 
> I'm sorry but I'm not able to reproduce this:
> 
> tis-tdbfind.patch:
> bash repro_dev_ldb.sh 10000 indexesAdded 2 records successfullyAdded
> 10000 records successfully
> real    0m9.035suser    0m9.021ssys     0m0.283s
> 
> tis-ldbfind.patch:
> bash repro_dev_ldb.sh 10000 indexesAdded 2 records successfullyAdded
> 10000 records successfully
> real    0m8.929suser    0m8.980ssys     0m0.219s
> 
> 
> I have a patch in the area to get rid of some malloc calls, but the
> only give a really small improvement.
> 
> I don't know what workflow your patches exactly improve but it would
> be nice to have a reproducer :-)

Just a quick note to connect some threads.  We have three discussions
on this same issue, we should probably centralise here as this is where
things started, but just so folks can follow, see:
https://bugzilla.samba.org/show_bug.cgi?id=15590https://gitlab.com/samba-team/samba/-/merge_requests/3616
In short, the emerging consensus is that we really need is a better
data structure than an in-memory TDB for the in-memory cache needed to
keep the indexes lined up with the database in this case.
Andrew Bartlett-- 
Andrew Bartlett (he/him)       https://samba.org/~abartlet/Samba Team Member (since 2001) https://samba.orgSamba Team Lead                https://catalyst.net.nz/services/sambaCatalyst.Net Ltd
Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group
company
Samba Development and Support: https://catalyst.net.nz/services/samba
Catalyst IT - Expert Open Source Solutions


More information about the samba-technical mailing list