Fwd: Regression: ldb performance with indexes

Gary Lockyer gary at catalyst.net.nz
Thu May 2 21:25:21 UTC 2024


You can set the cache size by passing an option to ldb from python

see python/samba/join.py join_provision_own_domain


so maybe that could be done in schema upgrade as a temporary measure 
until a better data structure can be implemented.

Gary

On 3/05/24 08:51, Andrew Bartlett via samba-technical wrote:
> On Thu, 2024-05-02 at 14:49 +0200, Andreas Schneider via samba-
> technical wrote:
>> On Friday, 22 March 2024 17:55:23 GMT+2 Andréas LEROUX via samba-
>> technical wrote:
>>> Hi Andreas and Andrew,
>>>   >>>> > Hi,my colleagues discovered a performance issue in libldb:
>>>>>>>> https://bugzilla.samba.org/show_bug.cgi?id=15590
>>>   >>>> >  >>>> > > > > As soon as you use indexes, ldbadd will be
>>> magnitudes >>  >> slower than >>  >>>> > itwas before.Could some
>>> ldb expert please look into it? >>>> >  >>>> > > Your subject says
>>> a regression. What version is this a >>>>  >>>> regressionagainst?
>>>>>>> Isn't that obvious from the bug report? >>>  >>> Here is the
>>> short summary: >>> $ bash repro.sh 20000 indexesAdded 2 records
>>> successfullyAdded >>  >> 20000 >>  >>> records successfully >>> On
>>> Samba 4.10: 0m01.231sOn Samba 4.19: 1m30.924s (that's 90 times >>>
>>> slower) >>>  >>>> > The very nature of a DB index is that it will
>>> take time to >>>>  >>>> create,possibly a lot of time, but should
>>> make reads faster. >>>> Either the DB index doesn't work at all in
>>> Samba 4.10 or there >>  >> is a >>  >>> huge performance problem in
>>> Samba 4.19. What is it? >>  >> Thanks, that wasn't written as
>>> obviously on the bug, thanks for the >> clarification. >  > I used
>>> our CentOS 8 Stream CI image for bisecting. You can't bisect >
>>> easily on a modern Linux Distribution, as the included waf would
>>> not > have support for newer Python versions like 3.12.
>>>   > In case you want to reproduce it, here is my run:I'm Andréas
>>> from Tranquil IT dev team. Denis and Yohannès asked me thisweek to
>>> take a look at the performance issues on large domains,
>>> whichinclude this issue in the current thread along the mdb large
>>> transactionissues.
>>> The attached patchset goes through all the tdb and ldb make test.
>>> * LMDB : increase MDB_IDL_LOGN from 16 to 18 to accomodate large
>>> nestedtransactions* tdb : fail-fast when record hash doesn't match
>>> expected hash to avoidto read/copy the entire record* ldb :
>>> increase DEFAULT_INDEX_CACHE_SIZE from 491 to 8089 to increasethe
>>> number of bucket to have smaller bucket to have faster iteration
>>> ineach buckets in tdb_find
>>> With this patchset we can upgrade large domains (>200k objects)
>>> toFL2k16 level in approximatly 1 hour instead of 3 days :-)
>>> [root at srvads1-bl1cw ~]# bash repro.sh 20000 indexes Added 2
>>> recordssuccessfully Added 20000 records successfully real 0m0.536s
>>> user0m0.798s sys 0m0.105s
>> I'm sorry but I'm not able to reproduce this:
>>
>> tis-tdbfind.patch:
>> bash repro_dev_ldb.sh 10000 indexesAdded 2 records successfullyAdded
>> 10000 records successfully
>> real    0m9.035suser    0m9.021ssys     0m0.283s
>>
>> tis-ldbfind.patch:
>> bash repro_dev_ldb.sh 10000 indexesAdded 2 records successfullyAdded
>> 10000 records successfully
>> real    0m8.929suser    0m8.980ssys     0m0.219s
>>
>>
>> I have a patch in the area to get rid of some malloc calls, but the
>> only give a really small improvement.
>>
>> I don't know what workflow your patches exactly improve but it would
>> be nice to have a reproducer :-)
> Just a quick note to connect some threads.  We have three discussions
> on this same issue, we should probably centralise here as this is where
> things started, but just so folks can follow, see:
> https://bugzilla.samba.org/show_bug.cgi?id=15590https://gitlab.com/samba-team/samba/-/merge_requests/3616
> In short, the emerging consensus is that we really need is a better
> data structure than an in-memory TDB for the in-memory cache needed to
> keep the indexes lined up with the database in this case.
> Andrew Bartlett--
> Andrew Bartlett (he/him)       https://samba.org/~abartlet/Samba Team Member (since 2001) https://samba.orgSamba Team Lead                https://catalyst.net.nz/services/sambaCatalyst.Net Ltd
> Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group
> company
> Samba Development and Support: https://catalyst.net.nz/services/samba
> Catalyst IT - Expert Open Source Solutions

-- 
Gary Lockyer
Catalyst.Net Limited - Expert Open Source Solutions

Catalyst.Net Ltd - a Catalyst IT group company
DDI: +64 4 123 4567 | Mob: +64 21 123 4567 | Tel: +64 4 123 4567 | www.catalyst.net.nz

CONFIDENTIALITY NOTICE: This email is intended for the named recipients only. It may contain privileged, confidential or copyright information. If you are not the named recipient, any use, reliance upon, disclosure or copying of this email or its attachments is unauthorised. If you have received this email in error, please reply via email or call +64 4 499 2267.




More information about the samba-technical mailing list