[WIP] [PATCH] ldb: new on-disk pack format

Aaron Haslett aaronhaslett at catalyst.net.nz
Wed May 15 23:29:17 UTC 2019


Patches setting the stage for this new pack format are in master at
ea7fd52..85b6f71 (6 patches).

This new version (attached) includes many fixes and code for repacking
databases at the appropriate versions.  We made the decision to keep
GUID indexing and the new pack format linked, meaning a DN indexed
database will assumed to use packing format V1 (LDB will error if it
isn't), and a GUID indexed database will be assumed to use V2, and the
database will be repacked if it isn't.

MR: https://gitlab.com/samba-team/samba/merge_requests/450


On 30/04/19 2:55 PM, Andrew Bartlett wrote:
> On Tue, 2019-04-30 at 14:25 +1200, Aaron Haslett via samba-technical
> wrote:
>> Garming discovered poor performance when recursively calculating group
>> membership for a user during LDAP bind.  This WIP patch attempts to fix
>> the problem by separating values from the rest of the data in our LDB
>> pack format.  This should dramatically reduce the amount of data loaded
>> into cache while unpacking with flag LDB_UNPACK_DATA_FLAG_NO_DATA_ALLOC.
>>
>> Correctness testing is included and a CI run is here:
>>
>> https://gitlab.com/samba-team/devel/samba/pipelines/59051539
>>
>> To be done:
>>
>>   * Performance testing
>>   * Research into OpenLDAP's pack format and possible modifications to
>>     ours based on theirs
> I've looked at the OpenLDAP code (mdb_entry_encode()), and the big
> difference is not in the implementation but in the ability to follow
> the code.  Both need more inline comments, but the OL code also avoids
> whitespace (ouch).
>
> Things this code does that OL doesn't do are pack the offsets at
> smaller than 'unsigned int' size.
>
> It looks like OpenLDAP avoids the issue being worked on here (large
> multi-valued attributes needing to be loaded and discarded) by putting
> them in different DB keys with SLAP_ATTR_BIG_MULTI, but it also puts
> the data at the end of the buffer.
>
> So from a 'is there something major we are missing' point of view, I
> think what we are doing is reasonable.
>
> Finally, for a future investigation, I think we should remove the
> 'talloc individual pointers' behaviour entirely, and leave that to the
> 'filter' layer in ldb_key_value (which copies the whole entry). 
>
> I hope this helps answer these questions,
>
> Andrew Bartlett
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: packing_format_v2.patch
Type: text/x-patch
Size: 64335 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20190516/890dbb8b/packing_format_v2.bin>


More information about the samba-technical mailing list