[PATCH] TDB traverse lock changes for massive AD DC perf improvement

Stefan Metzmacher metze at samba.org
Thu Apr 6 07:47:49 UTC 2017


Am 05.04.2017 um 20:56 schrieb Andrew Bartlett via samba-technical:
> On Wed, 2017-04-05 at 15:40 +0200, Stefan Metzmacher via samba-
> technical wrote:
>> Hi Andrew,
>>
>>>>> Please review.  If reviewed, I'll push with a patch that adds
>>>>> new
>>>>> performance tests that I'm keen to get in. 
>>>>
>>>> I'm wondering about all the readonly checks in
>>>> _tdb_transaction_prepare_commit(),
>>>> we already handle that in _tdb_transaction_start().
>>>>
>>>> I'm a bit nervous about the solaris10 problem.
>>>
>>> I am to.  I only got game to formally propose it when Jeremy
>>> essentially proclaimed it dead :-)
>>
>> I discussed this with Volker and we think we have an understanding
>> of what the solaris problem might be.
>>
>> The pattern with the traverse_read and prepare_commit interaction is
>> the following:
>>
>> 1. transaction_start got the allrecord lock with F_RDLCK.
>>
>> 2. the traverse_read code walks the database in a sequence like this
>> (per chain):
>>    2.1  chainlock(chainX, F_RDLCK)
>>    2.2  recordlock(chainX.record1, F_RDLCK)
>>    2.3  chainunlock(chainX, F_RDLCK)
>>    2.4  callback(chainX.record1)
>>    2.5  chainlock(chainX, F_RDLCK)
>>    2.6  recordunlock(chainX.record1, F_RDLCK)
>>    2.7  recordlock(chainX.record2, F_RDLCK)
>>    2.8  chainunlock(chainX, F_RDLCK)
>>    2.9  callback(chainX.record2)
>>    2.10 chainlock(chainX, F_RDLCK)
>>    2.11 recordunlock(chainX.record2, F_RDLCK)
>>    2.12 chainunlock(chainX, F_RDLCK)
>>    2.13 goto next chain
>>
>>    So it has always one record locked in F_RDLCK mode and tries to
>>    get the 2nd one before it releases the first one.
>>
>> 3. prepare_commit tries to upgrade the allrecord lock to F_RWLCK
>>    If that happens at the time of 2.4, the operation of
>>    2.5 may deadlock with the allrecord lock upgrade.
>>    On Linux step 2.5 works in order to make some progress with the
>>    locking, but on solaris it might fail because the kernel
>>    wants to satisfy the 1st lock requester before the 2nd one.
>>
>> I think the first step is a standalone test that does this:
>>
>> process1: F_RDLCK for ofs=0 len=2
>> process2: F_RDLCK for ofs=0 len=1
>> process1: upgrade ofs=0 len=2 to F_RWLCK (in blocking mode)
>> process2: F_RDLCK for ofs=1 len=1
>> process2: unlock ofs=0 len=2
>> process1: should continue at that point
>>
>> Once we have such a test we can run it on several solaris, freebsd,
>> linux or whatever.
>>
>> Then we can decide if we want a configure and/or runtime check for
>> this.
>> And only avoid the transaction F_RDLCK lock in traverse_read if the
>> kernel
>> behaves as expected.
>>
>> Can you write such a standalone test?
> 
> I'll give it a shot today.  Can I do it in python?

No, I think we need this in C.

metze

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20170406/8da9a463/signature.sig>


More information about the samba-technical mailing list