[PATCH] TDB traverse lock changes for massive AD DC perf improvement
Stefan Metzmacher
metze at samba.org
Thu Apr 6 07:47:49 UTC 2017
Am 05.04.2017 um 20:56 schrieb Andrew Bartlett via samba-technical:
> On Wed, 2017-04-05 at 15:40 +0200, Stefan Metzmacher via samba-
> technical wrote:
>> Hi Andrew,
>>
>>>>> Please review. If reviewed, I'll push with a patch that adds
>>>>> new
>>>>> performance tests that I'm keen to get in.
>>>>
>>>> I'm wondering about all the readonly checks in
>>>> _tdb_transaction_prepare_commit(),
>>>> we already handle that in _tdb_transaction_start().
>>>>
>>>> I'm a bit nervous about the solaris10 problem.
>>>
>>> I am to. I only got game to formally propose it when Jeremy
>>> essentially proclaimed it dead :-)
>>
>> I discussed this with Volker and we think we have an understanding
>> of what the solaris problem might be.
>>
>> The pattern with the traverse_read and prepare_commit interaction is
>> the following:
>>
>> 1. transaction_start got the allrecord lock with F_RDLCK.
>>
>> 2. the traverse_read code walks the database in a sequence like this
>> (per chain):
>> 2.1 chainlock(chainX, F_RDLCK)
>> 2.2 recordlock(chainX.record1, F_RDLCK)
>> 2.3 chainunlock(chainX, F_RDLCK)
>> 2.4 callback(chainX.record1)
>> 2.5 chainlock(chainX, F_RDLCK)
>> 2.6 recordunlock(chainX.record1, F_RDLCK)
>> 2.7 recordlock(chainX.record2, F_RDLCK)
>> 2.8 chainunlock(chainX, F_RDLCK)
>> 2.9 callback(chainX.record2)
>> 2.10 chainlock(chainX, F_RDLCK)
>> 2.11 recordunlock(chainX.record2, F_RDLCK)
>> 2.12 chainunlock(chainX, F_RDLCK)
>> 2.13 goto next chain
>>
>> So it has always one record locked in F_RDLCK mode and tries to
>> get the 2nd one before it releases the first one.
>>
>> 3. prepare_commit tries to upgrade the allrecord lock to F_RWLCK
>> If that happens at the time of 2.4, the operation of
>> 2.5 may deadlock with the allrecord lock upgrade.
>> On Linux step 2.5 works in order to make some progress with the
>> locking, but on solaris it might fail because the kernel
>> wants to satisfy the 1st lock requester before the 2nd one.
>>
>> I think the first step is a standalone test that does this:
>>
>> process1: F_RDLCK for ofs=0 len=2
>> process2: F_RDLCK for ofs=0 len=1
>> process1: upgrade ofs=0 len=2 to F_RWLCK (in blocking mode)
>> process2: F_RDLCK for ofs=1 len=1
>> process2: unlock ofs=0 len=2
>> process1: should continue at that point
>>
>> Once we have such a test we can run it on several solaris, freebsd,
>> linux or whatever.
>>
>> Then we can decide if we want a configure and/or runtime check for
>> this.
>> And only avoid the transaction F_RDLCK lock in traverse_read if the
>> kernel
>> behaves as expected.
>>
>> Can you write such a standalone test?
>
> I'll give it a shot today. Can I do it in python?
No, I think we need this in C.
metze
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20170406/8da9a463/signature.sig>
More information about the samba-technical
mailing list