[PATCH] [BUG# 12858] Re: [WIP][PATCH] Safe LDB locking for better replication

Stefan Metzmacher metze at samba.org
Mon Jun 26 16:05:04 UTC 2017


Am 26.06.2017 um 04:21 schrieb Andrew Bartlett via samba-technical:
> On Mon, 2017-06-26 at 10:29 +1200, Andrew Bartlett via samba-technical
> wrote:
>> On Fri, 2017-06-23 at 12:45 +0200, Stefan Metzmacher wrote:
>>>
>>> 3 autobuilds passed fine.
>>>
>>> However I looked closer at this commit
>>> "dsdb: Add test showing that the whole DB (including partitions) is
>>> locked during a search"
>>> https://git.samba.org/?p=metze/samba/wip.git;a=commitdiff;h=07afa3c48a1f
>>>
>>> And I noticed that the code is not testing what the commit message says.
>>> It's testing that a search can't start while a prepared transaction is
>>> in progress.
>>>
>>> I changed the test_db_lock1() to use a full SamDB in the child
>>> instead of going directly to the raw sam.ldb backend.
>>> As this simulates the real world.
>>>
>>> I added tests which really test if the prepare_commit is blocked by
>>> a search. And that test pass with the full patchset.
>>>
>>> But it fails unexpected without the "dsdb: Teach the Samba partition
>>> module" commit (or if I remove the per partition locks from it).
>>>
>>> UNEXPECTED(failure):
>>> samba.tests.dsdb.samba.tests.dsdb.DsdbTests.test_db_lock2(ad_dc_ntvfs:local)
>>> REASON: Exception: Exception: Traceback (most recent call last):
>>>   File
>>> "/home/metze/devel/samba/4.0/master4-drsuapi/bin/python/samba/tests/dsdb.py",
>>> line 240, in test_db_lock2
>>>     self.assertGreater(end - start, 1.9)
>>> AssertionError: 0.0003879070281982422 not greater than 1.9
>>>
>>> So it seems that the read_lock doesn't reach the tdb backend for sam.ldb
>>>
>>> Can you please debug this?
>>> We need to understand this behavior and fix it.
>>
>> I've taken some more time than I had on Saturday to look over this more
>> carefully:
>>
>> I think the issue is that the test isn't able to create a write lock on
>> sam.ldb because it only makes a modification to
>> DC=SAMBA,DC=EXAMPLE,DC=COM.ldb and this means that 
>>
>> _tdb_transaction_prepare_commit() will just do:
>>
>> 	/* check for a null transaction */
>> 	if (tdb->transaction->blocks == NULL) {
>> 		return 0;
>> 	}
>>
>> To test what you want to test, we should add two more additional tests,
>> that confirm that adding an @ record to sam.ldb operates correctly
>> (these go into the top partition), and adding a record to the
>> CN=CONFIGURUATION backend db blocks a search, showing that it isn't
>> just the traverse search read lock that is blocking it.  (It isn't, but
>> proof would be nice). 
>>
>> My other comments around the assert() calls in the child still hold,
>> but these were not what was breaking it.  
>>
>> I expect to get back to you later today with these improved tests.
> 
> Attached are the improved tests.  I've not done the tidy and squash
> pass, because I want you to clearly understand what I've done without
> resorting to diffs of diffs.
> 
> Thanks for thinking hard about this stuff, and asking the difficult
> questions.  I hope we are almost complete on the code questions,
> however I still have no good solutions on the backward compatibility
> issues. 
> 
> Finally, I'm also running my own builds on the attached to be sure.

Thanks! I'll have a closer look tomorrow.

But it seems the per partition locks are not optional,
as prepare_commit() might be a no-op.

metze


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20170626/bd0ae6b4/signature.sig>


More information about the samba-technical mailing list