LDAP notification tests fail with full-DB locking

Andrew Bartlett abartlet at samba.org
Fri May 26 21:32:24 UTC 2017

On Fri, 2017-05-26 at 21:33 +0200, Stefan Metzmacher via samba-
technical wrote:
> Hi Andrew,
> > I'm trying to get in full-DB locking into LDB, but I'm having trouble,
> > as the LDAP notification tests fail.  I've done a bisect and it is only
> > once we enable the LDB locks in "ldb: Lock the whole backend database
> > for the duration of a search" that we fail, but I can't yet figure out
> > how. 
> > 
> > The branch is here:
> > http://git.catalyst.net.nz/gw?p=samba.git;a=shortlog;h=refs/heads/ldb-s
> > afe-locking-private-ev-idx
> > 
> > The aim of this branch is to address a number of failures we have seen,
> > including we suspect the flapping replication tests.  We have been able
> > to trigger replication failures and missing search results if we hit
> > the DB hard enough. 
> > 
> > If you have any clues as to why the tests fail, I would be most
> > appreciative. 
> Not really, sorry.
> > Finally, I'm wondering if we can get the patches up to "ldb_tdb: Ensure
> > we correctly decrement ltdb->read_lock_count" merged?  These patches
> > are not enough to solve the lack-of-locking issues entirely, but have
> > tests and at least ensure the read performance improves.
> > 
> > Otherwise, it would be good to at least merge the event loop changes
> > and the index improvements. 
> Can you please base the bare minimum on
> https://git.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/master4-ldb-1
> So that we have all tdb changes first, then all ldb changes
> followed by the strictly required samba patches to pass
> autobuild and at the same time have a tree that doesn't
> introduce regressions.
> Having that will make it much easier to get to the rest.

I agree with that approach (splitting things up, making it minimal).

Since I wrote that mail I've been thinking of handling the others side
of it to reduce the set of patches:  First publish, seek review and
then push the non-locking changes and (perhaps) some more perf work
douglas has been doing (not yet published), leaving the tdb bump and
the ldb locking changes for a later ldb release.

I may even skip the event loop changes, as before I propose those I
want to add some cmocka tests for my changes in "ldb: Add
ldb_build_req_common() helper function"

I fear the locking bugs have a while to run, and I don't want to commit
us to any locking changes till I get the whole-DB lock working, in case
it causes us to have to re-visit them. 

Would it be OK if we bumped the ldb version a few times, starting with
the easiest patches first?  

This train is getting difficult to land all under one version number.
(They are not really in short supply are they?)

> Then prepare a branch with the minimum required patches
> to trigger the notify problem and give me the
> make subset that triggers it.

Thanks.  I will get you this branch next week.  I should even be able
to split that in the other direction, and not even include event loop
changes to show it is just the global lock.  (The event loop changes
are needed for the final fix, naturally).

The command is 
make test TESTS=samba4.ldap.notification.python

Finally testing has shown me that my test "ldb: Show that writes do not
appear during an ldb_search()" hangs, rather than fails without the
first set of locking fixes, so I need to sort that out. 

This looks like quite a long road, and I just hope we can sort it out
for 4.7!


Andrew Bartlett

Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba

More information about the samba-technical mailing list