[PATCH] Re: Proposed ldb 1.1.30 and tdb 1.3.13 (improve AD DC search performance, make multi-process)

Andrew Bartlett abartlet at samba.org
Tue Apr 11 20:23:27 UTC 2017

On Tue, 2017-04-11 at 21:40 +0200, Stefan Metzmacher wrote:
> Hi Andrew,
> > > Here's an update on current master.
> > > 
> > > I've added the run-fcntl-deadlock as test.
> > > 
> > > I still want to run the standalone make test of tdb and ldb on
> > > solaris,
> > > before we push this and have a closer look at the ldb changes.
> > 
> > Thanks!
> > 
> > Finally, this is may also be a 'data corruption' issue:  
> > 
> > My theory is that because there is no read lock held, the index
> > might
> > refer to different records compared with the data.  Sometimes that
> > will
> > fall back to a traverse (we do that on errors), but sometimes that
> > will
> > mean we don't return all the records, as we essentially check
> > twice,
> > both in the index and in the final search filter.  
> > 
> > If a search took a long time, a modify could happen between those
> > two
> > points, and while the modify would be atomic, the search would not
> > be
> > atomic.
> I don't understand what you're trying to say...
> Do you mean we need to call ltdb_lock_read() in a wider window?
> E.g. when starting the ldb_search() from the top level module stack
> until end of that search? Currently only searches are only atomic
> within ltdb_search(), but not ldb_search().

They are not even that atomic!  The bug that started all this was that
ltdb_search() does not actually hold a lock (after the first read).  

I've not considered the implications further up the stack at this

> > On the flip side, with this patch reads will block all writes for
> > longer.
> And here...?

If we actually hold a lock in ltdb_search(), then we will block writes
more often.  That is, what we fix in '[PATCH 07/17] ldb_tdb: Ensure we
correctly decrement ltdb->read_lock_count'.

> > BTW, A test for the LDB changes is in the cmocka thread.  Because
> > LDB
> > is buggy it passes right now, but fails if we do have the ldb fix
> > but
> > don't have the tdb fix. 
> And here...
> Be explicit please and refer to specific commits and tests.


I'll merge the other cmocka tests shortly, then hope to work with
Andreas to get this on merged.  

When we Garming tried to fix the refcounting ([PATCH 07/17] ldb_tdb:
Ensure we correctly decrement ltdb->read_lock_count) and so hold the
lock in ltdb_search(), make test failed with deadlock detected.  This
test re-creates this situation manually, for easy validation. 


Andrew Bartlett

Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba

More information about the samba-technical mailing list