[PATCH] Re: Proposed ldb 1.1.30 and tdb 1.3.13 (improve AD DC search performance, make multi-process)

Andrew Bartlett abartlet at samba.org
Tue Apr 11 20:23:27 UTC 2017


On Tue, 2017-04-11 at 21:40 +0200, Stefan Metzmacher wrote:
> Hi Andrew,
> 
> > > Here's an update on current master.
> > > 
> > > I've added the run-fcntl-deadlock as test.
> > > 
> > > I still want to run the standalone make test of tdb and ldb on
> > > solaris,
> > > before we push this and have a closer look at the ldb changes.
> > 
> > Thanks!
> > 
> > Finally, this is may also be a 'data corruption' issue:  
> > 
> > My theory is that because there is no read lock held, the index
> > might
> > refer to different records compared with the data.  Sometimes that
> > will
> > fall back to a traverse (we do that on errors), but sometimes that
> > will
> > mean we don't return all the records, as we essentially check
> > twice,
> > both in the index and in the final search filter.  
> > 
> > If a search took a long time, a modify could happen between those
> > two
> > points, and while the modify would be atomic, the search would not
> > be
> > atomic.
> 
> I don't understand what you're trying to say...
> 
> Do you mean we need to call ltdb_lock_read() in a wider window?
> E.g. when starting the ldb_search() from the top level module stack
> until end of that search? Currently only searches are only atomic
> within ltdb_search(), but not ldb_search().

They are not even that atomic!  The bug that started all this was that
ltdb_search() does not actually hold a lock (after the first read).  

I've not considered the implications further up the stack at this
point...

> > On the flip side, with this patch reads will block all writes for
> > longer.
> 
> And here...?

If we actually hold a lock in ltdb_search(), then we will block writes
more often.  That is, what we fix in '[PATCH 07/17] ldb_tdb: Ensure we
correctly decrement ltdb->read_lock_count'.

> > BTW, A test for the LDB changes is in the cmocka thread.  Because
> > LDB
> > is buggy it passes right now, but fails if we do have the ldb fix
> > but
> > don't have the tdb fix. 
> 
> And here...
> 
> Be explicit please and refer to specific commits and tests.

https://lists.samba.org/archive/samba-technical/2017-April/119857.html
has 
http://lists.samba.org/pipermail/samba-technical/attachments/20170407/4
a60faea/0002-ldb-Add-test-for-transaction-deadlock-detected-when-.bin

I'll merge the other cmocka tests shortly, then hope to work with
Andreas to get this on merged.  

When we Garming tried to fix the refcounting ([PATCH 07/17] ldb_tdb:
Ensure we correctly decrement ltdb->read_lock_count) and so hold the
lock in ltdb_search(), make test failed with deadlock detected.  This
test re-creates this situation manually, for easy validation. 

Thanks,

Andrew Bartlett

-- 
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba




More information about the samba-technical mailing list