ntdb for Samba 4.2?

Volker Lendecke Volker.Lendecke at SerNet.DE
Thu Mar 20 04:35:20 MDT 2014


On Thu, Mar 20, 2014 at 03:33:01PM +1030, Rusty Russell wrote:
> Volker Lendecke <Volker.Lendecke at SerNet.DE> writes:
> > On Wed, Mar 19, 2014 at 02:48:33PM +1030, Rusty Russell wrote:
> >> Meanwhile, there's no demand to drive it.  TDB has problems, but scaling
> >> over 3GB isn't the main one.  The locking speed has been greatly
> >> improved by the addition of inline mutexes; and so you won't notice
> >> freelist contention as much.  And using dbwrap means the new API doesn't
> >> even make the code cleaner.
> >
> > Even with mutexes you do notice freelist contention. The
> > real kick for this in our tests is the patch from
> > yesterday: With TDB_VOLATILE we keep some dead records per
> > chain, and while the freelist is blocked we go fishing for
> > dead records in other chains. Essentially, this turns each
> > chain into a small freelist. This together with mutexes
> > really, really rocks. No futex syscalls around anymore even
> > for heavily loaded servers. :-)
> 
> Yes, this is a really good idea.  I'd love to see a benchmark
> included in tdb which demonstrated the improvement, too.  I'm
> not sure tdbtorture will do it.
> 
> But:
>         5f7b481349796cc0e90563ed01353809b403e429 tdb: Fix a tdb corruption
> 
> I don't understand how this can corrupt?  A test case should be fairly
> simple, but I think this change is unnecessary.

The test case is simple: Run tdbtorture after adding
TDB_VOLATILE to the open flag. The problem is that with the

        rec_ptr = tdb_find_lock_hash(tdb, key, hash, F_WRLCK, &rec);

in common.c:393 we read the full record including the .next
pointer. tdb_purge_dead (line 412) can change rec.next in
the database: If rec.next points at a dead record, then
tdb_purge_dead will modify rec.next on disk without
modifying the copy on tdb_delete_hash's stack. Without
5f7b481349796cc0e90563ed01353809b403e429 we forced our copy
to disk, including the now invalid rec.next. The patch
changes the write call such that only the magic value is
changed, the rest is untouched.

Hope that explains it.

With best regards,

Volker Lendecke

-- 
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de, mailto:kontakt at sernet.de


More information about the samba-technical mailing list