[PATCHES] CTDB parallel database recovery

Michael Adam obnox at samba.org
Wed Oct 14 10:57:41 UTC 2015


On 2015-10-14 at 21:26 +1100, Martin Schwenke wrote:
> Note that it is now in master...  :-)

Gosh - I thought I'd give poor Amitay a heads-up since
no-one seemed to care to even respond... ;-)

Thanks for all the work!

Cheers - Michael

> peace & happiness,
> martin
> 
> On Wed, 14 Oct 2015 10:53:50 +0200, Michael Adam <obnox at samba.org>
> wrote:
> 
> > Hi Amitay!
> > 
> > First of all thanks for this work! - It looks impressive.
> > I wanted to look at it, but have not yet found the time
> > for it. It is quite a patchset... :-)
> > 
> > Just writing this heads-up so you know this hasn't gone
> > unnoticed.
> > 
> > Cheers - Michael
> > 
> > On 2015-09-25 at 16:22 +1000, Amitay Isaacs wrote:
> > > Hi,
> > > 
> > > Last few months I have been working on fixing the Samba/CTDB deadlock
> > > issue.
> > > 
> > > The problem:
> > > 
> > > Many times samba tries to grab multiple record locks in sequence.  Consider
> > > a case when samba is already holding a record lock on a database and tries
> > > to get a record lock on second database.  If the second record is not
> > > available on the local node, samba asks ctdb to migrate the record.  If
> > > recovery occurs at this time (e.g. node becoming inactive), ctdb cannot
> > > freeze all the databases for recovery since samba is already holding one
> > > lock and waiting for the second lock.  CTDB can process the second record
> > > request only after the recovery is complete, thus causing a deadlock.
> > > 
> > > The solution:
> > > 
> > > In parallel database recovery, each database is frozen and recovered
> > > independent from each other.  So as soon as the second database is
> > > recovered, CTDB will resend and process all the pending migration requests
> > > and then samba can get the second lock.  Once samba releases both the
> > > locks, ctdb can freeze the first database and recover it completing
> > > recovery process.
> > > 
> > > The implementation of parallel database recovery requires untangling
> > > freezing, call processing and recovery code.  To be able to recover each
> > > database independently, new controls are added.  The first series of
> > > patches mainly modify the freeze code.  The patches are available in the
> > > ctdb-freeze branch and are also attached.
> > > 
> > > 
> > > https://git.samba.org/?p=amitay/samba.git;a=shortlog;h=refs/heads/ctdb-freeze
> > > 
> > > The second set of patches add generation to each database and use database
> > > generation in the packet headers for record migration packets.  The patches
> > > are available in the ctdb-generation branch and are also attached.
> > > 
> > > 
> > > https://git.samba.org/?p=amitay/samba.git;a=shortlog;h=refs/heads/ctdb-generation
> > > 
> > > The last set of patches introduces lots of new code with tevent_req based
> > > abstractions.  The significant changes are:
> > > 
> > >   - Packet read/write and communication abstractions
> > >   - Abstract and separate protocol serialization routines
> > >   - Create a completely new client side API based on tevent_req
> > > 
> > > The main motivation is to encapsulate CTDB protocol behind a reasonable API
> > > which can be used by CTDB tool and Samba.  This will avoid the need to
> > > maintain independent implementation of CTDB protocol in samba code.
> > > 
> > > The actual parallel database recovery is implemented in a recovery helper
> > > which forked from recovery daemon.  The patches are available in
> > > ctdb-recovery branch.
> > > 
> > > 
> > > https://git.samba.org/?p=amitay/samba.git;a=shortlog;h=refs/heads/ctdb-recovery
> > > 
> > > The last set of patches is quite large and therefore not attached.
> > > 
> > > I have added tests for new abstractions and extensive tests for protocol
> > > serialization.  The new client side API is tested with re-implementation of
> > > ctdb tool.  The new tool will eventually replace the current tools/ctdb.c
> > > code.  This code is not yet complete and therefore will not be pushed just
> > > yet.  Those who are interested can take a look at the new ctdb tool here:
> > > 
> > >   https://git.samba.org/?p=amitay/samba.git;a=shortlog;h=refs/heads/ctdb-wip
> > > 
> > > Martin has reviewed all the patches, however these are large changes and I
> > > would really appreciate additional eyes on the code.
> > > 
> > > To reiterate, the sequence of patches (presented as branch names) is:
> > > 
> > > 1. ctdb-freeze
> > > 2. ctdb-generation
> > > 3. ctdb-recovery
> > > 
> > > Thanks.
> > > 
> > > Amitay.
> > 
> > 
> > 
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20151014/ced198fe/signature.sig>


More information about the samba-technical mailing list