[PATCHES] CTDB parallel database recovery
Martin Schwenke
martin at meltin.net
Wed Oct 14 10:26:19 UTC 2015
Note that it is now in master... :-)
peace & happiness,
martin
On Wed, 14 Oct 2015 10:53:50 +0200, Michael Adam <obnox at samba.org>
wrote:
> Hi Amitay!
>
> First of all thanks for this work! - It looks impressive.
> I wanted to look at it, but have not yet found the time
> for it. It is quite a patchset... :-)
>
> Just writing this heads-up so you know this hasn't gone
> unnoticed.
>
> Cheers - Michael
>
> On 2015-09-25 at 16:22 +1000, Amitay Isaacs wrote:
> > Hi,
> >
> > Last few months I have been working on fixing the Samba/CTDB deadlock
> > issue.
> >
> > The problem:
> >
> > Many times samba tries to grab multiple record locks in sequence. Consider
> > a case when samba is already holding a record lock on a database and tries
> > to get a record lock on second database. If the second record is not
> > available on the local node, samba asks ctdb to migrate the record. If
> > recovery occurs at this time (e.g. node becoming inactive), ctdb cannot
> > freeze all the databases for recovery since samba is already holding one
> > lock and waiting for the second lock. CTDB can process the second record
> > request only after the recovery is complete, thus causing a deadlock.
> >
> > The solution:
> >
> > In parallel database recovery, each database is frozen and recovered
> > independent from each other. So as soon as the second database is
> > recovered, CTDB will resend and process all the pending migration requests
> > and then samba can get the second lock. Once samba releases both the
> > locks, ctdb can freeze the first database and recover it completing
> > recovery process.
> >
> > The implementation of parallel database recovery requires untangling
> > freezing, call processing and recovery code. To be able to recover each
> > database independently, new controls are added. The first series of
> > patches mainly modify the freeze code. The patches are available in the
> > ctdb-freeze branch and are also attached.
> >
> >
> > https://git.samba.org/?p=amitay/samba.git;a=shortlog;h=refs/heads/ctdb-freeze
> >
> > The second set of patches add generation to each database and use database
> > generation in the packet headers for record migration packets. The patches
> > are available in the ctdb-generation branch and are also attached.
> >
> >
> > https://git.samba.org/?p=amitay/samba.git;a=shortlog;h=refs/heads/ctdb-generation
> >
> > The last set of patches introduces lots of new code with tevent_req based
> > abstractions. The significant changes are:
> >
> > - Packet read/write and communication abstractions
> > - Abstract and separate protocol serialization routines
> > - Create a completely new client side API based on tevent_req
> >
> > The main motivation is to encapsulate CTDB protocol behind a reasonable API
> > which can be used by CTDB tool and Samba. This will avoid the need to
> > maintain independent implementation of CTDB protocol in samba code.
> >
> > The actual parallel database recovery is implemented in a recovery helper
> > which forked from recovery daemon. The patches are available in
> > ctdb-recovery branch.
> >
> >
> > https://git.samba.org/?p=amitay/samba.git;a=shortlog;h=refs/heads/ctdb-recovery
> >
> > The last set of patches is quite large and therefore not attached.
> >
> > I have added tests for new abstractions and extensive tests for protocol
> > serialization. The new client side API is tested with re-implementation of
> > ctdb tool. The new tool will eventually replace the current tools/ctdb.c
> > code. This code is not yet complete and therefore will not be pushed just
> > yet. Those who are interested can take a look at the new ctdb tool here:
> >
> > https://git.samba.org/?p=amitay/samba.git;a=shortlog;h=refs/heads/ctdb-wip
> >
> > Martin has reviewed all the patches, however these are large changes and I
> > would really appreciate additional eyes on the code.
> >
> > To reiterate, the sequence of patches (presented as branch names) is:
> >
> > 1. ctdb-freeze
> > 2. ctdb-generation
> > 3. ctdb-recovery
> >
> > Thanks.
> >
> > Amitay.
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 173 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20151014/eea5e46e/attachment.sig>
More information about the samba-technical
mailing list