[PATCHES] CTDB parallel database recovery

Amitay Isaacs amitay at gmail.com
Fri Sep 25 06:22:21 UTC 2015


Last few months I have been working on fixing the Samba/CTDB deadlock

The problem:

Many times samba tries to grab multiple record locks in sequence.  Consider
a case when samba is already holding a record lock on a database and tries
to get a record lock on second database.  If the second record is not
available on the local node, samba asks ctdb to migrate the record.  If
recovery occurs at this time (e.g. node becoming inactive), ctdb cannot
freeze all the databases for recovery since samba is already holding one
lock and waiting for the second lock.  CTDB can process the second record
request only after the recovery is complete, thus causing a deadlock.

The solution:

In parallel database recovery, each database is frozen and recovered
independent from each other.  So as soon as the second database is
recovered, CTDB will resend and process all the pending migration requests
and then samba can get the second lock.  Once samba releases both the
locks, ctdb can freeze the first database and recover it completing
recovery process.

The implementation of parallel database recovery requires untangling
freezing, call processing and recovery code.  To be able to recover each
database independently, new controls are added.  The first series of
patches mainly modify the freeze code.  The patches are available in the
ctdb-freeze branch and are also attached.


The second set of patches add generation to each database and use database
generation in the packet headers for record migration packets.  The patches
are available in the ctdb-generation branch and are also attached.


The last set of patches introduces lots of new code with tevent_req based
abstractions.  The significant changes are:

  - Packet read/write and communication abstractions
  - Abstract and separate protocol serialization routines
  - Create a completely new client side API based on tevent_req

The main motivation is to encapsulate CTDB protocol behind a reasonable API
which can be used by CTDB tool and Samba.  This will avoid the need to
maintain independent implementation of CTDB protocol in samba code.

The actual parallel database recovery is implemented in a recovery helper
which forked from recovery daemon.  The patches are available in
ctdb-recovery branch.


The last set of patches is quite large and therefore not attached.

I have added tests for new abstractions and extensive tests for protocol
serialization.  The new client side API is tested with re-implementation of
ctdb tool.  The new tool will eventually replace the current tools/ctdb.c
code.  This code is not yet complete and therefore will not be pushed just
yet.  Those who are interested can take a look at the new ctdb tool here:


Martin has reviewed all the patches, however these are large changes and I
would really appreciate additional eyes on the code.

To reiterate, the sequence of patches (presented as branch names) is:

1. ctdb-freeze
2. ctdb-generation
3. ctdb-recovery


-------------- next part --------------
A non-text attachment was scrubbed...
Name: ctdb-freeze.patches
Type: application/octet-stream
Size: 87821 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20150925/69e4f718/ctdb-freeze.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ctdb-generation.patches
Type: application/octet-stream
Size: 19995 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20150925/69e4f718/ctdb-generation.obj>

More information about the samba-technical mailing list