[PATCH] Add replicated database model to ctdb

Tue Jun 27 09:30:56 UTC 2017

On Tue, Jun 27, 2017 at 4:49 PM, Stefan Metzmacher <metze at samba.org> wrote:

> Hi Amitay,
>
> > CTDB has two database models:
> >
> > 1. Persistent databases - replicated and permanent
> > 2. Volatile database - distributed and temporary
> >
> > This patch set adds a new database model
> >
> > 3. Replicated database - replicated and temporary
> >
> > The main purpose of this database model is store ctdb state information
> > which requires to be replicated across the cluster.  Currently different
> > types of state information is replicated using different methods.  With
> > this database model, we can get rid of various ad-hoc methods and use a
> > consistent API for cluster-wide state replication.
>
> As a site note: do you think it would be possible to introduce
> something like replicated/persistent records to the Volatile database
> model? I guess it would be useful for persistent handles in future.
>
>
I have been thinking of how to add persistence to the existing volatile
databases.

There are few issues that need addressing.

1. We'll need a different api for updating "persistent" records.  It cannot
be based on the current fetch-lock, but it has to be on the lines of the
transaction api.  This basically switches to the client-server model for
data update (similar to all database engines) rather than the current
shared model.

This can be done by adding a new flag to persistent records.  If the record
is marked persistent, then you cannot use fetch-lock api to update that
record.

2. The other issue is adding persistence across the cluster outage.  We can
start without the persistence across the cluster outage as first
approximation.  Even though no clustered file system is providing the
persistence properties yet, this will have to be addressed eventually.

More I think about it, we might have to come up with a combined
volatile+persistent model that,
  - maintains the high performance of shared access,
  - provides persistence for "information" that is truly persistent,
  - provides a unified api for access

It would be really useful to exactly know the information stored in the
persistent records and the access patters for such records.

For example: take locking.tdb

  - What information would be stored in each record?
  - Does all that information need to be persistent?
  - In what cases, Samba will need read-only access to that record?
  - In what cases, Samba will need read-write access to that record?
     - Does the persistent handle need to be updated, once it's created?
     - Do we need to update that record for something else?

  Supposing we are using transaction api to replicate the "persistent"
records,
   - all read-only accesses are completely local and very fast
   - all read-write accesses will need transaction api
     - if we don't need to update the record much, then we don't have to
worry about performance
     - if we do need to update the record, can we split the record into
"volatile" information and "persistent" information?

If we can gather such information, then we can appropriately design the
database model.

Amitay.